Safety models

GPT-OSS Safeguard 120B

gpt-oss-safeguard-120b

Parameters: 117B (5.1B active)Context: 131K tokensStrengths: Safety reasoning, bring-your-own-policy flexibility, full access to reasoning chains for debugging, configurable reasoning effort levelsStructured Outputs: Structured response formatting supportBest for: Content moderation, policy enforcement, LLM guardrails, and Trust & Safety labeling workflowsConfiguration repo: tinfoilsh/confidential-gpt-oss-safeguard-120b

Safety Model: Classifies text content based on custom safety policies you provide.

Getting Started

Model catalog

Tinfoil SDKs

Tinfoil Containers

Guides

Verification & Attestation

Tutorials

Admin API

Resources

Getting Started

Model catalog

Tinfoil SDKs

Tinfoil Containers

Guides

Verification & Attestation

Tutorials

Admin API

Resources

Documentation Index