Structured Outputs
Structured outputs ensure that model responses match specific formats like JSON schemas, regex patterns, or predefined choices. Tinfoil uses vLLM’s guided decoding to constrain outputs by filtering next-token predictions, guaranteeing valid formats without post-processing.
Tinfoil supports structured outputs through vLLM. Use response_format with json_schema type for JSON outputs, or extra_body with structured_outputs for choice and regex constraints.
Benefits
- Format Enforcement: Token-level filtering ensures outputs match your exact format
- Type Safety: Works with Pydantic (Python), Zod (TypeScript), and native types in Go
- No Post-Processing: Outputs are guaranteed valid
- Deterministic: Next-token prediction is constrained to produce only valid tokens
- Multiple Backends: Supports xgrammar and guidance backends
Quick Start
Here are basic examples for each structured output type:
Choice
Restrict output to a predefined list:
from tinfoil import TinfoilAI
client = TinfoilAI(api_key="your-api-key")
response = client.chat.completions.create(
model="<MODEL_NAME>",
messages=[
{"role": "user", "content": "Classify this sentiment: vLLM is wonderful!"}
],
extra_body={"structured_outputs": {"choice": ["positive", "negative"]}}
)
print(response.choices[0].message.content)
Regex
Enforce regex patterns for formatted outputs:
from tinfoil import TinfoilAI
client = TinfoilAI(api_key="your-api-key")
response = client.chat.completions.create(
model="<MODEL_NAME>",
messages=[
{"role": "user", "content": "Generate an example email address for Alan Turing, who works in Enigma. End in .com and new line."}
],
extra_body={"structured_outputs": {"regex": r"\w+@\w+\.com\n"}},
stop=["\n"]
)
print(response.choices[0].message.content)
JSON
Use response_format with json_schema type for reliable JSON generation:
from pydantic import BaseModel
from enum import Enum
from tinfoil import TinfoilAI
class CarType(str, Enum):
sedan = "sedan"
suv = "SUV"
truck = "Truck"
coupe = "Coupe"
class CarDescription(BaseModel):
brand: str
model: str
car_type: CarType
client = TinfoilAI(api_key="your-api-key")
json_schema = CarDescription.model_json_schema()
response = client.chat.completions.create(
model="<MODEL_NAME>",
messages=[
{"role": "user", "content": "Output a JSON object with the brand, model, and car_type of the most iconic car from the 90's."}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "car-description",
"schema": json_schema
}
}
)
print(response.choices[0].message.content)
Prompt explicitly for JSON. While structured_outputs enforces valid JSON structure, the model produces more reliable results when your prompt explicitly requests JSON output and describes the expected fields. For example, use “Output a JSON object with…” rather than just “Generate…”
Advanced Features
Whitespace Pattern Override
Customize whitespace handling in JSON decoding by combining response_format with extra_body:
response = client.chat.completions.create(
model="<MODEL_NAME>",
messages=[...],
response_format={
"type": "json_schema",
"json_schema": {
"name": "my-schema",
"schema": json_schema
}
},
extra_body={
"structured_outputs": {
"whitespace_pattern": r"[ \t\n]*"
}
}
)
Complex Nested Schemas
Build complex nested structures with Pydantic:
from pydantic import BaseModel
from tinfoil import TinfoilAI
class Address(BaseModel):
street: str
city: str
state: str
zip_code: str
class Employee(BaseModel):
name: str
age: int
email: str | None
addresses: list[Address]
class Company(BaseModel):
name: str
founded_year: int
employees: list[Employee]
headquarters: Address
client = TinfoilAI(api_key="your-api-key")
response = client.chat.completions.create(
model="<MODEL_NAME>",
messages=[
{"role": "user", "content": "Output a JSON object for a company profile. The company is named TechCorp and has 2 employees. Include the company name, founded_year, employees (each with name, age, email, and addresses), and headquarters address."}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "company",
"schema": Company.model_json_schema()
}
}
)
import json
company_data = json.loads(response.choices[0].message.content)
print(f"Company: {company_data['name']}, Employees: {len(company_data['employees'])}")
Best Practices
Markdown-Wrapped Responses: Some models may wrap JSON responses in markdown code blocks (```json ... ```). Strip the formatting before parsing the JSON.
Use Low Temperature for Deterministic Outputs
response = client.chat.completions.create(
model="<MODEL_NAME>",
temperature=0.1,
messages=[...],
response_format={
"type": "json_schema",
"json_schema": {
"name": "my-schema",
"schema": schema
}
}
)
Validate Responses
from pydantic import ValidationError
try:
parsed = MySchema.model_validate_json(response.choices[0].message.content)
except ValidationError as e:
print(f"Validation failed: {e}")
Enable Streaming for Large Responses
response = client.chat.completions.create(
model="<MODEL_NAME>",
messages=[...],
response_format={
"type": "json_schema",
"json_schema": {
"name": "my-schema",
"schema": schema
}
},
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
Additional Resources