Documentation Index
Fetch the complete documentation index at: https://docs.tinfoil.sh/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Tinfoil’s web search lets models augment their answers with fresh information from the web. Search runs inside a secure enclave and talks directly to Exa, a search provider with Zero Data Retention (ZDR). That gives you:- Query privacy: queries go from the enclave to Exa over TLS. Tinfoil never sees the query contents in plaintext.
- User anonymity from the search provider: all users share a single enclave-held API key, so Exa only sees the enclave’s IP address.
- Legal protections: Exa’s ZDR agreement ensures queries are never written to persistent storage or sent to external subprocessors.
- Optional PII protection: a safeguard model can block outgoing queries that contain sensitive information before they are sent to Exa.
- Optional prompt-injection protection: a safeguard model can scan search results and fetched pages for prompt-injection attempts and filter them out before the model sees them.
Read our blog post on private AI web search for background.
Choosing an API surface
You can enable web search on either of Tinfoil’s OpenAI-compatible endpoints:/v1/responses— recommended. Search progress and citations are surfaced through OpenAI’s nativeweb_search_callitems andresponse.web_search_call.*/response.output_text.annotation.addedstreaming events./v1/chat/completions— supported for compatibility. Citations are surfaced on the final message and ondelta.annotationsin streaming. The OpenAI Chat Completions spec has no native live-progress event for web search, so if you want a live progress UI there, see Streaming progress markers.
Quick start
Enable web search by addingweb_search_options to a Chat Completions request, or a web_search tool to a Responses request. Optionally add pii_check_options and/or prompt_injection_check_options to enable the safety filters.
Chat Completions (/v1/chat/completions)
Responses (/v1/responses)
Request options
Top-level request fields
| Field | API | Required | Description |
|---|---|---|---|
web_search_options | Chat Completions | Yes, to enable | Enables web search. Accepts the tuning fields listed below. |
tools: [{ "type": "web_search", ... }] | Responses | Yes, to enable | Enables web search. Per-tool fields match web_search_options. |
pii_check_options | Both | No | Block queries containing PII from being sent to Exa. Presence of the key enables the filter. |
prompt_injection_check_options | Both | No | Filter prompt-injection attempts out of search results and fetched pages. Presence of the key enables the filter. |
include: ["web_search_call.action.sources"] | Responses | No | Opt in to populating action.sources on web_search_call output items with the URLs each search returned. |
Search tuning fields
These fields are all optional. They have the same meaning on both APIs. On Chat Completions they go underweb_search_options.<field>; on Responses they go on the web_search tool entry (tools[].<field>).
| Field | Type | Description |
|---|---|---|
search_context_size | "low" | "medium" | "high" | Retrieval-depth tier. low favors short highlight snippets, medium is the default, high pulls more results and longer content per result. |
user_location | object | Approximate location context. Only approximate.country (ISO 3166-1 alpha-2) is honored today. |
filters.allowed_domains | string[] | Restrict search results to these hostnames. |
filters.excluded_domains | string[] | Drop these hostnames from search results. |
content_mode | "highlights" | "text" | Override what each result carries. Defaults to the tier choice implied by search_context_size. |
max_content_chars | integer | Per-result character budget for returned content. Defaults to the tier choice implied by search_context_size. |
category | string | Narrow search to a topical category (for example, news, research paper). |
start_published_date | YYYY-MM-DD or RFC 3339 | Only return results published on or after this date. |
end_published_date | YYYY-MM-DD or RFC 3339 | Only return results published on or before this date. |
max_age_hours | integer | Only return results from the last N hours. |
Safety filters are opt-in per request. Include
pii_check_options to block PII-bearing queries before they reach Exa, and prompt_injection_check_options to filter injection attempts out of search results and fetched pages. Both are independent; enabling one does not enable the other.Response format
Chat Completions
The response is a standard OpenAIchat.completion. Citations appear in two places:
- inline in the assistant text as ASCII markdown links (
[label](url)), - as structured
url_citationannotations on the assistant message, whosestart_index/end_indexspan the label text.
delta.annotations entries interleaved with delta.content:
The Chat Completions stream is a standard
chat.completion.chunk stream — there are no custom top-level events on this API. If you want live search progress on Chat Completions, see Streaming progress markers.Responses
Theresponse.output array contains a web_search_call item for each search or page fetch, followed by the assistant message item. Citations are attached to each output_text content part as flat url_citation annotations.
{type, url, start_index, end_index, title}) whereas Chat Completions nests it under url_citation.
When the request opts in with include: ["web_search_call.action.sources"], each search-kind web_search_call also carries the URLs the search returned:
Streaming events (Responses)
Responses streaming emits OpenAI’s standard events. The ones relevant to web search are:| Event | When |
|---|---|
response.output_item.added | A web_search_call item is surfaced, initially with status: "in_progress". |
response.web_search_call.in_progress | A search or page fetch is starting. |
response.web_search_call.searching | A search is being sent to the provider. Page fetches do not emit this event. |
response.web_search_call.completed | A search or fetch finished successfully. Not emitted on failure. |
response.output_item.done | Terminal event for the item. event.item.status is completed or failed. |
response.output_text.delta and response.output_text.done events. Citations arrive as response.output_text.annotation.added:
response.completed event carrying the full output and usage.
Failure and blocked statuses
A search or page fetch that fails surfaces withstatus: "failed" on the terminal web_search_call item (OpenAI’s web_search_call.status enum is in_progress, searching, completed, failed).
When a search is blocked by the PII filter, the spec-visible status is still failed because the OpenAI enum has no blocked value. The distinct “blocked by safety filter” signal is exposed through an optional _tinfoil sidecar that Tinfoil-aware clients can read to render a different affordance.
Optional: streaming progress markers (Chat Completions)
The OpenAI Chat Completions spec has no native event for web-search progress. If you want a live progress UI on Chat Completions (a spinner, a “searching the web…” line, a per-URL fetch indicator), you can opt in to Tinfoil progress markers. Send the request header:delta.content of chat.completion.chunk frames. Each marker is a standalone line:
statusfollows a simple lifecycle: onein_progressmarker when a search or fetch starts, followed by one terminal marker (completed,failed, orblocked).action.typeissearchfor web searches andopen_pagefor page fetches. A search that triggers multiple page fetches produces one marker pair per URL.erroris only present onfailedandblockedmarkers.sourcesis only present on terminal markers forsearchcalls that produced results. Each entry is a{url, title}pair attributing a citation to this specific call.titlemay be an empty string; clients should fall back to the URL or hostname in that case.
content string in non-streaming Chat Completions responses when the header is set, so you can render an identical progress timeline from either mode.
Clients that do not parse the markers will render them as text inside the assistant message. Either parse and strip them, or leave the header off to get a pristine stream with no markers.
Consuming markers on the client
A single regex is enough to extract and strip markers. The leading and trailing newlines are absorbed so the text before and after a marker collapses seamlessly:Optional: the _tinfoil sidecar
On the Responses API, web_search_call.status is restricted to OpenAI’s enum: in_progress, searching, completed, failed. Tinfoil exposes richer information through an optional vendor-extension field named _tinfoil that rides alongside the envelope:
_tinfoilis only present on failed calls. Successful calls omit it entirely._tinfoil.statusis only present when the unfiltered status differs from the envelope status. Today that means it appears when a call was blocked by safety filters._tinfoil.error.codeis present on every failed call. Known codes areblocked_by_safety_filterandtool_error.
_tinfoil is invisible unless you read it explicitly.
The sidecar appears on:
- the non-streaming
response.output[*]web_search_callitems, - the streaming
response.output_item.done.itemfor aweb_search_call.
Optional: usage reporting
To have aggregated token usage surfaced as an HTTP response header or trailer, send:X-Tinfoil-Usage-Metrics:
- as a response header for non-streaming requests,
- as an HTTP trailer for streaming requests.
chat.completion.chunk on streaming Chat Completions, request it through standard OpenAI stream_options:
PII protection
Thepii_check_options field prevents search queries containing sensitive personally identifiable information from being sent to Exa. When PII is detected, the query is blocked and the model responds without search results for that turn.
Blocked PII types:
- Government IDs: social security numbers, tax IDs, passport numbers, driver’s licenses, voter IDs, national IDs
- Financial: bank account numbers, credit card numbers, IBANs
- Contact: personal email addresses, personal phone numbers, home addresses
- Linkable identifiers: VINs, license plates, device serial numbers
- Identifying combinations: name + date of birth, name + address, or other combinations that identify a specific person
- Names alone
- Dates of birth alone
- Business or corporate contact information
- Public figures’ public information
- On the Responses API, a
web_search_calloutput item withstatus: "failed"and a_tinfoil.status: "blocked"sidecar. - On Chat Completions with progress markers enabled, a marker with
status: "blocked"anderror.code: "blocked_by_safety_filter".
Prompt-injection protection
Theprompt_injection_check_options field runs a safeguard model over each search snippet and fetched page before the content is handed back to the responding model. Results and pages that contain instructions aimed at hijacking the model (for example “ignore previous instructions”, embedded tool-use directives, or credential-exfiltration prompts) are dropped.
When every result for a query is filtered out, the model sees an empty result set for that search and answers without web grounding. Fetch failures due to injection filtering surface as status: "failed" on the corresponding web_search_call item.
Prompt-injection filtering is opt-in per request. Omit the field to skip the check.
Multi-turn conversations (Responses API)
The Responses API supportsprevious_response_id to continue a conversation across turns. Prior search results and fetched pages are carried forward so the model can build on them:

