Tinfoil’s document processing service extracts structured Markdown from uploaded documents — including PDFs, DOCX, PPTX, XLSX, HTML, CSV, and images. The entire service runs inside a secure enclave, and the VLM used for OCR and visual extraction also runs in its own secure enclave — so your documents are never exposed to any operator. Born-digital PDFs are parsed using MuPDF inside a sandboxed subprocess with no network access, environment variables, or filesystem; scanned pages and images are sent to the VLM for OCR.You can use document processing in two ways:
Call /v1/convert/file directly when you want extracted Markdown (or page images) back from the document service.
Send a base64-encoded file through the OpenAI-compatible /v1/responses or /v1/chat/completions APIs. Tinfoil privately converts the attachment and forwards either Markdown (for text-only models) or per-page Markdown plus page images (for vision-capable models) to the model. You can override the default with the optional tinfoil_mode field.
Current scope: OpenAI-compatible file input support currently accepts base64 file_data only. file_id and the /v1/files upload flow are not supported.
The document processing endpoint accepts multipart/form-data requests at /v1/convert/file. Upload one or more files with field name files.You can control extraction behavior with the mode query parameter:
Mode
Description
text (default)
Markdown from the text layer. VLM OCR only for scanned pages.
vision
Text plus VLM OCR for scanned pages and VLM visual descriptions (tables, charts, diagrams, formulas) for born-digital pages.
images
Per-page text plus page images as base64 PNG. No VLM.
raw
Text layer only. No VLM, no image rendering.
vlm
Full-page VLM OCR on every page.
import { SecureClient } from 'tinfoil'import fs from 'fs'const client = new SecureClient()const fileBuffer = fs.readFileSync('doc.pdf')const blob = new Blob([fileBuffer], { type: 'application/pdf' })const formData = new FormData()formData.append('files', blob, 'doc.pdf')// Default mode — fast, no VLM for born-digital PDFsconst response = await client.fetch('/v1/convert/file', { method: 'POST', body: formData,})const result = await response.json()// result.document.md_content contains the converted Markdownconsole.log(result.document.md_content)
The response includes the extracted Markdown content. When uploading a single file, the result is in document; for multiple files, results are in a documents array:
text mirrors the per-page slice of md_content; pure scans come back with an empty text field.When uploading multiple files, the response uses documents (an array) instead of document:
{ "documents": [ { "md_content": "# First document..." }, { "md_content": "# Second document..." } ], "status": "success", "processing_time": 3.21}
This recovers visual elements that text extraction discards — illustrations, diagrams, color-coding, page decorations, and other layout cues — while still giving the model accurate, parser-extracted text.When you instead attach a PDF as base64 file_data on /v1/responses or /v1/chat/completions with a vision-capable model, Tinfoil performs this same per-page interleave automatically.
For binary formats such as PDF, DOCX, PPTX, and images, Tinfoil processes the attachment through the private document-processing backend before forwarding it to the model. By default the router picks the best shape per attachment:
Routed model
Default PDF / image behavior
Vision-capable
Per-page interleaved Markdown and page images.
Text-only
Markdown only, for speed.
You can check whether a model is vision-capable via the multimodal field on GET /v1/models. DOCX, PPTX, XLSX, HTML, CSV, and plain text attachments are always forwarded as extracted Markdown regardless of the routed model. You can override the default per attachment with tinfoil_mode.
Set the optional Tinfoil-specific tinfoil_mode field directly on the file content part to override the auto-default — for example to force VLM full-page OCR on a low-quality scan:
The router consumes the field and strips it before the request is forwarded, so the upstream model never sees it.
Value
Behavior
auto (default)
images for vision-capable models, text for text-only models.
text
Markdown from the text layer; VLM OCR only on scanned pages.
vision
Markdown plus VLM visual descriptions for figures, charts, and tables.
images
Per-page interleaved Markdown and images. Requires a vision-capable model; returns 400 otherwise.
raw
Text layer only. No VLM, no image rendering.
vlm
Full-page VLM OCR on every page. Highest quality, slowest.
tinfoil_mode only affects PDF and image attachments; for DOCX, PPTX, XLSX, HTML, CSV, and plain text the field has no effect.
tinfoil_mode is a Tinfoil-specific extension and is not understood by OpenAI’s API. If your code needs to target both Tinfoil and OpenAI from the same request body, omit the field.
On Chat Completions the field nests inside the file object alongside filename and file_data:
Per request: up to 10 files, 50 MB each, multipart/form-data only.All non-2xx responses are {"error": "<message>"}./health reflects the state of the different pipeline elements:
The document upload API uses the same attestation mechanism as other Tinfoil services. Use SecureClient (as shown above) to verify attestation automatically.
Try Private Chat
Experience document upload in our private chat interface with real-time privacy verification.
Configuration Repo
View the open-source configuration for Tinfoil’s confidential document processing service.