Skip to main content
All models listed below are accessible through any of our SDKs. See the SDK pages for language-specific usage examples.

Available Models

Below is a list of all models currently supported on Tinfoil, including their model IDs and types.
Available models and capabilities are subject to change. If you require SLA guarantees, specific model availability, or long-term production usage, please contact us to discuss your needs. We’re also happy to work with you to add support for your desired model.

Chat Models

Description: Chat models support conversational AI capabilities through the standard chat completions API. All chat models follow the OpenAI chat completion format.

DeepSeek
DeepSeek R1
deepseek-r1-0528
Parameters: 671BContext: 128K tokensStrengths: State-of-the-art reasoning, advanced mathematical capabilities, enhanced function calling, reduced hallucination rateStructured Outputs: Structured response formatting supportBest for: Complex reasoning tasks, mathematical problem-solving, advanced coding, and tasks requiring deep analytical thinking

Moonshot
Kimi K2 Thinking
kimi-k2-thinking
Context: 256K tokensStrengths: Deep multi-step reasoning, stable long-horizon tool orchestration, advanced agentic coding, web browsing and research, native INT4 quantization for faster inferenceStructured Outputs: Structured response formatting supportBest for: Complex agentic workflows, multi-step coding and debugging tasks, web research requiring multiple tool calls, long-form writing
🤖 Thinking Agent: End-to-end trained for interleaved reasoning and function calling. Maintains stable performance across extended tool orchestration sequences.

Moonshot
Kimi K2.5
kimi-k2-5
Parameters: 1T total (32B activated) Context: 256K tokensStrengths: Unified vision and text processing, image and video analysis, generates code from screenshots and mockups, parallel task execution across specialized sub-agentsStructured Outputs: Structured response formatting supportBest for: Building applications that process visual inputs, converting designs to code, video comprehension, orchestrating complex workflows with multiple parallel agents
🎨 Vision + Language: Jointly trained on images, video, and text. Handles visual reasoning tasks and can spawn coordinated sub-agents for complex problems.

OpenAI
GPT-OSS 120B
gpt-oss-120b
Parameters: 117B Context: 128K tokens
Strengths: Powerful reasoning, configurable reasoning effort levels, full chain-of-thought access, native agentic abilities including function calling, web browsing, and Python code execution
Structured Outputs: Structured response formatting support
Best for: Production use cases requiring high reasoning capabilities, agentic operations, and specialized applications

OpenAI
GPT-OSS Safeguard 120B
gpt-oss-safeguard-120b
Parameters: 117B (5.1B active) Context: 128K tokens Strengths: Safety reasoning, bring-your-own-policy flexibility, full access to reasoning chains for debugging, configurable reasoning effort levels Structured Outputs: Structured response formatting support Best for: Content moderation, policy enforcement, LLM guardrails, and Trust & Safety labeling workflows
Safety Model: Classifies text content based on custom safety policies you provide.

Llama
Llama 3.3 70B
llama3-3-70b
Context: 128K tokens
Strengths: Multilingual understanding, dialogue optimization, strong reasoning
Structured Outputs: Structured response formatting support
Best for: Conversational AI applications and complex dialogue systems
Structured Outputs: All chat models support structured outputs for reliable data extraction and API integration. Full JSON schema validation available in Python, Node, and Go SDKs. See the Structured Outputs Guide for implementation examples.

Vision Models

Description: Vision models understand images and video for visual tasks including image analysis, video understanding, OCR, and screenshot-to-code generation.

Qwen
Qwen3-VL 30B
qwen3-vl-30b
Parameters: 30B (3B active)Context: 256K tokensStrengths: Advanced vision-language understanding, video analysis, GUI interaction, screenshot-to-code generation, spatial understanding, multilingual OCROCR Languages: Supports 32 languagesBest for: Image and video analysis, screenshot-to-code generation, OCR tasks, GUI automation, and vision-text understanding
📸 Multimodal: Processes both images and video. Supports long videos and documents with up to 256K context. See Image Processing Guide for usage examples.

Audio Models

Description: Audio models provide speech-to-text transcription and text-to-speech synthesis capabilities. Supporting both audio file transcription and high-quality speech generation.

OpenAI
Whisper Large V3 Turbo
whisper-large-v3-turbo
Capabilities: Speech-to-text transcription
Strengths: Fast processing, high accuracy, multiple language support
Best for: Audio transcription, voice-to-text applications
Audio Format: Supports .mp3 and .wav files

Mistral
Voxtral Small 24B
voxtral-small-24b
Parameters: 24BCapabilities: Speech-to-text transcription, audio Q&A, summarization, translation, voice-triggered function callingAudio Duration: Up to 30 minutes (transcription) or 40 minutes (understanding)Languages: English, Spanish, French, Portuguese, Hindi, German, Dutch, ItalianBest for: Speech transcription with automatic language detection, answering questions from spoken input, generating summaries from audio, and triggering functions from voice commands
Audio + Text: Built on Mistral Small 3.1 foundation, combining speech processing with strong text capabilities including function calling from voice commands.

Embedding Models

Description: Embedding models convert text into high-dimensional vectors for semantic search, similarity comparisons, and other vector-based operations.

Nomic
Nomic Embed Text v1.5
nomic-embed-text
Dimensions: 768Strengths: Multimodal embedding modelBest for: Semantic search, document similarity, clustering

Document Processing Models

Description: Document processing models handle file conversion, text extraction, and document parsing operations.

Docling
Docling Document Processing
docling
Capabilities: Document processing and conversion service
Strengths: PDF processing, Word document parsing, text extraction, format conversion with high accuracy
Best for: Document upload, processing, conversion, and text extraction workflows
📄 File Support: Supports PDF, Word documents, and other common document formats. See Document Processing Guide for usage examples.

Using Models

To use any of these models, you’ll need:
  1. API Key: Get your key from the Tinfoil dashboard
  2. SDK: Install the SDK for your preferred language
  3. Model ID: Use the model ID from the cards above in your API requests
For detailed usage examples and code samples, see the SDK documentation: