Guides
Image Processing
Learn how to use Tinfoil for image processing with multimodal models.
Image Upload
Multimodal Models Only: Image processing requires models with vision capabilities. Currently, only Mistral Small 3.1 24B supports image inputs. Other models (DeepSeek, Llama, Qwen) are text-only and cannot process images.
See the Model Catalog for complete model specifications and multimodal capabilities.
How It Works
Image processing works through the chat/completions endpoint using base64-encoded images. Images are sent as data URLs in the message content alongside your text prompt.
Converting Images to Base64
There are several ways to convert your images to base64 format:
API Usage
Best Practices
- Image Size: For optimal performance, resize large images before processing (recommended max: 4096x4096)
- Base64 Encoding: Ensure proper base64 encoding and include the correct MIME type in the data URL
- Multiple Images: You can include multiple images in a single chat completion by adding multiple image_url objects to the content array
- Compression: Consider compressing large images to reduce payload size and improve response times