Documentation Index
Fetch the complete documentation index at: https://docs.tinfoil.sh/llms.txt
Use this file to discover all available pages before exploring further.
Audio models handle speech-to-text transcription and audio understanding.


Whisper Large V3 Turbo
whisper-large-v3-turbo Parameters: 809MCapabilities: Speech-to-text transcriptionLanguages: 90+ languagesStrengths: Fast, accurate, multilingualBest for: Audio transcription, voice-to-text applicationsConfiguration repo: tinfoilsh/confidential-audio-processingAudio Format: Supports .mp3 and .wav files

Voxtral Small 24B
voxtral-small-24b Parameters: 24BCapabilities: Speech-to-text transcription, audio Q&A, summarization, translation, voice-triggered function callingAudio Duration: Up to 30 minutes (transcription) or 40 minutes (understanding)Languages: English, Spanish, French, Portuguese, Hindi, German, Dutch, ItalianBest for: Speech transcription with automatic language detection, answering questions from spoken input, generating summaries from audio, and triggering functions from voice commandsConfiguration repo: tinfoilsh/confidential-voxtral-small-24bAudio + Text: Built on Mistral Small 3.1 foundation, combining speech processing with strong text capabilities including function calling from voice commands.