Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tinfoil.sh/llms.txt

Use this file to discover all available pages before exploring further.

Audio models handle speech-to-text transcription and audio understanding.

OpenAI
Whisper Large V3 Turbo
whisper-large-v3-turbo
Parameters: 809MCapabilities: Speech-to-text transcriptionLanguages: 90+ languagesStrengths: Fast, accurate, multilingualBest for: Audio transcription, voice-to-text applicationsConfiguration repo: tinfoilsh/confidential-audio-processing
Audio Format: Supports .mp3 and .wav files

Mistral
Voxtral Small 24B
voxtral-small-24b
Parameters: 24BCapabilities: Speech-to-text transcription, audio Q&A, summarization, translation, voice-triggered function callingAudio Duration: Up to 30 minutes (transcription) or 40 minutes (understanding)Languages: English, Spanish, French, Portuguese, Hindi, German, Dutch, ItalianBest for: Speech transcription with automatic language detection, answering questions from spoken input, generating summaries from audio, and triggering functions from voice commandsConfiguration repo: tinfoilsh/confidential-voxtral-small-24b
Audio + Text: Built on Mistral Small 3.1 foundation, combining speech processing with strong text capabilities including function calling from voice commands.