TL;DR: How to set up private RAG with Verba + Tinfoil in 5 steps:
- Clone the Tinfoil fork of Verba from GitHub.
- Install dependencies and set up Weaviate (local or cloud).
- Configure Tinfoil API keys for both embeddings and chat models.
- Import your documents through Verba’s interface.
- Start chatting with your private knowledge base!
Introduction
Verba is an open-source Retrieval Augmented Generation (RAG) application that lets you chat with your own documents using AI. The Tinfoil fork extends Verba to work seamlessly with Tinfoil’s private inference API, ensuring your documents and conversations remain completely confidential. This integration combines:- Verba: Document ingestion, chunking, and RAG pipeline management
- Weaviate: Vector database for semantic search capabilities
- Tinfoil: Private, confidential computing for embeddings and chat completions
Prerequisites
For this tutorial, you’ll need:- Tinfoil API Key: Get an API key at tinfoil.sh
- Git: To clone the repository
- Docker: For containerized Weaviate deployment
You’re billed for all usage of the Tinfoil Inference API. See the Tinfoil
Inference for current pricing information.
Security Warning Never share your API key, be careful to not include it in
version control systems, and never bundle it in with front-end client code.
Installation and Setup
Step 1: Clone the Tinfoil Verba Fork
The Tinfoil fork includes pre-configured integrations for Tinfoil’s API endpoints:Step 2: Set Tinfoil API key
Step 3: Bring up Docker Compose
Configuration Options
Available Tinfoil Models
Verba can work with any of Tinfoil’s supported models. For chat models, you can choose from our high-performance reasoning models, advanced multimodal models, and multilingual dialogue models. For embeddings, we recommend usingnomic-embed-text
for high-quality text embeddings.
See our model catalog for the complete list of available models and their capabilities.
Running Verba
Start the Application
Launch Verba with your configuration:http://localhost:8000
Using Verba with Tinfoil
Document Import and Processing
Verba processes your documents through several stages, all using Tinfoil’s private infrastructure:1. Import Documents
Navigate to the “Import Data” section and add your files:- Supported Formats: PDF, DOCX, TXT, MD, HTML, and more
- Upload Methods:
- Single file upload
- Bulk directory upload
- URL import for web content
2. Document Processing Pipeline
Verba automatically:- Extracts Text: Parses document content
- Chunks Content: Splits into semantic segments (configurable size)
- Generates Embeddings: Uses Tinfoil’s Nomic model privately
- Stores Vectors: Saves to Weaviate for semantic search
3. Configuration Options
Customize processing in the Verba UI:- Chunk Size: 256, 512, 1024 tokens (512 recommended)
- Chunk Overlap: Overlap between chunks (20% recommended)
- Embedding Model: Pre-configured for Tinfoil Nomic embeddings
Querying Your Knowledge Base
1. Basic Chat Interface
Ask questions about your documents:2. Advanced Query Configuration
Configure RAG parameters:- Retrieval Count: Number of relevant chunks to retrieve (3-5 recommended)
- Temperature: Response creativity (0.1-0.3 for factual responses)
- Max Tokens: Response length limit
- Context Window: How much retrieved content to include
3. Source Attribution
Verba provides source citations for every response:- Document names and page references
- Confidence scores for retrieved chunks
- Direct links to source content
Scaling Options
- Horizontal Scaling: Run multiple Verba instances behind load balancer
- Weaviate Clustering: Scale vector database with Weaviate clusters
- Document Preprocessing: Separate ingestion and query workloads
Comparison with Traditional RAG
Feature | Traditional RAG | Verba + Tinfoil |
---|---|---|
Privacy | Data exposed to AI providers | End-to-end confidential |
Security | Trust-based | Hardware-verified |
Compliance | Limited control | Full data sovereignty |