Private RAG with Verba + Weaviate on Tinfoil
Build a fully private Retrieval Augmented Generation (RAG) chatbot using Verba, Weaviate, and Tinfoil’s confidential computing infrastructure.
TL;DR: How to set up private RAG with Verba + Tinfoil in 5 steps:
- Clone the Tinfoil fork of Verba from GitHub.
- Install dependencies and set up Weaviate (local or cloud).
- Configure Tinfoil API keys for both embeddings and chat models.
- Import your documents through Verba’s interface.
- Start chatting with your private knowledge base!
Your documents and conversations never leave your secure environment or get exposed to third parties.
Introduction
Verba is an open-source Retrieval Augmented Generation (RAG) application that lets you chat with your own documents using AI. The Tinfoil fork extends Verba to work seamlessly with Tinfoil’s private inference API, ensuring your documents and conversations remain completely confidential.
This integration combines:
- Verba: Document ingestion, chunking, and RAG pipeline management
- Weaviate: Vector database for semantic search capabilities
- Tinfoil: Private, confidential computing for embeddings and chat completions
The result is a fully private knowledge base where your sensitive documents never leave your controlled environment.
Prerequisites
Before starting, you’ll need:
- Tinfoil API Key: Get yours free at tinfoil.sh
- Git: To clone the repository
- Docker: For containerized Weaviate deployment
You’re billed for all usage of the Tinfoil Inference API. See Tinfoil Pricing for current rates.
Security Warning: Never share your API key, avoid including it in version control, and never bundle it in client-side code.
Installation and Setup
Step 1: Clone the Tinfoil Verba Fork
The Tinfoil fork includes pre-configured integrations for Tinfoil’s API endpoints:
Step 2: Set Tinfoil API key
Step 3: Bring up Docker Compose
Configuration Options
Available Tinfoil Models
Verba can work with any of Tinfoil’s supported models:
Chat Models
deepseek-r1-70b
: High-performance reasoning modelmistral-small-3-1-24b
: Advanced multimodal modelllama3-3-70b
: Multilingual dialogue modelqwen2-5-72b
: Powerful multilingual model with superior programming and mathematical reasoning
Embedding Models
nomic-embed-text
: High-quality text embeddings (recommended)
Running Verba
Start the Application
Launch Verba with your configuration:
Access Verba at http://localhost:8000
Using Verba with Tinfoil
Document Import and Processing
Verba processes your documents through several stages, all using Tinfoil’s private infrastructure:
1. Import Documents
Navigate to the “Import Data” section and add your files:
- Supported Formats: PDF, DOCX, TXT, MD, HTML, and more
- Upload Methods:
- Single file upload
- Bulk directory upload
- URL import for web content
2. Document Processing Pipeline
Verba automatically:
- Extracts Text: Parses document content
- Chunks Content: Splits into semantic segments (configurable size)
- Generates Embeddings: Uses Tinfoil’s Nomic model privately
- Stores Vectors: Saves to Weaviate for semantic search
All embedding generation happens through Tinfoil’s confidential computing enclaves.
3. Configuration Options
Customize processing in the Verba UI:
- Chunk Size: 256, 512, 1024 tokens (512 recommended)
- Chunk Overlap: Overlap between chunks (20% recommended)
- Embedding Model: Pre-configured for Tinfoil Nomic embeddings
Querying Your Knowledge Base
1. Basic Chat Interface
Ask questions about your documents:
2. Advanced Query Configuration
Configure RAG parameters:
- Retrieval Count: Number of relevant chunks to retrieve (3-5 recommended)
- Temperature: Response creativity (0.1-0.3 for factual responses)
- Max Tokens: Response length limit
- Context Window: How much retrieved content to include
3. Source Attribution
Verba provides source citations for every response:
- Document names and page references
- Confidence scores for retrieved chunks
- Direct links to source content
Scaling Options
- Horizontal Scaling: Run multiple Verba instances behind load balancer
- Weaviate Clustering: Scale vector database with Weaviate clusters
- Document Preprocessing: Separate ingestion and query workloads
Comparison with Traditional RAG
Feature | Traditional RAG | Verba + Tinfoil |
---|---|---|
Privacy | Data exposed to AI providers | End-to-end confidential |
Security | Trust-based | Hardware-verified |
Compliance | Limited control | Full data sovereignty |