Overview

Tinfoil uses confidential computing to provide a verifiably private runtime environment in the cloud. All model inference requests are routed through a confidential inference orchestrator that directs requests to the appropriate model enclaves. These secure enclaves are Trusted Execution Environments (TEEs) — isolated regions of memory and CPU resources that provide hardware-backed security guarantees. Learn more about how TEEs work and the confidentiality and verifiability properties they provide. This verifiability property is realized through an attestation architecture which ensures that:
  1. The secure enclave is genuine and properly configured, as attested to by AMD and NVIDIA.
  2. Only immutable and publicly-auditable code is executed inside the secure enclave.
  3. Only static model weights are loaded into the inference engine.

Components

This attestation architecture is illustrated in Figure 1 and consists of the following components.
  • confidential-inference-proxy: The confidential inference orchestrator that routes all model requests to appropriate enclaves, runs the same shim framework for consistent attestation
  • cvmimage: Confidential VM Image based on Ubuntu, with the CPU enclave–compatible kernel, the vLLM inference server, our tfshim and modelpack mount utils.
  • modelpack: Read only volume containing model weights or other immutable data
  • tinfoil-config.yml: Manifest for models, shim configuration, and dependency versions
  • tfshim: Reverse proxy that runs inside the VM image and terminates TLS, enforces security policy, and serves the remote attestation document. The TLS keypair is generated inside the enclave and the private key never leaves it.
  • pri-image-builder: Converts the tinfoil-config file into a deployment config and publishes a new Sigstore Bundle on the Sigstore transparency log
  • edk2 ovmf: UEFI boot firmware
  • Sigstore: Transparency log record containing source code measurements (i.e., a SHA256 hash of the compiled code)
  • Verifier (on Client Device): Checks that the source code and runtime measurements match and that the TLS connection matches the attested public key
Tinfoil Attestation Architecture

Figure 1: Overview of Tinfoil’s attestation architecture.

Immutability

The Confidential VM (CVM) is inherently stateless. It has no persistent data whatsoever; all virtual disks are mounted as read-only. Consequently, we need a means to verify the integrity of the read-only disk images to ensure they haven’t been modified by an attacker on the host. We use dm-verity to create an attested measurement of the disk image which the CVM verifies at boot time. We use mkosi to build the rootfs and modelpack to create immutable disk images from huggingface model weights.

CPU-GPU Chain of Trust

Once the CVM verifies the integrity of the disks, it in turn queries the GPU to ensure it’s also configured correctly by NVIDIA. This creates a link between the CPU and GPU attestations. If the CPU fails to verify the GPU’s attestation, it aborts the boot process and returns an error.

Lifecycle

To run a model on Tinfoil, we first build the model into a deployment configuration, deploy it on our infrastructure, then verify its integrity on the client device.

Build-time

  1. Download model weights from huggingface or other model repository
  2. Use modelpack to create an immutable .mpk file of the weights (EROFS+dm-verity) and an info string ([root node hash]_[offset]_[block uuid]) that verifies the integrity of that file
  3. Create a tinfoil-config.yml with:
    • MPK info string for the model
    • tfshim config (domains, path ACL, allowed CORS origins)
    • CVM image and OVMF firmware versions
    • Memory and vCPU core count
  4. Commit the config file and a GitHub Actions workflow for pri-build-action to a new repo
  5. Tag a release and pri-build-action publish the release including a measured deployment manifest
  6. Run the VM in QEMU

Runtime setup

  1. QEMU starts the VM
  2. When the CVM boots, our initialization process does the following:
    1. Creates a ramdisk for all ephemeral data
    2. Ensures the tinfoil-config file matches the attested hash provided in the kernel command line
    3. Checks the NVIDIA GPU attestation with NVIDIA’s local-gpu-verifier
    4. Uses modelpack to mount each model weight directory
    5. Applies tfshim and vllm configurations from the attested config and starts each service

Connection-time verification

Before exchanging application data (e.g., chat completions) with an enclave, the verifier SDK completes the following checks. Note that both the confidential inference orchestrator and the target model enclave can be verified using the same process.
  1. Fetches the attestation document from the enclave which includes the signed runtime measurements
  2. Verifies the certificate chain in the attestation to the CPU’s hardcoded root certificate: (AMD)
  3. Fetches the Sigstore bundle from GitHub
  4. Verifies the Sigstore bundle to Sigstore’s root trust anchor
  5. Checks the measurement predicates to ensure the source code and runtime enclave measurements match
  6. Opens a TLS connection to the enclave and ensures the public key offered by the remote server matches the public key included in the attestation document (binding TLS to the attested key and guaranteeing TLS terminates inside a verified enclave)

TLS Key Binding

TLS key binding ensures all TLS sessions terminate inside a verified enclave, never on a non-enclave host.
  • Attested key: The enclave’s attestation includes the enclave-generated TLS public key tied to the measured runtime.
  • In-enclave termination: tfshim terminates TLS inside the enclave; its private key is non-exportable and never leaves enclave memory.
  • Match check: The client Verifier compares the server’s TLS key to the attested key. If they differ, verification fails and no data is sent.
  • Outcome: Only verified TEEs can decrypt traffic; intermediaries can forward TCP but cannot terminate or read plaintext outside an enclave.

Inference Chain of Trust

  1. Routing: All inference requests first go through the inference orchestrator (running in a TEE), which examines the request to determine the target model
  2. Orchestrator Attestation: The orchestrator runs the same shim framework and generates CPU attestations, proving it’s running unmodified routing code in a TEE
  3. Enclave Attestation: The target inference enclave also runs the shim framework and provides its own CPU and GPU attestation for the inference workload
This architecture ensures that not only is the model inference happening in a verified enclave, but the request routing and load balancing logic is also verifiably running the expected open-source code in a secure environment.

Tinfoil Config File

The “tinfoil config file” is always called tinfoil-config.yml and placed at the root of a deployment repo. The private image builder action parses this file to create an attested deployment config and includes the SHA256 hash of the entire file as a kernel command line parameter to provide a cryptographic link to the running enclave. For example, our DeepSeek R1 deployment:
cvm-version: 0.0.27
ovmf-version: 0.0.2
cpus: 8
memory: 32768

models:
  - name: "deepseek-r1-0528"
    repo: "casperhansen/deepseek-r1-distill-llama-70b-awq@a1ab7653aae77fbabc536cbcbac5bb2e2fb5354f"
    mpk: "8e39a53227ccb0c3cffbed1c0013d4d63c74c1e01541b953ff021e91cb158330_39785418752_efe58861-8b9c-5e64-b0ee-85d9169acb44"
vllm-args: --quantization awq_marlin --max-model-len 65536

shim: # The shim config is passed directly to tfshim. See https://github.com/tinfoilsh/tfshim
  domains:
    - deepseek-r1-0528.model.tinfoil.sh
  listen-port: 443
  upstream-port: 8080 # Port of internal service (vllm)
  control-plane: https://api.tinfoil.sh
  paths: # Path ACL
    - /v1/chat/completions
    - /metrics
  origins: # for CORS
    - https://tinfoil.sh
    - https://chat.tinfoil.sh

Sigstore Bundle

Tinfoil uses a transparency log (Sigstore) to make the code and configuration used in deployments transparent and verifiable.
  • What it is: A signed transparency-log bundle containing in-toto statements about the source commit, build inputs, and the SHA256 measurements of built artifacts.
  • Producer: The private image builder publishes the bundle when a release is tagged for a deployment repo.
  • Verification: The Verifier SDK downloads the bundle, verifies its signatures to Sigstore’s root trust anchor, and extracts the expected measurements.
  • Purpose: These measurements are compared against the enclave’s attested runtime measurements to ensure the running code exactly matches the audited release.

Root trust anchor

“Root trust anchor” refers to the set of public keys/certificates that the Verifier SDK pins for Sigstore validation. Signatures and transparency-log data must verify back to these keys for verification to succeed. We ship these trust roots with the SDK (no external trust on our infrastructure).