Spire · tamper-evident LLM serving substrate

Prove
what your
AI did.

Spire is the AI-infrastructure layer beneath regulated AI. Every LLM call — the request, the response, the model version, the seed, the temperature, the route decision, the policy gate, and a cryptographic commitment over the KV-cache that produced the answer — is hash-linked into an Ed25519-signed audit chain. India-DPDP-first, with patent applications in preparation, it uniquely commits cache state — not just prompt and response — to an audit chain. When a regulator, an auditor, or a court asks you to replay an inference twelve months later, you can.

Reference implementation, pilot-ready: the v0.1 substrate is built and tested end to end, with production models, the TEE-attested replay node, and live connectors landing at alpha-customer commissioning. Available in early access — engage us to get started.

Request early access → The patent family $ spire verify-chain

# Verify a tenant's whole audit chain — locally, with the published public key.
spire verify-chain --tenant 5f2c… --from 2026-04-01 --to 2026-06-30
# chain verification OK for tenant 5f2c…
#   total entries: 1,284,902   ·   entries in window: 88,140

1,795

Software tests

Pipeline + integration · Protocol + Stub + Production slot

Patent mechanisms

KV-replay · three-gate · pinned routing · lineage · applications in preparation

Inference providers

9 cloud + vLLM · SGLang · llama.cpp

Audited event types

request · response · cache_hit · route · policy_gate …

≤50ms

Policy-gate p99 (target)

Design budget · refusal before the first transformer block

Ed25519

+ SHA-256 chain

Monthly CT-log anchor · 7-year retention

Compliance presets

DPDP · GDPR · CCPA · HIPAA · EU AI Act … + custom

01 — Who it's for

Built for AI that has to survive an audit.

In 2026 every major jurisdiction moved from "you should log your AI" to "your AI logs must be replayable under audit." The EU AI Act Article 12 logging window opens 2 August 2026; India's DPDP Board April 2026 guidance demands records that permit replication of an automated decision; HIPAA §164.312(b) requires post-hoc examination of clinical-AI decisions. Spire is the substrate that lets a regulated buyer answer those demands without a forensics project.

— Buyer · 01 · Tier-1 banking

The Chief AI Officer

Eight AI applications across three providers and self-hosted vLLM, logging prompts to Splunk that truncate at 4KB. When the CRO needs the exact response, the model version, and whether it was cached — there is no answer. Spire makes every one of those questions replayable.

— Buyer · 02 · Healthcare

The Head of Model Risk

Generative AI on patient notes, coding, and decision support under HIPAA §164.312(b) Audit Controls. Needs a mechanism that records activity in a form that permits independent post-hoc examination of an AI decision and the data underpinning it. That is the Spire evidence bundle.

— Buyer · 03 · Defence + public sector

The CISO

AI under DRDO / MoD procurement governance and NDAA Section 889 constraints. Wants India-hosted, sovereign, tamper-evident inference with HSM-held keys and a chain engineered to satisfy the technical requirements of Bharatiya Sakshya Adhiniyam §63 for electronic-record evidence. CERT-In + STQC empanelment is a regulatory track we are targeting (in progress).

— Buyer · 04 · Trusted-AI advisory

The AI-institute lead

A Big-4 trusted-AI practice that needs a named technical vendor to cite in its methodology and point to in attestation engagements. Spire is the artefact: ISO 42001 scope targeted (in progress), the audit chain as the deliverable.

02 — How it works

One call in. A signed, replayable record out.

Spire is a LiteLLM-based multi-tenant gateway with a per-request audit-chain shim. Each call passes a fixed pipeline: PII redaction at the interceptor, a manifest-pinned route decision, a three-gate policy check at the inference engine, the inference itself, then a cache-state commitment and a signed chain entry. Each link is a Protocol with a deterministic stub for tests and a production adapter slot — so the suite runs green with zero external dependency installed.

In plain terms: you can prove, months later, exactly what the AI saw and said — and no one can quietly change the record.

Step · 01

For any regulated workload

Redact, then route.

Microsoft Presidio strips Aadhaar, PAN, SSN, card, email, and phone PII at the gateway interceptor. The router then resolves the workload against a signed compliance-preset manifest before anything reaches a model.

spire.pii.presidio_redactor · spire.gateway.litellm_router

Step · 02

For cross-regime deployments

Pin the route to a signed manifest.

SignedRouter verifies the manifest's Ed25519 signature, walks the candidate (provider, region, model, cache-policy) set, and picks the first the manifest allows. No compliant route → NO_COMPLIANT_ROUTE, never a silent fail-open.

spire.compliance.signed_routing · SignedRouter.route

Step · 03

For consent + warrant + two-operator

Gate at the first transformer block.

ThreeGateEnforcer checks consent, warrant, and two-operator approval before any token is emitted. A failure returns an attested POLICY_DENY carrying the gate, the citation, the request hash, and the chain entry id.

spire.compliance.three_gate · ThreeGateEnforcer.evaluate

Step · 04

For cloud burst or self-hosted

Infer, hot cache first.

A Redis hot semantic cache with Qdrant cold overflow sits in front of twelve providers behind one Protocol. A confidence-gated cache-hit policy decides reuse; the route, the cache tier, and the model version all land in the record.

spire.cache.semantic · spire.inference.<provider>_adapter

Step · 05

For deterministic replay

Commit the cache state.

The replay service commits a Merkle root over the PagedAttention pages (self-hosted) or over the prompt-prefix and response chunks (cloud), plus the routing-manifest hash, the seed, temperature, and top-p — bound to the audit entry.

spire.audit_chain.replay · ReplayService.commit_inference

Step · 06

For tamper-evidence end to end

Sign and hash-link.

AuditChain.append signs prev_hash ‖ payload_hash ‖ ts with the per-tenant Ed25519 key and links it to the prior entry. Tampering with any past entry breaks verification at that index — and the monthly CT-log anchor makes it publicly detectable.

spire.audit_chain.chain · AuditChain.append

03 — The patent family

Four mechanisms the prompt-log tools do not address.

Most AI-governance tooling logs the prompt and the response. Spire goes further: it commits the cache page, signs the route decision, refuses at the inference engine, and content-addresses the training example. Four mechanisms in the patent family (#1, #2, #5, #9) cover those parts; the numbering is against our internal IP register, and the applications are in preparation. Each is implemented today as real Protocol architecture with deterministic stub adapters; production integrations plug behind the same Protocols.

Patent · #1

vs prompt-and-response observability

KV-cache forensic replay — Merkle commitment over the cache.

A Merkle commitment over the PagedAttention pages used in an inference is anchored to the chain at the time of inference. Self-hosted replay reconstructs the cache deterministically on a TEE-attested node; cloud-burst replay re-plays the prompt and seed and attests the response Merkle root matches. Every ReplayBundle carries the original event id, the operator who requested it, the verification proof, and Spire's signature.

spire.audit_chain.replay · compute_paged_attn_merkle_root

Patent · #2

vs refusal at the API gateway

Three-gate policy — enforced at the first transformer block.

Consent, warrant, and two-operator approval are checked before the first transformer block executes — so a regulator can show that no model computation occurred under a non-compliant request. A failed gate returns an attested POLICY_DENY, signed by the per-tenant key, that the operator can defend without further forensics. Gate-check budget ≤ 50 ms p99 (design target, not yet independently measured).

spire.compliance.three_gate · AttestedPolicyDeny

Patent · #5

vs silent cross-regime fail-open

Cryptographically-pinned compliance-preset routing.

The per-regime routing manifest is Ed25519-signed; the router verifies the signature, picks the first manifest-allowed route, and anchors the manifest hash in a ROUTE_DECISION event. A third party holding the manifest hash and the chain can prove a request was routed against a specific signed preset at a specific time. A deployment cannot silently fail open across DPDP, GDPR, HIPAA, or the EU AI Act.

spire.compliance.preset_manifest · verify_preset_manifest

Patent · #9

vs untraceable fine-tuning data

Provenance-anchored fine-tuning lineage.

Every training example is content-addressed; a lineage manifest is signed and chain-anchored. On a data-principal erasure request, DeletionCertService enumerates the affected examples and the manifests they appear in, soft-deletes the rows (DPDP §13 permits retaining the evidence trail), and issues a signed deletion certificate.

spire.provenance.deletion_cert · DeletionCertService

Applications are in preparation: mechanism #1 (KV-cache forensic replay) and a companion audit-chain-federated-learning mechanism target the Indian filing window in FY27 Q1; #2 targets FY27 Q2; #5 and #9 target FY27 Q3. Planned PCT national-phase filings for #1 and #2 (FY28) would put US / EU / India exclusivity on the 2046 horizon. Nothing in this family is filed yet.

04 — The substrate, link by link

Every link is a Protocol. Every Protocol is tested.

Spire follows the ICYCASTLE adapter pattern everywhere a capability touches an external dependency: a behaviour-defining Protocol, a deterministic StubAdapter for tests and zero-config dev, a production-adapter slot filled at deploy time, SQLite/Postgres storage with RLock-guarded transactions, REST endpoints, and a mandatory audit-chain hand-off. Business logic never calls a dependency directly.

Inference plane · one Protocol, twelve providers

anthropic_adapter — Claude Sonnet + Opus cloud burst.
openai_adapter / gemini_adapter — GPT-5 · Gemini 2.5 Pro.
together · fireworks · deepinfra — open-weight hosting.
groq_adapter / cerebras_adapter — low-latency silicon.
mistral_adapter — European provider.
vllm · sglang · llama_cpp — self-hosted + edge, PagedAttention.
gpu_rental — burst-capacity GPU node adapter.

Substrate · cache, store, eval, keys

cache.semantic — Redis hot + Qdrant cold, confidence gate.
cache.kv — PagedAttention · prefix-cache · prefix-routing.
vector_store — Qdrant · pgvector · LanceDB behind one base.
embeddings — BGE-M3 default, OpenAI fallback.
eval — Promptfoo · DeepEval · Ragas · Inspect AI runners.
observability — Langfuse + Arize Phoenix + OTLP forwarder.
key_management — BYOK · HSM · Vault · key rotation.

# Pin a route to a signed compliance-preset manifest, then commit the
# cache state to the audit chain — the two mechanisms in the patent family.
from spire.compliance.signed_routing import SignedRouter, RoutingRequest, CandidateRoute
from spire.audit_chain.replay import ReplayService
from spire.types import InferenceProvider, ProviderRegion, CachePolicy

decision = router.route(RoutingRequest(
    workload_id=workload_id,
    manifest_id=dpdp_manifest_id,
    candidates=(
        CandidateRoute(InferenceProvider.LOCAL_VLLM, ProviderRegion.MUMBAI,
                       "llama-3.3-70b", CachePolicy.DEFAULT),
        CandidateRoute(InferenceProvider.ANTHROPIC,  ProviderRegion.MUMBAI,
                       "claude-sonnet-4-6", CachePolicy.NEVER_CACHE),
    ),
))   # → RoutingDecision(manifest_hash=…, chosen=…, audit_entry_id=…)

commitment = replay.commit_inference(
    audit_entry_id=decision.audit_entry_id,
    inference_provider=decision.chosen.provider,
    inference_model=decision.chosen.model,
    inference_region=decision.chosen.region,
    paged_attn_pages=pages,                 # self-hosted: Merkle over KV pages
    routing_manifest_hash=decision.manifest_hash,
    seed=7, temperature=Decimal("0.0"), top_p=Decimal("1.0"),
)   # the inference is now deterministically replayable for 7 years

05 — The audit chain

Hash-linked. Signed. Publicly anchored.

The chain is the product. Each entry signs prev_entry_hash ‖ payload_hash ‖ ts with the tenant's Ed25519 key; the next entry's prev_entry_hash is the SHA-256 of this entry's signature, so any retro-edit breaks the link at that index. Genesis is the SHA-256 of the tenant id, so even the first entry commits to tenant identity. The canonical signing payload is byte-for-byte identical to SAAKSHA Rail's, so verifier tooling is interoperable across the ICYCASTLE family.

Verify · FR-31

Anyone can verify — no Spire needed.

verify_chain() walks entries in timestamp order and asserts tenant consistency, link integrity, fingerprint match, and Ed25519 validity. A regulator with the published public key re-runs it offline; POST /v1/audit-chain/verify does it over REST.

spire.audit_chain.chain · verify_chain

Anchor · FR-30

Monthly CT-log anchor.

Each tenant root is Merkle-rolled monthly and submitted to a public certificate-transparency log (Google / Sectigo / Cloudflare Nimbus). The anchored hash is the public checkpoint: re-anchoring after a tamper attempt yields a divergent root the CT log captures forever.

spire.audit_chain.ct_log_anchor · MonthlyAnchorService

Replay · FR-02

Re-run the exact inference.

POST /v1/replay/{event_id} reconstructs the inference deterministically and returns a signed ReplayBundle with the Merkle verification proof and, where applicable, a SEV-SNP / TDX / Nitro TEE attestation. Target: ≥99.5% replay success inside the 7-year window (design goal, not yet independently measured).

spire.audit_chain.replay · verify_replay_bundle

Export · FR-24

One signed evidence bundle.

EvidenceBundleService packages a chosen event-set into a tamper-evident, BSA §63-aligned evidence bundle: a statutory PDF certificate per regime, a signed JSON manifest with per-file SHA-256, the chain entries, and the trace + eval history. Counter-signed by the operator (Aadhaar-eSign / DSC / PIV). It is engineered to satisfy the technical requirements for electronic-record evidence on India tenants — hash, algorithm, chain of custody, device identity, and operator. Admissibility in any proceeding is determined by the court.

spire.audit_chain.evidence_bundle · EvidenceBundleService.generate

# Independent, offline verification — the regulator holds only the public key.
from spire.audit_chain.chain import verify_chain, AuditChainVerificationError

try:
    verify_chain(entries, tenant_id, tenant_public_key_bytes)
    print("chain OK — every link, fingerprint and signature verified")
except AuditChainVerificationError as exc:
    # Pinpoints exactly where integrity broke.
    print(f"TAMPER at entry index {exc.entry_index}: {exc}")

# What each entry signs (canonical, whitespace-free, cross-process stable):
#   prev_entry_hash (32 B)  ||  payload_hash (32 B)  ||  ts_iso_utf8
# Genesis prev_hash = SHA-256(tenant_id.bytes)  ·  algorithm = ED25519_SHA256

Fourteen audited event types: request, response, cache_hit, route_decision, policy_gate, eval_run, manifest_change, key_rotation, federation_grant, deletion_cert, bundle_issuance, replay, health_alert, tamper_alert. Spire has no persistence story that skips the chain — every event-producing module hands off to AuditChain.append.

06 — Integration

Wraps your calls. Keeps your model.

Spire is a gateway, not a model. Point your existing openai / anthropic clients at the Spire control plane and every call is audited, routed, cached, and gated without an application rewrite. Cloud burst to nine named providers, self-host on your own GPUs, or mix the two behind a single signed manifest. Observability dual-stacks Langfuse and Arize Phoenix so your ML engineers keep the tooling they already trust.

Anthropic Claude OpenAI GPT-5 Google Gemini 2.5 Pro Together AI Fireworks AI DeepInfra Groq Cerebras Mistral vLLM (self-host) SGLang (long-context) llama.cpp (edge)

# 1. Initialise a tenant — provisions the per-tenant Ed25519 root key in
#    the HSM (private key never leaves the partition) + the chain database.
spire init 5f2c4e7a-…-9b1d
#   root-key fingerprint: 9f86d081…  ·  audit-chain database: ~/.spire/…

# 2. Enrol a client application's API key (only the SHA-256 is stored).
spire enroll sk_live_… --tenant 5f2c4e7a-…-9b1d

# 3. Replay an inference for the regulator, twelve months later.
spire replay 8d3f…event --tenant 5f2c… --operator 11aa…
#   replay success: True  ·  replay mode: self_hosted  ·  merkle proof: …

# 4. Export a BSA §63-aligned evidence bundle for a DPDP Board enquiry.
spire evidence-bundle --tenant 5f2c… --operator 11aa… \
      --query "chat-with-policy 2026-04-14" --regime dpdp --output ./bundle.tar

# Run the control plane:
uvicorn spire.api:app --host 0.0.0.0 --port 8000

India-hosted by default: Yotta NM1 Mumbai primary + Yotta DK1 Greater Noida DR, with a Sify Tier-IV Hyderabad secondary. European tenants land at OVHcloud Frankfurt / Hetzner Falkenstein; US tenants at AWS us-east-2 with a HIPAA BAA and a FedRAMP Moderate path. The route region is part of the signed manifest — sovereignty is enforced, not promised.

07 — Compliance presets

Presets are policy. Not feature flags.

Each regime ships as a closed-set, signed manifest that drives routing, retention, consent gates, and cross-border restrictions. They are policy, not toggles — a region-sensitive behaviour reads from the preset or it does not happen. Custom presets are available on engagement, signed and chain-anchored like the rest.

Preset	What it pins
dpdp_2023	India DPDP §17 — replayable automated-decision records; April 2026 audit-trail guidance
gdpr	EU data-subject rights; EEA-region routing constraint
ccpa	California consumer privacy; opt-out provenance
quebec_loi25	Loi 25 automated-decision audit trail with model parameters at decision time
hipaa	§164.312(b) Audit Controls — post-hoc examination of clinical-AI decisions
eu_ai_act	Article 12 logging; high-risk-system event records (window opens 2 Aug 2026)
sebi_tech_vendor	SEBI technology-vendor governance for Indian capital-markets AI
rbi_outsourcing	RBI Master Directions on Outsourcing of IT services
sec_ai_disclosure	US SEC predictive-data-analytics disclosure + conflict-of-interest controls
pcpd_hk	Hong Kong PCPD AI audit trail
singapore_ai_v2	Singapore Model AI Governance Framework v2
china_deep_synthesis	PRC deep-synthesis + generative-AI provider logging
custom	Buyer-specific preset — signed, chain-anchored, available on engagement

Federation across tenants (fraud hotlist, adverse-event registry, shared eval) enforces the consent + warrant + two-operator gate — the same three-gate pattern used at the inference engine. No cross-tenant feature bypasses any of the three.

08 — Status & horizon

Built, not slideware. The substrate is in code.

The v0.1 substrate is implemented and tested end to end: every Protocol has a deterministic stub and a production-adapter slot, the audit chain is wired into every event-producing module, and the suite runs green. What remains before GA is commercial and operational — certification, the TEE-attested replay node, and the alpha-customer commissioning — not the core architecture.

1,795

Software tests, all green

Pipeline + integration (not a measure of model accuracy) · coverage gate ≥ 85% · mypy strict · ruff clean

≥99.5%

Target replay success

Inside the 7-year chain-retention window

Operator roles

Compliance officer · model-risk validator · auditor …

ISO 42001

Targeted (in progress)

Held: ISO 9001 · ISO 27001 (certs on request). In progress: ISO 42001 · SOC 2 Type II · CERT-In · STQC

Production-adapter slots that fill at alpha-customer commissioning: the TEE-attested GPU replay node (SEV-SNP / TDX / Nitro), the live CT-log client, the HSM-backed signer, and the reportlab PDF certificate renderer. The deterministic stubs that ship today keep the full suite runnable with zero external dependency installed — the intentional test layer, not placeholder product.

Prove
what your
AI did.

Built for AI that has to survive an audit.

The Chief AI Officer

The Head of Model Risk

The CISO

The AI-institute lead

One call in. A signed, replayable record out.

Redact, then route.

Pin the route to a signed manifest.

Gate at the first transformer block.

Infer, hot cache first.

Commit the cache state.

Sign and hash-link.

Four mechanisms the prompt-log tools do not address.

KV-cache forensic replay — Merkle commitment over the cache.

Three-gate policy — enforced at the first transformer block.

Cryptographically-pinned compliance-preset routing.

Provenance-anchored fine-tuning lineage.

Every link is a Protocol. Every Protocol is tested.

Hash-linked. Signed. Publicly anchored.

Anyone can verify — no Spire needed.

Monthly CT-log anchor.

Re-run the exact inference.

One signed evidence bundle.

Wraps your calls. Keeps your model.

Presets are policy. Not feature flags.

Built, not slideware. The substrate is in code.

Get started.

Make your AI
audit-ready.

Provewhat yourAI did.

Built for AI that has to survive an audit.

The Chief AI Officer

The Head of Model Risk

The CISO

The AI-institute lead

One call in. A signed, replayable record out.

Redact, then route.

Pin the route to a signed manifest.

Gate at the first transformer block.

Infer, hot cache first.

Commit the cache state.

Sign and hash-link.

Four mechanisms the prompt-log tools do not address.

KV-cache forensic replay — Merkle commitment over the cache.

Three-gate policy — enforced at the first transformer block.

Cryptographically-pinned compliance-preset routing.

Provenance-anchored fine-tuning lineage.

Every link is a Protocol. Every Protocol is tested.

Hash-linked. Signed. Publicly anchored.

Anyone can verify — no Spire needed.

Monthly CT-log anchor.

Re-run the exact inference.

One signed evidence bundle.

Wraps your calls. Keeps your model.

Presets are policy. Not feature flags.

Built, not slideware. The substrate is in code.

Get started.

Make your AIaudit-ready.

Prove
what your
AI did.

Make your AI
audit-ready.