A capability boundary for OpenClaw, LangChain, CrewAI, and your own delegation chains. Every agent-to-agent call is checked against a Lean-proven policy before it runs.
guard = Guard(permissions=["network:http:read"]) # an agent's boundary @guard.wrap(required_capabilities=["network:http:read"]) def web_fetch(url): ... web_fetch("https://example.com") # allowed → signed receipt # escalation → CertiorBlocked
$ pip install certior
Live on PyPI, full walkthrough in the quickstart.
Why now
Distinct threat classes enumerated in a single multi-agent framework - prompt injection, ClawDrain tool-chain exploits, data exfiltration, privilege escalation via delegation, and more.
arXiv:2603.12644
MCP servers reachable in production with no capability boundary on their tool surface - public scan, 2025. Any agent that connects inherits every tool.
Organizations breached via agent supply-chain attacks during 2025 - compromised skills, prompt-injected tool calls, capability over-grant. Pattern matching alone did not stop them.
How it works
Wrap any agent or pipeline. Each tool call passes through three independent checks before it runs. The decision is bound to a Lean-proven policy fingerprint so a customer can verify it offline.
Is the agent allowed to call this tool? In a delegation chain, every step's capability set must fit inside its parent's. A child can never escalate.
Per-policy detectors on prompts and tool outputs. HIPAA, SOX, and attorney-client presets out of the box; custom rules where you need them.
Per-agent hard ceiling. Every step debits the parent's budget. Runaway delegations stop themselves before they bill you.
Verified delegation in a multi-agent pipeline
The capability subset rule is what's machine-checked in Lean. The runtime just runs the rule; the proof says the rule is enough.
What's actually proven
Z3 runs on every tool call and proves the decision. Lean 4 machine-checks the policy model that Z3 enforces. Every signed receipt carries a fingerprint of the Lean source; an auditor reproduces the audit with a single lake build command.
Certior does not verify the LLM's behaviour. It verifies the boundary the LLM operates inside. If an action violates the policy, it is provably blocked - and the policy itself is proven sound.
Regulated teams in healthcare, finance, and legal feel it first because the auditor is watching. Every team running multi-agent systems hits it eventually.
A small group of early design partners is shaping the v1 surface with us.
What design partners get
lake build an auditor runs.