Warden
A governed MCP server, with the receipts: RBAC enforced outside the model, OpenTelemetry traces on every run, and an LLM-as-judge eval suite proving the agent stays inside policy.
Live demo: warden.alexlaguardia.dev. Browse recorded runs, replay traces, and fire a live agent run yourself (rate-limited).

The problem this demonstrates
Give an AI agent tool access to company data and you inherit two hard questions:
- Who is the agent acting as? A support agent asking "what's our pipeline with Acme?" must not get an answer the human asking is not allowed to see.
- How do you know the agent behaved? "It seemed fine in testing" is not an answer you can take to a security review.
Warden answers both, end to end, in a system small enough to read in an afternoon:
- Governance lives outside the model. The role comes from the session identity (like OAuth token scopes), every read passes through one policy choke point, and the model cannot widen its own access by prompting harder.
- Behavior is measured, not vibed. A deterministic oracle computes ground truth through the same governance layer, and a stronger model judges each agent answer against that reference: accuracy, faithfulness, RBAC compliance, and whether the agent honestly said "I can't see that" instead of fabricating.
- Every run is traceable. OpenTelemetry spans for each LLM completion and MCP tool call, persisted and replayable on a timeline in the dashboard.
What the demo shows
| Page | What it proves |
|---|---|
| Console | Live agent runs through the governed server, as any of 3 roles |
| Diff | The same question answered as admin, sales, and support. Admin gets $125,000 of pipeline; support gets a policy denial and an honest refusal |
| Run detail | Span timeline (LLM vs tool time), tool inputs/outputs with the enforcing role stamped on every result |
| Evals | 12/12 cases passing across all governance primitives |

Architecture
server/ The governed MCP server
data.py Seed dataset: one fictional company across 3 sources
(CRM accounts + deals, billing invoices, support tickets)
rbac.py Policy engine: roles + 3 governance primitives
(resource-level access, region row-scoping, field redaction)
store.py GovernedStore: the single choke point every read passes through
mcp_server.py MCP server (official SDK, stdio): 4 registry/dispatch tools,
role fixed by WARDEN_ROLE env, never by the model
agent/
runner.py Claude tool-calling loop over the real MCP server, role-bound
eval/
oracle.py Deterministic ground truth, computed THROUGH the governance
layer, so the reference itself respects policy
judge.py LLM-as-judge (a stronger model judges the agent's model),
anchored to the oracle reference to mitigate judge bias
golden.py 12 cases covering every RBAC primitive + honesty-on-denial
tracing/
otel.py Real OpenTelemetry spans (GenAI semantic conventions) via a
custom in-process SpanProcessor
store.py Run persistence (runs.db): answers, tool calls, spans
dashboard/
api.py FastAPI over runs.db + a rate-limited live-run endpoint
web/ Next.js console: run list, trace replay, role diff, evals
Roles: admin (everything), sales_west (West-region CRM + tickets, no billing), support (tickets everywhere + basic accounts, financial tier hidden).
Design choices worth stealing
- Registry/dispatch tools, not one-tool-per-table. The server exposes
list_resources,describe_resource,query_resource,get_record. Adding a data source changes the registry, not the tool surface, and the policy engine stays in one place. - The oracle goes through the governance layer. If the reference answer were computed against raw data, a correctly-denied answer would score as "wrong." Governance-aware ground truth is what makes "the agent honestly declined" a passing grade.
- A stronger model judges than answers (Opus judges Sonnet), anchored to the oracle reference, to cut self-preference bias.
- Denials are structured, not stringly. Tools return an
access_deniedobject; the eval then checks the agent reported the limit honestly rather than guessing.
Run it locally
pip install -r requirements.txt
python seed.py # build the demo company (warden.db)
python -m tests.test_rbac # 7/7 governance primitives hold
export ANTHROPIC_API_KEY=sk-ant-... # needed for agent + evals
python -m agent.runner --role support --trace "What account tier is Acme Corp?"
python -m eval.run_evals # 12/12, writes eval/results.json
python -m tracing.seed_runs # seed demo runs for the dashboard
# dashboard
uvicorn dashboard.api:app --port 8710
cd web && npm install && npm run dev # http://localhost:3006
Stack
Python (FastAPI, official mcp SDK, OpenTelemetry SDK), Claude (Sonnet agent, Opus judge), SQLite, Next.js + Tailwind.
Built solo by Alex LaGuardia as a working answer to "how do you let an agent touch real data without trusting it blindly?"