Warden

A governed MCP server, with the receipts: RBAC enforced outside the model, OpenTelemetry traces on every run, and an LLM-as-judge eval suite proving the agent stays inside policy.

Live demo: warden.alexlaguardia.dev. Browse recorded runs, replay traces, and fire a live agent run yourself (rate-limited).

Same question, different role

The problem this demonstrates

Give an AI agent tool access to company data and you inherit two hard questions:

Who is the agent acting as? A support agent asking "what's our pipeline with Acme?" must not get an answer the human asking is not allowed to see.
How do you know the agent behaved? "It seemed fine in testing" is not an answer you can take to a security review.

Warden answers both, end to end, in a system small enough to read in an afternoon:

Governance lives outside the model. The role comes from the session identity (like OAuth token scopes), every read passes through one policy choke point, and the model cannot widen its own access by prompting harder.
Behavior is measured, not vibed. A deterministic oracle computes ground truth through the same governance layer, and a stronger model judges each agent answer against that reference: accuracy, faithfulness, RBAC compliance, and whether the agent honestly said "I can't see that" instead of fabricating.
Every run is traceable. OpenTelemetry spans for each LLM completion and MCP tool call, persisted and replayable on a timeline in the dashboard.

What the demo shows

Page	What it proves
Console	Live agent runs through the governed server, as any of 3 roles
Diff	The same question answered as admin, sales, and support. Admin gets $125,000 of pipeline; support gets a policy denial and an honest refusal
Run detail	Span timeline (LLM vs tool time), tool inputs/outputs with the enforcing role stamped on every result
Evals	12/12 cases passing across all governance primitives

Run detail with trace timeline

Architecture

server/             The governed MCP server
  data.py           Seed dataset: one fictional company across 3 sources
                    (CRM accounts + deals, billing invoices, support tickets)
  rbac.py           Policy engine: roles + 3 governance primitives
                    (resource-level access, region row-scoping, field redaction)
  store.py          GovernedStore: the single choke point every read passes through
  mcp_server.py     MCP server (official SDK, stdio): 4 registry/dispatch tools,
                    role fixed by WARDEN_ROLE env, never by the model
agent/
  runner.py         Claude tool-calling loop over the real MCP server, role-bound
eval/
  oracle.py         Deterministic ground truth, computed THROUGH the governance
                    layer, so the reference itself respects policy
  judge.py          LLM-as-judge (a stronger model judges the agent's model),
                    anchored to the oracle reference to mitigate judge bias
  golden.py         12 cases covering every RBAC primitive + honesty-on-denial
tracing/
  otel.py           Real OpenTelemetry spans (GenAI semantic conventions) via a
                    custom in-process SpanProcessor
  store.py          Run persistence (runs.db): answers, tool calls, spans
dashboard/
  api.py            FastAPI over runs.db + a rate-limited live-run endpoint
web/                Next.js console: run list, trace replay, role diff, evals

Roles: admin (everything), sales_west (West-region CRM + tickets, no billing), support (tickets everywhere + basic accounts, financial tier hidden).

Design choices worth stealing

Registry/dispatch tools, not one-tool-per-table. The server exposes list_resources, describe_resource, query_resource, get_record. Adding a data source changes the registry, not the tool surface, and the policy engine stays in one place.
The oracle goes through the governance layer. If the reference answer were computed against raw data, a correctly-denied answer would score as "wrong." Governance-aware ground truth is what makes "the agent honestly declined" a passing grade.
A stronger model judges than answers (Opus judges Sonnet), anchored to the oracle reference, to cut self-preference bias.
Denials are structured, not stringly. Tools return an access_denied object; the eval then checks the agent reported the limit honestly rather than guessing.

Run it locally

pip install -r requirements.txt
python seed.py                       # build the demo company (warden.db)
python -m tests.test_rbac            # 7/7 governance primitives hold
export ANTHROPIC_API_KEY=sk-ant-...  # needed for agent + evals
python -m agent.runner --role support --trace "What account tier is Acme Corp?"
python -m eval.run_evals             # 12/12, writes eval/results.json
python -m tracing.seed_runs          # seed demo runs for the dashboard

# dashboard
uvicorn dashboard.api:app --port 8710
cd web && npm install && npm run dev   # http://localhost:3006

Stack

Python (FastAPI, official mcp SDK, OpenTelemetry SDK), Claude (Sonnet agent, Opus judge), SQLite, Next.js + Tailwind.

Built solo by Alex LaGuardia as a working answer to "how do you let an agent touch real data without trusting it blindly?"

Warden

Warden

The problem this demonstrates

What the demo shows

Architecture

Design choices worth stealing

Run it locally

Stack

MCP Server · Populars

🦞 OpenClaw — Personal AI Assistant

MarkItDown-MCP

MarkItDown

Awesome MCP Servers

mcp-server-sentry: A Sentry MCP server

MCP Server · New

MCP Vector Search

MCP Proxy Server

Docling MCP: making docling agentic

duckle

Fabric MCP Server