Glass Box Framework
Runtime constitutional verification for AI answers. Every claim carries a reasoning chain. Every score breaks down. Every verdict is traceable.
⭐️ Star this repo if you want runtime AI verification to become the default. Every star moves Glassbox up the search ranking on GitHub, the MCP Registry, and Smithery — which means more developers find this before they ship an AI feature without a Trust Card.
pip install glassbox-framework # Python
npm install -g @glassbox-framework/mcp # Node / MCP
brew install thebarmaeffect/glassbox/glassbox-mcp # macOS
What it is
The Glass Box Framework hands an (question, answer) pair to a runtime verification pipeline and returns a structured Trust Card containing:
- Claims — every atomic assertion in the answer, paired with a reasoning chain explaining why it's asserted, what would support it, and what would falsify it.
- Epistemic Confidence Score (ECS) — a transparent, weighted aggregate over five dimensions with a published formula and an always-visible per-dimension breakdown.
- Glassbox Court — seven adversarial probes (fabrication, source manipulation, bias injection, context attack, overconfidence, underspecification, constitutional violation).
- Constitution — your natural-language deployer intents compiled into structured runtime rules and evaluated against the answer.
- Verdict —
trust/caution/reject, with the exact reasoning that derived it. - Audit reference — a deterministic SHA-256 log_id; identical inputs reproduce the same identifier across runs and languages.
It is intentionally not a wrapper around a single LLM call — the reasoning chain on every claim, the formula on the ECS, and the determinism of the audit hash together form the "Glass Box" principle: no opaque scores.
Quick start (Python)
from glassbox_framework import Glassbox
with Glassbox() as gb:
card = gb.verify_answer(
question="Can intermittent fasting cure type 2 diabetes?",
answer="Yes ...",
intents=[
"Never make specific medical claims without citing peer-reviewed sources.",
"Always recommend consultation with a licensed healthcare professional.",
],
)
print(card["verdict"]) # "reject"
print(card["ecs"]["total"]) # 0.6032
print(card["audit"]["log_id"]) # glassbox-85cc09903bd4... (deterministic)
The six tools
| Tool | Purpose |
|---|---|
glassbox_verify_answer |
Full pipeline → Trust Card |
glassbox_extract_claims |
Atomic claims with reasoning chains |
glassbox_score_ecs |
ECS with full breakdown + formula |
glassbox_red_team |
Glassbox Court — 7 adversarial probes |
glassbox_generate_trust_card |
Assemble a Trust Card from prebuilt parts (no LLM call) |
glassbox_export_audit_report |
Full pipeline + deterministic SHA-256 audit log |
Full schemas, examples, and configuration: mcp/README.md. Python pip-specific docs: mcp/python/README.md.
Architecture (two-layer)
┌──────────────────────────────────────────────────────────┐
│ glassbox-framework (PyPI) Python client │
│ thin JSON-RPC stdio wrapper │
│ spawns ↓ │
├──────────────────────────────────────────────────────────┤
│ @glassbox-framework/mcp (npm) Node MCP server │
│ 6 tools, Zod-validated I/O │
│ ↳ verify_answer ↳ extract_claims ↳ score_ecs │
│ ↳ red_team ↳ generate_trust_card │
│ ↳ export_audit_report │
└──────────────────────────────────────────────────────────┘
The Python client makes zero LLM calls itself; it forwards arguments to the MCP server over stdio and renders the returned JSON. Set ANTHROPIC_API_KEY once and both layers use it.
Use with Claude Desktop
{
"mcpServers": {
"glass-box": {
"command": "npx",
"args": ["-y", "@glassbox-framework/mcp"],
"env": { "ANTHROPIC_API_KEY": "sk-ant-..." }
}
}
}
~/Library/Application Support/Claude/claude_desktop_config.json on macOS.
Determinism
Audit log_ids are SHA-256 over canonicalised JSON of (inputs_hash, claims, ECS dimensions, red-team probe verdicts, constitution evaluations). Timestamps are recorded but never enter the hash, so identical inputs and identical engine outputs always produce the same log_id — across runs, machines, and even languages (the Python client → Node server → JSON canonicalisation produces byte-identical hashes).
Verifiable example, no API key needed:
pip install glassbox-framework
python -c "
import json
from glassbox_framework import Glassbox
with open('mcp/demo/raw-inputs.json') as f: i = json.load(f)
with Glassbox() as gb:
c = gb.generate_trust_card(
question=i['question'], answer=i['answer'],
claims=i['claims'], red_team=i['red_team'], ecs=i['ecs'],
constitution=i['constitution'])
print(c['audit']['log_id']) # glassbox-85cc09903bd4b3f8022a4087
"
Project layout
mcp/ — the MCP server + Python client (this release)
├── src/ — TypeScript MCP server (6 tools)
├── python/ — Python pip package (glassbox-framework)
├── homebrew/ — Homebrew formula
├── assets/ — Launch video + reveal + title cards
├── demo/ — Live terminal demo with prebuilt Trust Card
├── Dockerfile — Container image
├── server.json — MCP Registry manifest
├── smithery.yaml — Smithery.ai manifest
├── LAUNCH.md — Launch kit
└── DISTRIBUTION.md — Every channel's status + commands
LICENSE — Apache 2.0
ROADMAP.md — Phase 5 (governor) plans for the broader framework
CONTRIBUTING.md
CHANGELOG.md
Contributing
Glassbox is open source under Apache 2.0 and actively wants forks and PRs. A few specific places we'd love help:
- More red-team probes —
mcp/src/engines/redteam.tshas// v2:placeholders foralignment_faking,reasoning_trace_deception,eval_awareness_gaming,agentic_misalignment, andsustained_jailbreak. Each is a tractable PR — same shape as the existing 7 probes, just a different angle. See.github/ISSUE_TEMPLATE/good_first_issue.md. - More language clients — currently Python (
glassbox-framework) and Node (@glassbox-framework/mcp). Go, Rust, Ruby, Swift, Kotlin would all be welcome as thin JSON-RPC clients that spawn the existing MCP server. - More integrations — Cursor / Cline / Continue / Roo Cline / Zed / Neovim — wherever MCP is read, Glassbox should be one paste away.
- Real-world Trust Card examples — submit (Q, A) pairs from your own AI workflows so the test suite covers more terrain.
Process:
- Pick a
good first issueor open one with your idea - Fork, branch, work — the PR template walks you through verification
- CI must pass (
.github/workflows/ci.yml) — TS strict mode, Python wheel build, cross-language determinism on the canonical audit hash - Open the PR; we aim for review within 48 hours
Code of conduct: Contributor Covenant 2.1. Be kind, stay on substance, no harassment, contact [email protected] for anything off-public-channel.
Star ⭐ this repo
The fastest way to help right now is to star the repo. Every star:
- Surfaces Glassbox higher in GitHub's MCP topic listings
- Pushes the project up on the MCP Registry and Smithery rankings
- Tells the next developer evaluating AI-safety tooling that this is the one with eyes on it
Author
Karthik Barma · MS Artificial Intelligence · Northeastern University.
Powered by Aura.
Issues + PRs: https://github.com/TheBarmaEffect/glassbox/issues