ProofTrail

Evidence-first browser automation with recovery and MCP.

For AI agents and human operators who need inspectable runs, retained evidence,and guided recovery.

ProofTrail is the browser-evidence and recovery layer, not a generic browserbot and not a hosted agent shell.

Current public distribution and ecosystem boundaries:DISTRIBUTION.md | INTEGRATIONS.md

30-Second Version

If you only want the shortest truthful product line, use this:

run one canonical browser workflow
keep one retained evidence bundle
recover before guesswork
expose the same governed surface through API and MCP

That is why this repo fits Codex, Claude Code, OpenHands, OpenCode, OpenClaw,and similar agent shells that need a browser-evidence layer instead of anotherprompt-only browser bot.

Quick paths:

Docs
Quickstart
ProofTrail for AI Agents
ProofTrail for Coding Agents and Agent Ecosystems
API Builder Quickstart
MCP Distribution Contract
Distribution Status
Integration Boundaries
ProofTrail MCP Skill

Current primary lane vs later lanes

If you only need the truthful packet order, keep it this simple:

Primary product lane
- canonical run -> retained evidence -> recovery/review
- this is the stable repo identity and the default public story
Current MCP lane that works now
- local checkout + stdio through pnpm mcp:start
- apps/mcp-server/ is the governed MCP side road for that local lane, not anew top-level product identity
Current public skill/discovery lanes
- the ClawHub skill page is live as a public discovery page for therepo-owned ProofTrail MCP skill packet
- the repo-owned skill packet under skills/prooftrail-mcp/ is materializedhere, but no generic cross-host skill-registry publication is evidenced yet
- the OpenHands/extensions submission is a separate review-pending lane,not a live listing
Later / contract-only lanes
- npm package publication
- MCP Docker image publication
- Official MCP Registry listing
- vendor-specific plugin or official integration claims

Those later lanes can be documented now, but they must stay documented aslater / contract-only / not yet live until a fresh upstream read-backexists.

ProofTrail storefront loop

The static storefront hero source still lives atassets/storefront/prooftrail-readme-hero.svg.

The storefront command-center screenshot artifact lives atassets/storefront/prooftrail-hero.png.

ProofTrail is for AI agents and human operators who need browser automationto stay inspectable, replayable, and recoverable after the first run.

Category Fit

ProofTrail is an evidence-first browser automation product:

run one canonical workflow
inspect one retained evidence bundle
recover with structured guidance
expose the same trusted surfaces to MCP clients and optional AI helpers

Why ProofTrail

One canonical path: start with just run, then inspect one retainedevidence bundle instead of juggling ad-hoc scripts and shell fragments.
Recovery before guesswork: move from explanation to recovery to comparebefore you fall back to raw logs or helper-path debugging.
AI and MCP in the right place: use AI reconstruction and MCP as governedside roads after the first proof run, not as replacements for thedeterministic mainline.
Strong AI-builder fit without fake heat: ProofTrail is a truthful browserevidence layer for Codex, Claude Code, OpenHands, OpenCode, OpenClaw, andother AI-agent workflows that need retained proof, recovery, and governedintegration instead of prompt-only browser improvisation.

Desktop host-automation note:

desktop smoke / e2e / business / soak are now operator-manual lanes
they require UIQ_DESKTOP_AUTOMATION_MODE=operator-manual plusUIQ_DESKTOP_AUTOMATION_REASON=<auditable reason>

Builder Entry

If you are integrating ProofTrail into another toolchain, use this order:

API Builder Quickstart
Universal API Reference
node --import tsx contracts/scripts/generate-client.ts --verify
ProofTrail MCP Server README

That sequence helps you separate:

the API contract layer
the generated-client freshness path
the governed MCP tool surface

The checked-in client under apps/web/src/api-gen/ is a repo-localgenerated helper, not a published SDK package.

MCP Install Surfaces

Use these four layers to avoid mixing a working local install with publicdiscovery pages or unpublished package contracts.

Current / usable now
- local checkout + stdio through pnpm mcp:start
- optional UIQ_MCP_API_BASE_URL / UIQ_MCP_AUTOMATION_TOKEN when you wantthe MCP process to call a live backend
Live public skill page
- the ClawHub ProofTrail MCP page is live as a discovery surface for theskill packet
- that page does not turn ProofTrail into a hosted endpoint, officialplugin, or generic skill-registry publication
Repo-owned skill packet and review lanes
- skills/prooftrail-mcp/ is the repo-owned install skill packet
- OpenHands/extensions is still review-pending
- no generic cross-host skill-registry listing is evidenced yet
Contract-only later lanes
- npm package: @prooftrail/mcp-server
- Docker image: ghcr.io/xiaojiou176-open/prooftrail-mcp-server:0.1.1
- Official MCP Registry stays blocked until the npm package is actuallypublished

Those future-facing names are part of the public contract now, but they arenot live install paths until the package/image is actually published.

If you are evaluating this repo for Codex, Claude Code,OpenHands, OpenCode, OpenClaw, or similar coding-agent workflows,keep the fit narrow and honest:

ProofTrail does not replace a coding agent
it fits as the browser execution, retained evidence, recovery, and governedMCP/API layer that a coding agent can call
the best public entry is stillProofTrail for AI Agents, then thebuilder/API and MCP pages

For Coding-Agent And Agent-Stack Workflows

If you found ProofTrail while searching for:

browser automation for Codex
browser automation for Claude Code
browser automation for OpenHands
browser automation for OpenCode
browser automation for OpenClaw
MCP browser automation for AI agents
API-first browser evidence for tool-using agents

read this repo in one very specific way:

ProofTrail is a browser-execution and evidence layer for agent shells such asCodex, Claude Code, OpenHands, OpenCode, OpenClaw, and other tool-using AIworkflows. It is not claiming to be an official vendor-specificintegration, plugin, or generic AI assistant shell.

The most truthful ecosystem fit today looks like this:

Ecosystem	Best public angle	Best first road
Claude Code	governed browser-evidence side road for a tool-using coding shell	ProofTrail for Coding Agents and Agent Ecosystems -> MCP for Browser Automation
Codex	browser-evidence substrate with API-first control and optional MCP	ProofTrail for Coding Agents and Agent Ecosystems -> API Builder Quickstart
OpenHands	browser-evidence subsystem behind a larger orchestration runtime	ProofTrail for AI Agents -> API Builder Quickstart
OpenCode	governed MCP browser surface behind the coding-agent shell	ProofTrail for Coding Agents and Agent Ecosystems -> MCP for Browser Automation
OpenClaw	browser workflow backend behind a multi-channel gateway or tool router	ProofTrail for Coding Agents and Agent Ecosystems -> API Builder Quickstart

The truthful bridge is:

ProofTrail for AI Agents
ProofTrail for Coding Agents and Agent Ecosystems
MCP for Browser Automation
API Builder Quickstart

This discovery layer is not claiming official vendor integrations, plugins,or a generic AI assistant shell.

That order keeps search intent and product truth aligned:

audience fit first
coding-agent fit second
governed MCP tool use third
direct API control after that

Ecosystem Fit At A Glance

If you only have ten seconds, use the map below like an airport departuresboard:

Claude Code and OpenCode are the clearest MCP-first fits today
Codex, OpenHands, and OpenClaw usually start API-first or hybrid
ProofTrail stays the browser-evidence and recovery layer in all cases

The ecosystem-fit visual source lives atassets/storefront/prooftrail-agent-ecosystem-map.svg.

Explore the Product Surface

If you want the shortest truthful way to understand where ProofTrail fits, usethese six pages as the current public matrix:

ProofTrail for AI Agents
ProofTrail for Coding Agents and Agent Ecosystems
MCP for Browser Automation
AI Reconstruction Side Road
ProofTrail vs Generic Browser Agents
Evidence, Recovery, and Review Workspace

If your search intent sounds more like:

browser automation for Codex
browser automation for Claude Code
browser automation for OpenHands
browser automation for OpenCode
browser automation for OpenClaw
MCP browser automation for AI agents
browser evidence layer for coding agents

start with ProofTrail for AI Agentsbefore dropping into the lower-level builder pages.

That sequence keeps the outward story honest:

audience fit first
coding-agent fit second
governed MCP and AI side roads next
alternatives framing after that
evidence/recovery/review loop as the deepest current product proof

Use the builder entry separately.

The outward matrix explains category fit and product shape.The builder entry explains contract-level integration once that product shapealready makes sense.

If you are coming from the builder side instead of the operator side, pair thatmatrix with the API Builder Quickstartso the public story and the integration story stay connected.

First Practical Win

Choose the shortest path that matches what you want to confirm first:

If you want to produce one canonical run:Start with just setup && just run.You should get a new run directory under.runtime-cache/artifacts/runs/<runId>/ with manifest and proof reports.
If you want to know what good evidence should look like:Start with docs/reference/run-evidence-example.md.That page shows the concrete report shape a healthy run should produce.
If you want to follow the guided operator path:Start with docs/getting-started/human-first-10-min.md.That is the shortest human-readable route from fresh checkout to inspectableproof.

15-Minute Evaluation Path

If you are seeing ProofTrail for the first time, keep the first pass simple:

run just setup
run just run
inspect the resulting bundle under .runtime-cache/artifacts/runs/<runId>/
use the command center for the deeper follow-up:
- Quick Launch to repeat the canonical path
- Task Center to confirm the result and inspect retained evidence
- Recovery Center inside Task Center before raw logs or replay
- Flow Workshop only after you already have one clear result

Treat helper and workshop commands like the advanced bench. Keep them available,but do not use them as the default first step.

The evaluator path for a first pass is intentionally short:

run the canonical flow
confirm the visible result in Task Center
inspect the retained evidence bundle
use Recovery Center before raw logs or shell fallbacks
only then open sharing, compare, or deeper workshop tools

If you are evaluating through the local command center instead of only the CLI,use the same story in product form:

Quick Launch: start the canonical run first
Task Center: confirm the result and inspect the evidence state first
Recovery Center: use the recovery layer inside Task Center before raw logsor workshop replay
Flow Workshop: refine drafts or replay steps only after the first resultalready exists

What This Repo Actually Does

How do we make browser automation reproducible, inspectable, and recoverable?

ProofTrail gives you one public mainline for running a workflow, one evidencebundle for understanding what happened, and one shared repo for the backend,web command center, automation runner, and MCP adapter that support that flow.

Think of the product in two layers:

Primary layer: canonical run, evidence, recovery
Secondary layer: template reuse, compare, studio tuning, AIreconstruction, MCP integration

The canonical public mainline is:

run just setup
run just run
inspect .runtime-cache/artifacts/runs/<runId>/

just run is the canonical public mainline wrapper forpnpm uiq run --profile pr --target web.local.

just run-legacy remains available for lower-level workshop troubleshooting,but it is not the canonical public mainline.

Why Teams Use It

Fewer mystery failures: every canonical run writes a manifest-anchoredevidence bundle with summary, index, and proof reports instead of leavingyou with scattered logs and screenshots.
Easier recovery: the web command center, run records, and flow workshopare built to help you inspect, replay, and repair workflows after somethingbreaks.
One repo, one story: backend orchestration, operator UI, automationrunner, and release proof surfaces live together, so docs and runtime truthcan stay aligned.

Quickstart

Requirements:

Python 3.11+
Node.js 20+
pnpm
uv
just

Install dependencies and local tooling.

just setup

Run the canonical workflow.

just run

Inspect the resulting evidence bundle.

ls .runtime-cache/artifacts/runs

What good looks like:

a new run directory appears under .runtime-cache/artifacts/runs/<runId>/
the run contains manifest.json, reports/summary.json,reports/diagnostics.index.json, reports/log-index.json,reports/proof.coverage.json, reports/proof.stability.json,reports/proof.gaps.json, and reports/proof.repro.json
manifest.json points back to those proof artifacts through bothmanifest.proof and manifest.reports
the same orchestrator-first chain is reachable throughpnpm uiq run --profile pr --target web.local
even when the PR gate fails, reports/summary.json still tells you whyinstead of leaving you with a silent shell failure

If just run fails, start with thehuman-first 10 minute guide andthe run evidence example beforedropping to legacy helper paths.

If just run succeeds, the next stop is Task Center: confirm the result, readthe evidence summary, and only then move into explain/share/recovery paths.

If you are already in the Web command center, keep the same order:

start from Quick Launch
move to Task Center to confirm the result and inspect evidence
use Recovery Center before diving into raw logs
open Flow Workshop only when you intentionally need the advanced draftor replay surfaces

After The First Successful Run

Once one canonical or operator-supported result already exists, the next valuelayer is no longer "how do I start?" It becomes "how do I reuse, compare,operate, and hand this off without losing trust?"

Use the product surfaces in this order:

Template reuse / readiness in Flow Workshop
- Ask: "Is this flow stable enough to reuse, or should it stay in workshop mode?"
- Treat readiness as a reuse verdict, not as a vanity score.
Compare in Task Center
- Ask: "How does this retained run differ from a baseline run?"
- Use compare to judge change, not to replace the canonical evidence bundle.
Profile / Target Studio
- Ask: "Which knobs are safe to tune, and what validation runs when I save?"
- Studio is a guarded operator surface, not a raw YAML editor.
AI reconstruction
- Ask: "Do I need help rebuilding a flow from artifacts?"
- Use it only after artifacts already exist; it is an optional advanced helper.
MCP
- Ask: "Do I need an external AI client to inspect runs or operate thisrepo safely?"
- Treat it as an integration side road, not as a replacement for just run.
Review Workspace
- Ask: "Do I need one review-ready packet before I hand this run to another maintainer?"
- Treat it as a local-first review packet, not as a hosted collaboration product.
Template Exchange
- Ask: "Do I need to move a reusable template contract into another checkout?"
- Use import/export/share for that handoff, not a marketplace mental model.

Wave 5 also makes the recovery boundary more explicit:

inspect_task-style actions are safe to suggest immediately
replay actions stay human-confirmed
OTP, provider-step, and manual-input actions stay manual-only

Outward Product Story

Use this mental model when you explain ProofTrail to a new evaluator:

What it is: evidence-first browser automation with recovery and MCP
Who it helps: AI agents and human operators who need trustworthy browser workflows
Why it feels different: the product does not stop at“the automation ran”; it keeps the evidence, recovery path, and handoffsurfaces attached to the run
Where AI fits: AI reconstruction helps after artifacts already exist
Where MCP fits: MCP exposes the same governed surfaces to external AI clients

Suitable / Not Suitable

Suitable for:

teams standardizing browser automation runs across operators and environments
maintainers who need inspectable evidence instead of ad-hoc shell output
workflows where replay, diagnostics, and recovery matter as much asfirst-run success

Not suitable for:

tiny one-off browser scripts where no shared evidence or recovery path is needed
teams unwilling to maintain a Python + Node workspace
people looking for a hosted SaaS with zero local setup

Validation and Governance

ProofTrail keeps the public story honest by separating runtime proof fromgovernance checks.

Minimal success case
Run evidence example
Quality gates
Changelog
Release guide
Release supply-chain policy
Maintainer GitHub closure evidence: just github-closure-report

Public collaboration contract:

external pull requests stay on GitHub-hosted, low-risk governance and buildlanes
live, external, and owner-secret workflows are manual-only and require theprotected owner-approved-sensitive environment
macOS-only smoke and regression lanes use GitHub-hosted macos-latest;self-hosted / shared-pool are not part of the public collaborationcontract

Maintainer Space Hygiene

ProofTrail treats disk cleanup as a governed maintenance path, not an ad-hoc"delete the biggest folder" exercise.

just space-report emits a repo-exclusive JSON report for runtime buckets,safe-clean residue, explicit reclaim candidates, protected totals, and thededicated external pnpm layer
just space-clean-safe runs a default dry-run for the low-risk cleanupwave; use ./scripts/space-clean-safe.py --apply only when you want toexecute the same safe-clean list
just space-clean-reclaim runs a default dry-run for largerepo-exclusive reclaim targets such as the root .venv, isolatednode_modules, and the repo-scoped pnpm store; use explicit --scope ...plus --apply only after the matching validation gate passes
just runtime-gc -- --dry-run previews retention-based cleanup for thereview-class runtime buckets before you let the same policy delete old files
canonical run evidence under .runtime-cache/artifacts/runs/, runtimebackups under .runtime-cache/backups/, and managed toolchains under.runtime-cache/toolchains/ are intentionally outside the first cleanup wave
empty run stub directories under .runtime-cache/artifacts/runs/ are theone explicit exception: they may enter the first safe-clean wave only whenthey are still empty and have no evidence files yet
the canonical Python runtime target is .runtime-cache/toolchains/python/.venv;the root .venv is a retired legacy surface and may only be reclaimedthrough space-clean-reclaim --scope root-venv after single-track validation

FAQ

Do I need the legacy helper path?

No. just run is the public default road. just run-legacy is only forlower-level workshop troubleshooting when you need to inspect helper-pathbehavior directly.

Where should I look after a run finishes?

Start with .runtime-cache/artifacts/runs/<runId>/manifest.json, then openreports/summary.json, the four reports/proof.*.json files, and the indexfiles described indocs/reference/run-evidence-example.md.

If you are using the local command center, the matching product path is:

open Task Center
inspect the evidence state first: retained, partial, missing, or empty
use Failure Explainer to understand the current run
use Share Pack when you want a handoff-friendly summary
use Compare when you need a baseline-versus-candidate judgment
treat Promotion Candidate as a later release/showcase decision, not as thefirst diagnostic step

Is this repository already a full docs site?

Not yet. Today the GitHub README is the conversion page, and the docs surface isthe supporting second layer. See docs/index.md for the currentpublic docs map.

Repository Map

apps/api/ - backend API and orchestration services
apps/web/ - operator-facing web command center
apps/automation-runner/ - record, replay, and reconstruction pipeline
apps/mcp-server/ - MCP adapter
packages/ - shared orchestration and runtime packages
configs/ - environment, schema, and governance configuration
contracts/ - API contracts
scripts/ - repo entrypoints and CI helpers
docs/ - storefront-supporting public docs surface

Documentation

Docs map: docs/index.md
Public docs overview: docs/README.md
Architecture contract: docs/architecture.md
CLI guide: docs/cli.md
docs/localized/zh-CN/README.md

Security and Contribution

SECURITY.md
SUPPORT.md
CONTRIBUTING.md
CODE_OF_CONDUCT.md

License

This repository is released under the MIT License.

If ProofTrail saves you time during evaluation or implementation, star the reposo you can find the updates, release notes, and new evidence examples later.