xiaojiou176-open

ProofTrail

Community xiaojiou176-open
Updated

Auditable browser automation platform for repeatable runs, inspectable evidence, and recovery-ready workflows.

ProofTrail

Evidence-first browser automation with recovery and MCP.

For AI agents and human operators who need inspectable runs, retained evidence,and guided recovery.

ProofTrail is the browser-evidence and recovery layer, not a generic browserbot and not a hosted agent shell.

Current public distribution and ecosystem boundaries:DISTRIBUTION.md | INTEGRATIONS.md

30-Second Version

If you only want the shortest truthful product line, use this:

  • run one canonical browser workflow
  • keep one retained evidence bundle
  • recover before guesswork
  • expose the same governed surface through API and MCP

That is why this repo fits Codex, Claude Code, OpenHands, OpenCode, OpenClaw,and similar agent shells that need a browser-evidence layer instead of anotherprompt-only browser bot.

Quick paths:

  • Docs
  • Quickstart
  • ProofTrail for AI Agents
  • ProofTrail for Coding Agents and Agent Ecosystems
  • API Builder Quickstart
  • MCP Distribution Contract
  • Distribution Status
  • Integration Boundaries
  • ProofTrail MCP Skill

Current primary lane vs later lanes

If you only need the truthful packet order, keep it this simple:

  • Primary product lane
    • canonical run -> retained evidence -> recovery/review
    • this is the stable repo identity and the default public story
  • Current MCP lane that works now
    • local checkout + stdio through pnpm mcp:start
    • apps/mcp-server/ is the governed MCP side road for that local lane, not anew top-level product identity
  • Current public skill/discovery lanes
    • the ClawHub skill page is live as a public discovery page for therepo-owned ProofTrail MCP skill packet
    • the repo-owned skill packet under skills/prooftrail-mcp/ is materializedhere, but no generic cross-host skill-registry publication is evidenced yet
    • the OpenHands/extensions submission is a separate review-pending lane,not a live listing
  • Later / contract-only lanes
    • npm package publication
    • MCP Docker image publication
    • Official MCP Registry listing
    • vendor-specific plugin or official integration claims

Those later lanes can be documented now, but they must stay documented aslater / contract-only / not yet live until a fresh upstream read-backexists.

ProofTrail storefront loop

The static storefront hero source still lives atassets/storefront/prooftrail-readme-hero.svg.

The storefront command-center screenshot artifact lives atassets/storefront/prooftrail-hero.png.

ProofTrail is for AI agents and human operators who need browser automationto stay inspectable, replayable, and recoverable after the first run.

Category Fit

ProofTrail is an evidence-first browser automation product:

  • run one canonical workflow
  • inspect one retained evidence bundle
  • recover with structured guidance
  • expose the same trusted surfaces to MCP clients and optional AI helpers

Why ProofTrail

  • One canonical path: start with just run, then inspect one retainedevidence bundle instead of juggling ad-hoc scripts and shell fragments.
  • Recovery before guesswork: move from explanation to recovery to comparebefore you fall back to raw logs or helper-path debugging.
  • AI and MCP in the right place: use AI reconstruction and MCP as governedside roads after the first proof run, not as replacements for thedeterministic mainline.
  • Strong AI-builder fit without fake heat: ProofTrail is a truthful browserevidence layer for Codex, Claude Code, OpenHands, OpenCode, OpenClaw, andother AI-agent workflows that need retained proof, recovery, and governedintegration instead of prompt-only browser improvisation.

Desktop host-automation note:

  • desktop smoke / e2e / business / soak are now operator-manual lanes
  • they require UIQ_DESKTOP_AUTOMATION_MODE=operator-manual plusUIQ_DESKTOP_AUTOMATION_REASON=<auditable reason>

Builder Entry

If you are integrating ProofTrail into another toolchain, use this order:

  1. API Builder Quickstart
  2. Universal API Reference
  3. node --import tsx contracts/scripts/generate-client.ts --verify
  4. ProofTrail MCP Server README

That sequence helps you separate:

  • the API contract layer
  • the generated-client freshness path
  • the governed MCP tool surface

The checked-in client under apps/web/src/api-gen/ is a repo-localgenerated helper, not a published SDK package.

MCP Install Surfaces

Use these four layers to avoid mixing a working local install with publicdiscovery pages or unpublished package contracts.

  • Current / usable now
    • local checkout + stdio through pnpm mcp:start
    • optional UIQ_MCP_API_BASE_URL / UIQ_MCP_AUTOMATION_TOKEN when you wantthe MCP process to call a live backend
  • Live public skill page
    • the ClawHub ProofTrail MCP page is live as a discovery surface for theskill packet
    • that page does not turn ProofTrail into a hosted endpoint, officialplugin, or generic skill-registry publication
  • Repo-owned skill packet and review lanes
    • skills/prooftrail-mcp/ is the repo-owned install skill packet
    • OpenHands/extensions is still review-pending
    • no generic cross-host skill-registry listing is evidenced yet
  • Contract-only later lanes
    • npm package: @prooftrail/mcp-server
    • Docker image: ghcr.io/xiaojiou176-open/prooftrail-mcp-server:0.1.1
    • Official MCP Registry stays blocked until the npm package is actuallypublished

Those future-facing names are part of the public contract now, but they arenot live install paths until the package/image is actually published.

If you are evaluating this repo for Codex, Claude Code,OpenHands, OpenCode, OpenClaw, or similar coding-agent workflows,keep the fit narrow and honest:

  • ProofTrail does not replace a coding agent
  • it fits as the browser execution, retained evidence, recovery, and governedMCP/API layer that a coding agent can call
  • the best public entry is stillProofTrail for AI Agents, then thebuilder/API and MCP pages

For Coding-Agent And Agent-Stack Workflows

If you found ProofTrail while searching for:

  • browser automation for Codex
  • browser automation for Claude Code
  • browser automation for OpenHands
  • browser automation for OpenCode
  • browser automation for OpenClaw
  • MCP browser automation for AI agents
  • API-first browser evidence for tool-using agents

read this repo in one very specific way:

ProofTrail is a browser-execution and evidence layer for agent shells such asCodex, Claude Code, OpenHands, OpenCode, OpenClaw, and other tool-using AIworkflows. It is not claiming to be an official vendor-specificintegration, plugin, or generic AI assistant shell.

The most truthful ecosystem fit today looks like this:

Ecosystem Best public angle Best first road
Claude Code governed browser-evidence side road for a tool-using coding shell ProofTrail for Coding Agents and Agent Ecosystems -> MCP for Browser Automation
Codex browser-evidence substrate with API-first control and optional MCP ProofTrail for Coding Agents and Agent Ecosystems -> API Builder Quickstart
OpenHands browser-evidence subsystem behind a larger orchestration runtime ProofTrail for AI Agents -> API Builder Quickstart
OpenCode governed MCP browser surface behind the coding-agent shell ProofTrail for Coding Agents and Agent Ecosystems -> MCP for Browser Automation
OpenClaw browser workflow backend behind a multi-channel gateway or tool router ProofTrail for Coding Agents and Agent Ecosystems -> API Builder Quickstart

The truthful bridge is:

  1. ProofTrail for AI Agents
  2. ProofTrail for Coding Agents and Agent Ecosystems
  3. MCP for Browser Automation
  4. API Builder Quickstart

This discovery layer is not claiming official vendor integrations, plugins,or a generic AI assistant shell.

That order keeps search intent and product truth aligned:

  • audience fit first
  • coding-agent fit second
  • governed MCP tool use third
  • direct API control after that

Ecosystem Fit At A Glance

If you only have ten seconds, use the map below like an airport departuresboard:

  • Claude Code and OpenCode are the clearest MCP-first fits today
  • Codex, OpenHands, and OpenClaw usually start API-first or hybrid
  • ProofTrail stays the browser-evidence and recovery layer in all cases

The ecosystem-fit visual source lives atassets/storefront/prooftrail-agent-ecosystem-map.svg.

Explore the Product Surface

If you want the shortest truthful way to understand where ProofTrail fits, usethese six pages as the current public matrix:

  1. ProofTrail for AI Agents
  2. ProofTrail for Coding Agents and Agent Ecosystems
  3. MCP for Browser Automation
  4. AI Reconstruction Side Road
  5. ProofTrail vs Generic Browser Agents
  6. Evidence, Recovery, and Review Workspace

If your search intent sounds more like:

  • browser automation for Codex
  • browser automation for Claude Code
  • browser automation for OpenHands
  • browser automation for OpenCode
  • browser automation for OpenClaw
  • MCP browser automation for AI agents
  • browser evidence layer for coding agents

start with ProofTrail for AI Agentsbefore dropping into the lower-level builder pages.

That sequence keeps the outward story honest:

  • audience fit first
  • coding-agent fit second
  • governed MCP and AI side roads next
  • alternatives framing after that
  • evidence/recovery/review loop as the deepest current product proof

Use the builder entry separately.

The outward matrix explains category fit and product shape.The builder entry explains contract-level integration once that product shapealready makes sense.

If you are coming from the builder side instead of the operator side, pair thatmatrix with the API Builder Quickstartso the public story and the integration story stay connected.

First Practical Win

Choose the shortest path that matches what you want to confirm first:

  • If you want to produce one canonical run:Start with just setup && just run.You should get a new run directory under.runtime-cache/artifacts/runs/<runId>/ with manifest and proof reports.
  • If you want to know what good evidence should look like:Start with docs/reference/run-evidence-example.md.That page shows the concrete report shape a healthy run should produce.
  • If you want to follow the guided operator path:Start with docs/getting-started/human-first-10-min.md.That is the shortest human-readable route from fresh checkout to inspectableproof.

15-Minute Evaluation Path

If you are seeing ProofTrail for the first time, keep the first pass simple:

  1. run just setup
  2. run just run
  3. inspect the resulting bundle under .runtime-cache/artifacts/runs/<runId>/
  4. use the command center for the deeper follow-up:
    • Quick Launch to repeat the canonical path
    • Task Center to confirm the result and inspect retained evidence
    • Recovery Center inside Task Center before raw logs or replay
    • Flow Workshop only after you already have one clear result

Treat helper and workshop commands like the advanced bench. Keep them available,but do not use them as the default first step.

The evaluator path for a first pass is intentionally short:

  1. run the canonical flow
  2. confirm the visible result in Task Center
  3. inspect the retained evidence bundle
  4. use Recovery Center before raw logs or shell fallbacks
  5. only then open sharing, compare, or deeper workshop tools

If you are evaluating through the local command center instead of only the CLI,use the same story in product form:

  1. Quick Launch: start the canonical run first
  2. Task Center: confirm the result and inspect the evidence state first
  3. Recovery Center: use the recovery layer inside Task Center before raw logsor workshop replay
  4. Flow Workshop: refine drafts or replay steps only after the first resultalready exists

What This Repo Actually Does

How do we make browser automation reproducible, inspectable, and recoverable?

ProofTrail gives you one public mainline for running a workflow, one evidencebundle for understanding what happened, and one shared repo for the backend,web command center, automation runner, and MCP adapter that support that flow.

Think of the product in two layers:

  • Primary layer: canonical run, evidence, recovery
  • Secondary layer: template reuse, compare, studio tuning, AIreconstruction, MCP integration

The canonical public mainline is:

  1. run just setup
  2. run just run
  3. inspect .runtime-cache/artifacts/runs/<runId>/

just run is the canonical public mainline wrapper forpnpm uiq run --profile pr --target web.local.

just run-legacy remains available for lower-level workshop troubleshooting,but it is not the canonical public mainline.

Why Teams Use It

  • Fewer mystery failures: every canonical run writes a manifest-anchoredevidence bundle with summary, index, and proof reports instead of leavingyou with scattered logs and screenshots.
  • Easier recovery: the web command center, run records, and flow workshopare built to help you inspect, replay, and repair workflows after somethingbreaks.
  • One repo, one story: backend orchestration, operator UI, automationrunner, and release proof surfaces live together, so docs and runtime truthcan stay aligned.

Quickstart

Requirements:

  • Python 3.11+
  • Node.js 20+
  • pnpm
  • uv
  • just
  1. Install dependencies and local tooling.
just setup
  1. Run the canonical workflow.
just run
  1. Inspect the resulting evidence bundle.
ls .runtime-cache/artifacts/runs

What good looks like:

  • a new run directory appears under .runtime-cache/artifacts/runs/<runId>/
  • the run contains manifest.json, reports/summary.json,reports/diagnostics.index.json, reports/log-index.json,reports/proof.coverage.json, reports/proof.stability.json,reports/proof.gaps.json, and reports/proof.repro.json
  • manifest.json points back to those proof artifacts through bothmanifest.proof and manifest.reports
  • the same orchestrator-first chain is reachable throughpnpm uiq run --profile pr --target web.local
  • even when the PR gate fails, reports/summary.json still tells you whyinstead of leaving you with a silent shell failure

If just run fails, start with thehuman-first 10 minute guide andthe run evidence example beforedropping to legacy helper paths.

If just run succeeds, the next stop is Task Center: confirm the result, readthe evidence summary, and only then move into explain/share/recovery paths.

If you are already in the Web command center, keep the same order:

  1. start from Quick Launch
  2. move to Task Center to confirm the result and inspect evidence
  3. use Recovery Center before diving into raw logs
  4. open Flow Workshop only when you intentionally need the advanced draftor replay surfaces

After The First Successful Run

Once one canonical or operator-supported result already exists, the next valuelayer is no longer "how do I start?" It becomes "how do I reuse, compare,operate, and hand this off without losing trust?"

Use the product surfaces in this order:

  1. Template reuse / readiness in Flow Workshop
    • Ask: "Is this flow stable enough to reuse, or should it stay in workshop mode?"
    • Treat readiness as a reuse verdict, not as a vanity score.
  2. Compare in Task Center
    • Ask: "How does this retained run differ from a baseline run?"
    • Use compare to judge change, not to replace the canonical evidence bundle.
  3. Profile / Target Studio
    • Ask: "Which knobs are safe to tune, and what validation runs when I save?"
    • Studio is a guarded operator surface, not a raw YAML editor.
  4. AI reconstruction
    • Ask: "Do I need help rebuilding a flow from artifacts?"
    • Use it only after artifacts already exist; it is an optional advanced helper.
  5. MCP
    • Ask: "Do I need an external AI client to inspect runs or operate thisrepo safely?"
    • Treat it as an integration side road, not as a replacement for just run.
  6. Review Workspace
    • Ask: "Do I need one review-ready packet before I hand this run to another maintainer?"
    • Treat it as a local-first review packet, not as a hosted collaboration product.
  7. Template Exchange
    • Ask: "Do I need to move a reusable template contract into another checkout?"
    • Use import/export/share for that handoff, not a marketplace mental model.

Wave 5 also makes the recovery boundary more explicit:

  • inspect_task-style actions are safe to suggest immediately
  • replay actions stay human-confirmed
  • OTP, provider-step, and manual-input actions stay manual-only

Outward Product Story

Use this mental model when you explain ProofTrail to a new evaluator:

  • What it is: evidence-first browser automation with recovery and MCP
  • Who it helps: AI agents and human operators who need trustworthy browser workflows
  • Why it feels different: the product does not stop at“the automation ran”; it keeps the evidence, recovery path, and handoffsurfaces attached to the run
  • Where AI fits: AI reconstruction helps after artifacts already exist
  • Where MCP fits: MCP exposes the same governed surfaces to external AI clients

Suitable / Not Suitable

Suitable for:

  • teams standardizing browser automation runs across operators and environments
  • maintainers who need inspectable evidence instead of ad-hoc shell output
  • workflows where replay, diagnostics, and recovery matter as much asfirst-run success

Not suitable for:

  • tiny one-off browser scripts where no shared evidence or recovery path is needed
  • teams unwilling to maintain a Python + Node workspace
  • people looking for a hosted SaaS with zero local setup

Validation and Governance

ProofTrail keeps the public story honest by separating runtime proof fromgovernance checks.

  • Minimal success case
  • Run evidence example
  • Quality gates
  • Changelog
  • Release guide
  • Release supply-chain policy
  • Maintainer GitHub closure evidence: just github-closure-report

Public collaboration contract:

  • external pull requests stay on GitHub-hosted, low-risk governance and buildlanes
  • live, external, and owner-secret workflows are manual-only and require theprotected owner-approved-sensitive environment
  • macOS-only smoke and regression lanes use GitHub-hosted macos-latest;self-hosted / shared-pool are not part of the public collaborationcontract

Maintainer Space Hygiene

ProofTrail treats disk cleanup as a governed maintenance path, not an ad-hoc"delete the biggest folder" exercise.

  • just space-report emits a repo-exclusive JSON report for runtime buckets,safe-clean residue, explicit reclaim candidates, protected totals, and thededicated external pnpm layer
  • just space-clean-safe runs a default dry-run for the low-risk cleanupwave; use ./scripts/space-clean-safe.py --apply only when you want toexecute the same safe-clean list
  • just space-clean-reclaim runs a default dry-run for largerepo-exclusive reclaim targets such as the root .venv, isolatednode_modules, and the repo-scoped pnpm store; use explicit --scope ...plus --apply only after the matching validation gate passes
  • just runtime-gc -- --dry-run previews retention-based cleanup for thereview-class runtime buckets before you let the same policy delete old files
  • canonical run evidence under .runtime-cache/artifacts/runs/, runtimebackups under .runtime-cache/backups/, and managed toolchains under.runtime-cache/toolchains/ are intentionally outside the first cleanup wave
  • empty run stub directories under .runtime-cache/artifacts/runs/ are theone explicit exception: they may enter the first safe-clean wave only whenthey are still empty and have no evidence files yet
  • the canonical Python runtime target is .runtime-cache/toolchains/python/.venv;the root .venv is a retired legacy surface and may only be reclaimedthrough space-clean-reclaim --scope root-venv after single-track validation

FAQ

Do I need the legacy helper path?

No. just run is the public default road. just run-legacy is only forlower-level workshop troubleshooting when you need to inspect helper-pathbehavior directly.

Where should I look after a run finishes?

Start with .runtime-cache/artifacts/runs/<runId>/manifest.json, then openreports/summary.json, the four reports/proof.*.json files, and the indexfiles described indocs/reference/run-evidence-example.md.

If you are using the local command center, the matching product path is:

  1. open Task Center
  2. inspect the evidence state first: retained, partial, missing, or empty
  3. use Failure Explainer to understand the current run
  4. use Share Pack when you want a handoff-friendly summary
  5. use Compare when you need a baseline-versus-candidate judgment
  6. treat Promotion Candidate as a later release/showcase decision, not as thefirst diagnostic step

Is this repository already a full docs site?

Not yet. Today the GitHub README is the conversion page, and the docs surface isthe supporting second layer. See docs/index.md for the currentpublic docs map.

Repository Map

  • apps/api/ - backend API and orchestration services
  • apps/web/ - operator-facing web command center
  • apps/automation-runner/ - record, replay, and reconstruction pipeline
  • apps/mcp-server/ - MCP adapter
  • packages/ - shared orchestration and runtime packages
  • configs/ - environment, schema, and governance configuration
  • contracts/ - API contracts
  • scripts/ - repo entrypoints and CI helpers
  • docs/ - storefront-supporting public docs surface

Documentation

  • Docs map: docs/index.md
  • Public docs overview: docs/README.md
  • Architecture contract: docs/architecture.md
  • CLI guide: docs/cli.md
  • docs/localized/zh-CN/README.md

Security and Contribution

  • SECURITY.md
  • SUPPORT.md
  • CONTRIBUTING.md
  • CODE_OF_CONDUCT.md

License

This repository is released under the MIT License.

If ProofTrail saves you time during evaluation or implementation, star the reposo you can find the updates, release notes, and new evidence examples later.

MCP Server · Populars

MCP Server · New

    evalops

    Deep Code Reasoning MCP Server

    A Model Context Protocol (MCP) server that provides advanced code analysis and reasoning capabilities powered by Google's Gemini AI

    Community evalops
    aiagenta2z

    OneKey Gateway

    OneKey Agent Gateway access to Commercial APIs, Skills, MCPs Build once and Ship 10× faster. For devs & builders, A unified API registry to serve and distribute across all agent formats—CLI, REST, MCP, and Skills—eliminating multiple builds. Access 30+ categories like search, image, finance, and 3D Rendering, register, and monetize APIs 10× faster

    Community aiagenta2z
    joeseesun

    🎯 多源内容 → NotebookLM 智能处理器

    Claude Skill: Multi-source content processor for NotebookLM. Supports WeChat articles, web pages, YouTube, PDF, Markdown, search queries → Podcast/PPT/MindMap/Quiz etc.

    Community joeseesun
    sgroy10

    speclock

    AI Constraint Engine — enforces CLAUDE.md, .cursorrules, AGENTS.md rules as laws. 51 MCP tools, 991 tests. Official MCP Registry. npx speclock protect

    Community sgroy10
    googleapis

    MCP Toolbox for Databases

    MCP Toolbox for Databases is an open source MCP server for databases.

    Community googleapis