๐ณ Tree-sitter Analyzer
English | ๆฅๆฌ่ช | ็ฎไฝไธญๆ
The MCP code-intelligence server for AI agents โ fewer tokens, fewer tool calls, 100 % local.Pre-indexed AST cache + 62 MCP tools + 13 curated agent skills + TOON-compressed output.Beats CodeGraph on 6-repo head-to-head median (โ11 % cost vs CodeGraph's โ4 %), with a strict CLI superset.Now with BM25-ranked symbol search across all 62 tools โ results sorted by relevance, not file path.
Get Started
One-line install for Claude Code:
claude mcp add tree-sitter-analyzer \
--env TREE_SITTER_PROJECT_ROOT="$PWD" \
-- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp
Restart your agent, then say: "Set the project root to my repo and run codegraph_status."
Other agents (Cursor, Copilot, Cline, Continue, Claude Desktop, Roo Code) โ
Why Tree-sitter Analyzer
- Token-efficient by default. Every MCP response uses TOON โ a tabular JSON variant that cuts payload by ~50-70 % vs raw JSON.
- Verdict envelopes. Every response carries
verdict: SAFE | CAUTION | UNSAFE | INFO | WARN | ERROR | NOT_FOUND, so orchestrators branch on outcomes without re-prompting. - Project health grading (AโF). No other open-source tool grades your whole project on size / complexity / coverage / duplication / dependencies / structure / git-hotspots in one call.
- 13 curated workflows (Skills). Pre-baked tool subsets for "find symbol", "trace call chain", "score health", "safe-to-edit before refactor", "PR review", etc.
- 5 layers of safety.
safe_to_edit+modification_guard+ constraint DSL +change_impact+ verdict envelopes โ designed so agents know before they touch. - Beats the leading competitor (CodeGraph) on multiple head-to-head benchmarks. See below.
Benchmark Results
Headless Claude Code (Haiku 4.5) asked one architecture question per repo. 3 arms: no-MCP / CodeGraph MCP / Tree-sitter Analyzer MCP. Single run per arm โ indicative, not statistically settled.
| Codebase | Lang / files | Baseline | CodeGraph | TSA | Winner |
|---|---|---|---|---|---|
| Gin | Go / 99 | $0.164 | $0.094 (โ43 %) | $0.080 (โ51 %) | TSA โญ |
| Alamofire | Swift / 98 | $0.201 | $0.219 (+9 %) | $0.147 (โ27 %) | TSA โญ |
| Excalidraw | TS / 603 | $0.204 | $0.179 (โ12 %) | $0.212 (+4 %) | CodeGraph |
| Django | Py / 2 910 | $0.162 | $0.106 (โ35 %) | $0.205 (+27 %) | CodeGraph |
| Tokio | Rust / 778 | $0.214 | $0.285 (+33 %) | $0.303 (+42 %) | both lose |
| OkHttp | Java / 596 | $0.169 | $0.200 (+18 %) | $0.178 (+5 %) | both lose |
| Median ฮ vs baseline | โ4 % | โ11 % | TSA |
TSA wins outright on 2 of 6 repos, has a lower median cost saving (โ11 %), and matches CodeGraph's reported direction on every repo where the indexer-class tools should help.
Why the median diverges from CodeGraph's published โ35 % claim: we used Haiku for cost control; they used Opus + 4-run median. See
docs/internal/CODEGRAPH_BENCHMARK_FINAL_2026-05-24.mdfor raw envelopes + reproducer scripts.Post-benchmark improvements (2026-05-30): (1) BM25 pre-filter narrows 40k symbols to ~400 before cosine rerank โ a 133ร speedup in semantic search. (2) Min-max BM25 normalization: relevance_score now properly differentiates strong matches (1.0) from weak (0.0) across all search paths. (3)
semantic().sort(by='confidence')now works end-to-end. These improvements were not in the benchmark run; repos with large symbol counts (Django, Excalidraw) should see improved token efficiency in re-runs.
Key Features
Pre-indexed code intelligence (CodeGraph parity + superset)
| Capability | TSA tool | Status |
|---|---|---|
| Symbol search (FTS5 + BM25 ranked) | codegraph_symbol_search |
ahead โ results sorted by relevance score, not file path |
| Go-to-def / find-refs / call hierarchy in one call | codegraph_navigate |
PRIMARY entry point |
| Bulk-fetch N related symbols + relationship map | codegraph_explore |
parity |
| Function-level blast radius + risk score | codegraph_impact |
parity + risk score |
| Who-calls-X / what-X-calls | codegraph_callers / codegraph_callees |
parity |
| Index health at-a-glance (+ edge count) | codegraph_status |
ahead โ reports total_edges for graph density signal |
| Pre-built call graph cache | codegraph_autoindex / codegraph_full_index / codegraph_incremental_sync |
parity |
| Tests affected by a change (CLI) | --affected FILE... |
parity |
Tree-sitter Analyzer exclusive
| Capability | TSA tool | Note |
|---|---|---|
| BM25-ranked symbol search | all search tools | relevance_score on every result (min-max normalized: best=1.0, weakest=0.0); sort(by='confidence') in DSL |
| Semantic search (133ร faster) | codegraph_query semantic() |
BM25 pre-filter narrows 40k symbols to ~400 before cosine rerank |
| Project AโF health grading | check_project_health |
7 dimensions (size/complexity/deps/coverage/duplication/structure/git-hotspot), no competitor offers this |
| TOON output | every tool, output_format: "toon" (default) |
50-70 % token saving |
| Verdict envelopes | every tool | SAFE/CAUTION/UNSAFE/INFO/WARN/ERROR/NOT_FOUND |
| Safe-to-edit gate | safe_to_edit + modification_guard |
refuses high-risk edits before they happen |
| Architectural constraint DSL | check_constraints |
"module A cannot import B" โ enforced |
| Code health (file-level) | check_file_health |
block/long-method/smell detection |
| Class hierarchy | codegraph_class_hierarchy |
type-inheritance tree |
| Dependency matrix | codegraph_dependency_matrix |
module-coupling matrix |
| Dead code | codegraph_dead_code |
transitive unreachable analysis |
| Complexity heatmap | codegraph_complexity_heatmap |
per-fn cyclomatic + project view |
| AST-structural clone detection | codegraph_similarity |
beyond text similarity |
| Mermaid call-graph export | codegraph_visualize |
paste-ready in docs |
| UML Mermaid export | codegraph_uml |
class / package / component / sequence diagrams |
| PR review | codegraph_pr_review |
AST-diff + semantic classify + blast radius |
| agent_summary | every response | next-step hint baked into the envelope |
| Synapse cross-file resolver | internal | import-aware, beats regex guessing |
| Temporal activation | symbol_lineage |
per-symbol git-modification frequency |
| One-shot file orientation | smart_context |
health + exports + deps + edit-risk in one call (replaces 3-4 calls) |
| Architectural decision journal | decision_journal |
persists reasoning across sessions โ no competitor exposes this |
Skills (13 curated workflows)
CodeGraph has zero skills. We ship 13 under .claude/skills/tsa-*/:
tsa-landing, tsa-find, tsa-graph, tsa-structure, tsa-deps, tsa-index, tsa-health-watch, tsa-edit-safety, tsa-edit-then-verify, tsa-constraints, tsa-pr-review, tsa-refactor-queue, tsa-temporal.
Each skill ships an allowed-tools subset + procedure recipe + decision-surface schema, so the agent doesn't have to triage 62 tools on every question.
255 CLI flags
Strict superset of CodeGraph's 15-command CLI. Highlights:
tree-sitter-analyzer --table full <file> # method/signature/complexity table
tree-sitter-analyzer --partial-read --start-line N --end-line M <file>
tree-sitter-analyzer --project-health # A-F grade across the project
tree-sitter-analyzer --callers <symbol> # who-calls
tree-sitter-analyzer --codegraph-impact <fn> # blast radius + risk
tree-sitter-analyzer --affected <file...> # tests transitively affected
tree-sitter-analyzer --dead-code # transitive unreachable
tree-sitter-analyzer --check-constraints # architectural rules
tree-sitter-analyzer --safe-to-edit <file> # refuse if risky
tree-sitter-analyzer --uml class # Mermaid UML class diagram
See docs/CODEMAPS/cli.md for the full surface.
Quick Start
1. Install dependencies
# uv (required)
curl -LsSf https://astral.sh/uv/install.sh | sh # macOS / Linux
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" # Windows
# fd + ripgrep (required for search)
brew install fd ripgrep # macOS
winget install sharkdp.fd BurntSushi.ripgrep.MSVC # Windows
2. Install Tree-sitter Analyzer
uv add "tree-sitter-analyzer[all,mcp]"
3. Hook it into your agent
See Supported Agents. Most clients want this MCP server entry:
{
"mcpServers": {
"tree-sitter-analyzer": {
"command": "uvx",
"args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
"env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }
}
}
}
After restart: "Set the project root to my repo and call codegraph_status."
How It Works
Source code โ tree-sitter parse โ SQLite + FTS5 index (.ast-cache/index.db)
โ
codegraph_navigate / codegraph_explore / codegraph_callers / ...
โ
TOON-compressed envelope
(verdict + agent_summary + data)
โ
MCP client / CLI consumer
The index is built lazily on first query, refreshed on file change via a content-hash diff (codegraph_incremental_sync). All 62 tools read from the same .ast-cache/, so a query and its follow-up share work.
Supported Agents
๐ Claude Code (recommended)claude mcp add tree-sitter-analyzer \
--env TREE_SITTER_PROJECT_ROOT="$PWD" \
-- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp
Verify: claude mcp list. The 13 tsa-* skills auto-discover from .claude/skills/.
Edit claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/, Windows: %APPDATA%\Claude\, Linux: ~/.config/Claude/):
{
"mcpServers": {
"tree-sitter-analyzer": {
"command": "uvx",
"args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
"env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }
}
}
}
๐ GitHub Copilot (VS Code)
Create .vscode/mcp.json (note: servers, not mcpServers):
{
"servers": {
"tree-sitter-analyzer": {
"type": "stdio",
"command": "uvx",
"args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
"env": { "TREE_SITTER_PROJECT_ROOT": "${workspaceFolder}" }
}
}
}
๐ฑ Cursor / Cline / Continue / Roo Code
All read the same mcpServers schema as Claude Desktop. Cursor: Settings โ MCP. Cline: MCP panel โ Edit settings. Continue: ~/.continue/config.json under experimental.modelContextProtocolServers. Roo Code: MCP panel โ Edit MCP Settings.
โ ๏ธ
TREE_SITTER_PROJECT_ROOTmust be absolute. The server enforces a security boundary against escapes viaSecurityBoundaryManager.
Supported Languages
21 language plugins; 13 fully wired into the indexer (full symbol + call graph) + 5 (data/markup) reachable via the single-file CLI path + 3 scaffold (plugin exists, indexer wiring pending). The 2026-05-24 patch unblocked Swift / Kotlin / Ruby / PHP / C# that had been silently skipped for months.
| Tier | Languages |
|---|---|
| Full index + symbol + call graph | Python ยท Java ยท JavaScript ยท TypeScript ยท Go ยท Rust ยท C ยท C++ ยท C# ยท Swift ยท Kotlin ยท Ruby ยท PHP |
| Single-file analysis (CLI) | HTML ยท CSS ยท Markdown ยท SQL ยท YAML |
| Scaffold (plugin exists, indexer wiring pending) | bash ยท scala ยท json |
CodeGraph supports a similar set; the only popular code languages neither tool ships yet are Dart, Vue, Svelte, Lua (next-sprint backlog).
Configuration
Mostly nothing. The defaults are designed so you can hook it into your agent and forget:
- Output format: TOON. Override per-call with
output_format: "json". - Project root:
TREE_SITTER_PROJECT_ROOT(env var, MCP) or--project-root(CLI). - Cache location:
<project>/.ast-cache/. Safe to delete โ auto-rebuilds. - Optional:
TREE_SITTER_OUTPUT_PATHfor large-output write target.
Quality & Testing
| Metric | Value |
|---|---|
| Tests passed | 18,702 โ |
| Coverage | |
| Type safety | 100 % mypy |
| Platforms | macOS ยท Linux ยท Windows |
| Pre-commit gates | bandit ยท mypy ยท pyupgrade ยท detect-secrets ยท codemap-sync ยท smell-ratchet |
uv run pytest -q # full suite
uv run python check_quality.py --new-code-only # quality gate
Troubleshooting
| Symptom | Fix |
|---|---|
unsupported language on .swift / .kt / .rb / .php / .cs |
Update to โฅ 1.12.x โ the 5-language gap was patched in commit 50e99a8f. |
| MCP server doesn't appear in client | TREE_SITTER_PROJECT_ROOT must be absolute; restart the client after config edit. |
database is locked |
Stop any other process holding .ast-cache/index.db; if persistent, rm -rf .ast-cache && tree-sitter-analyzer --autoindex. |
| Slow first call | First call builds the index. Subsequent calls are sub-second. Run --full-index upfront to amortise. |
| Agent picks the wrong tool | Use a tsa-* skill (/tsa-graph, /tsa-find, ...) โ each skill restricts the visible tool set to one workflow. |
Development
git clone https://github.com/aimasteracc/tree-sitter-analyzer.git
cd tree-sitter-analyzer
uv sync --extra all --extra mcp
uv run pytest -q
See docs/CONTRIBUTING.md for the development guide.