sagarmk

Beacon

Community sagarmk
Updated

Semantic code search plugin for Claude Code using hybrid vector search + BM25. 98% accuracy, 5x faster than grep.

Beacon

Turn Claude Code into Cursor. Semantic code search that understands your codebase — find code by meaning, not just string matching.

Quick Start · Usage · Models · Commands · Config · Examples

98.3% accuracy · 5x faster than grep · 20-query benchmark on a real codebase

Quick Start

1. Install Ollama (local embeddings, free)

brew install ollama
ollama serve &
ollama pull nomic-embed-text

2. Install the Beacon plugin

claude plugin marketplace add sagarmk/Claude-Code-Beacon-Plugin
claude plugin install beacon@claude-code-beacon-plugin

3. Start Claude Code

claude

That's it. On first session start, Beacon will:

  1. Install npm dependencies automatically (native modules like better-sqlite3 — takes a few seconds)
  2. Index your codebase in the background

No npm install, no manual setup. Just install and go.

Usage

After installing, Beacon indexes automatically on session start. Here's the essentials:

Force a full re-index

> /reindex

Deletes existing embeddings and rebuilds from scratch — useful after switching models or if the index gets stale.

Check index health

> /index
Beacon Index

● ● ● ● ●    nomic-embed-text · Ollama (local)
● ● ● ● ●    768 dims · 3.8 MB
● ● ● ● ●
● ● ● ● ●    Coverage: 100% (38/38 files)

              Indexed by extension
              ● .js  25 files
              ● .md  13 files

              Statistics
              Indexed files    38
              Total chunks     109
              Avg chunks/file  2.9
              Last sync        2 minutes ago

For a quick numeric summary:

> /index-status
{
  "files_indexed": 38,
  "total_chunks": 114,
  "last_sync": "2026-03-01T04:30:21.453Z",
  "embedding_model": "nomic-embed-text",
  "embedding_endpoint": "http://localhost:11434/v1"
}

Search your codebase

> /search-code "authentication flow"
[
  {
    "file": "src/middleware/auth.ts",
    "lines": "12-45",
    "similarity": "0.82",
    "score": "0.74",
    "preview": "export async function verifyAuth(req, res, next) {\n  const token = req.headers.authorization?.split(' ')[1];\n  ..."
  },
  {
    "file": "src/routes/login.ts",
    "lines": "8-32",
    "similarity": "0.78",
    "score": "0.65",
    "preview": "router.post('/login', async (req, res) => {\n  const { email, password } = req.body;\n  ..."
  }
]

Hybrid search combines semantic similarity (understands meaning), BM25 keyword matching, and identifier boosting — so searching "auth flow" finds code about authentication even if it never uses the word "auth".

Options: --top-k N (results count), --threshold F (min score), --path <dir> (scope to directory), --no-hybrid (pure vector search).

Embedding Models

Beacon runs on open-source models by default — no API keys, no cloud costs, fully local via Ollama.

Model Dims Context Speed Best for
nomic-embed-text (default) 768 8192 Fast General-purpose, great code search
mxbai-embed-large 1024 512 Fast Higher accuracy, larger vectors
snowflake-arctic-embed:l 1024 512 Medium Strong retrieval benchmarks
all-minilm 384 512 Very fast Lightweight, low resource usage

To switch models, pull with Ollama and update your config:

ollama pull mxbai-embed-large
// .claude/beacon.json
{
  "embedding": {
    "model": "mxbai-embed-large",
    "dimensions": 1024,
    "query_prefix": ""
  }
}

Then run /reindex to rebuild with the new model.

Cloud Providers

For cloud-hosted embeddings, create .claude/beacon.json in your repo:

OpenAI
export OPENAI_API_KEY="sk-..."
{
  "embedding": {
    "api_base": "https://api.openai.com/v1",
    "model": "text-embedding-3-small",
    "api_key_env": "OPENAI_API_KEY",
    "dimensions": 1536,
    "batch_size": 100,
    "query_prefix": ""
  }
}
Voyage AI
export VOYAGE_API_KEY="pa-..."
{
  "embedding": {
    "api_base": "https://api.voyageai.com/v1",
    "model": "voyage-code-3",
    "api_key_env": "VOYAGE_API_KEY",
    "dimensions": 1024,
    "batch_size": 50,
    "query_prefix": ""
  }
}
LiteLLM proxy (Vertex AI, Bedrock, Azure, etc.)
pip install litellm
litellm --model vertex_ai/text-embedding-004 --port 4000
{
  "embedding": {
    "api_base": "http://localhost:4000/v1",
    "model": "vertex_ai/text-embedding-004",
    "api_key_env": "LITELLM_API_KEY",
    "dimensions": 1024,
    "batch_size": 50,
    "query_prefix": ""
  }
}
Custom endpoint

Any server implementing the OpenAI /v1/embeddings API will work. Set api_base, model, dimensions, and optionally api_key_env in .claude/beacon.json.

Commands

Beacon indexes your codebase automatically on session start and re-embeds files as you edit — no manual steps needed.

Search
Command Description
/search-code Hybrid code search — semantic + keyword + BM25 matching. Supports --path <dir> to scope results
Index
Command Description
/index Visual overview — files, chunks, coverage, provider
/index-status Quick health check — file count, chunk count, last sync
/reindex Force full re-index from scratch
/run-indexer Manually trigger indexing
/terminate-indexer Kill a running sync process
Config
Command Description
/config View and modify Beacon configuration
/blacklist Prevent indexing of specific directories
/whitelist Allow indexing in otherwise-blacklisted directories

Beacon also provides a code-explorer agent and a semantic-search skill that Claude can invoke automatically.

Why Beacon?
  • Understands your questions — ask "where is the auth flow?" and get lib/auth.ts, not every file containing "auth"
  • Query expansion — searches for "auth" automatically find code mentioning "authentication", "authorize", and "login"
  • Stays in sync automatically — hooks handle full index, incremental re-embedding on edits, and garbage collection
  • Resilient — retries with backoff on transient failures, auto-recovers from DB corruption, debounces GC
  • Works with any embedding provider — Ollama (local/free), OpenAI, Voyage AI, LiteLLM, or any OpenAI-compatible API
  • Gives Claude better context — slash commands, a code-explorer agent, and a grep-nudge hook for smarter search
How It Works

Beacon uses Claude Code hooks to stay in sync with your codebase:

Hook Trigger What it does
SessionStart Every session Ensures npm deps are installed (first run only), then full index or diff-based catch-up
PostToolUse Write, Edit, MultiEdit Re-embeds the changed file
PostToolUse Bash Garbage collects embeddings for deleted files
PreCompact Before context compaction Injects index status so search capability survives compaction
PreToolUse Grep Intercepts grep and redirects to Beacon for semantic-style queries
Configuration

Default configuration (config/beacon.default.json):

{
  "embedding": {
    "api_base": "http://localhost:11434/v1",
    "model": "nomic-embed-text",
    "api_key_env": "",
    "dimensions": 768,
    "batch_size": 10,
    "query_prefix": "search_query: "
  },
  "chunking": {
    "strategy": "hybrid",
    "max_tokens": 512,
    "overlap_tokens": 50
  },
  "indexing": {
    "include": ["**/*.ts", "**/*.tsx", "**/*.js", "..."],
    "exclude": ["node_modules/**", "dist/**", "..."],
    "max_file_size_kb": 500,
    "auto_index": true,
    "max_files": 10000,
    "concurrency": 4
  },
  "search": {
    "top_k": 10,
    "similarity_threshold": 0.35,
    "hybrid": {
      "enabled": true,
      "weight_vector": 0.4,
      "weight_bm25": 0.3,
      "weight_rrf": 0.3,
      "doc_penalty": 0.5,
      "identifier_boost": 1.5,
      "debug": false
    }
  },
  "storage": {
    "path": ".claude/.beacon"
  }
}
Option Default Description
embedding.api_base http://localhost:11434/v1 Embedding API endpoint
embedding.model nomic-embed-text Embedding model name
embedding.dimensions 768 Vector dimensions (must match model)
embedding.query_prefix search_query: Prefix prepended to search queries
indexing.include Common code patterns Glob patterns for files to index
indexing.exclude node_modules, dist, etc. Glob patterns to skip
indexing.max_file_size_kb 500 Skip files larger than this
indexing.auto_index true Auto-index on session start
indexing.concurrency 4 Number of files to index in parallel
search.top_k 10 Max results per query
search.similarity_threshold 0.35 Minimum similarity score
search.hybrid.enabled true Enable hybrid search (set false for pure vector)
Per-repo overrides

Create .claude/beacon.json in any repo to override defaults. Values are deep-merged with the default config:

{
  "embedding": {
    "api_base": "https://api.openai.com/v1",
    "model": "text-embedding-3-small",
    "api_key_env": "OPENAI_API_KEY",
    "dimensions": 1536
  },
  "indexing": {
    "include": ["**/*.py"],
    "max_files": 5000
  }
}
Storage

Beacon stores its SQLite database at .claude/.beacon/embeddings.db (configurable via storage.path). This file is auto-generated and safe to delete — run /reindex to rebuild. The database uses sqlite-vec for vector search and FTS5 for keyword matching.

Troubleshooting

What if Ollama is down?

Beacon degrades gracefully when the embedding server is unreachable — it never blocks your session. Embedding requests automatically retry with backoff (1s, 4s) before giving up.

Scenario Behavior
Session start Sync is skipped, error is logged, session continues normally
Search Falls back to keyword-only (BM25) search — still returns results
File edits Re-embedding fails silently, old embeddings are preserved
Status commands Work normally (DB-only, no Ollama needed)
DB corruption Auto-detected and rebuilt on next sync

Start Ollama at any time and run /run-indexer to catch up.

Manual indexing

Command What it does
/run-indexer Manually trigger indexing — useful when auto_index is off or after starting Ollama late
/reindex Force a full re-index from scratch (deletes existing embeddings first)
/terminate-indexer Kill a stuck sync process and clean up lock state

Checking index health

Run /index for a visual overview with a coverage bar, file list, and provider info. For a quick numeric summary, use /index-status — it shows file count, chunk count, and last sync time.

Things to look for:

  • Low coverage % — files may be excluded by glob patterns or exceeding max_file_size_kb
  • Sync status errors — usually means the embedding server was unreachable during the last sync
  • Stale sync warnings — the index hasn't been updated recently; run /run-indexer to refresh

Verifying search

Run /search-code with a test query to confirm search is working. If results include "FTS-only" in debug output, the embedding server is unreachable — search still works but without semantic matching (keyword/BM25 only).

Examples

See EXAMPLES.md for real-world use cases — intent-based search, codebase navigation, identifier tracking, and auto-sync — each with concrete before/after comparisons.

License

MIT

MCP Server · Populars

MCP Server · New

    easyshell-ai

    EasyShell

    Lightweight server management & intelligent ops platform with Docker one-click deployment, batch script execution, web terminal, and AI-powered operations.

    Community easyshell-ai
    AVIDS2

    Memorix

    Cross-Agent Memory Bridge Persistent memory for AI coding agents across 10 IDEs (Cursor, Windsurf, Claude Code, Codex, Copilot, Kiro, Antigravity, OpenCode, Trae, Gemini CLI) via MCP. Team collaboration, auto-cleanup, mini-skills, workspace sync. Never re-explain your project again.

    Community AVIDS2
    zw008

    VMware AIops

    VMware vCenter/ESXi AI-powered monitoring and operations. Two skills: vmware-monitor (read-only, safe) and vmware-aiops (full operations) | Claude Code Skill

    Community zw008
    Dave-London

    Pare

    Dev tools, optimized for agents. Structured, token-efficient MCP servers for git, test runners, npm, Docker, and more.

    Community Dave-London
    luckyPipewrench

    Pipelock

    Firewall for AI agents. DLP scanning, SSRF protection, bidirectional MCP scanning, tool poisoning detection, and workspace integrity monitoring.

    Community luckyPipewrench