simonsysun

SeekLink

Community simonsysun
Updated

SeekLink — hybrid semantic search for markdown vaults. Four-channel RRF fusion, MLX reranker, native CJK support. Fully local.

SeekLink

English · 中文

PyPIPython 3.11+TestsLicense: MIT

SeekLink is a local semantic search CLI and optional read-only MCP stdio serverfor Markdown vaults. It indexes a folder of .md files, searches with hybridkeyword + vector retrieval, and returns line-anchored results that humans andagents can read with simple shell commands.

It is built for personal knowledge bases, Obsidian-compatible vaults, bilingualEnglish/Chinese notes, and local agent workflows. MCP clients such as ClaudeCode, Cursor, and VS Code can call the same read-only search/get/status/doctorsurface through seeklink[mcp]. It is also a useful search layer for Markdownwiki patterns such as Andrej Karpathy'sllm-wiki:an agent can search existing pages, read precise line windows, then update thewiki without sending the vault to a hosted service.

Everything runs locally. No API key. No cloud search service. No Obsidian pluginrequired.

Install

uv tool install seeklink
# or
pip install seeklink

For Apple Silicon reranking support, install the optional MLX extra:

uv tool install "seeklink[mlx]"
# or
pip install "seeklink[mlx]"

For Model Context Protocol (MCP) clients such as Claude Code, Cursor, or VSCode, install the optional MCP extra:

uv tool install "seeklink[mcp]"
# or
pip install "seeklink[mcp]"

SeekLink requires Python's sqlite3 module to be linked against SQLite3.45 or newer with FTS5 enabled. seeklink status --vault PATH checks this andprints a clear error if the runtime SQLite is too old.

Quick Start

# 1. Build the index first.
seeklink index --vault /path/to/vault

# 2. Search it.
seeklink search "machine learning" --vault /path/to/vault

Daily use is simpler if you set a default vault:

export SEEKLINK_VAULT=/path/to/vault
seeklink index
seeklink search "agent memory systems"
seeklink get notes/agent-memory-patterns.md:1 -C 20

seeklink search and single-file seeklink index path/to/file.md use aresident daemon when --vault is not passed. The daemon keeps the embedder andoptional reranker warm in memory; on macOS this appears as a local Pythonprocess. It is local-only, uses a Unix socket, and does not open a network portor call a cloud service. By default it exits after 15 minutes of inactivity.Full-vault seeklink index runs in-process so progress stays on stderr and thefinal Done: summary stays on stdout. seeklink status and seeklink getalways stay cold-start: status only reads SQLite metadata, and get reads thefile directly from disk. Use --no-daemon, SEEKLINK_NO_DAEMON=1, or anexplicit --vault PATH when a script needs a one-shot cold-start path.

MCP users follow the same first step: build the index withseeklink index --vault PATH before registering the MCP server.

Output

Text search output is stable:

  SCORE  PATH[:LINE]  TITLE
           <content preview, one line, up to 120 chars>
  • PATH is relative to the vault root.
  • LINE is 1-indexed and points to the best matching chunk in the current file.
  • Exit code is 0 for success, including no results; 1 for runtimevault/config/file errors detected by SeekLink; and 2 for command-line usageerrors from argument parsing.
  • Scores are useful for sorting within one query. Do not compare scores acrossreranker-enabled and reranker-disabled runs.

Use JSON when an agent needs structured output:

seeklink search "agent memory systems" --vault PATH --json
seeklink status --vault PATH --json
seeklink doctor --vault PATH --json
seeklink daemon status --json

Common Commands

Search

seeklink search "query" --vault PATH [options]

Options:

--top-k N          Number of results. Default: 10.
--json             Emit one machine-readable JSON object.
--tags TAG [TAG]   Filter by tags. AND semantics.
--folder PREFIX    Filter by vault-relative folder prefix.
--rerank-k N|auto  Rerank candidate budget. Default: auto.
--no-rerank        Skip cross-encoder reranking for this query.
--no-daemon        Force an in-process search instead of using the daemon.
--title-weight F   Override title/alias/heading channel weight. Default: 1.5.

Get

Read a precise file window without using the database or daemon:

seeklink get notes/spaced-repetition.md
seeklink get notes/spaced-repetition.md:12
seeklink get notes/spaced-repetition.md:12 -l 40
seeklink get notes/spaced-repetition.md:12 -C 20

-l/--lines prints lines starting at LINE. -C/--context prints lines beforeand after LINE, grep-style. Path escapes such as ../.. are rejected.

Status

seeklink status --vault PATH
seeklink status --vault PATH --json

Status reports index counts, model names, index-configuration compatibility,SQLite WAL status, and freshness warnings. It does not load the embedding orreranking models.

Doctor

seeklink doctor --vault PATH
seeklink doctor --vault PATH --json

Doctor checks Python, SQLite, the local database, index compatibility, daemonstate, and optional MLX availability. It does not download or load models, butmay initialize the local SeekLink database/schema if missing.

MCP

The optional Model Context Protocol (MCP) adapter lets agent clients discoverand call SeekLink's read-only tools directly. The CLI keeps workingindependently; MCP is another surface for the same retrieval path, not areplacement.

seeklink mcp --vault PATH

Install it with seeklink[mcp]. Build the index with the CLI first:seeklink index --vault PATH. The MCP adapter is read-only and exposes fourtools: search, get, status, and doctor. It does not expose index,write notes, use HTTP/OAuth, or route through the Unix-socket daemon. Run oneMCP server per vault. search keeps its text summary compact with paths andline anchors; result previews stay in structured content for agents that needthem. status and doctor may initialize or migrate the local SeekLink schemawhen an existing .seeklink/seeklink.db needs it, but they do not index ormodify Markdown notes. If your MCP client does not inherit your shell PATH,use the absolute path from which seeklink in the examples below.

Claude Code:

claude mcp add --transport stdio --scope project seeklink \
  -- seeklink mcp --vault /ABS/PATH/TO/VAULT

Cursor .cursor/mcp.json:

{
  "mcpServers": {
    "seeklink": {
      "type": "stdio",
      "command": "seeklink",
      "args": ["mcp", "--vault", "/ABS/PATH/TO/VAULT"]
    }
  }
}

VS Code .vscode/mcp.json:

{
  "servers": {
    "seeklink": {
      "type": "stdio",
      "command": "seeklink",
      "args": ["mcp", "--vault", "/ABS/PATH/TO/VAULT"]
    }
  }
}

Index

seeklink index --vault PATH
seeklink index path/to/file.md --vault PATH

Full-vault indexing skips unchanged files by content hash unless the storedindex was built with a different embedder, vector dimension, or chunkerconfiguration, in which case SeekLink rebuilds the derived index contents.Single-file indexing updates one Markdown file only when the existing indexconfiguration is compatible.

Daemon

seeklink daemon status
seeklink daemon stop
seeklink daemon restart
seeklink daemon pid
seeklink daemon run --vault PATH

You normally do not need to start the daemon manually. search and single-fileindex auto-spawn and auto-restart it when appropriate, then it exits afterSEEKLINK_DAEMON_IDLE_TIMEOUT seconds of inactivity. The default is 900 seconds(15 minutes); set it to 0, off, false, or no to keep the daemon warmuntil stopped.

Full-vault index still runs in-process for progress output. Passing --vaultto search or single-file index forces a one-shot cold-start path because thedaemon is bound to one vault at startup. --no-daemon andSEEKLINK_NO_DAEMON=1 also force the same cold-start path. Useseeklink daemon status to inspect the warm process and seeklink daemon stopto release its memory immediately.

How Search Works

SeekLink fuses four channels with Reciprocal Rank Fusion:

Channel Purpose
BM25 / FTS5 Exact words, code terms, acronyms, CJK lexical matches
Vector search Semantic matches across different wording
Title / aliases / headings Exact note and section lookup
Wikilink indegree Small graph-quality prior from existing [[links]]

The default embedder is jinaai/jina-embeddings-v2-base-zh throughfastembed. CJK full-text search uses a jieba FTS5 tokenizer when the localPython/SQLite build can safely register it; otherwise SeekLink falls back toSQLite's built-in trigram tokenizer instead of crashing.

The default vector dimension is 768. Advanced custom-embedder experiments canset SEEKLINK_EMBEDDING_DIM, but it must match the embedder output and requiresa full seeklink index rebuild.

On Apple Silicon, SeekLink can rerank candidates withmlx-community/Qwen3-Reranker-0.6B-mxfp8 when installed with seeklink[mlx].Reranking is local and optional; if MLX is unavailable, SeekLink falls back tofirst-stage hybrid RRF ranking. Use --no-rerank for one query or setSEEKLINK_RERANKER_MODEL="" to disable it globally.

Frontmatter

Markdown frontmatter is optional. When present, SeekLink uses it for tags andaliases:

---
tags: [ai, memory]
aliases: [LLM memory, agent memory]
---
  • tags support filtered search: seeklink search "memory" --tags ai
  • aliases are indexed for search and used when resolving wikilinks

Storage

SeekLink writes one SQLite database inside the vault:

/path/to/vault/.seeklink/seeklink.db

The database contains source metadata, chunks, FTS5 tables, sqlite-vec vectors,and a wikilink graph. Delete .seeklink/ and run seeklink index to rebuild.

Supported

Area Status
Python 3.11, 3.12, 3.13, 3.14
SQLite Python sqlite3 linked against SQLite 3.45+ with FTS5
OS macOS and Linux
Windows Not supported as a first-class path
File format Markdown .md
Vault style Plain folder or Obsidian-compatible vault
CJK Native path via jieba, with trigram fallback on static SQLite builds
Reranker Optional seeklink[mlx] extra on Apple Silicon; disabled elsewhere
Daemon Single vault per machine
MCP Optional seeklink[mcp] stdio adapter, one server per vault

Not For

  • Hosted or synced multi-user search.
  • Non-Markdown sources without conversion.
  • A GUI or Obsidian plugin.
  • Sub-millisecond search over millions of notes.
  • Cloud embedding or reranking APIs.

Agent Notes

Agents can use SeekLink through ordinary subprocess calls:

seeklink status --vault PATH
seeklink index --vault PATH
seeklink search "query" --vault PATH --json
seeklink get PATH:LINE -C 20 --vault PATH

MCP clients can use the optional read-only adapter:

seeklink mcp --vault PATH

To make an agent choose SeekLink for a Markdown vault, add this to theproject's AGENTS.md, CLAUDE.md, or editor rules:

When you need to search or inspect this Markdown vault, use SeekLink for
semantic retrieval:

1. Run `seeklink status --vault PATH --json`.
2. If no index exists or files changed, run `seeklink index --vault PATH`.
3. Run `seeklink search "QUERY" --vault PATH --json`.
4. Read exact context with `seeklink get PATH:LINE -C 20 --vault PATH`.

If SeekLink is registered as an MCP server in this client, prefer the
`search`, `get`, `status`, and `doctor` MCP tools over shelling out to the CLI.

Prefer SeekLink for conceptual, cross-language, tag/folder-filtered, or
Obsidian-style note searches. Use rg for exact literal searches.

For hot loops, the daemon exposes a length-prefixed JSON protocol over the Unixsocket at ~/.rhizome/seeklink.sock. Most agents should prefer the CLI JSONsurface unless they specifically need socket-level latency.

See llms.txt for the compact agent contract.

Evaluation

Search-quality tests live in tests/blind/; the method is documented indocs/blind-test.md. Release claims should be backed bythe bundled fixture queries or by clearly labeled private-vault measurements.

Contributing

git clone https://github.com/simonsysun/seeklink
cd seeklink
uv sync --dev
uv run python -m pytest tests/ -q

Keep runtime dependencies small, keep public docs user-facing, and add aCHANGELOG.md entry for user-visible changes.

License

MIT

MCP Server · Populars

MCP Server · New