MarcellM01

TinySearch

Community MarcellM01
Updated

Shrink the web for your local LLMs!

TinySearch

LicenseReleaseLast commitMCPFastAPI

A tiny local-first web research engine for MCP agents.

TinySearch searches the web, reranks results, crawls the best pages, extractsthe most relevant chunks, and returns a source-grounded prompt your LLM cananswer from.

No hosted dashboard. No account system. No analytics. No scraped-data cache.

Just search -> crawl -> rerank -> grounded prompt.

Quick start

Run TinySearch as an MCP server over Streamable HTTP:

docker run --rm -p 8000:8000 -e MCP_TRANSPORT=streamable-http -e MCP_HOST=0.0.0.0 marcellm01/tinysearch:latest

Then connect your MCP client to:

{
  "mcpServers": {
    "tinysearch": {
      "url": "http://localhost:8000/mcp"
    }
  }
}

TinySearch exposes one MCP tool:

research(query)

Pass the user's question as-is. TinySearch searches, crawls, reranks, andreturns the grounded prompt in answer.

Why TinySearch?

  • Give local agents web research without wiring together a whole search stack.
  • Keep source URLs attached to the evidence your model sees.
  • Avoid dumping full webpages into context.
  • Use local ONNX embeddings or an OpenAI-compatible embedding API.
  • Run over MCP or a simple FastAPI endpoint.

TinySearch is built for local agents, prototypes, personal workflows, and smallsystems where source-grounded web research matters more than running a fullsearch backend.

How it works

flowchart TB
    subgraph Row1["Search and choose pages"]
        direction LR
        A[User query] --> B[DuckDuckGo HTML search]
        B --> C[Filter HTTP results<br/>build title URL domain snippet docs]
        C --> D[Rank search docs<br/>dense + BM25 weighted RRF]
    end

    subgraph Row2["Crawl and build prompt"]
        direction LR
        E[Crawl kept URLs in parallel<br/>crawl4ai markdown] --> F[Truncate and chunk markdown]
        F --> G[Rank combined chunk pool<br/>dense + BM25 weighted RRF]
        G --> H[Dedupe chunks<br/>apply source quotas and fill]
        H --> I[Build source-grounded prompt]
    end

    Row1 --> Row2

TinySearch does not directly answer the question. It returns astructured prompt in the MCP tool's answer field, and yourclient model uses that prompt to produce the final cited response.

QUESTION
What happened in the latest NFL playoffs?

TODAY
2026-05-15

RESULTS
1. Title
   URL
   Relevant extracted text...

2. Title
   URL
   Relevant extracted text...

INSTRUCTIONS
Answer only from the results. Cite source URLs.

Run from source

Use this path if you want to inspect the code, edit TinySearch, or run it as alocal stdio MCP server.

git clone https://github.com/MarcellM01/TinySearch
cd TinySearch

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

MCP clients spawn TinySearch from their config. Add it with absolute paths:

macOS / Linux:

{
  "mcpServers": {
    "tinysearch": {
      "command": "/absolute/path/to/TinySearch/.venv/bin/python",
      "args": [
        "/absolute/path/to/TinySearch/servers/mcp_server.py"
      ]
    }
  }
}

Windows:

{
  "mcpServers": {
    "tinysearch": {
      "command": "C:/absolute/path/to/TinySearch/.venv/Scripts/python.exe",
      "args": [
        "C:/absolute/path/to/TinySearch/servers/mcp_server.py"
      ]
    }
  }
}

Template config files live in mcp_templates/.

The repo also includes agentic_coding_templates/global-rules-recommended.md,a global-rules template for agentic coding tools such as Cline and Roo Code.These rules help coding agents call TinySearch only when web research isactually needed.

The server uses stdio by default, which is what Cursor and similar clientsexpect when they spawn python .../mcp_server.py. To run with sse orstreamable-http, set MCP_TRANSPORT when starting the process. Do not puttransport in configs/research_config.json.

Docker

The quick start command runs TinySearch over Streamable HTTP onhttp://localhost:8000/mcp. Docker pulls marcellm01/tinysearch:latestautomatically if the image is not already local.

With MCP_TRANSPORT=streamable-http, the image serves Streamable HTTP on/mcp and SSE on /mcp/sse. GET requests to /mcp without anmcp-session-id are treated as the legacy SSE stream. If a client still cannotconnect, try MCP_TRANSPORT=sse alone or the stdio Docker setup below.

Persistent models and config

For repeated use, keep downloaded models in a Docker volume and mount your localconfig:

docker run --rm \
  -p 8000:8000 \
  -v tinysearch-models:/data/models \
  -v "$PWD/configs/research_config.json:/config/research_config.json:ro" \
  -e TINYSEARCH_CONFIG_PATH=/config/research_config.json \
  -e MCP_TRANSPORT=streamable-http \
  -e MCP_HOST=0.0.0.0 \
  marcellm01/tinysearch:latest

MCP over stdio

Use this mode for MCP clients that launch tools as local commands instead ofconnecting to a URL. Replace /absolute/path/to/TinySearch with this repo'sabsolute path:

{
  "mcpServers": {
    "tinysearch": {
      "command": "docker",
      "args": [
        "run",
        "--rm",
        "-i",
        "-v",
        "tinysearch-models:/data/models",
        "-v",
        "/absolute/path/to/TinySearch/configs/research_config.json:/config/research_config.json:ro",
        "-e",
        "TINYSEARCH_CONFIG_PATH=/config/research_config.json",
        "-e",
        "TINYSEARCH_MODELS_DIR=/data/models",
        "marcellm01/tinysearch:latest"
      ]
    }
  }
}

Edit configs/research_config.json to choose embedding_model (fast,balanced, quality, or a custom Hugging Face ONNX repo id). The named Dockervolume keeps downloaded model bundles between launches.

Optional HTTP server

Useful when you want HTTP instead of MCP:

uvicorn servers.fastapi_server:app --reload

Endpoints:

  • GET /health
  • GET /web_search?query=...
  • POST /site_crawl
  • POST /research

Configuration

Tune research defaults in configs/research_config.json. SetTINYSEARCH_CONFIG_PATH to load a different JSON config file, which is therecommended Docker override pattern.

The onnx embedding backend uses local ONNX bundles under models/. Startingthe MCP server or FastAPI app downloads the configured embedding_model oncefrom Hugging Face when embedding_backend is onnx.

Built-in local presets:

  • fast: onnx-models/all-MiniLM-L6-v2-onnx
  • balanced: BAAI/bge-small-en-v1.5
  • quality: BAAI/bge-base-en-v1.5

You can also set embedding_model to a custom Hugging Face ONNX repo id. SetTINYSEARCH_MODELS_DIR to move the model cache, or useTINYSEARCH_ONNX_MODEL_DIR when you need to point at one exact bundle directory.

Key settings:

  • Search: search_top_k, search_rrf_cutoff, search_dense_weight, search_max_results_to_keep
  • Chunks: chunk_rrf_cutoff, chunk_dense_weight, chunk_max_results_to_keep
  • Crawl: crawl_max_chunk_tokens, crawl_overlap_tokens, max_concurrent_crawls
  • Embeddings: embedding_backend, embedding_model, embedding_openai_env_file, max_concurrent_embedding_calls
  • Tokenizer: encoding_name
  • Dense input prefixes: dense_query_prefix, dense_document_prefix
  • Trace: trace_path

For embedding_backend openai_compatible, add a .env file at the projectroot, or set embedding_openai_env_file, with:

OPENAI_BASE_URL=
OPENAI_API_KEY=
OPENAI_EMBEDDING_MODEL=

OPENAI_BASE_URL is optional for api.openai.com. EMBEDDING_MODEL andMODEL_NAME are accepted as aliases for OPENAI_EMBEDDING_MODEL.

The research pipeline requires dense embeddings. It raises ifsearch_dense_weight or chunk_dense_weight is set to 0.

When not to use TinySearch

TinySearch is not a replacement for a commercial search API or a persistentcrawler. It is probably not the right tool if you need:

  • guaranteed search coverage
  • large-scale indexing
  • long-term page caching
  • enterprise observability
  • production SLA-backed web search

TinySearch vs...

Tool type What it gives you Tradeoff
Search API Search results Usually hosted / paid
Full crawler / index Persistent search backend More infrastructure
SearxNG Metasearch Still needs setup and a ranking layer
TinySearch MCP research prompt with ranked chunks Lightweight; not a full search engine

Entrypoints

  • pipelines.agentic_research.agentic_run: single-turn search, crawl, ranking, and prompt assembly
  • servers.mcp_server: MCP server for agent clients
  • servers.fastapi_server: optional HTTP API

Tests

Run the unittest suite:

python -m unittest discover tests

Contact

Using TinySearch or want to build on it?

Email me or reach me on Bluesky.

Privacy notes

TinySearch reads the pages it crawls and returns ranked excerpts to the callingclient. It does not include credentials in the repo, and .env / trace outputshould stay local. If you enable openai_compatible embeddings, your embeddingprovider receives the text snippets sent for vectorization.

License

Source code in this repository is under the MIT License.

When embedding_backend is onnx, TinySearch may download the selected localONNX embedding bundle at runtime from Hugging Face. Those weights are separatedistributions under their model-card licenses; keep license and attributionnotices if you ship or redistribute those files. Optional manual export forfast uses sentence-transformers/all-MiniLM-L6-v2 (Apache-2.0).

See NOTICE for Docker and third-party distribution notes.

MCP Server ยท Populars

MCP Server ยท New

    MarcellM01

    TinySearch

    Shrink the web for your local LLMs!

    Community MarcellM01
    DomDemetz

    Claude Soul

    Self-improving learning engine for Claude Code. Not memory. Growth.

    Community DomDemetz
    chinawsb

    Daofy for Delphi

    Daofy for Delphi โ€” MCP Server that compiles Delphi projects and queries knowledge base for AI assistants.

    Community chinawsb
    heymrun

    Heym

    Self-hosted AI workflow automation platform with visual canvas, agents, RAG, HITL, MCP, and observability in one runtime.

    Community heymrun
    Wide-Moat

    Open Computer Use

    MCP server that gives any LLM its own computer โ€” managed Docker workspaces with live browser, terminal, code execution, document skills, and autonomous sub-agents. Self-hosted, open-source, pluggable into any model.

    Community Wide-Moat