DiegoNogueiraDev

MCP Context Hub

Community DiegoNogueiraDev
Updated

MCP Context Hub

MCP Context Hub

Local MCP server (Node.js + TypeScript) for context optimization, RAG memory, semantic cache, and sub-MCP proxy. Designed to run on a machine with GPU (RTX 3060 Ti) + Ollama, acting as a single MCP endpoint for Claude.

Architecture

           Claude (Remote)
                |
        HTTP POST/GET/DELETE + Bearer Token
                |
    +-----------v-----------+
    |   Express (:3100)     |
    |   Auth + IP Allowlist |
    +-----------+-----------+
                |
    +-----------v-----------+
    |  McpServer (SDK v1)   |
    |                       |
    |  Tools:               |
    |   context_pack        |
    |   memory_search       |
    |   memory_upsert       |
    |   context_compress    |
    |   proxy_call          |
    +-+------+------+-----+-+
      |      |      |     |
      v      v      v     v
  Ollama  SQLite  Cache  ProxyMgr
  Client  Vector  LRU    (stdio
  (chat   Store   +TTL    sub-MCP)
  +embed  +FTS5
  +fallback)

Features

  • context_pack — Combines semantic + text search, deduplication, and LLM synthesis into a structured context bundle (summary, facts, next actions)
  • memory_search — Semantic similarity search over stored documents using vector embeddings
  • memory_upsert — Store documents with automatic chunking, embedding, and indexing
  • context_compress — Compress text into bullets, JSON, steps, or summary format to reduce token usage
  • proxy_call — Call tools on sub-MCP servers (e.g., filesystem) with optional post-processing (summarize, compress)

Requirements

  • Node.js >= 20
  • Ollama with the following models:
    • llama3.1:8b-instruct-q4_K_M (primary chat)
    • qwen2.5:7b-instruct-q4_K_M (fallback chat)
    • nomic-embed-text:v1.5 (embeddings, 768 dims)

Quick Start

# 1. Clone and install
git clone https://github.com/DiegoNogueiraDev/mcp-context-hub.git
cd mcp-context-hub
npm install

# 2. Pull Ollama models
ollama pull llama3.1:8b-instruct-q4_K_M
ollama pull qwen2.5:7b-instruct-q4_K_M
ollama pull nomic-embed-text:v1.5

# 3. Configure environment
cp .env.example .env
# Edit .env and set MCP_AUTH_TOKEN to a secure random value

# 4. Start the server
npm run dev

Or use the setup script:

chmod +x scripts/setup.sh
./scripts/setup.sh
npm run dev

Usage

Health Check

curl http://localhost:3100/health
# {"status":"healthy","timestamp":"..."}

MCP Protocol

The server uses Streamable HTTP transport at /mcp. Initialize a session first:

# Initialize session
curl -X POST http://localhost:3100/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "Authorization: Bearer <your-token>" \
  -d '{
    "jsonrpc": "2.0",
    "method": "initialize",
    "params": {
      "protocolVersion": "2025-03-26",
      "capabilities": {},
      "clientInfo": { "name": "my-client", "version": "1.0.0" }
    },
    "id": 1
  }'

Then call tools using the mcp-session-id header from the response:

# Store a document
curl -X POST http://localhost:3100/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "mcp-session-id: <session-id>" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "memory_upsert",
      "arguments": {
        "document_id": "my-doc",
        "content": "Your document text here...",
        "scope": "project",
        "tags": ["example"]
      }
    },
    "id": 2
  }'

# Search memories
curl -X POST http://localhost:3100/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "mcp-session-id: <session-id>" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "memory_search",
      "arguments": {
        "query": "your search query",
        "top_k": 5
      }
    },
    "id": 3
  }'

Sub-MCP Proxy

Configure sub-MCP servers via the PROXY_SERVERS environment variable:

PROXY_SERVERS='{"filesystem":{"command":"node","args":["node_modules/@modelcontextprotocol/server-filesystem/dist/index.js","/tmp"]}}' npm run dev

Then call tools on them via proxy_call:

{
  "name": "proxy_call",
  "arguments": {
    "server": "filesystem",
    "tool": "read_file",
    "arguments": { "path": "/tmp/example.txt" },
    "post_process": "none"
  }
}

Configuration

All settings via environment variables (see .env.example):

Variable Default Description
MCP_AUTH_TOKEN Bearer token for authentication
MCP_ALLOWED_IPS 127.0.0.1,::1 Comma-separated allowed IPs
OLLAMA_BASE_URL http://localhost:11434 Ollama API URL
PRIMARY_MODEL llama3.1:8b-instruct-q4_K_M Primary chat model
FALLBACK_MODEL qwen2.5:7b-instruct-q4_K_M Fallback chat model
EMBEDDING_MODEL nomic-embed-text:v1.5 Embedding model
PORT 3100 Server port
HOST 0.0.0.0 Server host
DB_PATH ./data/context-hub.db SQLite database path
CACHE_TTL_MS 300000 Cache TTL (5 minutes)
CACHE_MAX_ENTRIES 100 Max cache entries
LOG_LEVEL info Log level (debug, info, warn, error)
PROXY_SERVERS {} Sub-MCP server configs (JSON)

Commands

npm run dev          # Start dev server (HTTP on :3100)
npm run dev:stdio    # Start in stdio mode (for local MCP testing)
npm run build        # Compile TypeScript
npm start            # Run compiled output
npm test             # Run tests (31 tests, 6 files)
npm run typecheck    # Type-check without emitting
npm run health       # Run health check script

Project Structure

src/
  config.ts                  # Environment configuration
  index.ts                   # Entry point + graceful shutdown
  db/
    connection.ts            # SQLite singleton (WAL mode)
    migrations.ts            # Table definitions (documents, chunks, FTS5, audit)
    cosine.ts                # Cosine similarity + embedding serialization
  server/
    mcp-server.ts            # McpServer setup + tool registration
    transport.ts             # Express + Streamable HTTP transport
    session.ts               # Session management
  middleware/
    auth.ts                  # Bearer token validation
    ip-allowlist.ts          # IP restriction
    audit.ts                 # Tool call logging
  tools/
    schemas.ts               # Zod schemas for all tools
    context-pack.ts          # context_pack implementation
    memory-search.ts         # memory_search implementation
    memory-upsert.ts         # memory_upsert implementation
    context-compress.ts      # context_compress implementation
    proxy-call.ts            # proxy_call implementation
  services/
    ollama-client.ts         # Ollama API (chat + embed + fallback)
    sqlite-vector-store.ts   # Vector store (SQLite + brute-force cosine)
    text-search.ts           # FTS5 full-text search
    chunker.ts               # Recursive text splitter
    dedup.ts                 # Content hashing + Jaccard dedup
    semantic-cache.ts        # LRU + TTL in-memory cache
    proxy-manager.ts         # Sub-MCP stdio connections
  utils/
    logger.ts                # Pino structured logging
    metrics.ts               # In-memory call metrics
    retry.ts                 # Exponential backoff retry
    tokens.ts                # Token estimation
  types/
    index.ts                 # Type re-exports
    ollama.ts                # Ollama API types
    vector-store.ts          # VectorStore interface
tests/
  unit/                      # cosine, chunker, dedup, cache
  integration/               # sqlite vector store
  e2e/                       # Express server

Tech Stack

  • Runtime: Node.js 20, TypeScript
  • MCP SDK: @modelcontextprotocol/sdk v1.26
  • HTTP: Express v5 + Streamable HTTP transport
  • Database: SQLite (better-sqlite3) with WAL mode, FTS5
  • Embeddings: Ollama nomic-embed-text:v1.5 (768 dimensions)
  • Chat: Ollama with automatic model fallback
  • Validation: Zod v4
  • Logging: Pino
  • Testing: Vitest

License

MIT

MCP Server · Populars

MCP Server · New