🤖 Spec Assistant — MCP-Powered RAG

A local, fully functional developer assistant that brings spec-driven development directly into your IDE. Ask natural language questions about your technical specifications and perform automated code compliance checks directly from VS Code, Claude Desktop or any MCP-compatible client.

Architecture

Developer Question / Code Snippet
       │
       ▼
  MCP Server (stdio transport)
  ┌─────────────────────────────┐
  │  get_spec(query)            │  ◄── Custom tools exposed to client
  │  list_specs()               │
  │  validate_code(code, spec)  │
  └────────────┬────────────────┘
               │
        ┌───────▼────────┐
        │  RAG Pipeline  │
        │                │
        │  1. Embed query│──► sentence-transformers (local, all-MiniLM-L6-v2)
        │  2. Retrieve   │──► ChromaDB (local vector DB, cosine metric)
        │  3. Build prompt│
        │  4. Call LLM   │──► Ollama llama3.2 (local) or OpenAI
        └────────────────┘
                │
        ┌───────▼────────┐
        │  specs/ folder │  ◄── Your Markdown (.md) or Text (.txt) files
        └────────────────┘

Quick Start

1. Prerequisites

Python 3.11+ installed on your system.
Ollama installed and running locally.
Pull the default local model using:
```
ollama pull llama3.2
```

2. Install Dependencies

Clone this repository, navigate to the directory, and install dependencies:

pip install -r requirements.txt

[!NOTE]The initial setup might take a moment to resolve as it downloads the local embedding model (all-MiniLM-L6-v2 ~90MB) on first run.

3. Environment Configuration

Copy the example environment configuration to create your local .env file:

# Windows
copy .env.example .env

# macOS/Linux
cp .env.example .env

The defaults are already pre-configured to work with a local Ollama server out of the box.

4. Import Specification Documents

Put your .md or .txt specification documents inside the specs/ directory. By default, the repository contains:

auth_spec.md — Authentication, authorization rules, and endpoints.
user_management_spec.md — User CRUD API specifications.
notification_spec.md — In-app, webhook, and email notification settings.

5. Index Specifications into Vector Store

Run the ingestion pipeline to parse documents, split them into chunks, compute vector embeddings, and save them to your local database:

python ingest.py

6. View Indexed Data

To inspect exactly what text chunks, documents, and metadata are indexed in your local vector database, run the helper database viewer script:

python view_db.py

Running the MCP Server

Run server in Dev Mode (with Inspector)

To test the MCP tools interactively, you can run the server using the MCP developer tool:

mcp dev mcp_server/server.py

This runs the server locally and launches a web-based MCP Inspector where you can invoke and test all tools in real-time.

Integrations

Connect to Claude Desktop

Add the server configuration to your Claude Desktop configuration file:

Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Add the following JSON configuration (replacing absolute paths with your own directory path):

{
  "mcpServers": {
    "spec-assistant": {
      "command": "python",
      "args": ["C:/absolute/path/to/spec-mcp-poc/mcp_server/server.py"],
      "env": {}
    }
  }
}

Restart Claude Desktop, and you will see the new tools symbol in the composer window!

Connect to IDE Extension (e.g. Cline or Continue in VS Code)

Add this configuration block to your IDE extension configuration file:

{
  "spec-assistant": {
    "command": "python",
    "args": ["mcp_server/server.py"],
    "cwd": "C:/absolute/path/to/spec-mcp-poc"
  }
}

Available MCP Tools

Tool Name	Parameters	Description
`get_spec`	`query` (str)	Ask natural language questions about specifications. Utilizes semantic vector search to augment your LLM's response.
`list_specs`	None	Returns a detailed list of all specifications currently parsed and indexed inside the database.
`validate_code`	`code` (str), `spec_name` (str)	Validates code snippets against specifications and returns list of compliant features, violations, and recommendation checklist.

Project Structure

spec-mcp-poc/
├── specs/                    # 📄 Raw specification files (add yours here)
│   ├── auth_spec.md
│   ├── user_management_spec.md
│   └── notification_spec.md
├── ingestion/
│   ├── chunker.py            # Loads specs and splits them into sliding-window text chunks
│   ├── embedder.py           # Embeds chunks using sentence-transformers (local model)
│   └── indexer.py            # Interfaces with ChromaDB (creation, deletes, insertions)
├── rag/
│   ├── retriever.py          # Performs semantic search querying database via cosine distance
│   ├── prompt_builder.py     # Generates LLM chat prompts for retrieval and compliance checks
│   └── llm_client.py         # Routes requests to Ollama (local) or OpenAI (cloud)
├── mcp_server/
│   └── server.py             # MCP Server exposing standard tools over stdio
├── config.py                 # Central configurations and environment reader
├── ingest.py                 # Entry point command line pipeline to run ingestion
├── view_db.py                # Helper utility script to view local vector database entries
├── requirements.txt          # Python dependencies
├── .gitignore                # Git ignored patterns
├── .env                      # Local environment configurations (ignored)
└── .env.example              # Template configuration

Switching to Cloud-based OpenAI (Optional)

If you prefer using OpenAI cloud endpoints instead of local Ollama, update your .env file configuration:

LLM_BACKEND=openai
OPENAI_API_KEY=sk-proj-your-actual-api-key
OPENAI_MODEL=gpt-4o-mini

No code modifications are required; the system automatically switches backends on the fly.

Troubleshooting

Problem	Potential Cause	Troubleshooting Action
`No indexed specs found`	Database has not been initialized.	Run `python ingest.py` to index specs.
`Connection refused` (Ollama)	Ollama service is not running.	Make sure the Ollama application is running, or run `ollama serve`.
`Model not found`	The model is missing in Ollama.	Run `ollama pull llama3.2` to download the model.
Slow execution during first run	Cold start downloads.	The local embedding model (`all-MiniLM-L6-v2`) is downloaded once and cached for future runs.

🤖 Spec Assistant — MCP-Powered RAG