π€ Spec Assistant β MCP-Powered RAG
A local, fully functional developer assistant that brings spec-driven development directly into your IDE. Ask natural language questions about your technical specifications and perform automated code compliance checks directly from VS Code, Claude Desktop or any MCP-compatible client.
Architecture
Developer Question / Code Snippet
β
βΌ
MCP Server (stdio transport)
βββββββββββββββββββββββββββββββ
β get_spec(query) β βββ Custom tools exposed to client
β list_specs() β
β validate_code(code, spec) β
ββββββββββββββ¬βββββββββββββββββ
β
βββββββββΌβββββββββ
β RAG Pipeline β
β β
β 1. Embed queryββββΊ sentence-transformers (local, all-MiniLM-L6-v2)
β 2. Retrieve ββββΊ ChromaDB (local vector DB, cosine metric)
β 3. Build promptβ
β 4. Call LLM ββββΊ Ollama llama3.2 (local) or OpenAI
ββββββββββββββββββ
β
βββββββββΌβββββββββ
β specs/ folder β βββ Your Markdown (.md) or Text (.txt) files
ββββββββββββββββββ
Quick Start
1. Prerequisites
- Python 3.11+ installed on your system.
- Ollama installed and running locally.
- Pull the default local model using:
ollama pull llama3.2
2. Install Dependencies
Clone this repository, navigate to the directory, and install dependencies:
pip install -r requirements.txt
[!NOTE]The initial setup might take a moment to resolve as it downloads the local embedding model (
all-MiniLM-L6-v2~90MB) on first run.
3. Environment Configuration
Copy the example environment configuration to create your local .env file:
# Windows
copy .env.example .env
# macOS/Linux
cp .env.example .env
The defaults are already pre-configured to work with a local Ollama server out of the box.
4. Import Specification Documents
Put your .md or .txt specification documents inside the specs/ directory. By default, the repository contains:
auth_spec.mdβ Authentication, authorization rules, and endpoints.user_management_spec.mdβ User CRUD API specifications.notification_spec.mdβ In-app, webhook, and email notification settings.
5. Index Specifications into Vector Store
Run the ingestion pipeline to parse documents, split them into chunks, compute vector embeddings, and save them to your local database:
python ingest.py
6. View Indexed Data
To inspect exactly what text chunks, documents, and metadata are indexed in your local vector database, run the helper database viewer script:
python view_db.py
Running the MCP Server
Run server in Dev Mode (with Inspector)
To test the MCP tools interactively, you can run the server using the MCP developer tool:
mcp dev mcp_server/server.py
This runs the server locally and launches a web-based MCP Inspector where you can invoke and test all tools in real-time.
Integrations
Connect to Claude Desktop
Add the server configuration to your Claude Desktop configuration file:
- Windows:
%APPDATA%\Claude\claude_desktop_config.json - macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
Add the following JSON configuration (replacing absolute paths with your own directory path):
{
"mcpServers": {
"spec-assistant": {
"command": "python",
"args": ["C:/absolute/path/to/spec-mcp-poc/mcp_server/server.py"],
"env": {}
}
}
}
Restart Claude Desktop, and you will see the new tools symbol in the composer window!
Connect to IDE Extension (e.g. Cline or Continue in VS Code)
Add this configuration block to your IDE extension configuration file:
{
"spec-assistant": {
"command": "python",
"args": ["mcp_server/server.py"],
"cwd": "C:/absolute/path/to/spec-mcp-poc"
}
}
Available MCP Tools
| Tool Name | Parameters | Description |
|---|---|---|
get_spec |
query (str) |
Ask natural language questions about specifications. Utilizes semantic vector search to augment your LLM's response. |
list_specs |
None | Returns a detailed list of all specifications currently parsed and indexed inside the database. |
validate_code |
code (str), spec_name (str) |
Validates code snippets against specifications and returns list of compliant features, violations, and recommendation checklist. |
Project Structure
spec-mcp-poc/
βββ specs/ # π Raw specification files (add yours here)
β βββ auth_spec.md
β βββ user_management_spec.md
β βββ notification_spec.md
βββ ingestion/
β βββ chunker.py # Loads specs and splits them into sliding-window text chunks
β βββ embedder.py # Embeds chunks using sentence-transformers (local model)
β βββ indexer.py # Interfaces with ChromaDB (creation, deletes, insertions)
βββ rag/
β βββ retriever.py # Performs semantic search querying database via cosine distance
β βββ prompt_builder.py # Generates LLM chat prompts for retrieval and compliance checks
β βββ llm_client.py # Routes requests to Ollama (local) or OpenAI (cloud)
βββ mcp_server/
β βββ server.py # MCP Server exposing standard tools over stdio
βββ config.py # Central configurations and environment reader
βββ ingest.py # Entry point command line pipeline to run ingestion
βββ view_db.py # Helper utility script to view local vector database entries
βββ requirements.txt # Python dependencies
βββ .gitignore # Git ignored patterns
βββ .env # Local environment configurations (ignored)
βββ .env.example # Template configuration
Switching to Cloud-based OpenAI (Optional)
If you prefer using OpenAI cloud endpoints instead of local Ollama, update your .env file configuration:
LLM_BACKEND=openai
OPENAI_API_KEY=sk-proj-your-actual-api-key
OPENAI_MODEL=gpt-4o-mini
No code modifications are required; the system automatically switches backends on the fly.
Troubleshooting
| Problem | Potential Cause | Troubleshooting Action |
|---|---|---|
No indexed specs found |
Database has not been initialized. | Run python ingest.py to index specs. |
Connection refused (Ollama) |
Ollama service is not running. | Make sure the Ollama application is running, or run ollama serve. |
Model not found |
The model is missing in Ollama. | Run ollama pull llama3.2 to download the model. |
| Slow execution during first run | Cold start downloads. | The local embedding model (all-MiniLM-L6-v2) is downloaded once and cached for future runs. |