mcp-embedding-search

A Model Context Protocol (MCP) server that queries a Turso databasecontaining embeddings and transcript segments. This tool allows usersto search for relevant transcript segments by asking questions,without generating new embeddings.

Features

🔍 Vector similarity search for transcript segments
📊 Relevance scoring based on cosine similarity
📝 Complete transcript metadata (episode title, timestamps)
⚙️ Configurable search parameters (limit, minimum score)
🔄 Efficient database connection pooling
🛡️ Comprehensive error handling
📈 Performance optimized for quick responses

Configuration

This server requires configuration through your MCP client. Here areexamples for different environments:

Cline Configuration

Add this to your Cline MCP settings:

{
	"mcpServers": {
		"mcp-embedding-search": {
			"command": "node",
			"args": ["/path/to/mcp-embedding-search/dist/index.js"],
			"env": {
				"TURSO_URL": "your-turso-database-url",
				"TURSO_AUTH_TOKEN": "your-turso-auth-token"
			}
		}
	}
}

Claude Desktop Configuration

Add this to your Claude Desktop configuration:

{
	"mcpServers": {
		"mcp-embedding-search": {
			"command": "node",
			"args": ["/path/to/mcp-embedding-search/dist/index.js"],
			"env": {
				"TURSO_URL": "your-turso-database-url",
				"TURSO_AUTH_TOKEN": "your-turso-auth-token"
			}
		}
	}
}

API

The server implements one MCP tool:

search_embeddings

Search for relevant transcript segments using vector similarity.

Parameters:

question (string, required): The query text to search for
limit (number, optional): Number of results to return (default: 5,max: 50)
min_score (number, optional): Minimum similarity threshold(default: 0.5, range: 0-1)

Response format:

[
	{
		"episode_title": "Episode Title",
		"segment_text": "Transcript segment content...",
		"start_time": 123.45,
		"end_time": 167.89,
		"similarity": 0.85
	}
	// Additional results...
]

Database Schema

This tool expects a Turso database with the following schema:

CREATE TABLE embeddings (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  transcript_id INTEGER NOT NULL,
  embedding TEXT NOT NULL,
  FOREIGN KEY(transcript_id) REFERENCES transcripts(id)
);

CREATE TABLE transcripts (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  episode_title TEXT NOT NULL,
  segment_text TEXT NOT NULL,
  start_time REAL NOT NULL,
  end_time REAL NOT NULL
);

The embedding column should contain vector embeddings that can beused with the vector_distance_cos function.

Development

Setup

Clone the repository
Install dependencies:

npm install

Build the project:

npm run build

Run in development mode:

npm run dev

Publishing

The project uses changesets for version management. To publish:

Create a changeset:

npm run changeset

Version the package:

npm run version

Publish to npm:

npm run release

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see the LICENSE file for details.

Acknowledgments

Built on theModel Context Protocol
Designed for efficient vector similarity search in transcriptdatabases

mcp-embedding-search