syntheticgio

ncbi-datasets-mcp

Community syntheticgio
Updated

An MCP server for NCBI Datasets command line tools which allows for the searching and fetching of NCBI Taxonomy, Gene, Genome, and Viral data. Note: this project is not affiliated with NCBI or the NCBI Datasets team.

ncbi-datasets-mcp

NOTE: This is not affiliated with NCBI or NCBI Datasets, this is a user provided tool.

An MCP server that gives Claude access to NCBI Datasets v2 — search genome assembly metadata, retrieve taxonomy records, and download data packages without leaving your conversation.

Tools

Tool Transport Description
ensure_cli Install the NCBI CLI tools (run once, or set NCBI_AUTO_INSTALL=true)
genome_summary_by_taxon REST Search genome assemblies by organism name or tax ID
genome_summary_by_accession REST Fetch assembly metadata for known accessions
genome_download_by_taxon CLI Download a genome package by taxon
genome_download_by_accession CLI Download a genome package by accession
rehydrate_genome_package CLI Fetch sequence files for a dehydrated package
dataformat_genome_tsv CLI Convert a genome JSONL data report to TSV
taxonomy_summary REST Get lineage, rank, and names for a taxon
taxonomy_download CLI Download a taxonomy package

Installation

Option 1 — Desktop Extension (recommended for Claude Desktop users)

  1. Download ncbi-datasets.mcpb from the Releases page.
  2. Double-click the file and click Install in Claude Desktop.
  3. Optionally enter your NCBI API key and download directory.

The NCBI CLI tools are downloaded automatically on first use (NCBI_AUTO_INSTALL=true is set by default in the extension).

Option 2 — JSON config (Claude Desktop / Claude Code)

Add to claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "ncbi-datasets": {
      "command": "uvx",
      "args": ["ncbi-datasets-mcp"],
      "env": {
        "NCBI_API_KEY": "your_key_here",
        "NCBI_DOWNLOAD_DIR": "/path/to/downloads",
        "NCBI_AUTO_INSTALL": "true"
      }
    }
  }
}

Requires uv (curl -LsSf https://astral.sh/uv/install.sh | sh).

Configuration

Variable Default Description
NCBI_API_KEY (none) NCBI API key — raises rate limit to 10 req/s
NCBI_DOWNLOAD_DIR ~/Downloads/ncbi_datasets Default download location
NCBI_AUTO_INSTALL false Auto-install CLI tools on startup
NCBI_MAX_RESULTS 20 Cap for summary tool result counts
NCBI_REQUEST_TIMEOUT 300 Seconds before a download times out
NCBI_CLI_PATH (auto) Override path to datasets binary
NCBI_DATAFORMAT_PATH (auto) Override path to dataformat binary

Development

# Install with dev extras
pip install -e ".[dev]"

# Run unit tests
pytest

# Run all tests including live network calls
pytest -m integration

# Regenerate enums from the current NCBI OpenAPI spec
python scripts/gen_enums.py

# Run the server locally (stdio transport)
ncbi-datasets-mcp

Architecture

src/ncbi_datasets_mcp/
  server.py           FastMCP app — tool registrations only
  config.py           Pydantic-settings env config
  cli/
    locator.py        Find datasets/dataformat (config → PATH → cache)
    installer.py      Download binaries from NCBI FTP
    runner.py         Async subprocess wrapper
  rest/
    client.py         httpx client for metadata/summary endpoints
  domains/
    _generated_enums.py  Vendored enums from OpenAPI spec
    common.py         Shared utilities (output dir, filename sanitising)
    genome.py         Genome CLI arg builders + response shaping
    taxonomy.py       Taxonomy CLI arg builders
  models/
    responses.py      Shared DownloadResult dataclass

Summary tools (no file I/O) → REST API. Download and format-conversion tools → NCBI CLI binaries.

Cite

If you use NCBI Datasets in your research, please cite:

NCBI Datasets. National Center for Biotechnology Information. https://www.ncbi.nlm.nih.gov/datasets/

License

MIT

MCP Server · Populars

MCP Server · New

    bobmatnyc

    MCP Vector Search

    CLI-first semantic code search with MCP integration. Modern, fast, and intelligent code search powered by ChromaDB and AST parsing.

    Community bobmatnyc
    ptbsare

    MCP Proxy Server

    This server acts as a central hub for Model Context Protocol (MCP) resource servers.

    Community ptbsare
    docling-project

    Docling MCP: making docling agentic

    Making docling agentic through MCP

    Community docling-project
    SouravRoy-ETL

    duckle

    Local-first ETL/ELT studio: a drag-and-drop visual pipeline designer that compiles to SQL and runs on DuckDB. Tiny desktop app, no servers, git-friendly workspaces.

    Community SouravRoy-ETL
    ksylvan

    Fabric MCP Server

    Fabric MCP Server: Seamlessly integrate Fabric AI capabilities into MCP-enabled tools like IDEs and chat interfaces.

    Community ksylvan