huangjianhuster

Biomolecule Modeling MCP Server

Community huangjianhuster
Updated

A MCP server for biomolecule modeling

Biomolecule Modeling MCP Server

An MCP (Model Context Protocol) server that lets AI assistants (Claude, etc.) clean PDB structures and run initial-stage energy relaxation — the preparation step before production MD or coarse-graining (e.g. Martini) pipelines.

Quickstart (Claude Code)

The easiest way to use this server is via uvx — no manual dependency management needed:

claude mcp add "biomolecule-modeling" --scope user -- \
  uvx --from git+https://github.com/YOUR_USERNAME/biomolecule-modeling-mcp biomodeling

That's it. uvx installs the package and all its Python dependencies into an isolated environment automatically. After running this command, restart Claude Code and the server will be listed under active MCP servers.

OpenMM note: uvx installs the PyPI wheel of OpenMM (CPU-only). If you need GPU acceleration, install OpenMM via conda first (conda install -c conda-forge openmm pdbfixer) and use the manual registration method instead.

What It Does

The server exposes a set of tools that guide an LLM through a structured workflow:

query_pdb_structure          # inspect chains, sequences, gaps, structural breaks
  └─ split_pdb               # separate protein / nucleic acid / ligands
       └─ fix_pdb_structure  # clean structure (PDBFixer): residues, hydrogens, heavy atoms
            └─ assemble_pdb_structures  # recombine components after separate fixing
                 └─ relax_pdb_structure # energy minimise ± short restrained MD (OpenMM)

At any point get_workflow_report returns a Markdown summary of every decision the LLM made, with its reasoning, which can be saved to disk.

Tools

Tool Description
query_pdb_structure Chain-level inspection: type, sequence, residue range, numbering gaps, backbone breaks
split_pdb Split a multi-component PDB into per-type files (protein, nucleic, ligands)
fix_pdb_structure PDBFixer wrapper: remove heterogens, replace non-standard residues, add missing atoms/hydrogens
assemble_pdb_structures Merge multiple PDB files back into one, with conflict resolution
relax_pdb_structure OpenMM energy minimisation + optional short backbone-restrained MD
query_ff_presets List available force-field presets and their parameters
get_workflow_report Return and/or save a Markdown report of all tool calls and LLM reasoning
reset_workflow_report Clear the session log for a new structure
set_working_directory Change the working directory for relative path resolution

Force-Field Presets

Preset Solvent Notes
amber14_implicit GBn2 implicit Default. No periodic box needed. Fast.
amber14_explicit TIP3P-FB explicit Requires pre-solvated periodic box
amber99sb_implicit OBC2 implicit Older AMBER, implicit solvent
charmm36_explicit CHARMM TIP3P explicit Requires pre-solvated periodic box

Manual Installation

Use this method if you want GPU-accelerated OpenMM or prefer a local checkout.

1. Install OpenMM (conda recommended for GPU support)

conda install -c conda-forge openmm pdbfixer

2. Clone and install dependencies

git clone https://github.com/YOUR_USERNAME/biomolecule-modeling-mcp
cd biomolecule-modeling-mcp
uv sync

3. Register with Claude Code

claude mcp add "biomolecule-modeling" --scope user -- \
  /path/to/uv run --directory /path/to/biomolecule-modeling-mcp python main.py

Replace /path/to/uv with which uv and use the absolute path to the cloned repo.

Important: use --directory (not --project). --project sets the virtualenv but leaves cwd wherever the caller is, which breaks relative utils/ imports.

Standalone scripts (no MCP client needed)

uv run utils/pdb_query.py structure.pdb        # inspect structure
uv run utils/pdb_fix.py input.pdb              # fix structure
uv run utils/openmm_relax.py input.pdb         # minimise structure

Example Sessions

The following examples show real natural-language prompts you can give Claude, and the sequence of tool calls the server will drive in response.

Case 1 — Download, inspect, clean, and relax a PDB from RCSB

Your prompt to Claude:

Download PDB 1UBQ, query its structure information, clean the protein part, and relax the structure.

What Claude does:

Step 1 — Download the file

Claude fetches 1UBQ.pdb from RCSB (https://files.rcsb.org/download/1UBQ.pdb) and saves it locally.

Step 2 — Inspect the structure

query_pdb_structure(
    pdb_file="1UBQ.pdb",
    notes="Initial inspection before cleaning. Need to identify chain types, "
          "any non-standard residues, and structural gaps."
)
Tool output (abbreviated)
chains:
  A — protein, 76 residues (MET1–GLY76), no numbering gaps, no structural breaks
heterogens: HOH (58 water molecules)
non_standard_residues: []
missing_residues: []

Claude's reasoning: Single protein chain, no gaps, no non-standard residues. Waters present — will remove them. No ligands to worry about; can go straight to fix_pdb_structure.

Step 3 — Clean the structure

fix_pdb_structure(
    input_pdb="1UBQ.pdb",
    output_pdb="1UBQ_fixed.pdb",
    remove_heterogens=True,      # removes the 58 water molecules
    add_hydrogens=True,
    ph=7.0,
    fix_missing_residues=False,  # no gaps — nothing to fix
    notes="Removing waters, adding hydrogens at pH 7.0. No non-standard residues "
          "or gaps detected so no other fixing steps needed."
)
Tool output (abbreviated)
output_pdb: 1UBQ_fixed.pdb
n_residues: 76
n_atoms: 1231
heterogens_removed: 58
nonstandard_residues: []
missing_atoms_added: 0
hydrogens_added: true

Step 4 — Relax the structure

relax_pdb_structure(
    input_pdb="1UBQ_fixed.pdb",
    output_pdb="1UBQ_relaxed.pdb",
    preset="amber14_implicit",
    run_md=False,
    notes="Minimisation-only run. Structure is clean and small (76 residues); "
          "a full MD run is not needed for clash removal."
)
Tool output (abbreviated)
output_pdb: 1UBQ_relaxed.pdb
preset_used: amber14_implicit
n_atoms: 1231
energy_initial_kJ_mol: -18423.4
energy_final_kJ_mol:   -24817.9

Claude's summary: Energy dropped ~6400 kJ/mol during minimisation — typical for a crystal structure after hydrogen addition. Output 1UBQ_relaxed.pdb is ready for downstream MD or coarse-graining.

Case 2 — Clean a local multi-chain structure, keep only selected chains

Your prompt to Claude:

Query the PDB structure under ./structures/complex.pdb, clean it to keep only chain A and chain B, and relax the structure.

What Claude does:

Step 1 — Inspect the structure

query_pdb_structure(
    pdb_file="./structures/complex.pdb",
    notes="Inspecting the multi-chain complex before chain selection and cleaning."
)
Tool output (abbreviated)
chains:
  A — protein, 312 residues (SER5–LEU316), 2 numbering gaps (at 47–51, 203–208)
  B — protein, 289 residues (MET1–LYS289), no gaps
  C — protein, 145 residues (ALA1–GLU145), no gaps
  D — nucleic (DNA), 24 residues
heterogens: MG (3 ions), HOH (412 waters), ATP (1 molecule)
non_standard_residues: [MSE → MET in chain A (4 occurrences)]

Claude's reasoning: User wants chains A and B only. Chain A has two numbering gaps — these are likely short disordered loops in the crystal; will not attempt loop modelling. MSE (selenomethionine) in chain A will be converted to MET. ATP and ions must be removed before OpenMM relaxation.

Step 2 — Clean, keeping only chains A and B

fix_pdb_structure(
    input_pdb="./structures/complex.pdb",
    output_pdb="./structures/complex_AB_fixed.pdb",
    chains_to_keep=["A", "B"],
    replace_nonstandard_residues=True,   # converts MSE → MET
    remove_heterogens=True,              # removes ATP, Mg²⁺, waters
    add_missing_heavy_atoms=True,
    add_hydrogens=True,
    ph=7.0,
    fix_missing_residues=False,          # gaps are 4–5 residues; skipping loop modelling
    notes="Keeping chains A and B only per user request. Converting 4 × MSE → MET "
          "in chain A. Removing ATP, Mg ions, and waters — these cannot be "
          "parameterised with standard AMBER presets. Not fixing the two loops "
          "in chain A (gaps of 4–5 residues) to avoid poor loop geometry."
)
Tool output (abbreviated)
output_pdb: complex_AB_fixed.pdb
n_residues: 601   (312 + 289)
n_atoms: 9847
chains_kept: [A, B]
nonstandard_residues: [MSE→MET ×4 in chain A]
heterogens_removed: 416   (ATP ×1, MG ×3, HOH ×412)
missing_atoms_added: 12
hydrogens_added: true

Claude's warning to user: "Chain A still has two short gaps at residues 47–51 and 203–208 (missing from the crystal structure). These regions will be absent in the relaxed output. If you need complete loops for your downstream simulation, let me know and I can attempt loop modelling — though the geometry for gaps this size may be rough."

Step 3 — Relax the two-chain complex

relax_pdb_structure(
    input_pdb="./structures/complex_AB_fixed.pdb",
    output_pdb="./structures/complex_AB_relaxed.pdb",
    preset="amber14_implicit",
    run_md=True,
    restrain_backbone=True,
    restraint_k=1000.0,
    production_steps=25000,
    notes="Running short restrained MD (25 000 steps ≈ 100 ps) to relax side chains "
          "around the 12 newly added heavy atoms and the two gap termini in chain A. "
          "Backbone restrained at 1000 kJ/mol/nm² to preserve experimental fold."
)
Tool output (abbreviated)
output_pdb: complex_AB_relaxed.pdb
preset_used: amber14_implicit
n_atoms: 9847
energy_initial_kJ_mol: -71203.1
energy_post_min_kJ_mol: -89441.6
energy_post_md_kJ_mol:  -94308.2
energy_final_kJ_mol:    -95112.4

Claude's summary: Energy decreased steadily through minimisation → MD → final minimisation. The structure is converged and ready. Output: complex_AB_relaxed.pdb.

Project Structure

biomolecule-modeling-mcp/
├── main.py                  # MCP server entry point; all @mcp.tool() definitions
├── pyproject.toml           # dependencies + CLI entry point (biomodeling)
├── utils/
│   ├── pdb_query.py         # Structure inspection (BioPython)
│   ├── pdb_fix.py           # PDBFixer wrapper
│   ├── pdb_splitter.py      # Split multi-component PDB by chain type
│   ├── pdb_assemble.py      # Merge PDB files
│   └── openmm_relax.py      # OpenMM energy minimisation + restrained MD
└── data/                    # Example / test PDB files

Key Gotchas

  • Never fix large loops by default. fix_missing_residues=False is the safe default; loops > 5–10 residues produce poor geometry with PDBFixer.
  • Ligands break standard relaxation. Non-standard HETATM residues must be removed (or separately parameterised) before running OpenMM with AMBER/CHARMM presets.
  • OpenMM >= 8.x implicit solvent. The implicit solvent XML (e.g. implicit/gbn2.xml) goes into ForceField(), not createSystem().

License

MIT

MCP Server · Populars

MCP Server · New