Biomolecule Modeling MCP Server
An MCP (Model Context Protocol) server that lets AI assistants (Claude, etc.) clean PDB structures and run initial-stage energy relaxation — the preparation step before production MD or coarse-graining (e.g. Martini) pipelines.
Quickstart (Claude Code)
The easiest way to use this server is via uvx — no manual dependency management needed:
claude mcp add "biomolecule-modeling" --scope user -- \
uvx --from git+https://github.com/YOUR_USERNAME/biomolecule-modeling-mcp biomodeling
That's it. uvx installs the package and all its Python dependencies into an isolated environment automatically. After running this command, restart Claude Code and the server will be listed under active MCP servers.
OpenMM note:
uvxinstalls the PyPI wheel of OpenMM (CPU-only). If you need GPU acceleration, install OpenMM via conda first (conda install -c conda-forge openmm pdbfixer) and use the manual registration method instead.
What It Does
The server exposes a set of tools that guide an LLM through a structured workflow:
query_pdb_structure # inspect chains, sequences, gaps, structural breaks
└─ split_pdb # separate protein / nucleic acid / ligands
└─ fix_pdb_structure # clean structure (PDBFixer): residues, hydrogens, heavy atoms
└─ assemble_pdb_structures # recombine components after separate fixing
└─ relax_pdb_structure # energy minimise ± short restrained MD (OpenMM)
At any point get_workflow_report returns a Markdown summary of every decision the LLM made, with its reasoning, which can be saved to disk.
Tools
| Tool | Description |
|---|---|
query_pdb_structure |
Chain-level inspection: type, sequence, residue range, numbering gaps, backbone breaks |
split_pdb |
Split a multi-component PDB into per-type files (protein, nucleic, ligands) |
fix_pdb_structure |
PDBFixer wrapper: remove heterogens, replace non-standard residues, add missing atoms/hydrogens |
assemble_pdb_structures |
Merge multiple PDB files back into one, with conflict resolution |
relax_pdb_structure |
OpenMM energy minimisation + optional short backbone-restrained MD |
query_ff_presets |
List available force-field presets and their parameters |
get_workflow_report |
Return and/or save a Markdown report of all tool calls and LLM reasoning |
reset_workflow_report |
Clear the session log for a new structure |
set_working_directory |
Change the working directory for relative path resolution |
Force-Field Presets
| Preset | Solvent | Notes |
|---|---|---|
amber14_implicit |
GBn2 implicit | Default. No periodic box needed. Fast. |
amber14_explicit |
TIP3P-FB explicit | Requires pre-solvated periodic box |
amber99sb_implicit |
OBC2 implicit | Older AMBER, implicit solvent |
charmm36_explicit |
CHARMM TIP3P explicit | Requires pre-solvated periodic box |
Manual Installation
Use this method if you want GPU-accelerated OpenMM or prefer a local checkout.
1. Install OpenMM (conda recommended for GPU support)
conda install -c conda-forge openmm pdbfixer
2. Clone and install dependencies
git clone https://github.com/YOUR_USERNAME/biomolecule-modeling-mcp
cd biomolecule-modeling-mcp
uv sync
3. Register with Claude Code
claude mcp add "biomolecule-modeling" --scope user -- \
/path/to/uv run --directory /path/to/biomolecule-modeling-mcp python main.py
Replace /path/to/uv with which uv and use the absolute path to the cloned repo.
Important: use
--directory(not--project).--projectsets the virtualenv but leavescwdwherever the caller is, which breaks relativeutils/imports.
Standalone scripts (no MCP client needed)
uv run utils/pdb_query.py structure.pdb # inspect structure
uv run utils/pdb_fix.py input.pdb # fix structure
uv run utils/openmm_relax.py input.pdb # minimise structure
Example Sessions
The following examples show real natural-language prompts you can give Claude, and the sequence of tool calls the server will drive in response.
Case 1 — Download, inspect, clean, and relax a PDB from RCSB
Your prompt to Claude:
Download PDB 1UBQ, query its structure information, clean the protein part, and relax the structure.
What Claude does:
Step 1 — Download the file
Claude fetches 1UBQ.pdb from RCSB (https://files.rcsb.org/download/1UBQ.pdb) and saves it locally.
Step 2 — Inspect the structure
query_pdb_structure(
pdb_file="1UBQ.pdb",
notes="Initial inspection before cleaning. Need to identify chain types, "
"any non-standard residues, and structural gaps."
)
Tool output (abbreviated)
chains:
A — protein, 76 residues (MET1–GLY76), no numbering gaps, no structural breaks
heterogens: HOH (58 water molecules)
non_standard_residues: []
missing_residues: []
Claude's reasoning: Single protein chain, no gaps, no non-standard residues. Waters present — will remove them. No ligands to worry about; can go straight to fix_pdb_structure.
Step 3 — Clean the structure
fix_pdb_structure(
input_pdb="1UBQ.pdb",
output_pdb="1UBQ_fixed.pdb",
remove_heterogens=True, # removes the 58 water molecules
add_hydrogens=True,
ph=7.0,
fix_missing_residues=False, # no gaps — nothing to fix
notes="Removing waters, adding hydrogens at pH 7.0. No non-standard residues "
"or gaps detected so no other fixing steps needed."
)
Tool output (abbreviated)
output_pdb: 1UBQ_fixed.pdb
n_residues: 76
n_atoms: 1231
heterogens_removed: 58
nonstandard_residues: []
missing_atoms_added: 0
hydrogens_added: true
Step 4 — Relax the structure
relax_pdb_structure(
input_pdb="1UBQ_fixed.pdb",
output_pdb="1UBQ_relaxed.pdb",
preset="amber14_implicit",
run_md=False,
notes="Minimisation-only run. Structure is clean and small (76 residues); "
"a full MD run is not needed for clash removal."
)
Tool output (abbreviated)
output_pdb: 1UBQ_relaxed.pdb
preset_used: amber14_implicit
n_atoms: 1231
energy_initial_kJ_mol: -18423.4
energy_final_kJ_mol: -24817.9
Claude's summary: Energy dropped ~6400 kJ/mol during minimisation — typical for a crystal structure after hydrogen addition. Output 1UBQ_relaxed.pdb is ready for downstream MD or coarse-graining.
Case 2 — Clean a local multi-chain structure, keep only selected chains
Your prompt to Claude:
Query the PDB structure under ./structures/complex.pdb, clean it to keep only chain A and chain B, and relax the structure.
What Claude does:
Step 1 — Inspect the structure
query_pdb_structure(
pdb_file="./structures/complex.pdb",
notes="Inspecting the multi-chain complex before chain selection and cleaning."
)
Tool output (abbreviated)
chains:
A — protein, 312 residues (SER5–LEU316), 2 numbering gaps (at 47–51, 203–208)
B — protein, 289 residues (MET1–LYS289), no gaps
C — protein, 145 residues (ALA1–GLU145), no gaps
D — nucleic (DNA), 24 residues
heterogens: MG (3 ions), HOH (412 waters), ATP (1 molecule)
non_standard_residues: [MSE → MET in chain A (4 occurrences)]
Claude's reasoning: User wants chains A and B only. Chain A has two numbering gaps — these are likely short disordered loops in the crystal; will not attempt loop modelling. MSE (selenomethionine) in chain A will be converted to MET. ATP and ions must be removed before OpenMM relaxation.
Step 2 — Clean, keeping only chains A and B
fix_pdb_structure(
input_pdb="./structures/complex.pdb",
output_pdb="./structures/complex_AB_fixed.pdb",
chains_to_keep=["A", "B"],
replace_nonstandard_residues=True, # converts MSE → MET
remove_heterogens=True, # removes ATP, Mg²⁺, waters
add_missing_heavy_atoms=True,
add_hydrogens=True,
ph=7.0,
fix_missing_residues=False, # gaps are 4–5 residues; skipping loop modelling
notes="Keeping chains A and B only per user request. Converting 4 × MSE → MET "
"in chain A. Removing ATP, Mg ions, and waters — these cannot be "
"parameterised with standard AMBER presets. Not fixing the two loops "
"in chain A (gaps of 4–5 residues) to avoid poor loop geometry."
)
Tool output (abbreviated)
output_pdb: complex_AB_fixed.pdb
n_residues: 601 (312 + 289)
n_atoms: 9847
chains_kept: [A, B]
nonstandard_residues: [MSE→MET ×4 in chain A]
heterogens_removed: 416 (ATP ×1, MG ×3, HOH ×412)
missing_atoms_added: 12
hydrogens_added: true
Claude's warning to user: "Chain A still has two short gaps at residues 47–51 and 203–208 (missing from the crystal structure). These regions will be absent in the relaxed output. If you need complete loops for your downstream simulation, let me know and I can attempt loop modelling — though the geometry for gaps this size may be rough."
Step 3 — Relax the two-chain complex
relax_pdb_structure(
input_pdb="./structures/complex_AB_fixed.pdb",
output_pdb="./structures/complex_AB_relaxed.pdb",
preset="amber14_implicit",
run_md=True,
restrain_backbone=True,
restraint_k=1000.0,
production_steps=25000,
notes="Running short restrained MD (25 000 steps ≈ 100 ps) to relax side chains "
"around the 12 newly added heavy atoms and the two gap termini in chain A. "
"Backbone restrained at 1000 kJ/mol/nm² to preserve experimental fold."
)
Tool output (abbreviated)
output_pdb: complex_AB_relaxed.pdb
preset_used: amber14_implicit
n_atoms: 9847
energy_initial_kJ_mol: -71203.1
energy_post_min_kJ_mol: -89441.6
energy_post_md_kJ_mol: -94308.2
energy_final_kJ_mol: -95112.4
Claude's summary: Energy decreased steadily through minimisation → MD → final minimisation. The structure is converged and ready. Output: complex_AB_relaxed.pdb.
Project Structure
biomolecule-modeling-mcp/
├── main.py # MCP server entry point; all @mcp.tool() definitions
├── pyproject.toml # dependencies + CLI entry point (biomodeling)
├── utils/
│ ├── pdb_query.py # Structure inspection (BioPython)
│ ├── pdb_fix.py # PDBFixer wrapper
│ ├── pdb_splitter.py # Split multi-component PDB by chain type
│ ├── pdb_assemble.py # Merge PDB files
│ └── openmm_relax.py # OpenMM energy minimisation + restrained MD
└── data/ # Example / test PDB files
Key Gotchas
- Never fix large loops by default.
fix_missing_residues=Falseis the safe default; loops > 5–10 residues produce poor geometry with PDBFixer. - Ligands break standard relaxation. Non-standard HETATM residues must be removed (or separately parameterised) before running OpenMM with AMBER/CHARMM presets.
- OpenMM >= 8.x implicit solvent. The implicit solvent XML (e.g.
implicit/gbn2.xml) goes intoForceField(), notcreateSystem().
License
MIT