๐ Tripitaka MCP Server
An MCP Server for searching and citing content from the Pฤli Tipiแนญaka.Gives AI agents (such as Claude or Cursor) the ability to look up suttas, quote the teachings, and compare translations across languages.
๐ This project is offered as Dhamma Dฤna โ 100% free, non-commercial only.License details: LICENSE (code) + NOTICE.md (data)
โจ Features
- ๐ Full Tipiแนญaka coverage at parity with SuttaCentral โ all three baskets indexed (~444K segments): Sutta (Pฤli + Sujato English), Vinaya (Pฤli + Brahmali English), and Abhidhamma (Pฤli only โ no English in upstream
bilara-datafor any Abhidhamma book). Live counts vialist_structure. - โ๏ธ Hybrid Search โ highest precision by combining keyword and semantic search through Reciprocal Rank Fusion (RRF). Ready to use.
- ๐ Keyword Search โ trigram fuzzy matching with cross-language alignment.
- ๐ง Semantic Search โ meaning-based search via vector similarity (pgvector).
- ๐ Translation Comparison โ view and compare renderings across editions, aligned at the segment level.
- ๐ Dictionary Bridge โ built-in dictionary of 20,000+ entries (P. A. Payutto, PTS, DPPN).
- ๐ Get Sutta & Reference โ fetch sutta content by ID (e.g.
mn1,pli-tv-bu-vb-pj1,patthana1.1) and generate properly formatted academic citations. - ๐ฌ Pฤli word analyzer โ strip inflectional suffixes to find the root form when dictionary lookup misses (
bhikkhลซnaแนโbhikkhu). - ๐ Cross-reference URLs in every response โ clickable deep links to SuttaCentral (Pฤli + Sujato English + segment anchor) plus 84000.org volume routing for Thai users. AI clients can surface these so users verify the source in one click.
- ๐ก Dual transport โ both legacy SSE (
/sse) and canonical Streamable HTTP (/mcp, MCP spec 2025-03-26). - ๐ฆ MCP Resources โ
tripitaka://structure,tripitaka://sutta/{id},tripitaka://word/{w}for clients that pin context as resources. - ๐ Curated reference pages at
/topics/*โ six markdown pages covering canon structure, getting-started + tool selection, places (Mahฤjanapada + holy sites + cosmology), 10 foundational themes with locus classicus, ~30 major figures, and a phase-based timeline of the Buddha's 45-year mission. Sutta IDs verified against live data; AI clients can fetch a page in one shot instead of running 30+ tool calls. - ๐ค Claude skill โ
skills/tipitaka-research.mdships a ready-to-install workflow file that activates a multi-step research pattern (clarify โ verify coverage โ search โ drill in โ cite) on Claude Desktop / Claude Code. - ๐ฎ Postman Ready โ ships with a Postman collection for testing the API.
๐๏ธ Tech Stack
| Technology | Role |
|---|---|
| Python + FastMCP | MCP Server |
| PostgreSQL + pgvector | Database + Vector Search |
| sentence-transformers | Embeddings for semantic search |
| Docker Compose | Infrastructure |
๐ Quick Start
๐ No setup โ connect to the public Dhamma Dฤna server
The maintainers run a free public instance at tripitaka-mcp.com.
| Endpoint | Use |
|---|---|
https://mcp.tripitaka-mcp.com/mcp |
Streamable HTTP (MCP spec 2025-03-26) |
https://mcp.tripitaka-mcp.com/sse |
Legacy SSE (older clients) |
Connect Claude Desktop in three steps (no install, no Docker, no GPU โ you just need Node.js):
1. Find your absolute npx path. Claude Desktop doesn't read your shell profile, so a bare npx won't resolve. Open a terminal:
which npx
# example: /Users/you/.nvm/versions/node/v22.14.0/bin/npx
2. Open claude_desktop_config.json (~/Library/Application Support/Claude/ on macOS, %APPDATA%\Claude\ on Windows) and add the entry below โ substitute YOUR_NPX_PATH with the output from step 1, and YOUR_NODE_BIN_DIR with that path's parent directory:
{
"mcpServers": {
"tripitaka": {
"command": "YOUR_NPX_PATH",
"args": ["-y", "mcp-remote", "https://mcp.tripitaka-mcp.com/mcp"],
"env": { "PATH": "YOUR_NODE_BIN_DIR:/usr/local/bin:/usr/bin:/bin" }
}
}
}
3. Quit Claude Desktop completely (โQ on macOS, tray โ Quit on Windows) and reopen. The ๐ indicator in the bottom-left should show tripitaka with 10 tools available.
First connection takes 5โ10 seconds while
npxdownloadsmcp-remoteon demand โ give Claude Desktop a moment after restart before assuming it failed.
Once connected, try asking Claude things like:
- "What does the Buddha teach about mindfulness of breathing? Quote the relevant passages from MN 118."
- "Show me the full text of the Karaแนฤซyamettasutta in Pฤli and English."
- "What does the Pฤli word sati mean according to the Payutto dictionary?"
- "Find suttas where the Buddha discusses anger."
Claude will pick the right tool, fetch the canonical Pฤli, and surface clickable links back to SuttaCentral for verification.
The hosted server is rate-limited (10 req/10s + 60 req/min per IP) and offered for personal study, research, and dhamma practice โ see NOTICE.md before redistributing or using commercially.
๐๏ธ Fastest local path โ use the installer (recommended for non-developers)
git clone https://github.com/dhamma-seeker/tripitaka-mcp.git
cd tripitaka-mcp
./scripts/install.sh
The installer downloads a prepared database dump from Hugging Face โ dhamma-seeker/tripitaka-mcp-dump and restores it automatically โ cutting setup time from 2โ4 hours (loading data + generating embeddings) down to ~5 minutes.(If a local dump file already exists, the local copy is used instead.)
The installer will:
- Verify that
docker,compose,openssl, andcurlare installed - Generate
.envwith random passwords (for both the admin and the readonly user) - Download the dump from Hugging Face (if not already local)
- Start the DB and restore the dump
- Set up the readonly role and runtime timeouts
- Print a ready-to-paste Claude Desktop config
Options:
./scripts/install.sh --dump PATH # use an existing dump file
./scripts/install.sh --dump-url URL # override the dump source
./scripts/install.sh --no-dump # skip restore (load data yourself later)
๐ง Manual setup (for developers)
1. Clone & Setup
git clone https://github.com/dhamma-seeker/tripitaka-mcp.git
cd tripitaka-mcp
cp .env.example .env
# Set POSTGRES_PASSWORD in .env to a random password
2. Start Database
docker compose up db -d
3. Install Dependencies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
4. Initialize Database & Load Data
# 1. Seed metadata (pitaka, nikฤya)
python scripts/seed_metadata.py
# 2. Download & load Sutta Piแนญaka data from SuttaCentral
python scripts/data_loader.py
# 3. Load Thai CC0 translations (Dhฤซranando & Jayasฤro)
python scripts/load_thai_cc0.py
# 4. Load dictionaries (DPD, PTS, DPPN, and the Payutto dictionary)
python scripts/load_dictionary.py
# 5. Generate embeddings for semantic / hybrid search
python scripts/generate_embeddings.py
5. Run MCP Server
python main.py
๐งช Testing with Postman
The project supports Postman testing in SSE mode:
- Run the server with:
MCP_TRANSPORT=sse python main.py - Import postman_collection.json into Postman
- Invoke the tools directly
๐ข Production Deployment
To deploy to production without re-loading the data and re-running the embedding model, restoring from a database dump is the recommended path.
docker compose -f docker-compose.prod.yml up -d --build
The production stack runs 3 services:
dbโ PostgreSQL + pgvector (internal only, no exposed port)mcp-serverโ FastMCP (runs as a readonly user, read-only FS,cap_drop: ALL)caddyโ reverse proxy + Let's Encrypt + rate limit (10 req/10s and 60 req/1 min per IP)
For an extra hardening layer, front Caddy with Cloudflare (DNS proxy + rate-limit rules + DDoS protection on the free tier).
๐ Full details: DEPLOYMENT.md
๐ง Connecting to Claude Desktop
The repo ships claude_desktop_config.example.json with three ready-to-use entries โ copy whichever fits your setup into claude_desktop_config.json (~/Library/Application Support/Claude/ on macOS, %APPDATA%\Claude\ on Windows), then edit the absolute paths:
| Entry | When to use | Transport |
|---|---|---|
tripitaka-local |
You ran the installer locally on the same machine as Claude Desktop | stdio (no network) |
tripitaka-remote |
You self-hosted the server on a VPS and want the modern transport | Streamable HTTP (/mcp) |
tripitaka-remote-sse |
Your client doesn't support Streamable HTTP yet | Legacy SSE (/sse) |
The remote entries route through mcp-remote โ Claude Desktop โ npx bridge โ remote MCP. The example file has annotated comments explaining each field; remove the _comment keys before saving.
Heads-up for nvm users:
commandandenv.PATHneed absolute node paths โ Claude Desktop doesn't read your shell profile. Find the right paths withwhich npx/which pythonwhile your normal shell is active.
Optional: install the research skill
For Claude Desktop / Claude Code users, copying the bundled skill activates the multi-step research workflow automatically:
mkdir -p ~/.claude/skills
cp skills/tipitaka-research.md ~/.claude/skills/
# Restart Claude Desktop (Cmd+Q then reopen) to pick up the skill
Details in skills/README.md.
๐ฆ MCP Tools (10 total)
| Tool | Description |
|---|---|
search_hybrid |
(Recommended for concept search) Combined keyword + semantic via RRF โ best when looking for "discourses about X". |
search_by_keyword |
Trigram keyword search โ best for exact word lookups (appamฤda, ฤnฤpฤnassati). |
search_semantic |
Pure vector similarity โ usually you want search_hybrid instead. |
get_sutta |
Fetch a full sutta by ID (e.g. mn1, dn22, dhp1-20) โ returns every segment with cross-reference URLs. |
get_reference |
Generate a properly formatted academic citation with all source URLs. |
compare_translations |
Compare renderings of a single segment across editions. |
list_structure |
Show the Tipiแนญaka structure with segment-count coverage per nikฤya. |
list_editions |
List Thai/English translation editions currently loaded. |
get_word_definition |
Pฤli dictionary lookup (PTS, DPPN, and the Payutto Thai dictionary). |
parse_pali_word |
Strip Pฤli suffixes to recover the root form when get_word_definition misses (bhikkhลซnaแน โ bhikkhu). |
โ ๏ธ Note on search_semantic
The vector index is built only on text_pali (SuttaCentral's bilara-data does not yet include Thai translations) using a multilingual MiniLM model that is not specifically trained on Pฤli. As a result:
- Pฤli / English queries โ accurate (good cross-lingual alignment)
- Thai queries โ loose matches, not recommended
- For exact keywords like
appamฤda,search_by_keywordis more precise - For general-purpose search,
search_hybrid(keyword + semantic) tolerates this limitation best
Upgrading to a Pฤli-trained embedding model (e.g. bge-m3) plus embedding the Thai edition is on the roadmap.
๐ Project Structure
tripitaka-mcp/
โโโ main.py # Main MCP Server (10 tools + 3 resources)
โโโ db/
โ โโโ connection.py # Database connection pool
โ โโโ schema.py # Schema (supports translation table)
โโโ embedding/
โ โโโ model.py # SentenceTransformer wrapper
โโโ scripts/
โ โโโ install.sh # One-shot installer (HF dump โ DB)
โ โโโ deploy.sh # Deploy / restart on a VPS
โ โโโ backup.sh # pg_dump โ S3-compatible store
โ โโโ dump_and_publish.sh # Verify embeddings โ pg_dump โ upload to HuggingFace
โ โโโ seed_metadata.py # Seed pitaka/nikฤya metadata
โ โโโ data_loader.py # Load Sutta Piแนญaka (Pฤli + Sujato English)
โ โโโ load_vinaya.py # Vinaya loader (Vibhaแน
ga + Pฤtimokkha + Khandhaka + Parivฤra, Brahmali EN)
โ โโโ load_abhidhamma.py # Abhidhamma loader (7 books, Pฤli โ bilara has no EN)
โ โโโ load_thai_cc0.py # Thai translation loader
โ โโโ load_dictionary.py # Load dictionary data
โ โโโ scrape_payutto.py # Web scraper for the Payutto dictionary
โ โโโ generate_embeddings.py # Generate vector embeddings
โ โโโ run_embedding_with_retry.sh # Resilient wrapper around embedding generation (retries on DB drop)
โ โโโ check_embedding_progress.py # Live progress snapshot (or --watch mode) for the embedding job
โ โโโ smoke_test.sh # Endpoint smoke test (TLS + /sse + /mcp + /health)
โ โโโ test_full_sutta.py # Full-content smoke test (22 size-tiered suttas across all 3 piแนญakas)
โโโ topics/ # Static markdown pages served at /topics/*
โ โโโ README.md # Index of available topic pages
โ โโโ tipitaka-overview.md # Canon structure + coverage
โ โโโ getting-started.md # Connection paths, tool selection, prompt patterns
โ โโโ places.md # Geography of the suttas (Mahฤjanapada, holy sites, cosmology)
โ โโโ themes.md # 10 foundational teachings + locus classicus
โ โโโ people.md # ~30 major figures (chief disciples, lay supporters, kings)
โโโ skills/ # Portable Claude skills for AI clients
โ โโโ README.md # How to install
โ โโโ tipitaka-research.md # Multi-step research workflow
โโโ infra/ # Reverse proxy + deploy config
โ โโโ Caddyfile # Caddy: TLS, rate limit, /topics, /sse, /mcp
โ โโโ Dockerfile.caddy # Caddy + caddy-ratelimit plugin
โ โโโ cloud-init.yml # VPS bootstrap
โ โโโ *.tf # Terraform (provider-agnostic)
โโโ docs/
โ โโโ CAPACITY.md # Capacity planning per VPS spec
โโโ claude_desktop_config.example.json
โโโ docker-compose.yml # Dev (single mcp-server)
โโโ docker-compose.prod.yml # Prod (db + 2 mcp-server + caddy)
โโโ Dockerfile
โโโ requirements.txt
๐ Data Sources & License
This project aggregates data from multiple sources under different licenses.Please read NOTICE.md in full before redistributing.
| Source | License | Note |
|---|---|---|
| Source code | MIT | Free to use, fork, modify |
| SuttaCentral bilara-data | CC0 | Public domain |
| Thai translations (Dhฤซranando, Jayasฤro) | CC0 | Via SuttaCentral |
| Dictionary of Buddhism by Somdet Phra Buddhaghosacariya (P. A. Payutto) | Dhamma Dฤna | โ ๏ธ Non-commercial use only |
| PTS / DPPN / Dhammika Dictionaries | Public Domain / CC | โ |
โ ๏ธ If you plan to fork or redistribute
- โ Use in free / dhamma-dฤna / educational projects โ allowed
- โ Run on your own machine / personal use โ allowed
- โ Do not use in any paid product or service (because of the Payutto dictionary)
- โ Do not modify the dictionary content
For commercial use: remove the dictionary component, or contact Wat Nyanavesakavan for permission.
๐ Credits & Attribution
See CREDITS.md for contributor details and NOTICE.md for license terms.
Gratitude to:
- Somdet Phra Buddhaghosacariya (P. A. Payutto) + Wat Nyanavesakavan
- SuttaCentral and the Thai & English translators
- 84000.org
Sฤdhu ๐ โ May the sharing of this Dhamma bring benefit and happiness to all beings.