mcp_poc
A PDF-to-Markdown converter built with the Model-View-Controller (MVC) pattern using the Model Context Protocol (MCP).
Architecture
The project follows a clean separation of concerns across three layers, connected via MCP's stdio transport.
flowchart TB
subgraph Client["Client Layer"]
C[mcp_client.py<br/>Stdio Client Connector]
end
subgraph Server["Server Layer (Controller)"]
S[mcp_server.py<br/>FastMCP Server]
T[convert_pdf Tool]
P[summarize_markdown Prompt]
end
subgraph Model["Model Layer"]
M[PyMuPDF / fitz<br/>PDF Text Extraction]
end
subgraph View["View Layer"]
V[Markdown Formatter<br/>## Page N]
end
C -->|stdio JSON-RPC| S
S --> T
T --> M
M --> V
V --> T
T --> S
S -->|Tool Result| C
File Responsibilities
| File | Role | Description |
|---|---|---|
mcp_server.py |
Controller | Exposes convert_pdf as an MCP tool and summarize_markdown as a prompt. Orchestrates Model and View logic. |
mcp_client.py |
Connector | Manages the stdio client session used to call the server's tools from external processes or LLMs. |
main.py |
Entry Point | Can run the server in stdio mode (--server) or execute direct local conversions for testing. |
test_integration.py |
Test | End-to-end integration test that launches the server via stdio and verifies tool execution. |
Quick Start
1. Install Dependencies
python -m venv venv
# Windows
.\venv\Scripts\pip install -r requirements.txt
# macOS / Linux
source venv/bin/activate && pip install -r requirements.txt
2. Run a Direct Conversion (Local Test)
.\venv\Scripts\python main.py data/sample.pdf
3. Run the MCP Server (for Claude / IDE integration)
.\venv\Scripts\python main.py --server
4. Run Integration Test (Client → Server)
.\venv\Scripts\python test_integration.py
Project Structure
mcp_poc/
├── .env # Environment configuration
├── .gitignore # Git ignore rules
├── README.md # This file
├── main.py # Application entry point
├── mcp_client.py # MCP stdio client connector
├── mcp_server.py # MCP server with tools & prompts
├── requirements.txt # Python dependencies
├── test_integration.py # Integration test
└── data/
└── sample.pdf # Test PDF
How It Works
- Client (
mcp_client.py) spawns the server as a subprocess and communicates over stdio. - Server (
mcp_server.py) receives aconvert_pdftool call. - Model (
fitz/ PyMuPDF) opens the PDF and extracts raw text page by page. - View formats the extracted text as Markdown with
## Page Nheaders. - The formatted Markdown is returned to the client via the MCP protocol.
Extending
- OCR Support: Swap the Model layer to use
marker-pdfor an external OCR API for scanned documents. - Additional Tools: Add more
@mcp.tool()definitions inmcp_server.pyfor image extraction, metadata parsing, etc. - Alternative Transports: Replace
stdiowithsseorhttpinmcp.run(...)for remote deployments.