wasp-mcp

Web Agent Semantic Protocol — MCP Server

wasp-mcp is a Model Context Protocol server that lets Claude (or any MCP client) query arbitrary webpages with token-efficient, structure-aware retrieval. Instead of dumping raw HTML into the context window, WASP builds a lightweight structural index (the manifest) from a page's headings, then fetches content only for the sections relevant to a query.

The result: answers grounded in real page content at a fraction of the token cost of naive scraping.

See the WASP Whitepaper for full protocol specification.

How It Works

Every webpage has two useful layers:

Structure — headings and section anchors that form a table of contents. Small, cheap to index.
Content — the text under each heading. Expensive to send in full; most is irrelevant to any given query.

WASP exploits this split with a two-tier pipeline:

Tier 1 — get_manifest(url)
  ↓ Try GET /.well-known/wasp.json (site-native manifest, 3 s timeout)
  ↓ Fall back: fetch HTML → parse headings → generate manifest client-side
  → Returns: structured index (headings, anchors, depth, token estimates)

Tier 2 — fetch_chunk(url, anchor)
  ↓ Resolve anchor → DOM element (getElementById → querySelector → fuzzy match)
  ↓ Extract section text via Range API / heading-sibling walk
  → Returns: plain-text body of that section only

query_page(url, query)
  ↓ get_manifest → score chunks by keyword match → fetch_chunk for top results
  ↓ Build numbered [1. Heading] context → call Claude API → inline [N] citations
  → Returns: { answer, sources[] }

A naive full-page scrape of a typical faculty profile costs ~16,700 tokens. The same query via WASP costs ~2,700 — a 6× reduction.

Install

Requirements: Node.js ≥ 18, an Anthropic API key.

git clone https://github.com/seanfeeney/wasp-mcp
cd wasp-mcp
npm install
npm run build

Set your API key:

export ANTHROPIC_API_KEY=sk-ant-...

Run the server (stdio transport, for Claude Desktop / Claude Code):

node dist/index.js

Add to Claude Code

Add wasp-mcp as a local MCP server in your Claude Code project config:

claude mcp add wasp -- node /absolute/path/to/wasp-mcp/dist/index.js

Or edit .claude/settings.json manually:

{
  "mcpServers": {
    "wasp": {
      "command": "node",
      "args": ["/absolute/path/to/wasp-mcp/dist/index.js"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

Restart Claude Code after saving. Confirm the server is live:

/mcp

MCP Tools

`get_manifest`

Fetches the structural index for a URL. Tries the site's own /.well-known/wasp.json first; falls back to client-side DOM generation from the fetched HTML.

Parameters

Name	Type	Required	Description
`url`	string	yes	Fully-qualified URL of the page

Example

get_manifest("https://engineering.tamu.edu/cse/profiles/aklappenecker.html")

{
  "wasp": "1.0",
  "url": "https://engineering.tamu.edu/cse/profiles/aklappenecker.html",
  "title": "Andreas Klappenecker — Texas A&M CSE",
  "summary": "Faculty profile for Andreas Klappenecker.",
  "keywords": ["quantum computing", "cryptography", "image processing"],
  "chunks": [
    { "id": "chunk_001", "heading": "Andreas Klappenecker", "anchor": "#wasp-001", "depth": 1, "tokens": 5, "order": 1 },
    { "id": "chunk_002", "heading": "Research Interests",   "anchor": "#wasp-002", "depth": 2, "tokens": 4, "order": 2 },
    { "id": "chunk_003", "heading": "Selected Publications","anchor": "#wasp-003", "depth": 2, "tokens": 5, "order": 3 }
  ],
  "generated": "client"
}

`fetch_chunk`

Retrieves the plain-text body of a single section identified by its anchor. Anchor resolution uses a three-stage fallback: getElementById → querySelector → fuzzy heading match.

Parameters

Name	Type	Required	Description
`url`	string	yes	Page URL (used for cache lookup; re-fetches if not cached)
`anchor`	string	yes	CSS anchor string from the manifest (e.g. `"#research-interests"`)

Example

fetch_chunk(
  "https://engineering.tamu.edu/cse/profiles/aklappenecker.html",
  "#wasp-002"
)

Quantum computing, image processing, cryptography.

`query_page`

Full end-to-end retrieval: builds the manifest, scores chunks against the query, fetches relevant section bodies, calls Claude, and returns a cited answer.

Parameters

Name	Type	Required	Description
`url`	string	yes	Page to query
`query`	string	yes	Natural-language question
`provider`	string	no	`"claude"` (default) \| `"openai"` \| `"ollama"`

Example

query_page(
  "https://engineering.tamu.edu/cse/profiles/aklappenecker.html",
  "What are this professor's research interests?"
)

{
  "answer": "Professor Klappenecker's research interests are quantum computing [1], image processing [1], and cryptography [1].",
  "sources": [
    { "heading": "Research Interests", "anchor": "#wasp-002" }
  ]
}

Token Efficiency

Approach	Tokens sent to LLM	Example page
Raw HTML scrape	~16,700	TAMU faculty profile
WASP `query_page`	~2,700	same page, same query
Reduction	6.1×

Token savings grow with page length. A 50,000-token documentation page may see 20–40× reduction when only 2–3 sections are relevant.

Project Structure

wasp-mcp/
  index.ts        MCP server entry — registers tools
  manifest.ts     get_manifest() — discovery + DOM generation
  chunks.ts       fetch_chunk() — anchor resolution + text extraction
  retrieval.ts    query_page() — scoring, enrichment, LLM call
  providers.ts    claude / openai / ollama provider adapters
  cache.ts        In-memory URL → { manifest, html } cache with TTL
  types.ts        Shared TypeScript types

wasp-mcp

wasp-mcp

How It Works

Install

Add to Claude Code

MCP Tools

`get_manifest`

`fetch_chunk`

`query_page`

Token Efficiency

Project Structure

License

MCP Server · Populars

🦞 OpenClaw — Personal AI Assistant

MarkItDown

MarkItDown-MCP

Awesome MCP Servers

mcp-server-sentry: A Sentry MCP server

MCP Server · New

consult-llm

bernstein

byob

FastMCP

boss-agent-cli