robot-resources

@robot-resources/scraper-mcp

Community robot-resources
Updated

MCP server for Robot Resources Scraper — web content to clean markdown with 70-80% token reduction

npm versionLicense: MIT

@robot-resources/scraper-mcp

MCP server for Scraper — context compression for AI agents.

What is Robot Resources?

Human Resources, but for your AI agents.

Robot Resources gives AI agents two superpowers:

  • Router — Routes each LLM call to the cheapest capable model. 60-90% cost savings across OpenAI, Anthropic, and Google.
  • Scraper — Compresses web pages to clean markdown. 70-80% fewer tokens per page.

Both run locally. Your API keys never leave your machine. Free, unlimited, no tiers.

Install the full suite

npx robot-resources

One command sets up everything. Learn more at robotresources.ai

About this MCP server

This package gives AI agents two tools to compress web content into token-efficient markdown via the Model Context Protocol: single-page compression and multi-page BFS crawling.

Installation

npx @robot-resources/scraper-mcp

Or install globally:

npm install -g @robot-resources/scraper-mcp

Claude Desktop Configuration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "scraper": {
      "command": "npx",
      "args": ["-y", "@robot-resources/scraper-mcp"]
    }
  }
}

Tools

scraper_compress_url

Compress a single web page into markdown with 70-90% fewer tokens.

Parameters:

Parameter Type Required Default Description
url string yes URL to compress
mode string no 'auto' 'fast', 'stealth', 'render', or 'auto'
timeout number no 10000 Fetch timeout in milliseconds
maxRetries number no 3 Max retry attempts (0-10)

Example prompt: "Compress https://docs.example.com/getting-started"

scraper_crawl_url

Crawl multiple pages from a starting URL using BFS link discovery.

Parameters:

Parameter Type Required Default Description
url string yes Starting URL to crawl
maxPages number no 10 Max pages to crawl (1-100)
maxDepth number no 2 Max link depth (0-5)
mode string no 'auto' 'fast', 'stealth', 'render', or 'auto'
include string[] no URL patterns to include (glob)
exclude string[] no URL patterns to exclude (glob)
timeout number no 10000 Per-page timeout in milliseconds

Example prompt: "Crawl the docs at https://docs.example.com with max 20 pages"

Fetch Modes

Mode How Use when
'fast' Plain HTTP Default sites, APIs, docs
'stealth' TLS fingerprint impersonation Anti-bot protected sites
'render' Headless browser (Playwright) JS-rendered SPAs
'auto' Fast → stealth fallback on 403/challenge Unknown sites (default)

Stealth requires impit and render requires playwright as peer dependencies of @robot-resources/scraper.

Requirements

  • Node.js 18+

Related

License

MIT

MCP Server · Populars

MCP Server · New

    nteract

    semiotic

    A data visualization for AI and Streaming

    Community nteract
    rixinhahaha

    Snip

    A macOS menu-bar screenshot tool with annotation, AI-powered organization, and semantic search. Built with Electron and Ollama. Featured on Product Hunt: https://www.producthunt.com/products/snip-ai-powered-macos-screenshot-tool

    Community rixinhahaha
    blitzdotdev

    Blitz

    Blitz mac app

    Community blitzdotdev
    mozilla

    Firefox DevTools MCP

    Model Context Protocol server for Firefox DevTools - enables AI assistants to inspect and control Firefox browser through the Remote Debugging Protocol

    Community mozilla
    globau

    Firefox DevTools MCP

    Model Context Protocol server for Firefox DevTools - enables AI assistants to inspect and control Firefox browser through the Remote Debugging Protocol

    Community globau