MCP WebScout
A Model Context Protocol (MCP) server providing web search (DuckDuckGo) and intelligent content extraction with LLM-powered analysis.
Features
- search: Search the web using DuckDuckGo
- fetch: Advanced web fetching with Crawl4AI and LLM extraction
System Requirements
| Requirement | Version | Notes |
|---|---|---|
| Python | >= 3.10 | Required runtime environment |
| pip | latest | Package manager (included with Python) |
| Playwright | latest | Required by Crawl4AI for browser automation |
| DeepSeek API Key | - | Required for LLM extraction mode |
| Proxy (optional) | - | Required for users in mainland China |
Python Dependencies (14 packages)
| Package | Version | Purpose |
|---|---|---|
| mcp | >=1.0.0 | MCP protocol implementation |
| duckduckgo-search | >=3.0.0 | DuckDuckGo search API |
| requests | >=2.32.0 | HTTP requests |
| beautifulsoup4 | >=4.12.0 | HTML parsing |
| openai | >=1.30.0 | OpenAI API client for DeepSeek |
| crawl4ai | >=0.5.0 | Advanced web scraping |
Quick Start
Get started in 5 steps:
1. Clone and Setup Environment
git clone <repository>
cd mcp-webscout
python -m venv .venv
On Windows:
.venv\Scripts\activate
On macOS/Linux:
source .venv/bin/activate
2. Install Dependencies
pip install -e ".[dev]"
3. Install Playwright Browsers
playwright install chromium
4. Configure Environment Variables
cp .env.example .env
Edit .env and add your configuration:
# Required for LLM extraction
DEEPSEEK_API_KEY=sk-your-actual-key-here
# Required for mainland China users
PROXY_URL=http://127.0.0.1:7890
USE_PROXY=true
5. Verify Installation
# Run tests
pytest tests/ -v
# Test the server
python -m mcp_webscout --help
Detailed Configuration
For detailed environment setup instructions, see ENV_SETUP.md.
Usage
As a Command
mcp-webscout
As a Python Module
python -m mcp_webscout
With Claude Desktop
Add to your claude_desktop_config.json:
Basic Configuration
{
"mcpServers": {
"webscout": {
"command": "mcp-webscout"
}
}
}
With Environment Variables (Recommended)
{
"mcpServers": {
"webscout": {
"command": "mcp-webscout",
"env": {
"DEEPSEEK_API_KEY": "sk-your-key-here",
"PROXY_URL": "http://127.0.0.1:7890",
"USE_PROXY": "true",
"DEFAULT_MAX_LENGTH": "5000",
"PYTHONUTF8": "1"
}
}
}
}
Windows Configuration
{
"mcpServers": {
"webscout": {
"command": "python",
"args": ["-m", "mcp_webscout"],
"env": {
"DEEPSEEK_API_KEY": "sk-your-key-here",
"PROXY_URL": "http://127.0.0.1:7890",
"USE_PROXY": "true",
"PYTHONUTF8": "1"
}
}
}
}
Tools
search
Search the web using DuckDuckGo.
Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
| query | string | Yes | Search query |
| max_results | integer | No | Maximum results (1-10, default: 5) |
Returns:
Formatted search results with titles, URLs, and snippets.
Example:
{
"query": "Python programming",
"max_results": 3
}
fetch
Advanced web fetching with Crawl4AI and LLM extraction.
Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
| url | string | Yes | URL to fetch |
| mode | string | No | Extraction mode: simple, llm (default: simple) |
| prompt | string | No | Custom extraction prompt for LLM mode |
| max_length | integer | No | Maximum characters (default: 5000) |
| use_proxy | boolean | No | Use proxy (default: true) |
Returns:
Fetched and optionally extracted content.