MCP Conceal
An MCP proxy that pseudo-anonymizes PII before data reaches external AI providers like Claude, ChatGPT, or Gemini.
sequenceDiagram
participant C as AI Client (Claude)
participant P as MCP Conceal
participant S as Your MCP Server
C->>P: Request
P->>S: Request
S->>P: Response with PII
P->>P: PII Detection
P->>P: Pseudo-Anonymization
P->>P: Consistent Mapping
P->>C: Sanitized Response
MCP Conceal performs pseudo-anonymization rather than redaction to preserve semantic meaning and data relationships required for AI analysis. Example: [email protected]
becomes [email protected]
, maintaining structure while protecting sensitive information.
Installation
Download Pre-built Binary
- Visit the Releases page
- Download the binary for your platform:
Platform | Binary |
---|---|
Linux x64 | mcp-server-conceal-linux-amd64 |
macOS Intel | mcp-server-conceal-macos-amd64 |
macOS Apple Silicon | mcp-server-conceal-macos-aarch64 |
Windows x64 | mcp-server-conceal-windows-amd64.exe |
- Make executable:
chmod +x mcp-server-conceal-*
(Linux/macOS) - Add to PATH:
- Linux/macOS:
mv mcp-server-conceal-* /usr/local/bin/mcp-server-conceal
- Windows: Move to a directory in your PATH or add current directory to PATH
- Linux/macOS:
Building from Source
git clone https://github.com/gbrigandi/mcp-server-conceal
cd mcp-server-conceal
cargo build --release
Binary location: target/release/mcp-server-conceal
Quick Start
Prerequisites
Install Ollama for LLM-based PII detection:
- Install Ollama: ollama.ai
- Pull model:
ollama pull llama3.2:3b
- Verify:
curl http://localhost:11434/api/version
Basic Usage
Create a minimal mcp-server-conceal.toml
:
[detection]
mode = "regex_llm"
[llm]
model = "llama3.2:3b"
endpoint = "http://localhost:11434"
See the Configuration section for all available options.
Run as proxy:
mcp-server-conceal \
--target-command python3 \
--target-args "my-mcp-server.py" \
--config mcp-server-conceal.toml
Configuration
Complete configuration reference:
[detection]
mode = "regex_llm" # Detection strategy: regex, llm, regex_llm
enabled = true
confidence_threshold = 0.8 # Detection confidence threshold (0.0-1.0)
[detection.patterns]
email = "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b"
phone = "\\b(?:\\+?1[-\\.\\s]?)?(?:\\(?[0-9]{3}\\)?[-\\.\\s]?)?[0-9]{3}[-\\.\\s]?[0-9]{4}\\b"
ssn = "\\b\\d{3}-\\d{2}-\\d{4}\\b"
credit_card = "\\b\\d{4}[-\\s]?\\d{4}[-\\s]?\\d{4}[-\\s]?\\d{4}\\b"
ip_address = "\\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b"
url = "https?://[^\\s/$.?#].[^\\s]*"
[faker]
locale = "en_US" # Locale for generating realistic fake PII data
seed = 12345 # Seed ensures consistent anonymization across restarts
consistency = true # Same real PII always maps to same fake data
[mapping]
database_path = "mappings.db" # SQLite database storing real-to-fake mappings
retention_days = 90 # Delete old mappings after N days
[llm]
model = "llama3.2:3b" # Ollama model for PII detection
endpoint = "http://localhost:11434"
timeout_seconds = 180
prompt_template = "default" # Template for PII detection prompts
[llm_cache]
enabled = true # Cache LLM detection results for performance
database_path = "llm_cache.db"
max_text_length = 2000
Configuration Guidance
Detection Settings:
confidence_threshold
: Lower values (0.6) catch more PII but increase false positives. Higher values (0.9) are more precise but may miss some PII.mode
: Choose based on your latency vs accuracy requirements (see Detection Modes below)
Faker Settings:
locale
: Use "en_US" for American names/addresses, "en_GB" for British, etc. Affects realism of generated fake dataseed
: Keep consistent across deployments to ensure same real data maps to same fake dataconsistency
: Always leavetrue
to maintain data relationships
Mapping Settings:
retention_days
: Balance between data consistency and storage. Shorter periods (30 days) reduce storage but may cause inconsistent anonymization for recurring datadatabase_path
: Use absolute paths in production to avoid database location issues
Detection Modes
Choose the detection strategy based on your performance requirements and data complexity:
RegexLlm (Default)
Best for production environments - Combines speed and accuracy:
- Phase 1: Fast regex catches common patterns (emails, phones, SSNs)
- Phase 2: LLM analyzes remaining text for complex PII
- Use when: You need comprehensive detection with reasonable performance
- Performance: ~100-500ms per request depending on text size
- Configure:
mode = "regex_llm"
Regex Only
Best for high-volume, latency-sensitive applications:
- Uses only pattern matching - no AI analysis
- Use when: You have well-defined PII patterns and need <10ms response
- Trade-off: May miss contextual PII like "my account number is ABC123"
- Configure:
mode = "regex"
LLM Only
Best for complex, unstructured data:
- AI-powered detection catches nuanced PII patterns
- Use when: Accuracy is more important than speed
- Performance: ~200-1000ms per request
- Configure:
mode = "llm"
Advanced Usage
Claude Desktop Integration
Configure Claude Desktop to proxy MCP servers:
{
"mcpServers": {
"database": {
"command": "mcp-server-conceal",
"args": [
"--target-command", "python3",
"--target-args", "database-server.py --host localhost",
"--config", "/path/to/mcp-server-conceal.toml"
],
"env": {
"DATABASE_URL": "postgresql://localhost/mydb"
}
}
}
}
Custom LLM Prompts
Customize detection prompts for specific domains:
Template locations:
- Linux:
~/.local/share/mcp-server-conceal/prompts/
- macOS:
~/Library/Application Support/com.mcp-server-conceal.mcp-server-conceal/prompts/
- Windows:
%LOCALAPPDATA%\\com\\mcp-server-conceal\\mcp-server-conceal\\data\\prompts\\
Usage:
- Run MCP Conceal once to auto-generate
default.md
in the prompts directory:mcp-server-conceal --target-command echo --target-args "test" --config mcp-server-conceal.toml
- Copy:
cp default.md healthcare.md
- Edit template for domain-specific PII patterns
- Configure:
prompt_template = "healthcare"
Environment Variables
Pass environment variables to target process:
mcp-server-conceal \
--target-command node \
--target-args "server.js" \
--target-cwd "/path/to/server" \
--target-env "DATABASE_URL=postgresql://localhost/mydb" \
--target-env "API_KEY=secret123" \
--config mcp-server-conceal.toml
Troubleshooting
Enable debug logging:
RUST_LOG=debug mcp-server-conceal \
--target-command python3 \
--target-args server.py \
--config mcp-server-conceal.toml
Common Issues:
- Invalid regex patterns in configuration
- Ollama connectivity problems
- Database file permissions
- Missing prompt templates
Security
Mapping Database: Contains sensitive real-to-fake mappings. Secure with appropriate file permissions.
LLM Integration: Run Ollama on trusted infrastructure when using LLM-based detection modes.
Contributing
Contributions are welcome! Follow these steps to get started:
Development Setup
Prerequisites:
- Install Rust: https://rustup.rs/
- Minimum supported Rust version: 1.70+
Clone and setup:
git clone https://github.com/gbrigandi/mcp-server-conceal cd mcp-server-conceal
Build in development mode:
cargo build cargo test
Install development tools:
rustup component add clippy rustfmt
Run with debug logging:
RUST_LOG=debug cargo run -- --target-command cat --target-args test.txt --config mcp-server-conceal.toml
Testing
- Unit tests:
cargo test
- Integration tests:
cargo test --test integration_test
- Linting:
cargo clippy
- Formatting:
cargo fmt
Submitting Changes
- Fork the repository
- Create a feature branch:
git checkout -b feature-name
- Make your changes and add tests
- Ensure all tests pass:
cargo test
- Format code:
cargo fmt
- Submit a pull request with a clear description
License
MIT License - see LICENSE file for details.