๐ Revolutionary Screen Monitor MCP Server
A REVOLUTIONARY Model Context Protocol (MCP) server! Gives AI real-time vision capabilities and enhanced UI intelligence power. This isn't just screen capture - it gives AI the power to truly "see" and understand your digital world!
๐ฏ NEW in v2.1.0: Enhanced Smart Click with 75% success rate, menu detection, and fuzzy matching!
๐ WHY ScreenMonitorMCP?
- ๐ฅ First & Only: Real-time continuous screen monitoring feature
- ๐ง AI Intelligence: AI that understands UI elements and can interact with them
- โก Intelligent: Smart UI detection and interaction capabilities
- ๐ฏ Natural: AI that understands commands like "Click the save button"
๐ฅ REVOLUTIONARY FEATURES
๐ Real-Time Continuous Monitoring
- AI's Eyes Never Close: 2-5 FPS continuous screen monitoring
- Smart Change Detection: Distinguishes between small, major, and critical changes
- Proactive Analysis: AI automatically analyzes important changes
- Adaptive Performance: Smart frame rate adjustment
๐ฏ UI Element Intelligence
- Computer Vision UI Detection: Automatically recognizes buttons, forms, menus
- OCR Text Extraction: Reads text from anywhere on the screen
- Smart Click System: Natural language commands like "Click the save button"
- Interaction Mapping: AI knows exactly where and how to interact
๐ Application Monitoring
- Context Awareness: Detects which application is currently active
- Change Detection: Monitors application-specific changes and events
- Event Broadcasting: Relays application events to AI clients
- Multi-Application Support: Works with any application (Blender, VSCode, browsers, etc.)
๐ ๏ธ REVOLUTIONARY MCP TOOLS
๐ Real-Time Monitoring Tools
start_continuous_monitoring()
- Starts AI's continuous vision capabilitystop_continuous_monitoring()
- Stops continuous monitoringget_monitoring_status()
- Real-time status information and statisticsget_recent_changes()
- Recently detected screen changes
๐ฏ UI Intelligence Tools
analyze_ui_elements()
- Recognizes and maps all UI elements on screensmart_click()
- Smart clicking with natural language commands ("Click the save button")extract_text_from_screen()
- OCR text extraction from screen
๐ Application Monitoring Tools
get_active_application()
- Get currently active application contextregister_application_events()
- Register for application-specific eventsbroadcast_application_change()
- Broadcast application changes to AI clients
๐ธ Traditional Tools
capture_and_analyze()
- Screen capture and AI analysis (enhanced)list_tools()
- MCP standard compliant lists all tools (categorized, detailed information)
๐ฏ USAGE SCENARIOS
๐ Real-Time Monitoring
# Start AI's continuous vision capability
await start_continuous_monitoring(fps=3, change_threshold=0.1)
# Check monitoring status
status = await get_monitoring_status()
# View recent changes
changes = await get_recent_changes(limit=5)
๐ฏ Enhanced UI Intelligence โญ NEW
# Analyze all UI elements on screen (now with menu detection!)
ui_analysis = await analyze_ui_elements()
# Smart clicking with natural language (75% success rate!)
await smart_click("File") # โ
Works!
await smart_click("Save button") # โ
Enhanced matching!
# Extract text from screen with OCR
text_data = await extract_text_from_screen()
๐ Application Monitoring
# Get active application context
app_context = await get_active_application()
# Register for application events
await register_application_events(app_name="Blender")
# Monitor application changes
changes = await get_recent_changes(limit=5)
๐ INSTALLATION
1. Prepare Project Files
# Navigate to project directory
cd ScreenMonitorMCP
# Install required libraries
pip install -r requirements.txt
2. Configure Environment Variables
Edit the .env
file:
# Server Configuration
HOST=127.0.0.1
PORT=7777
API_KEY=your_secret_key
# AI Configuration
OPENAI_API_KEY=your_openai_api_key
OPENAI_BASE_URL=https://api.openai.com/v1
DEFAULT_OPENAI_MODEL=gpt-4o
DEFAULT_MAX_TOKENS=1000
3. Standalone Testing (Optional)
# Test the server
python main.py
# Test revolutionary features
python test_revolutionary_features.py
๐ง MCP CLIENT SETUP
Claude Desktop / MCP Client Configuration
Add the following JSON to your MCP client's configuration file:
๐ฏ Simple Configuration (Recommended)
{
"mcpServers": {
"screenMonitorMCP": {
"command": "python",
"args": ["/path/to/ScreenMonitorMCP/main.py"],
"cwd": "/path/to/ScreenMonitorMCP"
}
}
}
๐ง Advanced Configuration
{
"mcpServers": {
"screenMonitorMCP": {
"command": "python",
"args": [
"/path/to/ScreenMonitorMCP/main.py"
],
"cwd": "/path/to/ScreenMonitorMCP",
"env": {
"OPENAI_API_KEY": "your-api-key-here"
}
}
}
}
๐ก๏ธ Secure Configuration
{
"mcpServers": {
"screenMonitorMCP": {
"command": "python",
"args": [
"/path/to/ScreenMonitorMCP/main.py",
"--api-key", "your-secret-key"
],
"cwd": "/path/to/ScreenMonitorMCP"
}
}
}
๐ช Windows Example
{
"mcpServers": {
"screenMonitorMCP": {
"command": "python",
"args": ["C:/path/to/ScreenMonitorMCP/main.py"],
"cwd": "C:/path/to/ScreenMonitorMCP"
}
}
}
โ ๏ธ Important Notes
- File Path: Update
/path/to/ScreenMonitorMCP/main.py
path according to your project directory - Python Path: Make sure Python is in PATH or use full path:
"C:/Python311/python.exe"
- Working Directory:
cwd
parameter is important for proper.env
file reading - API Keys: All settings are automatically read from
.env
file
๐งช USAGE EXAMPLES
๐ Starting Real-Time Monitoring
# Start AI's continuous vision capability
result = await start_continuous_monitoring(
fps=3,
change_threshold=0.1,
smart_detection=True
)
# Check monitoring status
status = await get_monitoring_status()
# View recent changes
changes = await get_recent_changes(limit=10)
# Stop monitoring
await stop_continuous_monitoring()
๐ฏ Using UI Intelligence
# Analyze all UI elements on screen
ui_elements = await analyze_ui_elements(
detect_buttons=True,
extract_text=True,
confidence_threshold=0.7
)
# Smart clicking with natural language
await smart_click("Click the save button", dry_run=False)
# Extract text from specific region
text_data = await extract_text_from_screen(
region={"x": 100, "y": 100, "width": 500, "height": 300}
)
๐ Application Monitoring
# Start application monitoring
await start_application_monitoring()
# Get active application context
app_context = await get_active_application()
# Register Blender for monitoring
await register_application_events(
app_name="Blender",
event_types=["scene_change", "object_modification"]
)
# Monitor application changes
changes = await get_recent_application_events(limit=10)
# Broadcast Blender scene change
await broadcast_application_change(
app_name="Blender",
event_type="scene_change",
event_data={"objects_modified": ["Cube", "Camera"]}
)
๐ฏ BLENDER INTEGRATION EXAMPLE
With this system, you can relay real-time changes from Blender to your AI client (like Claude Desktop):
Step 1: Start ScreenMonitorMCP
# Add ScreenMonitorMCP to your Claude Desktop config
python main.py
Step 2: Activate Application Monitoring
# Run these commands in Claude Desktop:
await start_application_monitoring()
await register_application_events("Blender")
Step 3: Work in Blender
- Open Blender and make changes to your scene
- ScreenMonitorMCP automatically detects window focus changes
- Your AI client knows you're working in Blender
Step 4: Send Custom Events (Future Feature)
# From within your Blender script:
await broadcast_application_change(
app_name="Blender",
event_type="object_added",
event_data={"object_name": "Suzanne", "object_type": "MESH"}
)
๐ธ Traditional Screen Capture
# Enhanced screen capture and analysis
result = await capture_and_analyze(
capture_mode="all",
analysis_prompt="What do you see on this screen?",
max_tokens=1500 # AI models can now use more tokens for detailed analysis
)
# List all tools
tools = await list_tools()
๐ REVOLUTIONARY CAPABILITIES
This MCP server gives AI the following capabilities:
- ๐๏ธ Continuous Vision: AI can monitor the screen non-stop
- ๐ง Enhanced Smart Understanding: Recognizes UI elements and interacts with them (75% success rate!)
- ๐ฏ Advanced Natural Interaction: Understands commands like "File", "Save button" with fuzzy matching
- ๐ Menu Intelligence: Detects menu bars, menu items, and UI hierarchies
- ๐ Multi-Strategy Matching: Fuzzy text matching, position-based scoring, and semantic understanding
- โก Proactive Help: Offers help before you need it
- ๐ Application Awareness: Monitors and broadcasts application events
๐ง TROUBLESHOOTING
Common Issues and Solutions
Unicode/Encoding Error (Windows)
UnicodeEncodeError: 'charmap' codec can't encode character
Solution: โ This error is fixed! Server automatically uses UTF-8 encoding.
JSON Configuration Error
// โ Wrong { "command": "python", "args": ["path/to/main.py",] // Trailing comma is wrong } // โ Correct { "command": "python", "args": ["path/to/main.py"] }
Python Path Issue
{ "command": "C:/Python311/python.exe", // Use full path "args": ["C:/path/to/ScreenMonitorMCP/main.py"] }
Missing Dependencies
cd ScreenMonitorMCP pip install -r requirements.txt
OCR Issues
# Install Tesseract (optional) # EasyOCR installs automatically
MCP Connection Closed Error
MCP error -32000: Connection closed
Solution: Check file paths and add
cwd
parameter.
๐ LICENSE
This project is licensed under the MIT License.
๐ Revolutionary MCP server that gives AI real "eyes"!๐ฅ Next-generation AI-human interaction starts here!