windows-computer-use-mcp

A tool for agents, and an agent itself.

You	It
Use it as an MCP server	Claude, Cursor, DeepSeek call `automation_click`, `automation_screenshot`, `automation_ocr` — 22 tools
Use it as an autonomous agent	Give it a goal: `automation_mission(run="install app, verify UI, screenshot result")` — it plans, executes, retries, and reports
Use it as a webapp	`start.ps1` opens a React dashboard at http://127.0.0.1:10788 with HITL, crawler, logging
Use it as a desktop app	The NSIS installer bundles everything into one binary — no Python, no uv, no git needed

Exhibit A: 100 Tauri/NSIS installers, one unattended run, $2 in LLM costs. Install, screenshot, verify, report — zero human intervention. That is what agentic Windows automation looks like at scale.

Built on pywinauto. Read docs/SAFETY.md before production use.

Quick Start
Features
Documentation
Ports
License

Quick Start

Method	Command / Config
MCP stdio (Cursor, Claude Desktop)	`{ "mcpServers": { "windows-computer-use": { "command": "uv", "args": ["--directory", "<PATH>", "run", "windows-computer-use-mcp"] } } }`
HTTP streamable (any MCP HTTP client)	`{ "mcpServers": { "windows-computer-use": { "url": "http://127.0.0.1:10789/mcp" } } }`
Web operator UI	`.\start.ps1` → http://127.0.0.1:10788
Desktop app (NSIS installer)	Download from Releases — zero deps

See INSTALL.md for detailed setup. Run just demo for examples.

Features

Window Management — find, activate, maximize, minimize, position, close
Mouse & Keyboard — click, drag, type, hotkeys, app shortcuts
UI Elements — inspect, click, read text, verify state via UIA / Win32
Visual Intelligence — screenshots, OCR, template matching
Autonomous Missions — give it a goal, it plans and executes with retry + verification
Macro Recording — record any UI sequence, replay, verify outcomes
Multi-App Workflows — chain actions across Notepad, Calc, Paint, or any Windows app
Telemetry — every action logged to SQLite; query failure patterns by tool
Adaptive Location — auto-cascades through title/auto_id/control_id/class/OCR to find elements
Face Recognition — optional, off by default

Documentation

Doc	Content
INSTALL.md	Setup: desktop app, uv, MCP config
docs/README.md	Full documentation hub
docs/py-stack.md	Python dependency deep dive
docs/composing-with-playwright.md	Browser automation with Playwright MCP
docs/ocr.md	OCR system — Tesseract setup, limitations, competition
docs/cua-nsis-certification.md	Dogfooding: using the tool to test its own NSIS installer
docs/SAFETY.md	HITL, kill switch, opt-in features
docs/TOOLS.md	Portmanteau tool reference
tests/README.md	Test suite guide and e2e setup
examples/README.md	Runnable demos
mcpb/README.md	MCPB bundle packaging
web_sota/README.md	Operator UI build/dev guide
CHANGELOG.md	Release history

Ports

Port	Service
10788	Frontend — Vite operator UI
10789	Backend — FastAPI + FastMCP HTTP
stdio	MCP transport (port-free)

Repo	What it does
autohotkey-mcp	Raw input recording/replay via AHK
browser-mcp	Playwright browser control — for webapps, HTML DOM, websites
virtualization-mcp	Sandbox / VM isolation
windows-operations-mcp	Registry, services, accounts

Browser vs desktop: This server drives Win32 / UI Automation. For HTML/DOM and websites, pair with browser-mcp (Playwright). Both MCPs can run side by side — use one profile that loads both and let the LLM pick the right tool for the target.

Fleet standards: mcp-central-docs.

windows-computer-use-mcp

windows-computer-use-mcp

Quick Start

Features

Documentation

Ports

Related

License

MCP Server · Populars

🦞 OpenClaw — Personal AI Assistant

MarkItDown-MCP

MarkItDown

Awesome MCP Servers

mcp-server-sentry: A Sentry MCP server

MCP Server · New

Geniuz

ggui

CocoIndex Code MCP Server

Tiger Linear MCP Server

MCP Gemini CLI