MCP server for simplest GUI agent ๐น๏ธ๐ค
pyautogui_mcp_server packages a Streamable HTTP MCP server for running Python code with pyautogui instrumentation.
It is designed for GUI automation workflows where plain pyautogui execution is not enough. The package adds MCP-friendly output handling, and richer screenshots.
โจ What this package adds
Compared with running raw pyautogui calls directly, this library adds extra effort in the following areas:
- Shared Python globals across tool calls.
- Captured
stdout,stderr, and final expression results in one MCP response stream. - Inline screenshot delivery as MCP image content instead of requiring manual file handling.
- Annotated mouse-operation previews that show the target point or path before the action runs.
- Screenshot normalization so captured images line up better with logical screen coordinates.
๐ ๏ธ Tool response example
<stdout>
Cut the right rope by dragging left to right through it.
</stdout>
<pyautogui-mcp.dragTo x=860 y=430 duration=0.2 button='left'
time_offset="T+1.1s" pyautogui.size=(1440, 900)>
</pyautogui-mcp.dragTo>
๐ฆ Installation
pip install pyautogui_mcp_server
For local development:
pip install -e .[dev]
๐ Run the MCP server
Use the module entrypoint:
python -m pyautogui_mcp_server --host 127.0.0.1 --port 9300
Or use the installed console script:
pyautogui-mcp-server --port 9300
Show CLI help:
python -m pyautogui_mcp_server --help
The server exposes a run_python_with_pyautogui MCP tool that executes Python with instrumented pyautogui behavior.
๐ License
MIT