UI-TARS Desktop

[!IMPORTANT]

[2025-03-18] We released a technical preview version of a new desktop app - Agent TARS, a multimodal AI agent that leverages browser operations by visually interpreting web pages and seamlessly integrating with command lines and file systems.

UI-TARS Desktop is a GUI Agent application based on UI-TARS (Vision-Language Model) that allows you to control your computer using natural language.

📑 Paper | 🤗 Hugging Face Models | 🫨 Discord | 🤖 ModelScope 🖥️ Desktop Application | 👓 Midscene (use in browser)

Showcases

Instruction	Video
Get the current weather in SF using the web browser
Send a twitter with the content "hello world"

News

[2025-02-20] - 📦 Introduced UI TARS SDK, is a powerful cross-platform toolkit for building GUI automation agents.
[2025-01-23] - 🚀 We updated the Cloud Deployment section in the 中文版: GUI模型部署教程 with new information related to the ModelScope platform. You can now use the ModelScope platform for deployment.

Features

🤖 Natural language control powered by Vision-Language Model
🖥️ Screenshot and visual recognition support
🎯 Precise mouse and keyboard control
💻 Cross-platform support (Windows/MacOS)
🔄 Real-time feedback and status display
🔐 Private and secure - fully local processing

Quick Start

See Quick Start.

Deployment

See Deployment.

Contributing

See CONTRIBUTING.md.

SDK (Experimental)

See @ui-tars/sdk

License

UI-TARS Desktop is licensed under the Apache License 2.0.

Citation

If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil:

@article{qin2025ui,
  title={UI-TARS: Pioneering Automated GUI Interaction with Native Agents},
  author={Qin, Yujia and Ye, Yining and Fang, Junjie and Wang, Haoming and Liang, Shihao and Tian, Shizuo and Zhang, Junda and Li, Jiahao and Li, Yunxin and Huang, Shijue and others},
  journal={arXiv preprint arXiv:2501.12326},
  year={2025}
}

UI-TARS Desktop

UI-TARS Desktop

Showcases

News

Features

Quick Start

Deployment

Contributing

SDK (Experimental)

License

Citation

MCP Server · Populars

MarkItDown-MCP

Awesome MCP Servers

mcp-server-sentry: A Sentry MCP server

AWS Knowledge Base Retrieval MCP Server

Brave Search MCP Server

MCP Server · New

🔍 Algolia Node.js MCP

Bright Data MCP

Reddit MCP Server

MKP - Model Kontext Protocol Server for Kubernetes

Unofficial ChEMBL MCP Server