
OmniMCP
by openadaptai
AI-driven UI interaction and visual perception using Microsoft OmniParser and MCP.
What it does
OmniMCP bridges the gap between LLMs and complex user interfaces. By leveraging Microsoft's OmniParser, it allows AI models to visually perceive the screen, identify UI elements, and execute precise mouse and keyboard actions to complete goals autonomously.
Tools
- Visual Perception: Analyzes screenshots to identify and label interactive UI components.
- LLM Planner: Generates a sequence of actions based on current visual state and goal.
- Agent Executor: Orchestrates the perceive-plan-act loop for continuous task execution.
- Input Controller: Performs physical interactions via pynput for mouse and keyboard control.
Installation
{
"mcpServers": {
"omnimcp": {
"command": "python",
"args": ["/path/to/OmniMCP/cli.py"]
}
}
}
Supported hosts
- Claude Desktop
- Linux (X11/Wayland)
Quick install
git clone https://github.com/OpenAdaptAI/OmniMCP.git && cd OmniMCP && ./install.shInformation
- Pricing
- free
- Published
- 5/1/2026
- stars






