OmniMCP

Name: OmniMCP
Availability: InStock
Author: openadaptai

by openadaptai

AI-driven UI interaction and visual perception using Microsoft OmniParser and MCP.

0 stars

3 views

Works in:claude

Exposes:ToolsResources

View on GitHub Docs

What it does

OmniMCP bridges the gap between LLMs and complex user interfaces. By leveraging Microsoft's OmniParser, it allows AI models to visually perceive the screen, identify UI elements, and execute precise mouse and keyboard actions to complete goals autonomously.

Tools

Visual Perception: Analyzes screenshots to identify and label interactive UI components.
LLM Planner: Generates a sequence of actions based on current visual state and goal.
Agent Executor: Orchestrates the perceive-plan-act loop for continuous task execution.
Input Controller: Performs physical interactions via pynput for mouse and keyboard control.

Installation

{
  "mcpServers": {
    "omnimcp": {
      "command": "python",
      "args": ["/path/to/OmniMCP/cli.py"]
    }
  }
}

Supported hosts

Claude Desktop
Linux (X11/Wayland)

Quick install

git clone https://github.com/OpenAdaptAI/OmniMCP.git && cd OmniMCP && ./install.sh

Information

Pricing: free
Published: 5/1/2026
stars