
OmniGrip
by zibo-chen
Cross-platform computer control MCP server enabling LLM-driven GUI automation with vision, OCR, and input simulation.
What it does
OmniGrip transforms language models into active agents capable of interacting with any desktop environment. It provides a comprehensive bridge between LLMs and the operating system, allowing the AI to see the screen, read text via OCR, and perform precise mouse and keyboard actions across macOS, Windows, and Linux.
Tools
take_screenshot: Captures the current display as a JPEG for visual analysis.mouse_click: Performs left, right, or middle clicks at specific coordinates.keyboard_type: Types Unicode text into the active application.get_ocr_data: Extracts all text from the screen with precise coordinates.list_windows: Retrieves a list of all visible system windows.focus_window: Brings a specific window to the foreground by its ID.clipboard_read/clipboard_write: Manages system clipboard content.
Installation
Build from source using Rust:
git clone https://github.com/zibo-chen/OmniGrip.git
cd OmniGrip
cargo build --release
Add to claude_desktop_config.json:
{
"mcpServers": {
"omni-grip": {
"command": "/path/to/omni-grip",
"args": []
}
}
}
Supported hosts
- claude
Quick install
cargo build --releaseInformation
- Pricing
- free
- Published
- 5/30/2026






