
Human MCP
by mrgoonie
Give AI agents human-like senses: visual analysis, image/video generation, speech synthesis, browser automation, and advanced reasoning — 29 MCP tools in one se
What it does
Human MCP is a comprehensive MCP server that equips AI coding agents with multimodal capabilities modelled on human senses. It connects to Google Gemini, Minimax, ZhipuAI, and ElevenLabs APIs to deliver visual analysis, creative content generation, speech synthesis, and structured reasoning — all exposed as standard MCP tools.
Use it to debug UI screenshots, generate images or videos from prompts, narrate code explanations, automate browser screenshots, and run systematic reasoning chains — directly from your AI agent without leaving the chat.
Tools
- eyes_analyze — Analyse images, videos, and GIFs for UI bugs, errors, and accessibility issues
- eyes_compare — Detect visual differences between two images
- eyes_read_document — Extract text and tables from PDF, DOCX, XLSX, PPTX, and more
- eyes_summarize_document — Generate structured summaries from documents
- gemini_gen_image — Text-to-image generation via Gemini Imagen API
- gemini_gen_video / gemini_image_to_video — Video generation and animation via Veo 3.0
- minimax_gen_music / elevenlabs_gen_music — AI music generation with vocals
- elevenlabs_gen_sfx — Sound effect generation from text descriptions
- gemini_inpaint_image / gemini_outpaint_image / gemini_style_transfer_image / gemini_compose_images / gemini_edit_image — AI-powered image editing operations
- jimp_crop_image / jimp_resize_image / jimp_rotate_image / jimp_mask_image — Fast local image processing via Jimp
- rmbg_remove_background — AI background removal with three quality levels
- playwright_screenshot_fullpage / playwright_screenshot_viewport / playwright_screenshot_element — Automated web screenshots via Playwright
- mouth_speak / mouth_narrate / mouth_explain / mouth_customize — Text-to-speech and code narration via Gemini, Minimax, and ElevenLabs
- mcp__reasoning__sequentialthinking / brain_analyze_simple / brain_patterns_info / brain_reflect_enhanced — Structured reasoning, pattern analysis, and meta-cognitive reflection
Installation
{
"mcpServers": {
"human-mcp": {
"command": "npx",
"args": ["@goonnguyen/human-mcp"],
"env": {
"GOOGLE_GEMINI_API_KEY": "your_gemini_api_key_here"
}
}
}
}
Optionally add ELEVENLABS_API_KEY, MINIMAX_API_KEY, or ZHIPUAI_API_KEY to unlock additional providers.
Supported hosts
Claude Desktop, VS Code Copilot, Cursor, Windsurf — all confirmed in the README.
Quick install
npx @goonnguyen/human-mcp