MCPBench
by modelscope
Comprehensive evaluation benchmark for MCP servers focusing on task accuracy, latency, and token usage.
What it does
MCPBench is an evaluation framework designed to rigorously test and compare the performance of Model Context Protocol (MCP) servers. It provides a standardized way to measure how different servers handle specific task categories, ensuring developers can optimize for accuracy and efficiency.
Tools
As a benchmarking framework, MCPBench evaluates the following capabilities of target servers:
- Web Search Evaluation: Tests accuracy and latency for web-based retrieval tasks.
- Database Query Evaluation: Benchmarks the ability to interact with and query databases.
- GAIA Evaluation: Tests general AI assistant capabilities in complex real-world scenarios.
Installation
To run the benchmark, clone the repository and set up the environment:
conda create -n mcpbench python=3.11 -y
conda activate mcpbench
pip install -r requirements.txt
Configure your servers in the configs folder and run the evaluation scripts (e.g., sh evaluation_websearch.sh your_config.json).
Supported hosts
- claude
Quick install
pip install -r requirements.txtInformation
- Pricing
- free
- Published
- 4/15/2026
- stars






