Back to Apps

AgentEvals
Supports UIby agentevals-dev
Framework-agnostic AI agent evaluation using OpenTelemetry traces to score performance and inference quality without re-execution.
0 stars
Works in:claude
Exposes:Tools
What it does
AgentEvals connects to AI agent execution traces via OpenTelemetry (OTel) to provide deterministic scoring of agent behavior. It allows developers to benchmark agents before production by analyzing tool trajectories and response quality from existing traces, eliminating the need for expensive and slow re-runs.
Tools
list_metrics: Displays all available built-in and community evaluation metrics.evaluate_traces: Processes local OTLP or Jaeger trace files to generate scores.list_sessions: Lists active streaming sessions for real-time evaluation.summarize_session: Provides a structured summary of an agent session's tool calls.evaluate_sessions: Scores live sessions against a defined golden reference set.
Installation
Add to your claude_desktop_config.json:
{
"mcpServers": {
"agentevals": {
"command": "agentevals",
"args": ["mcp"]
}
}
}
Supported hosts
- Claude Desktop
- Claude Code
Quick install
pip install agentevals-cliInformation
- Pricing
- free
- Published
- 6/18/2026
- stars
- 0
Categories
Choose your AI client and follow the steps below.
Claude Desktop
{"mcpServers": {"agentevals": {"command": "agentevals", "args": ["mcp"]}}}





