Anthropic's balanced Claude Sonnet — optimal mix of intelligence, speed, and cost for production AI applications.
h2oGPTe Agent v1.6.44 (claude-sonnet-4.5 + gpt-5, h2o.ai)
Claude 4.5 Sonnet high reasoning — mini-SWE-agent v2
Claude Code + GBOX MCP (GBOX AI, Oct 2025)
Claude-Sonnet-4-5-20250929 FC — rank #2
Claude Sonnet 4.5 — rank #7, reasoning enabled, Sierra eval
claude-sonnet-4-6 — rank #1, max_steps=100, Anthropic Mar 2026
OpenHands-Versa + Claude Sonnet 4 — rank #3, 33.1% resolved
Claude 4.5 Sonnet — rank #6, calibration error 65.0%
Claude 4.6 Sonnet Thinking Medium Effort — rank #5
Claude-Sonnet-4 Thinking — rank #13, easy=98.4 med=72.9 hard=31.4
claude_3_7_sonnet_inspect — p50=60.4 min autonomous work, avg_score=0.558
Claude Code + Claude Sonnet 4.5 (Sep 2025) — rank #2, $68.33