Anthropic's most intelligent Claude model — best for complex agentic workflows, coding, and long-horizon reasoning tasks.
CustomGPT Agent v2 (Claude Opus 4.6 + Sonnet 4.6)
Claude 4.5 Opus high reasoning — mini-SWE-agent v2
Claude-Opus-4-5-20251101 FC — rank #1
Claude Opus 4.5 — rank #1, high reasoning, Sierra eval
Claude 4.6 Opus Thinking High Effort — rank #3
Claude-Opus-4 Thinking — rank #12, easy=98.8 med=78.3 hard=31.4
claude_opus_4_6_inspect — p50=718.8 min autonomous work, avg_score=0.789
Claude Code + Claude Opus 4.5 — rank #1, 95.5% manual, $87.16
USACO Episodic+Semantic + Claude Opus 4.1 High — rank #3, $267.72