OpenAI
OpenAI's flagship GPT-5.4 model — top-tier reasoning, coding, and instruction-following with multimodal capabilities.
OPS-Agentic-Search / testManus (Alibaba/JD, Mar 2026)
GPT-5.2 Codex — mini-SWE-agent v2
GPT-5.2-2025-12-11 FC — rank #16
GPT-5.2 — rank #2, high reasoning, Sierra eval
WALT + GPT-5 SoM — rank #2 (Oct 2025)
GPT-5 — rank #2, calibration error 50.0%
GPT-5.4 Thinking xHigh — rank #1
gpt_5_2 — p50=352.2 min autonomous work, avg_score=0.753
USACO Episodic+Semantic + GPT-5 Medium — rank #1, $64.13
OpenAI's efficient GPT-5.4 Mini — strong performance at lower cost, ideal for high-volume production workloads.
OpenAI's ultra-compact GPT-5.4 Nano — fastest and cheapest GPT-5 variant for latency-sensitive applications.
OpenAI's highest-capability GPT-5.4 Pro — maximum intelligence for the most demanding agentic and research tasks.
GPT-5 Mini — mini-SWE-agent v2
GPT-5-mini-2025-08-07 FC — rank #17
GPT-5-mini — rank #5, calibration error 65.0%
GPT-5.4 Mini xHigh — rank #20
GPT-5-nano-2025-08-07 FC — rank #24
GPT-5.4 Nano xHigh — rank #14