xAI
xAI's Grok 4 flagship — frontier reasoning and multimodal model with a 256K context window and tool-use capabilities.
exo-unoptimized (grok-4-fast + gemini-2.5-flash)
Grok-4-0709 Prompt — rank #9
Grok 4 — rank #3, calibration error 56.4%
Grok 4 — rank #24
xAI's ultra-fast Grok 4.1 Fast — 2M context at a fraction of the cost, optimised for high-throughput production.
xAI's Grok 4.20 with 2M token context — advanced reasoning and multi-agent capabilities for complex agentic workflows.
xAI's Grok 4.20 Multi-Agent — specialised for orchestrating complex multi-agent workflows with 2M token context.
exo-vanilla-react (grok 4 fast + gemini 2.5 flash)
Grok-4-1-fast-reasoning FC — rank #5
Grok 4.1 Fast — rank #34
exo-unoptimized (grok 4 fast + gemini 2.5 flash)
Grok 4.20 Beta — rank #19