Google's flagship Gemini 3 Pro — frontier multimodal intelligence with a 2M token context for complex reasoning.
Langfun Agent v4.0 single agent (Google)
Gemini 3 Pro — mini-SWE-agent v2
ColorBrowserAgent (Dec 2025) — proxy via Gemini 3 Pro
Gemini-3-Pro-Preview Prompt — rank #3
Gemini 3 Pro — rank #5, high reasoning, Sierra eval
Gemini 3 Pro — rank #1, calibration error 57.2%
Gemini 3 Pro Preview High — rank #9
gemini_3_pro — p50=224.3 min autonomous work, avg_score=0.710