Alibaba's 235B MoE Qwen3 flagship with 22B active params — top open-source performance on reasoning, coding, and math.
agent-testv2 (GLM 4.5 + GLM 4.5 Air + Gemini + GPT)
Qwen3-235B-A22B-Instruct-2507 Prompt — rank #23
Qwen 3 235B A22B Thinking 2507 — rank #44
Qwen3-235B-A22B, easy=98.8 med=86.8 hard=50.8