Alibaba Qwen
Alibaba's ultra-compact 0.6B Qwen3 base model — remarkably capable for its size, ideal for on-device and embedded applications.
Alibaba's 235B MoE Qwen3 flagship with 22B active params — top open-source performance on reasoning, coding, and math.
Alibaba Qwen's 235B instruct MoE model — high-capacity instruct model for multi-turn dialogue and large-context reasoning.
Alibaba's 30B MoE Qwen3 base model with 3B active params — strong reasoning at low inference cost for the Qwen3 family.
Alibaba Qwen's 30B instruct model — large-context, instruction-tuned model for complex reasoning and long-form generation.
Alibaba's 32B Qwen3 dense base model — strong reasoning and multilingual capabilities with hybrid thinking mode support at a practical scale.
Alibaba Qwen's 4B instruct model — compact, efficient instruction-tuned model for cost-sensitive chat and generation.
Alibaba's next-generation 80B MoE Qwen3 instruct model with 3B active params — strong reasoning and instruction-following at efficient inference cost.
Alibaba's 8B Qwen3 base model — strong reasoning and multilingual capabilities with hybrid thinking mode support.
agent-testv2 (GLM 4.5 + GLM 4.5 Air + Gemini + GPT)
Qwen3-235B-A22B-Instruct-2507 Prompt — rank #23
Qwen 3 235B A22B Thinking 2507 — rank #44
Qwen3-235B-A22B, easy=98.8 med=86.8 hard=50.8
Qwen3-30B-A3B-Instruct-2507 FC — rank #41
Qwen 3 30B A3B — rank #68
Qwen3-32B RL40 agent submission
Qwen3-32B FC — rank #29
Qwen 3 32B — rank #62
Qwen3-30B-A3B family FC — rank #41
Qwen 3 Next 80B A3B Thinking — rank #47
Qwen3-8B FC — rank #39