Retour à Deepseek R1 Distill
DeepSeek's R1 reasoning capabilities distilled into a 32B Qwen model — strong math and coding at a fraction of full R1 cost.
164K tokensGratuit / Poids ouvertsTransformerMIT
Benchmarks
GAIA
DeepSeek-R1-Distill-Qwen-32B
57.5%DeepSeek-R1-Distill-Llama-8B
27.6%DeepSeek-R1-Distill-Qwen-7B
25.6%MetaAgent_v0.2.1 (o3) w/ 32B distill