DeepSeek AI
Deepseek R1 Distill
DeepSeek's R1 reasoning distilled into a tiny 1.5B model — chain-of-thought reasoning at ultra-low compute for edge deployment.
DeepSeek's R1 reasoning capabilities distilled into a 32B Qwen model — strong math and coding at a fraction of full R1 cost.
DeepSeek's R1 reasoning distilled into Llama 70B — frontier-level chain-of-thought at 70B scale on a Llama architecture.
DeepSeek's R1 reasoning distilled into a 7B Qwen base — solid chain-of-thought at a highly deployable size.
DeepSeek's R1 reasoning distilled into an 8B Qwen3 base — strong chain-of-thought at a compact, deployable size.
DeepSeek's R1 reasoning distilled into Llama 3 8B — strong chain-of-thought on a Llama base for broad ecosystem compatibility.