DeepSeek AI · Deepseek R1
DeepSeek's pure-RL reasoning model trained without SFT — demonstrates emergent chain-of-thought through reinforcement learning alone.
No benchmark scores available yet for this model.