🦁 Liger GRPO meets TRL
Liger GRPO is now integrated with TRL, enabling more efficient Group Relative Policy Optimization for LLM training. This allows developers to scale reinforcement learning from human feedback with significantly reduced memory overhead.

