AI21 Labs
AI21 Labs' 52B hybrid SSM-Transformer model — Mamba + MoE architecture with 256K context and exceptional throughput.