Microsoft
Microsoft's 1-bit 2B language model — BitNet b1.58 architecture enables efficient LLM inference with drastically reduced memory and energy.