Mistral AI
Mistral AI's 141B 8x22B MoE base model — significantly stronger than 8x7B with 39B active params, matching GPT-3.5 on most benchmarks.
Mixtral-8x22B (GAIA baseline era)
Mixtral-8x22B proxy (BFCL era)
Mistral AI's landmark 8x7B MoE instruct model — sparse mixture-of-experts with GPT-3.5-level performance at fraction of the compute.
🧟🧟 (Mistral-Small-3.1-24B) — close family
Mixtral 8x7B baseline (Jan 2024)
Mixtral proxy (BFCL v3 era)
Mixtral-8x7b + BLIP-2-T5XL AxTree+Caption (Jan 2024)