Meta · Llama 3.2
Meta's 11B multimodal Llama 3.2 — the first vision-capable Llama model, supporting image understanding with 128K context.
No benchmark scores available yet for this model.