vMLX

Name: vMLX
Availability: InStock
Author: jjang-ai

by jjang-ai

Local AI engine for Apple Silicon providing OpenAI and Anthropic compatible APIs for LLMs, VLMs, and Image Gen.

0 stars

124 views

Works in:CursorCodexGemini CLI

Exposes:ToolsResources

View on GitHub Docs

What it does

vMLX is a high-performance local AI inference engine optimized for Apple Silicon (M1-M4). It allows users to run a vast array of models (LLMs, VLMs, and Flux image models) entirely on-device, providing a secure, private alternative to cloud APIs with full compatibility with OpenAI and Anthropic wire formats.

Tools

Local LLM Serving: Run any mlx-community model with continuous batching and paged KV cache.
Image Generation: Local Flux Schnell/Dev and Z-Image Turbo generation and editing.
Distributed Inference: Split large models across multiple Macs via Thunderbolt or Ethernet.
JANG Quantization: Adaptive mixed-precision quantization for superior quality at low bit-rates.

Installation

Install via uv:

brew install uv
uv tool install vmlx
vmlx serve mlx-community/Qwen3-8B-4bit

Supported hosts

Claude Desktop
Cursor
Codex
Gemini-CLI

Quick install

uv tool install vmlx

Information

Pricing: free
Published: 4/14/2026
stars