
Flama
by vortico
Production framework for Predictive and Generative AI, serving models as APIs with native MCP support.
What it does
Flama is a high-performance production framework designed to turn any predictive or generative AI model into a production-ready API in a single line of code. It allows developers to serve models from various frameworks (scikit-learn, TensorFlow, PyTorch, LLMs) using a unified portable .flm format, exposing them via OpenAI, Anthropic, and Ollama-compatible endpoints.
Tools
Flama enables the creation of custom MCP tools via a simple Python decorator. By adding a server to the Flama app, any function marked with @app.mcp.tool is automatically exposed as an MCP tool with a JSON Schema derived from its type hints.
Installation
To install Flama and its generative AI capabilities:
pip install "flama[llm,pydantic]"
To serve a model as an MCP server:
flama serve --model file=your_model.flm,url=/,name=model_name
Add to claude_desktop_config.json:
{
"mcpServers": {
"flama": {
"command": "flama",
"args": ["serve", "--model", "file=your_model.flm,url=/,name=model_name"]
}
}
}
Supported hosts
- claude
- cursor
- vscode-copilot
Quick install
pip install "flama[full]"Information
- Pricing
- free
- Published
- 6/28/2026
- stars
- 0
Categories
Choose your AI client and follow the steps below.
Cursor
Add as MCP server in settingsClaude Desktop
Add flama serve command to claude_desktop_config.jsonVS Code Copilot
Add to settings.json github.copilot.chat.mcp.servers





