
FastAPI-BitNet
Supports UIby grctest
FastAPI-based MCP server for Microsoft's BitNet inference framework, enabling programmatic control of llama.cpp instances.
What it does
FastAPI-BitNet provides a high-performance bridge between the Model Context Protocol (MCP) and Microsoft's BitNet inference framework. It allows AI agents to programmatically launch, manage, and interact with llama-cli and llama-server processes using BitNet's 1-bit LLM architectures via a REST API.
Tools
session_management: Start, stop, and monitor persistent BitNet chat sessions.batch_operations: Initialize and interact with multiple model instances in a single call.interactive_chat: Send prompts to running sessions and receive cleaned model responses.model_benchmarking: Run benchmarks and calculate perplexity on GGUF models.resource_estimation: Estimate server capacity based on system RAM and CPU threads.
Installation
Add the following to your claude_desktop_config.json:
{
"mcpServers": {
"fastapi-bitnet": {
"url": "http://127.0.0.1:8080/mcp"
}
}
}
Note: The server must be running via Docker or Uvicorn on port 8080.
Supported hosts
Confirmed for VS Code Copilot and Claude Desktop.
Quick install
docker run -d --name ai_container -p 8080:8080 fastapi_bitnetInformation
- Pricing
- free
- Published
- 4/25/2026
- stars






