Name: Houtini LM
Availability: InStock
Author: houtini-ai

What it does

Houtini LM connects Claude Code to local LLM servers (LM Studio, Ollama) or OpenAI-compatible cloud APIs (DeepSeek, Groq, Cerebras, OpenRouter). It allows Claude to delegate "grunt work"—like generating boilerplate, drafting commit messages, and performing code reviews—to cheaper or free models while keeping high-level architecture and planning on the frontier model.

Tools

chat: General task offloading with planning triggers to nudge Claude into delegating work.
custom_prompt: A three-part prompt (system, context, instruction) designed to reduce context bleed.
code_task: Specialized tool for code analysis, bug finding, and test generation.
code_task_files: Analyzes multiple files directly from disk without flooding the MCP client's context window.
embed: Generates text embeddings via OpenAI-compatible endpoints.
discover: Health check and real-time performance readout (tok/s and TTFT).
list_models: Lists all available models on the server with detailed capability profiles.
stats: Displays cumulative token savings and per-model performance history.

Installation

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "houtini-lm": {
      "command": "npx",
      "args": ["-y", "@houtini/lm"],
      "env": {
        "HOUTINI_LM_ENDPOINT_URL": "http://localhost:1234"
      }
    }
  }
}

Supported hosts

claude

Houtini LM

What it does

Tools

Installation

Supported hosts

Quick install

Information

Related Apps

Categories