YouTube Transcription Agent
par gilbertsahumada
MCP server that transcribes and summarizes YouTube videos using OpenAI Whisper, with x402 micropayment support for A2A calls.
What it does
This MCP server lets your AI assistant transcribe and summarize YouTube videos. It downloads audio via yt-dlp, runs it through OpenAI Whisper for transcription, and returns timestamped text or concise summaries. It also exposes an Agent-to-Agent (A2A) HTTP interface with x402 micropayments on Base Sepolia, so other AI agents can pay per request.
Tools
- transcribe_video — Downloads YouTube audio and transcribes it with timestamps using the Whisper API
- summarize_video — Transcribes the video and generates a summary with key points
- chat — General conversation with the agent
Installation
Add to your claude_desktop_config.json:
{
"mcpServers": {
"youtube-transcriber": {
"command": "npx",
"args": ["tsx", "/path/to/src/mcp-server.ts"],
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}
}
Requires system dependencies: ffmpeg and deno (for yt-dlp).
Supported hosts
Claude Desktop and Cursor (confirmed in README via MCP stdio interface). A2A interface is protocol-agnostic for agent-to-agent use.
Installation rapide
npx tsx src/mcp-server.tsInformations
- Tarification
- freemium
- Publié
- 4/10/2026
- Mis à jour







