
from claude-code-toolkit60
Practical guide to building production Retrieval-Augmented Generation (RAG) systems: vector DB selection, chunking strategies, embedding model choices, retrieva
Provides a hands-on, production-focused blueprint for building Retrieval-Augmented Generation (RAG) systems. Covers selecting and configuring vector databases (Qdrant, Pinecone, Chroma, Weaviate, Milvus), chunking strategies (fixed, semantic, hierarchical, sliding window), embedding model trade-offs (OpenAI, Sentence Transformers, Cohere), retrieval optimizations (hybrid search, reranking, metadata filtering), and production practices like caching, async ingestion, and monitoring. Includes code snippets and decision trees to guide practical implementation and deployment.
Use this skill when you are designing or debugging a semantic search / RAG pipeline: choosing a vector DB, deciding chunking and embedding strategies, optimizing retrieval quality, implementing hybrid dense+sparse search, or building production ingestion and monitoring. It's aimed at engineers building search, Q&A, or assistant systems that need reliable, scalable retrieval.
Best suited for code-focused and engineering agent runtimes that can run Python snippets and interact with vector DBs (Claude Code, Copilot/Codex-style agents, other code-capable assistants).
Comprehensive RAG implementation reference guide covering vector DB selection, chunking strategies, embedding models, retrieval optimization, and production patterns. No bundled scripts to test. SKILL.md is well-written with practical code examples and decision trees, though everything is in a single monolithic file with no scripts/ or references/ separation.
Clean reference skill with no security concerns. Well-written content covering the full RAG stack. Main architectural weakness is monolithic structure — everything inline with no scripts/ or references/ directories. Code examples are practical and include good docstrings.