Name: RAG Implementation Patterns
Author: applied-artificial-intelligence

What it does

Provides a hands-on, production-focused blueprint for building Retrieval-Augmented Generation (RAG) systems. Covers selecting and configuring vector databases (Qdrant, Pinecone, Chroma, Weaviate, Milvus), chunking strategies (fixed, semantic, hierarchical, sliding window), embedding model trade-offs (OpenAI, Sentence Transformers, Cohere), retrieval optimizations (hybrid search, reranking, metadata filtering), and production practices like caching, async ingestion, and monitoring. Includes code snippets and decision trees to guide practical implementation and deployment.

When to use it

Use this skill when you are designing or debugging a semantic search / RAG pipeline: choosing a vector DB, deciding chunking and embedding strategies, optimizing retrieval quality, implementing hybrid dense+sparse search, or building production ingestion and monitoring. It's aimed at engineers building search, Q&A, or assistant systems that need reliable, scalable retrieval.

What's included

Scripts: none in the repo package (has_scripts=false) but extensive code examples and recipe snippets are included in the SKILL.md.
References: inline references to recommended tools and docs; no separate references/ directory.
Instructions: step-by-step patterns for DB selection, chunking methods, embedding calls, hybrid search, reranking, metadata filtering, and pipeline examples.

Compatible agents

Best suited for code-focused and engineering agent runtimes that can run Python snippets and interact with vector DBs (Claude Code, Copilot/Codex-style agents, other code-capable assistants).

RAG Implementation Patterns

What it does

When to use it

What's included

Compatible agents

Tags

Not yet audited

Related Skills

Information