Building RAG Systems

Learn to ground LLMs in external knowledge using retrieval, chunking, and reranking.

Concepts 6

Total time ~57 min

Interactive 5

Some steps reference prerequisites outside this path. Use the prep links inside those steps before continuing.

Step 1

Embeddings & Semantic Search

Interactive

Learn how embeddings turn text into vectors and enable semantic search by finding meaning-based similarity instead of keyword matches.

intermediate 9 min read

Prep recommended: Tokenization
Step 2

Vector Databases & Approximate Nearest Neighbors (ANN)

Interactive

Learn what vector databases store, why nearest-neighbor search must be approximate at scale, and how ANN indexes (like HNSW and IVF) make retrieval fast.

intermediate 10 min read
Step 3

Retrieval-Augmented Generation (RAG)

Learn how RAG lets an LLM answer questions using relevant external documents fetched at query time.

intermediate 8 min read
Step 4

Chunking & Indexing Strategies for RAG

Interactive

Learn how to split documents into retrievable chunks, attach the right metadata, and index content so RAG retrieves the right context reliably.

intermediate 11 min read
Step 5

Reranking & Hybrid Retrieval

Interactive

Learn why two-stage retrieval and keyword+vector fusion improve relevance in real-world RAG systems.

intermediate 10 min read
Step 6

Context Windows & Prompt Budgeting

Interactive

Build a practical mental model for context limits and how to allocate tokens for better cost, speed, and answer quality.

intermediate 9 min read

Continue learning

Customizing Models

From prompt engineering to fine-tuning to building autonomous agents.