From Zero-Shot to Expert: A Deep Dive into Retrieval Domain Adaptation¶
Overview¶
Part 1: The Foundations of Relevance¶
- Information Retrieval (IR) Fundamentals
- Lexical vs. Semantic Search
- Embeddings: The Language of Meaning
Part 2: The Architectural Spectrum of Modern Retrieval¶
- Bi-Encoders: The Foundation of Scalable Semantic Search
- Cross-Encoders: The Gold Standard for Precision Reranking
- A Deep Dive into BM25: The Science of Statistical Relevance
- SPLADE: Teaching a Lexical Retriever to Think Semantically
- ColBERT: Fine-Grained Interaction for Semantic Precision
Part 3: Assembling the State-of-the-Art Pipeline¶
- The Hybrid Retriever: A Spectrum of Understanding
- Reciprocal Rank Fusion (RRF): The Democratic Ensemble
- The Modern Retriever: A Visual Guide
Part 4: The Art and Science of Model Training¶
- Key IR Evaluation Metrics
- The BEIR Framework for Reproducibility Research
- The Need for Domain Adaptation
- The Engine of DAPT: A Deep Dive into Masked Language Modeling
- Training the 'Teacher': Supervised Fine-Tuning for the Cross-Encoder
- The Art of Mimicry: The Mechanics of Knowledge Distillation for Retrieval