- RAG pipeline
- fine-tuning vs prompting
- reranking
- LLM evals
NLP / llms
LLMs, RAG, agents, and evals
Pretraining, fine-tuning, SFT, RLHF, DPO, LoRA, quantization, RAG, vector databases, reranking, tool use, agents, safety, and evals.
shellbackend needed later
RAG pipeline workbench with chunking, embeddings, reranking, hallucination tests, and cost/latency tradeoffs.
- What is the core job of "LLMs, RAG, agents, and evals"?
- Which common mistake would break a production implementation of this topic?
- Which inputs or limits must be validated before the interactive feature ships?
- What is the smallest test that proves the future implementation behaves correctly?
- When does this module really need backend compute, and when is a UI simulation enough?
- Start with one focused feature, not a full course inside one page.
- All public inputs must be typed, bounded, and covered by reject-case tests.
- If a model, dataset, or job is added, document source, license, limits, and fallback.
- The interaction must explain the topic rather than serve as decoration.