Detailed Calendar
Required and additional readings, to be updated (bi)weekly. Additional readings are not mandatory.
Introduction to Language Models
Lecture 1: Introduction Aug 21
Required Readings
- None
Lecture 2: n-gram LMs I Aug 23
Required Readings
- Jurafsky and Martin, Chap 3.1-3.3
Lecture 3: n-gram LMs II Aug 28
Required Readings
- Jurafsky and Martin, Chap 3.4-3.7
Additional Readings
- Mitchell, Chap 2, Estimating Probabilities
Early Neural Language Models
Lecture 4: Word Embeddings Aug 30
Required Readings
- Jurafsky and Martin, Chap 6.1-6.7
Lecture 5: Word Embeddings II Sep 6
Required Readings
- Jurafsky and Martin, Chap 6.8-6.12
Additional Readings
- Mikolov et al., ICLR 2013. Efficient Estimation of Word Representations in Vector Space
- Mikolov et al., NeurIPS 2013 Distributed Representations of Words and Phrases and their Compositionality
- Jay Al Ammar. Illustrated Word2Vec
Lecture 6: Logistic Regression I Sep 11
Required Readings
- Jurafsky and Martin, Chap 5
Lecture 7: Logistic Regression II Sep 13
Required Readings
- Jurafsky and Martin, Chap 5
Lecture 8: Feedforward Neural Network Language Models Sep 18
Required Readings
- Jurafsky and Martin, Chap 7
Lecture 9: Recurrent Neural Network Language Models Sep 20
Required Readings
- Jurafsky and Martin, Chap 9.1-9.2
Modern Neural Language Models
Lecture 10: Sequence-to-Sequence and Attention Sep 25
Required Readings
- Jurafsky and Martin, Chap 9.3.2-9.3.3; 9.7-9.8
Lecture 11: Transformers Building Blocks Sep 27
Required Readings
- Jurafsky and Martin, Chap 10.1
Lecture 12: Invited Lecture - Language Grounding by Jesse Thomason Oct 2
Lecture 13: PyTorch for Transformers Oct 4
Additional Readings
- Iyyer CS685 Spring 2023 Tokenization
Lecture 14: Transformer Building Blocks II Oct 16
Required Readings
- Jurafsky and Martin, Chap 10.2
Large Language Models
Lecture 15: Pre-training Transformers Oct 18
Required Readings
- Jurafsky and Martin, Chap 11.1-11.2
Lecture 16: Pretraining Transformers II Oct 23
Required Readings
- Jurafsky and Martin, Chap 11.3
Lecture 17: Generating from Language Models Oct 25
Required Readings
- Jurafsky and Martin, Chap 10.4
Lecture 18: Generating from Language Models II Oct 30
Required Readings
- Jurafsky and Martin, Chap 13.5.2
Additional Readings
- Holtzmann et al., 2020
- WordPiece Modeling
Lecture 19: Generating from Language Models II Nov 1
Additional Readings
Lecture 20: LLMs: Limitations and Harms Nov 6
Additional Readings
Lecture 21: RLHF Nov 8
Additional Readings
- Chip Huyen’s blog post on RLHF: Great balance of humor and technical details with many references for detailed information.
- HuggingFace Blog Post: Illustrating RLHF by Nathan Lambert et al.: mainly focuses on the RLHF algorithm itself, providing a brief history of RL and sharing seminal work that led to RLHF and practical tools for using RLHF.
- Argilla Blog Post: Finetuning an LLM: RLHF and alternatives
- Yoav Goldberg’s post: Hypotheses on why RLHF works.
- Proximal Policy Optimization (PPO): The Key to LLM Alignment: more detail on the PPO algorithm and how it improves on previous RL algorithms.