CSCI499 Spring 2024: Language Models in NLP
🌸 Spring 2024 ⏰ Mon / Wed 4:00 - 5:50p 📍 DMC 260


Teaching Assistant: Sayan Ghosh
Office Hours: Wed 2-3pm; Location RTH 4th Floor Lobby

Course Producer: Xinyue Cui
Announcements
Apr 15: Week 15
- Quiz 6 conducted in class.
Apr 10: Week 14
- Quiz 5 graded sheet distributed in class.
Apr 1: Week 13
- Quiz 5 conducted in class.
Mar 18: Week 11
- Graded Progress Reports and Quiz 4 sheets now available.
- No instructor office hours on 3/25; extra OH on request via email.
Mar 4: Week 9
- Instructor Office Hours are now on Mondays from 2-3 pm.
Feb 26: Week 8
- Quiz 3 was conducted in class.
Feb 21: Week 7
- Project Proposals and Quiz 2 are graded.
Feb 12: Week 6
- Quiz 2 was conducted in class.
Feb 5: Week 5
- Graded Quiz 1 distributed in class, others can be collected in later classes / TA Office hours.
Jan 29: Week 4
- Quiz 1 was completed in class.
Jan 24: Week 3
- Project pitches were done in class and teammate selection details are now on Piazza.
Jan 17: Week 2
- HW1 Released; due 1/31
- Project Pitches on 1/24
Jan 10: Week 1
- Canceling class due to a personal emergency
Summary
Language models have been the talk of the town ever since OpenAI’s ChatGPT became available to all! However, language models have been studied in Natural Language Processing for decades now, even though NLP has been recently revolutionized by the advancement of large-scale language models achieving state-of-the-art performance across a wide variety of tasks. But what is truly behind this seemingly fantastical technology? This course will cover the fundamentals of modern language modeling, and how they have grown to be the behemoth they are today. Students will gain familiarity with how LMs are constructed, model architectures underlying them as well as get hands-on experience with building and evaluating small-scale LMs. The class will also explore details and variants of the real-world consequences of deploying large-scale LMs, such as the ethics and harms associated with them.
Calendar + Syllabus
Also see the Detailed Calendar for required and additional readings for all lectures.
Introduction to Language Models
Early Neural Language Models
Modern Neural Language Models
Feb 19- No Class President’s Day
- Feb 21
- Sequence-To-Sequence and Attention Slides
- Feb 26
-
- Transformers - Building Blocks I Slides
- HW2 Due
- Feb 28
- Project Discussions
- Mar 4
-
- Transformers - Building Blocks II Slides
- HW3 Released Progress Report Due
- Mar 6
- TA Lecture: PyTorch for Transformers Colab
Mar 11- No Class Spring Break
Mar 13- No Class Spring Break
Large Language Models (LLMs)
- Mar 18
- Pre-training Transformers I Slides
- Mar 20
-
- Pre-training Transformers II Slides
- HW3 Due; HW4 Released
- Mar 25
- Guest Lecture: Limitations and Harms of LLMs Slides
- Mar 27
- Generating from Language Models I Slides
- Apr 1
- Generating from Language Models II Slides
- Apr 3
-
- Guest Lecture: Prompting Slides
- HW4 Due
- Apr 8
- Project Discussions
- Apr 10
- Guest Lecture: Aligning LLMs Slides
- Apr 15
- Advanced Topics + Putting it all together Slides
Project Presentations
- Apr 17
- Project Presentations
- Apr 22
- Project Presentations
- Apr 24
- Project Presentations
Apr 29- No Class Study Week
- May 1
-
- Project Final Report
- Due latest by 6:30pm PT
Calendar and prespecified syllabus are subject to change. More details, e.g. reading materials and additional resources will be added as the semester continues. All work (except the project final report) is due on the specified date by 11:59 PM PT.
Assignments
There will be three components to course grades, see more details.
- Homeworks (40%).
- Quizzes + Class Participation (20%).
- Class Project (40%).
Students are allowed a maximum of 6 late days total for all assignments (NO LATE DAYS ALLOWED FOR quizzes), with a maximum of 3 late days per deliverable.
Note: Please familiarize yourself with the academic policies and read the note about student well-being.
Pre-Requisites
Students are required to have taken CSCI-270 Introduction to Algorithms and Theory of Computing (4.0 units) as well as one of (CSCI-360 Introduction to AI, CSCI-467 Introduction to Machine Learning or equivalent experience). Fluency with python programming is recommended. Please email the instructor for special circumstances or specific clarifications.