Link Search Menu Expand Document

CSCI499 Spring 2024: Language Models in NLP

🌸 Spring 2024     ⏰ Mon / Wed 4:00 - 5:50p     📍 DMC 260

Instructor: Swabha Swayamdipta

Office Hours: Mondays 2-3pm; SAL 238

Teaching Assistant: Sayan Ghosh

Office Hours: Wed 2-3pm; Location RTH 4th Floor Lobby

Course Producer: Xinyue Cui


Apr 10: Week 14

  • Quiz 5 graded sheet distributed in class.

Apr 1: Week 13

  • Quiz 5 conducted in class.

Mar 18: Week 11

  • Graded Progress Reports and Quiz 4 sheets now available.
  • No instructor office hours on 3/25; extra OH on request via email.

Mar 4: Week 9

  • Instructor Office Hours are now on Mondays from 2-3 pm.

Feb 26: Week 8

  • Quiz 3 was conducted in class.

Feb 21: Week 7

  • Project Proposals and Quiz 2 are graded.

Feb 12: Week 6

  • Quiz 2 was conducted in class.

Feb 5: Week 5

  • Graded Quiz 1 distributed in class, others can be collected in later classes / TA Office hours.

Jan 29: Week 4

  • Quiz 1 was completed in class.

Jan 24: Week 3

  • Project pitches were done in class and teammate selection details are now on Piazza.

Jan 17: Week 2

  • HW1 Released; due 1/31
  • Project Pitches on 1/24

Jan 10: Week 1

  • Canceling class due to a personal emergency


Language models have been the talk of the town ever since OpenAI’s ChatGPT became available to all! However, language models have been studied in Natural Language Processing for decades now, even though NLP has been recently revolutionized by the advancement of large-scale language models achieving state-of-the-art performance across a wide variety of tasks. But what is truly behind this seemingly fantastical technology? This course will cover the fundamentals of modern language modeling, and how they have grown to be the behemoth they are today. Students will gain familiarity with how LMs are constructed, model architectures underlying them as well as get hands-on experience with building and evaluating small-scale LMs. The class will also explore details and variants of the real-world consequences of deploying large-scale LMs, such as the ethics and harms associated with them.

Calendar + Syllabus

Also see the Detailed Calendar for required and additional readings for all lectures.

Introduction to Language Models

Jan 08
Introduction and Course Overview   Slides
No Additional Readings
Jan 10
Class Cancelled
Jan 15
No Class   MLK Day
Jan 17
n-gram Language Models I   Slides
HW1 Released
Jan 22
n-gram Language Models II   Slides
Jan 24
Project Pitches

Early Neural Language Models

Jan 29
Logistic Regression   Slides
Jan 31
Logistic Regression II   Slides
HW1 Due
Feb 5
Word Embeddings I   Slides
Feb 7
Word Embeddings II   Slides
HW2 Released Proposal Due
Feb 12
Feedforward Neural Nets and Backprop   Slides
Feb 14
Recurrent Neural Network LMs   Slides

Modern Neural Language Models

Feb 19
No Class   President’s Day
Feb 21
Sequence-To-Sequence and Attention   Slides
Feb 26
Transformers - Building Blocks I   Slides
HW2 Due
Feb 28
Project Discussions
Mar 4
Transformers - Building Blocks II  Slides
HW3 Released Progress Report Due
Mar 6
TA Lecture: PyTorch for Transformers   Colab
Mar 11
No Class   Spring Break
Mar 13
No Class   Spring Break

Large Language Models (LLMs)

Mar 18
Pre-training Transformers I   Slides
Mar 20
Pre-training Transformers II   Slides
HW3 Due; HW4 Released
Mar 25
Guest Lecture: Limitations and Harms of LLMs   Slides
Mar 27
Generating from Language Models I   Slides
Apr 1
Generating from Language Models II   Slides
Apr 3
Guest Lecture: Prompting   Slides
HW4 Due
Apr 8
Project Discussions
Apr 10
Guest Lecture: Aligning LLMs   Slides

Outro and Project Presentations

Apr 15
Putting it all together  
No Additional Readings
Apr 17
Project Presentations
Apr 22
Project Presentations
Apr 24
Project Presentations
Apr 29
No Class   Study Week
May 1
Project Final Report
Due latest by 6:30pm PT

Calendar and prespecified syllabus are subject to change. More details, e.g. reading materials and additional resources will be added as the semester continues. All work (except the project final report) is due on the specified date by 11:59 PM PT.


There will be three components to course grades, see more details.

Students are allowed a maximum of 6 late days total for all assignments (NO LATE DAYS ALLOWED FOR quizzes), with a maximum of 3 late days per deliverable.

Note: Please familiarize yourself with the academic policies and read the note about student well-being.


Students are required to have taken CSCI-270 Introduction to Algorithms and Theory of Computing (4.0 units) as well as one of (CSCI-360 Introduction to AI, CSCI-467 Introduction to Machine Learning or equivalent experience). Fluency with python programming is recommended. Please email the instructor for special circumstances or specific clarifications.

Previous Iterations