CSCI499 Spring 2024: Language Models in NLP

Name: Language Models in NLP
Author: Swabha Swayamdipta

🌸 Spring 2024 ⏰ Mon / Wed 4:00 - 5:50p 📍 DMC 260

Instructor: Swabha Swayamdipta

swabhas@usc.edu

Office Hours: Mondays 2-3pm; SAL 238

Teaching Assistant: Sayan Ghosh

ghoshsay@usc.edu

Office Hours: Wed 2-3pm; Location RTH 4th Floor Lobby

Course Producer: Xinyue Cui

xinyuecu@usc.edu

Announcements

Apr 15: Week 15

Quiz 6 conducted in class.

Apr 10: Week 14

Quiz 5 graded sheet distributed in class.

Apr 1: Week 13

Quiz 5 conducted in class.

Mar 18: Week 11

Graded Progress Reports and Quiz 4 sheets now available.
No instructor office hours on 3/25; extra OH on request via email.

Mar 4: Week 9

Instructor Office Hours are now on Mondays from 2-3 pm.

Feb 26: Week 8

Quiz 3 was conducted in class.

Feb 21: Week 7

Project Proposals and Quiz 2 are graded.

Feb 12: Week 6

Quiz 2 was conducted in class.

Feb 5: Week 5

Graded Quiz 1 distributed in class, others can be collected in later classes / TA Office hours.

Jan 29: Week 4

Quiz 1 was completed in class.

Jan 24: Week 3

Project pitches were done in class and teammate selection details are now on Piazza.

Jan 17: Week 2

HW1 Released; due 1/31
Project Pitches on 1/24

Jan 10: Week 1

Canceling class due to a personal emergency

Summary

Language models have been the talk of the town ever since OpenAI’s ChatGPT became available to all! However, language models have been studied in Natural Language Processing for decades now, even though NLP has been recently revolutionized by the advancement of large-scale language models achieving state-of-the-art performance across a wide variety of tasks. But what is truly behind this seemingly fantastical technology? This course will cover the fundamentals of modern language modeling, and how they have grown to be the behemoth they are today. Students will gain familiarity with how LMs are constructed, model architectures underlying them as well as get hands-on experience with building and evaluating small-scale LMs. The class will also explore details and variants of the real-world consequences of deploying large-scale LMs, such as the ethics and harms associated with them.

Calendar + Syllabus

Also see the Detailed Calendar for required and additional readings for all lectures.

Introduction to Language Models

Jan 08

Introduction and Course Overview Slides: No Additional Readings

~~Jan 10~~

Class Cancelled

~~Jan 15~~

No Class MLK Day

Jan 17

n-gram Language Models I Slides: HW1 Released

Jan 22

n-gram Language Models II Slides

Jan 24

Project Pitches

Early Neural Language Models

Jan 29

Logistic Regression Slides

Jan 31

Logistic Regression II Slides: HW1 Due

Feb 5

Word Embeddings I Slides

Feb 7

Word Embeddings II Slides: HW2 Released Proposal Due

Feb 12

Feedforward Neural Nets and Backprop Slides

Feb 14

Recurrent Neural Network LMs Slides

Modern Neural Language Models

~~Feb 19~~

No Class President’s Day

Feb 21

Sequence-To-Sequence and Attention Slides

Feb 26

Transformers - Building Blocks I Slides: HW2 Due

Feb 28

Project Discussions

Mar 4

Transformers - Building Blocks II Slides: HW3 Released Progress Report Due

Mar 6

TA Lecture: PyTorch for Transformers Colab

~~Mar 11~~

No Class Spring Break

~~Mar 13~~

No Class Spring Break

Large Language Models (LLMs)

Mar 18

Pre-training Transformers I Slides

Mar 20

Pre-training Transformers II Slides: HW3 Due; HW4 Released

Mar 25

Guest Lecture: Limitations and Harms of LLMs Slides

Mar 27

Generating from Language Models I Slides

Apr 1

Generating from Language Models II Slides

Apr 3

Guest Lecture: Prompting Slides: HW4 Due

Apr 8

Project Discussions

Apr 10

Guest Lecture: Aligning LLMs Slides

Apr 15

Advanced Topics + Putting it all together Slides

Project Presentations

Apr 17

Project Presentations

Apr 22

Project Presentations

Apr 24

Project Presentations

~~Apr 29~~

No Class Study Week

May 1

Project Final Report: Due latest by 6:30pm PT

Calendar and prespecified syllabus are subject to change. More details, e.g. reading materials and additional resources will be added as the semester continues. All work (except the project final report) is due on the specified date by 11:59 PM PT.

Assignments

There will be three components to course grades, see more details.

Homeworks (40%).
Quizzes + Class Participation (20%).
Class Project (40%).

Students are allowed a maximum of 6 late days total for all assignments (NO LATE DAYS ALLOWED FOR quizzes), with a maximum of 3 late days per deliverable.

Note: Please familiarize yourself with the academic policies and read the note about student well-being.

Pre-Requisites

Students are required to have taken CSCI-270 Introduction to Algorithms and Theory of Computing (4.0 units) as well as one of (CSCI-360 Introduction to AI, CSCI-467 Introduction to Machine Learning or equivalent experience). Fluency with python programming is recommended. Please email the instructor for special circumstances or specific clarifications.

Previous Iterations

Fall 2023
- See previous class projects here.