Link Search Menu Expand Document

CSCI 544 Fall 2024: Applied NLP

🍂 Fall 2024     ⏰ Tue / Thu 4:00 - 5:50p     📍 SAL 101

Instructor: Swabha Swayamdipta

swabhas@usc.edu

Office Hours: Wednesdays 8-9 AM @ Zoom (link on Brightspace)

Teaching Assistant: Abel Salinas Jr.

abelsali@usc.edu

Office Hours: Thursdays 3-4 PM @ SAL 213

Teaching Assistant: Brihi Joshi

brihijos@usc.edu

Office Hours: Tuesdays 10-11 AM @ RTH 314

Teaching Assistant: Ting-Yun (Charlotte) Chang

tingyun@usc.edu

Office Hours: Thursdays 10 - 11 AM @ PHE 102

Teaching Assistant: Matt Finlayson

mfinlays@usc.edu

Office Hours: Mondays 2-3 PM @ SAL 322

Teaching Assistant: Sayan Ghosh

ghoshsay@usc.edu

Office Hours: Fridays 11-12 PM @ RTH 420

Teaching Assistant: Ziyi Liu

zliu2803@usc.edu

Office Hours: Wednesdays 3-4 PM @ PHE101

Announcements

See Brightspace.

Summary

This course covers both fundamental and cutting-edge topics in Natural Language Processing (NLP) with a focus on Language Models. Natural language processing (NLP) has been revolutionized by the advancement of large-scale language models achieving state-of-the-art performance across a wide variety of tasks. This course will cover the fundamentals of language modeling and related topics in natural language processing, deep learning and machine learning. Students will gain familiarity with the capabilities of large language models as well as get hands-on experience with building and evaluating small-scale language models. The class will also explore the real-world consequences of deploying language models, such as the ethics and harms associated with them

Calendar + Syllabus

Week Date Class Topics Readings Work Due
1 Aug 27 Introduction to LMs and Course Overview
Aug 29 n-gram Models J&M, Chap 3
2 Sep 3 n-gram Language Models (Smoothing) + Logistic Regression J&M, Chap 3 HW1 Released
Sep 5 Logistic Regression (contd.) J&M, Chap 5 Quiz 1;
3 Sep 10 Word Embeddings J&M, Chap 6 Group Formation Deadline;
Sep 12 Word Embeddings (contd.) J&M, Chap 6 Additional: word2vec Explained
4 Sep 17 Feedforward Neural Nets J&M, Chap 7 Quiz 2;
Sep 19 Backpropagation J&M, Chap 7 HW1 Due; HW2 Release
5 Sep 24 Recurrent Neural Networks J&M, Chap 8
Sep 26 Seq2Seq and Attention J&M, Chap 8 Project Proposal Due;
6 Oct 1 Transformers - Building Blocks J&M, Chap 9
Oct 3 Transformers (contd.) J&M, Chap 9 Quiz 3; Mid-Semester Evaluation
7 Oct 8 Guest Lecture - PyTorch for Transformers HW2 Due; HW3 Release
Oct 10 Fall Break
8 Oct 15 Midterm Exam
Oct 17 Pre-training and Finetuning Transformers J&M, Chaps 10, 11
9 Oct 22 Tokenization and Generating from LMs J&M, Chaps 2.5, 13 HW3 Due; HW4 Released
Oct 24 Language Generation J&M, Chaps 13
10 Oct 29 Large Language Models - Pre-Training J&M, Chaps 10, 12 Project Status Report Due;
Oct 31 Large Language Models - Post-Training + Paper Presentations I J&M, Chaps 12
11 Nov 5 LLMs - Preference Tuning + Harms + Paper Presentations II J&M, Chaps 12
Nov 7 Paper Presentations III Quiz 4;
12 Nov 12 Paper Presentations IV Quiz 5;
Nov 14 Guest Lecture on Pretraining LLMs by Prof. Neiswanger HW4 Due
13 Nov 19 Project Presentations I;
Nov 21 Project Presentations II;
14 Nov 26 Project Presentations III;
Nov 28 Thanksgiving
15 Dec 3 Project Presentations IV;
Dec 5 Final Exam
16 Dec 10 Study Week
Dec 12 Study Week
17 Dec 17 Project Final Report due by 6:30pm;

Calendar and prespecified syllabus are subject to change. More details, e.g. reading materials and additional resources will be added as the semester continues. All work (except the project final report) is due on the specified date by 11:59 PM PT.

Assignments and Grading

There will be three components to course grades:

  • Homeworks (20%).
    • 5% X 4: There will be four coding homework assignments based on the topics of the class.
  • Quizzes (10%).
    • 2% X 5: Multiple-Choice Questions and Short Answers. Missed quizzes will receive a zero grade, and there will be no make-up quizzes.
  • Class Projects (40%).
    • Each student will do a group class project based on the topics covered in the class. Students will propose their own project, do the research and build a proof-of-concept, create a video demonstration of the proof-of-concept, and present the project in their report.
    • Proposal: 4%
    • Status Reports: 8%
    • Project Presentation: 10%
    • Final Write-up: 18%
  • Paper Presentations (5%).
    • The project teams will present a scientific publication related to their project to the class.
    • All members of the team are expected to identify the central points of the research, and present that research to the class, as well as answer questions from the instructor, TAs and fellow students.
    • One member of team—randomly picked by the instructors a couple of hours before the presentation—will be the presenter, so please prepare accordingly!
    • The presenter is responsible for the entire team’s grade, so please ensure both you and your teammates are prepared!
    • The total time of each team’s presentation is 5 minutes (3 min presentation + 2 min QA) - we will be very strict about this.
    • If you are NOT presenting, you could participate in Q/A - bonus points will be awarded to folks who ask insightful questions (announce your name before you ask a question).
    • Each team will prepare 3 slides (via Google slides) to be shared with their assigned TAs by 11:59 PM the day before the presentation. Failure to share will cause a loss of grade.
    • Content of the slides:
      • Slide 1: Main Research Question in the paper,
      • Slide 2: Main Results Summarized,
      • Slide 3: How this influences your project.
  • Exams (25%)
    • Midterm (10%): The midterm exam will contain a mixture of multiple choice and long form questions, covering about the first half of the material covered in the class.
    • Final (15%): The final exam at the end of the semester covering all of the material covered in the class will contain a mixture of multiple choice and long form questions.

Grading inquiries and questions about the grading of the homeworks and the quizzes can be asked (to the TAs) within two weeks from the grading date (the date the grades are released). Grades will be available within 2-2.5 weeks after submission.

All written assignments related to the final project should use the standard *ACL paper submission template.

Late Days

Students are allowed a maximum of 6 late days total for all assignments (but NOT the quiz sheets). You may use up to 3 late days per assignment. Using one late day for a project assignment involves each of the teammates using a late day each. Partial late days are not permitted. For every extra late day beyond the allowed late days, the student / team will lose 20% of the grade for the assignment.

Note: Please familiarize yourself with the academic policies and read the note about student well-being.

Similar Classes

  • Undergraduate-level Special Topics: Language Models in NLP Spring 2024
  • Undergraduate-level Special Topics: Language Models in NLP Fall 2023