Language Models in Natural Language Processing
To be offered as CSCI 499 - Special Topics in Fall 2023
Class Description
Language models (LMs) have been the talk of the town ever since OpenAI’s ChatGPT became available to all! However, language models have been around in Natural Language Processing (NLP) for decades now, even though NLP has been recently revolutionized by the advancement of large-scale language models (LMs) achieving state-of-the-art performance across a wide variety of tasks. But what is truly behind this seemingly fantastical technology?This course will cover the fundamentals of language modeling, and how they have grown to be the behemoth they are today. Students will gain familiarity with how LMs are constructed, model architectures underlying them as well as get hands-on experience with building and evaluating small-scale LMs. The class will also explore the real-world consequences of deploying LMs, such as the ethics and harms associated with them.
Who can take this class?
Students are recommended to have taken CSCI-270 (Introduction to Algorithms and Theory of Computing (4.0 units) as well as 1 of (CSCI-360 Introduction to AI, CSCI-467 Introduction to Machine Learning or equivalent experience). Fluency with python programming is highly recommended. Please email me for special circumstances or specific clarifications.
Syllabus
Subject to change before the beginning of the Fall 2023 Semester
- Week 1: Introduction; Language Models of Today; Fundamentals of Probability - Recap
- Week 2: n-Gram Models
- Week 3: Perplexity and Evaluation
- Week 4: Tokenization and Word Embeddings
- Week 5: Recurrent Neural Net Language Models
- Week 6: Transformers and Transformer LMs
- Week 7: Masked Language Models
- Week 8: Encoder - Decoder Language Models
- Week 9: Prompting
- Week 10: Pretrained Language Models and Finetuned Language Models
- Week 11: Probing and Analyzing
- Week 12: Generating text from a language model
- Week 13: Datasets for training large language models
- Week 14: Ethics and Harms
- Week 15: Latest Progress / LMs in Industry / Wrapping it all up
More details coming soon!