Swabha Swayamdipta

Postdoctoral Investigator • MOSAICAllen Institute for AI

My research focuses on studying biases in datasets and models. Good biases, such as structural inductive biases help language understanding - check out my PhD thesis on these. But biases can be undesirable, e.g. spurious correlations commonly found in crowd-sourced, large-scale datasets due to annotation artifacts, or social prejudices of human annotators and task designers (coming soon!).


I obtained my PhD from Carnegie Mellon University in May 2019, where I was advised by Noah Smith and Chris Dyer. During most of my PhD I was a visiting student at the University of Washington in Seattle.

Update I am looking for academic positions in Winter / Spring 2021!


Dec 3, 2020 Guest lecture in Eunsol Choi’s Topics in NLP class at UT Austin on Biases and Interpretability.
Nov 2, 2020 Was delighted to be an invited speaker for Responsible AI at the Microsoft E+D Product Leaders Conference.
Sep 22, 2020 Preprint for EMNLP acceptance Dataset Cartography is now available on ArXiv. Camera-ready version and code coming soon!
Sep 15, 2020 Paper titled Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics is now accepted to the Proceedings of EMNLP, and GDaug is accepted to Findings of EMNLP.
Aug 13, 2020 Completed one year as a postdoctoral investigator at AI2!