dataset cartography

Visualize datasets with respect to models.

SNLI data map with respect to a RoBERTa model.

An automatic approach to study different regions in a dataset, as a side-effect of training a model. Data maps provide point-wise estimates for individual data instances, based on the ease of a model to learn them.

Check out Elior Cohen’s wonderful blogpost about our paper.

@inproceedings{swayamdipta2020dataset,
      title={Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics},
      author={Swabha Swayamdipta and Roy Schwartz and Nicholas Lourie and Yizhong Wang and Hannaneh Hajishirzi and Noah A. Smith and Yejin Choi},
      year={2020},
      url={https://arxiv.org/abs/2009.10795},
      booktitle={EMNLP}
}