aflite

Filter dataset instances with spurious biases.

Four sample biased datasets as input to AFLite (top). Blue and orange indicate two different classes. Only the original two dimensions are shown, not the bias features. For the dataset on the left, with the highest separation, we flip some labels at random, so even an RBF kernel cannot achieve perfect performance. AFLite makes the data more challenging for the models (bottom).

Cite our paper:

@inproceedings{Bras2020AdversarialFO,
  title={Adversarial Filters of Dataset Biases},
  author={Ronan Le Bras and Swabha Swayamdipta and Chandra Bhagavatula and Rowan Zellers and Matthew E. Peters and A. Sabharwal and Yejin Choi},
  bibtex={ICML},
  year={2020},
  url={https://arxiv.org/abs/2002.04108}
}