NLI models might not be making the right decisions for the right reasons.

Annotation artifacts for different NLI labels.

We show that, in a significant portion of Natural Language Inference data, the annotation protocol leaves clues that make it possible to identify the label by looking only at the hypothesis, without observing the premise. Specifically, we show that a simple text categorization model can correctly classify the hypothesis alone in about 67% of SNLI (Bowman et. al, 2015) and 53% of MultiNLI (Williams et. al, 2017).

