graviti logoProductOpen DatasetsAbout
Request DemoSign in
233
0
0
General
Discussion
Code
Activities
c77aef4f-8cd1-11eb-88ae-0e1f58d5e9a9
20dbf70·
Jun 20, 2021 9:56 AM
·1Commits

Overview

The Cross-lingual Natural Language Inference (XNLI) corpus is a crowd-sourced collection of 5,000 test and 2,500 dev pairs for the MultiNLI corpus. The pairs are annotated with textual entailment and translated into 14 languages: French, Spanish, German, Greek, Bulgarian, Russian, Turkish, Arabic, Vietnamese, Thai, Chinese, Hindi, Swahili and Urdu. This results in 112.5k annotated pairs. Each premise can be associated with the corresponding hypothesis in the 15 languages, summing up to more than 1.5M combinations.

Examples

Citation

If you use the corpus in an academic paper, please consider citing:

@InProceedings{conneau2018xnli,
  author = "Conneau, Alexis
        and Rinott, Ruty
        and Lample, Guillaume
        and Williams, Adina
        and Bowman, Samuel R.
        and Schwenk, Holger
        and Stoyanov, Veselin",
  title = "XNLI: Evaluating Cross-lingual Sentence Representations",
  booktitle = "Proceedings of the 2018 Conference on Empirical Methods
               in Natural Language Processing",
  year = "2018",
  publisher = "Association for Computational Linguistics",
  location = "Brussels, Belgium",
}
🎉Many thanks to Graviti Open Datasets for contributing the dataset
Basic Information
Application ScenariosNot Available
AnnotationsNot Available
TasksNot Available
LicenseCC BY-NC 4.0
Updated on2021-01-20 05:58:00
Metadata
Data TypeNot Available
Data Volume10K
Annotation Amount0
File Size0B
Copyright Owner
Facebook Research
Annotator
Unknown
More Support Options
Start building your AI now
Get StartedContact