{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,23]],"date-time":"2026-07-23T16:06:16Z","timestamp":1784822776104,"version":"3.55.0"},"reference-count":51,"publisher":"MIT Press - Journals","license":[{"start":{"date-parts":[[2022,5,5]],"date-time":"2022-05-05T00:00:00Z","timestamp":1651708800000},"content-version":"vor","delay-in-days":124,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,5,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>One of the biggest challenges hindering progress in low-resource and multilingual machine translation is the lack of good evaluation benchmarks. Current evaluation benchmarks either lack good coverage of low-resource languages, consider only restricted domains, or are low quality because they are constructed using semi-automatic procedures. In this work, we introduce the Flores-101 evaluation benchmark, consisting of 3001 sentences extracted from English Wikipedia and covering a variety of different topics and domains. These sentences have been translated in 101 languages by professional translators through a carefully controlled process. The resulting dataset enables better assessment of model quality on the long tail of low-resource languages, including the evaluation of many-to-many multilingual translation systems, as all translations are fully aligned. By publicly releasing such a high-quality and high-coverage dataset, we hope to foster progress in the machine translation community and beyond.<\/jats:p>","DOI":"10.1162\/tacl_a_00474","type":"journal-article","created":{"date-parts":[[2022,5,5]],"date-time":"2022-05-05T19:11:46Z","timestamp":1651777906000},"page":"522-538","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":135,"title":["The <scp>Flores-101<\/scp> Evaluation Benchmark for Low-Resource and Multilingual Machine Translation"],"prefix":"10.1162","volume":"10","author":[{"given":"Naman","family":"Goyal","sequence":"first","affiliation":[{"name":"Facebook AI Research, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Cynthia","family":"Gao","sequence":"additional","affiliation":[{"name":"Facebook AI Research, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Vishrav","family":"Chaudhary","sequence":"additional","affiliation":[{"name":"Facebook AI Research, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Peng-Jen","family":"Chen","sequence":"additional","affiliation":[{"name":"Facebook AI Research, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Guillaume","family":"Wenzek","sequence":"additional","affiliation":[{"name":"Facebook AI Research, France"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Da","family":"Ju","sequence":"additional","affiliation":[{"name":"Facebook AI Research, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sanjana","family":"Krishnan","sequence":"additional","affiliation":[{"name":"Facebook AI Research, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Marc\u2019Aurelio","family":"Ranzato","sequence":"additional","affiliation":[{"name":"Facebook AI Research, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Francisco","family":"Guzm\u00e1n","sequence":"additional","affiliation":[{"name":"Facebook AI Research, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Angela","family":"Fan","sequence":"additional","affiliation":[{"name":"Facebook AI Research, France"},{"name":"LORIA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"281","published-online":{"date-parts":[[2022,5,4]]},"reference":[{"key":"2022050519113833100_bib1","first-page":"98","article-title":"Benchmarking neural machine translation for Southern African languages","volume-title":"Proceedings of the 2019 Workshop on Widening NLP","author":"Abbott","year":"2019"},{"key":"2022050519113833100_bib2","article-title":"Menyo-20k: A multi-domain english-yor\u2216ub\u2216\u2019a corpus for machine translation and domain adaptation","author":"Adelani","year":"2021","journal-title":"arXiv preprint arXiv:2103.08647"},{"key":"2022050519113833100_bib3","doi-asserted-by":"publisher","first-page":"3204","DOI":"10.18653\/v1\/P19-1310","article-title":"JW300: A wide-coverage parallel corpus for low-resource languages","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Agi\u0107","year":"2019"},{"key":"2022050519113833100_bib4","first-page":"3874","article-title":"Massively multilingual neural machine translation","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Aharoni","year":"2019"},{"key":"2022050519113833100_bib5","article-title":"Towards a parallel corpus of Portuguese and the Bantu language Emakhuwa of Mozambique","author":"Ali","year":"2021","journal-title":"arXiv preprint arXiv:2104.05753"},{"key":"2022050519113833100_bib6","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.nlpcovid19-2.5","article-title":"Tico-19: The translation initiative for covid- 19","volume-title":"EMNLP Workshop on NLP-COVID","author":"Anastasopoulos","year":"2020"},{"key":"2022050519113833100_bib7","article-title":"Massively multilingual neural machine translation in the wild: Findings and challenges","author":"Arivazhagan","year":"2019","journal-title":"arXiv preprint arXiv:1907.05019"},{"key":"2022050519113833100_bib8","article-title":"Domain-specific MT for low-resource languages: The case of Bambara- French","author":"Tapo","year":"2021","journal-title":"arXiv e-prints"},{"key":"2022050519113833100_bib9","first-page":"351","article-title":"Boosting neural machine translation from finnish to northern S\u00e1mi with rule-based backtranslation","volume-title":"NoDaLiDa 2021","author":"Aulamo","year":"2021"},{"key":"2022050519113833100_bib10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18653\/v1\/W19-5301","article-title":"Findings of the 2020 conference on machine translation (wmt20)","volume-title":"Proceedings of the Fifth Conference on Machine Translation","author":"Barrault","year":"2020"},{"key":"2022050519113833100_bib11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18653\/v1\/W19-5301","article-title":"Findings of the 2020 conference on machine translation (wmt20)","volume-title":"Proceedings of the Fifth Conference on Machine Translation","author":"Barrault","year":"2020"},{"key":"2022050519113833100_bib12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18653\/v1\/W19-5301","article-title":"Findings of the 2019 conference on machine translation (wmt19)","volume-title":"Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)","author":"Barrault","year":"2019"},{"key":"2022050519113833100_bib13","doi-asserted-by":"publisher","first-page":"272","DOI":"10.18653\/v1\/W18-6401","article-title":"Findings of the 2018 conference on machine translation (wmt18)","volume-title":"Proceedings of the Third Conference on Machine Translation","author":"Bojar","year":"2018"},{"key":"2022050519113833100_bib14","doi-asserted-by":"publisher","first-page":"169","DOI":"10.18653\/v1\/W17-4717","article-title":"Findings of the 2017 conference on machine translation (wmt17)","volume-title":"Second Conference on Machine Translation","author":"Bojar","year":"2017"},{"key":"2022050519113833100_bib15","doi-asserted-by":"publisher","first-page":"207","DOI":"10.3115\/v1\/D14-1026","article-title":"A human judgement corpus and a metric for Arabic MT evaluation","volume-title":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Bouamor","year":"2014"},{"key":"2022050519113833100_bib16","article-title":"Monolingual and parallel corpora for Kangri low resource language","author":"Chauhan","year":"2021","journal-title":"arXiv preprint arXiv:2103.11596"},{"issue":"2","key":"2022050519113833100_bib17","doi-asserted-by":"publisher","first-page":"375","DOI":"10.1007\/s10579-014-9287-y","article-title":"A massively parallel corpus: The bible in 100 languages","volume":"49","author":"Christodouloupoulos","year":"2015","journal-title":"Language Resources and Evaluation"},{"key":"2022050519113833100_bib18","doi-asserted-by":"publisher","first-page":"8440","DOI":"10.18653\/v1\/2020.acl-main.747","article-title":"Unsupervised cross-lingual representation learning at scale","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Conneau","year":"2020"},{"key":"2022050519113833100_bib19","first-page":"149","article-title":"Similar southeast Asian languages: Corpus-based case study on Thai- Laotian and Malay-Indonesian","volume-title":"Proceedings of the 3rd Workshop on Asian Translation (WAT2016)","author":"Ding","year":"2016"},{"key":"2022050519113833100_bib20","article-title":"Ffr v1. 1: FonFrench neural machine translation","author":"Dossou","year":"2020","journal-title":"arXiv preprint arXiv:2006.09217"},{"key":"2022050519113833100_bib21","article-title":"Crowdsourced phrase-based tokenization for low-resourced neural machine translation: The case of Fon language","author":"Dossou","year":"2021","journal-title":"arXiv preprint arXiv:2103.08052"},{"key":"2022050519113833100_bib22","article-title":"AmericasNLI: Evaluating zero-shot natural language understanding of pretrained multilingual models in truly low-resource languages","author":"Ebrahimi","year":"2021","journal-title":"arXiv preprint arXiv:2104.08726"},{"key":"2022050519113833100_bib23","article-title":"Igbo-English machine translation: An evaluation benchmark","author":"Ezeani","year":"2020","journal-title":"arXiv preprint arXiv:2004.00648"},{"key":"2022050519113833100_bib24","article-title":"Beyond English-centric multilingual machine translation","author":"Fan","year":"2020","journal-title":"arXiv preprint arXiv:2010.11125"},{"key":"2022050519113833100_bib25","article-title":"Participatory research for low-resourced machine translation: A case study in african languages","author":"Nekoto","year":"2020","journal-title":"arXiv preprint arXiv:2010.02353"},{"key":"2022050519113833100_bib26","first-page":"550","article-title":"Complete multilingual neural machine translation","volume-title":"Proceedings of the Fifth Conference on Machine Translation","author":"Freitag","year":"2020"},{"key":"2022050519113833100_bib27","article-title":"Extended parallel corpus for Amharic-English machine translation","author":"Gezmu","year":"2021","journal-title":"arXiv preprint arXiv: 2104.03543"},{"key":"2022050519113833100_bib28","doi-asserted-by":"publisher","first-page":"6098","DOI":"10.18653\/v1\/D19-1632","article-title":"The FLORES evaluation datasets for low-resource machine translation: Nepali\u2013 English and Sinhala\u2013English","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Guzm\u00e1n","year":"2019"},{"key":"2022050519113833100_bib29","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00065","article-title":"Google\u2019s multilingual neural machine translation system: Enabling zero-shot translation","volume-title":"Transactions of the Association for Computational Linguistics","author":"Johnson","year":"2016"},{"key":"2022050519113833100_bib30","article-title":"Europarl: A parallel corpus for statistical machine translation","author":"Koehn","year":"2005"},{"key":"2022050519113833100_bib31","article-title":"Unsupervised machine translation on Dravidian languages","author":"Koneru","year":"2021","journal-title":"arXiv preprint arXiv:2103.15877"},{"key":"2022050519113833100_bib32","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-2012","article-title":"Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing","author":"Kudo","year":"2018","journal-title":"arXiv preprint arXiv:1808.06226"},{"key":"2022050519113833100_bib33","article-title":"Low-resource machine translation for low-resource languages: Leveraging comparable data, code-switching and compute resources","author":"Kuwanto","year":"2021","journal-title":"arXiv preprint arXiv:2103.13272"},{"issue":"2","key":"2022050519113833100_bib34","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3448216","article-title":"Finding better subwords for Tibetan neural machine translation","volume":"20","author":"Li","year":"2021","journal-title":"Transactions on Asian and Low-Resource Language Information Processing"},{"key":"2022050519113833100_bib35","doi-asserted-by":"publisher","first-page":"2529","DOI":"10.18653\/v1\/D17-1268","article-title":"Learning language representations for typology prediction","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Malaviya","year":"2017"},{"key":"2022050519113833100_bib36","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-long.566","article-title":"Scientific credibility of machine translation research: A meta-evaluation of 769 papers","author":"Marie","year":"2021","journal-title":"CoRR"},{"key":"2022050519113833100_bib37","first-page":"688","article-title":"Results of the wmt20 metrics shared task","volume-title":"Proceedings of the Fifth Conference on Machine Translation","author":"Mathur","year":"2020"},{"key":"2022050519113833100_bib38","article-title":"Indt5: A text-to-text transformer for 10 indigenous languages","author":"El","year":"2021","journal-title":"arXiv preprint arXiv:2104.07483"},{"key":"2022050519113833100_bib39","article-title":"Low-resource neural machine translation for southern African languages","author":"Nyoni","year":"2021","journal-title":"arXiv preprint arXiv:2104.00366"},{"key":"2022050519113833100_bib40","doi-asserted-by":"publisher","first-page":"612","DOI":"10.18653\/v1\/W17-4770","article-title":"chrf++: words helping character n-grams","volume-title":"Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers","author":"Popovi\u0107","year":"2017"},{"key":"2022050519113833100_bib41","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W18-6319","article-title":"A call for clarity in reporting BLEU scores","author":"Post","year":"2018","journal-title":"arXiv preprint arXiv:1804.08771"},{"key":"2022050519113833100_bib42","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/ICSDA.2016.7918974","article-title":"Introduction of the Asian language treebank","volume-title":"2016 Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA)","author":"Riza","year":"2016"},{"key":"2022050519113833100_bib43","doi-asserted-by":"publisher","first-page":"1351","DOI":"10.18653\/v1\/2021.eacl-main.115","article-title":"WikiMatrix: Mining 135M parallel sentences in 1620 language pairs from Wikipedia","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Schwenk","year":"2021"},{"key":"2022050519113833100_bib44","article-title":"CCMatrix: Mining billions of high-quality parallel sentences on the web","author":"Schwenk","year":"2019","journal-title":"arXiv preprint arXiv:1911.04944"},{"key":"2022050519113833100_bib45","first-page":"3273","article-title":"LORELEI language packs: Data, tools, and resources for technology development in low resource languages","volume-title":"Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC\u201916)","author":"Strassel","year":"2016"},{"key":"2022050519113833100_bib46","first-page":"1574","article-title":"Introducing the Asian language treebank (ALT)","volume-title":"Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC\u201916)","author":"Ye","year":"2016"},{"key":"2022050519113833100_bib47","first-page":"188","article-title":"Emerging language spaces learned from massively multilingual corpora","volume-title":"Digital Humanities in the Nordic Countries DHN2018","author":"Tiedemann","year":"2018"},{"key":"2022050519113833100_bib48","first-page":"1174","article-title":"The Tatoeba Translation Challenge \u2013 Realistic data sets for low resource and multilingual MT","volume-title":"Proceedings of the Fifth Conference on Machine Translation","author":"Tiedemann","year":"2020"},{"key":"2022050519113833100_bib49","article-title":"Ccnet: Extracting high quality monolingual datasets from web crawl data","author":"Wenzek","year":"2019","journal-title":"arXiv preprint arXiv:1911.00359"},{"key":"2022050519113833100_bib50","doi-asserted-by":"publisher","first-page":"1628","DOI":"10.18653\/v1\/2020.acl-main.148","article-title":"Improving massively multilingual neural machine translation and zero-shot translation","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Zhang","year":"2020"},{"key":"2022050519113833100_bib51","doi-asserted-by":"publisher","first-page":"73","DOI":"10.18653\/v1\/W19-5208","article-title":"The effect of translationese in machine translation test sets","volume-title":"Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)","author":"Zhang","year":"2019"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00474\/2020699\/tacl_a_00474.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00474\/2020699\/tacl_a_00474.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,5]],"date-time":"2022-05-05T19:12:13Z","timestamp":1651777933000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00474\/110993\/The-Flores-101-Evaluation-Benchmark-for-Low"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022]]},"references-count":51,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00474","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022]]},"published":{"date-parts":[[2022]]}}}