{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,14]],"date-time":"2026-05-14T19:02:43Z","timestamp":1778785363458,"version":"3.51.4"},"reference-count":50,"publisher":"MIT Press - Journals","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Transactions of the Association for Computational Linguistics"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:p> Innovations in annotation methodology have been a catalyst for Reading Comprehension (RC) datasets and models. One recent trend to challenge current RC models is to involve a model in the annotation process: Humans create questions adversarially, such that the model fails to answer them correctly. In this work we investigate this annotation methodology and apply it in three different settings, collecting a total of 36,000 samples with progressively stronger models in the annotation loop. This allows us to explore questions such as the reproducibility of the adversarial effect, transfer from data collected with varying model-in-the-loop strengths, and generalization to data collected without a model. We find that training on adversarially collected samples leads to strong generalization to non-adversarially collected datasets, yet with progressive performance deterioration with increasingly stronger models-in-the-loop. Furthermore, we find that stronger models can still learn from datasets collected with substantially weaker models-in-the-loop. When trained on data collected with a BiDAF model in the loop, RoBERTa achieves 39.9F<jats:sub>1<\/jats:sub> on questions that it cannot answer when trained on SQuAD\u2014only marginally lower than when trained on data collected using RoBERTa itself (41.0F<jats:sub>1<\/jats:sub>). <\/jats:p>","DOI":"10.1162\/tacl_a_00338","type":"journal-article","created":{"date-parts":[[2020,11,12]],"date-time":"2020-11-12T20:07:06Z","timestamp":1605211626000},"page":"662-678","source":"Crossref","is-referenced-by-count":41,"title":["Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension"],"prefix":"10.1162","volume":"8","author":[{"given":"Max","family":"Bartolo","sequence":"first","affiliation":[{"name":"Department of Computer Science University College London."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alastair","family":"Roberts","sequence":"additional","affiliation":[{"name":"Department of Computer Science University College London."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Johannes","family":"Welbl","sequence":"additional","affiliation":[{"name":"Department of Computer Science University College London."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sebastian","family":"Riedel","sequence":"additional","affiliation":[{"name":"Department of Computer Science University College London."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pontus","family":"Stenetorp","sequence":"additional","affiliation":[{"name":"Department of Computer Science University College London."}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","reference":[{"key":"bib1","doi-asserted-by":"crossref","first-page":"632","DOI":"10.18653\/v1\/D15-1075","volume-title":"Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing","author":"Bowman Samuel R.","year":"2015"},{"key":"bib2","doi-asserted-by":"crossref","first-page":"2358","DOI":"10.18653\/v1\/P16-1223","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Chen Danqi","year":"2016"},{"key":"bib3","doi-asserted-by":"crossref","first-page":"63","DOI":"10.18653\/v1\/W19-2008","volume-title":"Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP","author":"Chen Michael","year":"2019"},{"key":"bib4","doi-asserted-by":"crossref","first-page":"2174","DOI":"10.18653\/v1\/D18-1241","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Choi Eunsol","year":"2018"},{"key":"bib5","volume":"1803","author":"Clark Peter","year":"2018","journal-title":"CoRR"},{"key":"bib6","first-page":"5925","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Dasigi Pradeep","year":"2019"},{"key":"bib7","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1109\/CVPR.2009.5206848","volume-title":"2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Deng Jia","year":"2009"},{"key":"bib8","first-page":"4171","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Devlin Jacob","year":"2019"},{"key":"bib9","first-page":"4537","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Dinan Emily","year":"2019"},{"key":"bib10","first-page":"2368","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Dua Dheeru","year":"2019"},{"key":"bib11","volume":"1711","author":"Ettinger Allyson","year":"2017","journal-title":"CoRR"},{"key":"bib12","first-page":"1","volume-title":"Proceedings of Workshop for NLP Open Source Software (NLP-OSS)","author":"Gardner Matt","year":"2018"},{"key":"bib13","volume":"1811","author":"Grefenstette Edward","year":"2018","journal-title":"CoRR"},{"key":"bib14","first-page":"107","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)","author":"Gururangan Suchin","year":"2018"},{"key":"bib15","first-page":"1693","volume-title":"Advances in Neural Information Processing Systems 28","author":"Hermann Karl Moritz","year":"2015"},{"key":"bib16","first-page":"2021","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natura Language Processing","author":"Jia Robin","year":"2017"},{"key":"bib17","doi-asserted-by":"crossref","first-page":"1601","DOI":"10.18653\/v1\/P17-1147","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Joshi Mandar","year":"2017"},{"key":"bib18","volume-title":"International Conference on Learning Representations","author":"Kaushik Divyansh","year":"2020"},{"key":"bib19","doi-asserted-by":"crossref","first-page":"5010","DOI":"10.18653\/v1\/D18-1546","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Kaushik Divyansh","year":"2018"},{"key":"bib20","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00023"},{"key":"bib21","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00276"},{"key":"bib22","first-page":"3","volume-title":"SIGIR","author":"Lewis David D.","year":"1994"},{"key":"bib23","first-page":"2171","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Liu Nelson F.","year":"2019"},{"key":"bib24","volume":"1907","author":"Liu Yinhan","year":"2019","journal-title":"CoRR"},{"issue":"2","key":"bib25","first-page":"313","volume":"19","author":"Marcus Mitchell P.","year":"1993","journal-title":"Computational Linguistics"},{"key":"bib26","first-page":"1725","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Min Sewon","year":"2018"},{"key":"bib27","first-page":"1003","volume-title":"Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP","author":"Mintz Mike","year":"2009"},{"key":"bib28","author":"Nguyen Tri","year":"2016","journal-title":"arXiv preprint arXiv:1611.09268"},{"key":"bib29","author":"Nie Yixin","year":"2019","journal-title":"arXiv preprint arXiv: 1910.14599"},{"key":"bib30","doi-asserted-by":"crossref","first-page":"784","DOI":"10.18653\/v1\/P18-2124","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)","author":"Rajpurkar Pranav","year":"2018"},{"key":"bib31","doi-asserted-by":"crossref","first-page":"2383","DOI":"10.18653\/v1\/D16-1264","volume-title":"Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing","author":"Rajpurkar Pranav","year":"2016"},{"key":"bib32","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00266"},{"key":"bib33","first-page":"193","volume-title":"Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing","author":"Richardson Matthew","year":"2013"},{"key":"bib34","doi-asserted-by":"crossref","first-page":"2087","DOI":"10.18653\/v1\/D18-1233","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Saeidi Marzieh","year":"2018"},{"key":"bib35","doi-asserted-by":"crossref","first-page":"15","DOI":"10.18653\/v1\/K17-1004","volume-title":"Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)","author":"Schwartz Roy","year":"2017"},{"key":"bib36","volume-title":"The International Conference on Learning Representations (ICLR)","author":"Seo Minjoon","year":"2017"},{"key":"bib37","first-page":"254","volume-title":"Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing","author":"Snow Rion","year":"2008"},{"key":"bib38","doi-asserted-by":"crossref","first-page":"4208","DOI":"10.18653\/v1\/D18-1453","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Sugawara Saku","year":"2018"},{"key":"bib39","volume":"1911","author":"Sugawara Saku","year":"2019","journal-title":"CoRR"},{"key":"bib40","first-page":"1","volume-title":"Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER)","author":"Thorne James","year":"2019"},{"key":"bib41","doi-asserted-by":"crossref","first-page":"191","DOI":"10.18653\/v1\/W17-2623","volume-title":"Proceedings of the 2nd Workshop on Representation Learning for NLP","author":"Trischler Adam","year":"2017"},{"key":"bib42","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00279"},{"key":"bib43","doi-asserted-by":"crossref","first-page":"271","DOI":"10.18653\/v1\/K17-1028","volume-title":"Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)","author":"Weissenborn Dirk","year":"2017"},{"key":"bib44","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00021"},{"key":"bib45","doi-asserted-by":"crossref","unstructured":"Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R\u00e9mi Louf, Morgan Funtowicz, and Jamie Brew. 2019. HuggingFace\u2019s Transformers: State-of-the-art Natural Language Processing.","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"bib46","doi-asserted-by":"crossref","first-page":"2369","DOI":"10.18653\/v1\/D18-1259","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Yang Zhilin","year":"2018"},{"key":"bib47","volume-title":"International Conference on Learning Representations","author":"Yang Zhilin","year":"2018"},{"key":"bib48","doi-asserted-by":"crossref","first-page":"93","DOI":"10.18653\/v1\/D18-1009","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Zellers Rowan","year":"2018"},{"key":"bib49","doi-asserted-by":"crossref","first-page":"4791","DOI":"10.18653\/v1\/P19-1472","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Zellers Rowan","year":"2019"},{"key":"bib50","author":"Zhang Sheng","year":"2018","journal-title":"arXiv preprint arXiv:1810.12885"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/tacl_a_00338","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T21:39:45Z","timestamp":1615585185000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/96474"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12]]},"references-count":50,"alternative-id":["10.1162\/tacl_a_00338"],"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00338","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,12]]}}}