{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,5]],"date-time":"2026-06-05T15:53:28Z","timestamp":1780674808589,"version":"3.54.1"},"reference-count":37,"publisher":"MIT Press - Journals","license":[{"start":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T00:00:00Z","timestamp":1649721600000},"content-version":"vor","delay-in-days":101,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,4,13]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>In a conversational question answering scenario, a questioner seeks to extract information about a topic through a series of interdependent questions and answers. As the conversation progresses, they may switch to related topics, a phenomenon commonly observed in information-seeking search sessions. However, current datasets for conversational question answering are limiting in two ways: 1) they do not contain topic switches; and 2) they assume the reference text for the conversation is given, that is, the setting is not open-domain. We introduce TopiOCQA (pronounced Tapioca), an open-domain conversational dataset with topic switches based on Wikipedia. TopiOCQA contains 3,920 conversations with information-seeking questions and free-form answers. On average, a conversation in our dataset spans 13 question-answer turns and involves four topics (documents). TopiOCQA poses a challenging test-bed for models, where efficient retrieval is required on multiple turns of the same conversation, in conjunction with constructing valid responses using conversational history. We evaluate several baselines, by combining state-of-the-art document retrieval methods with neural reader models. Our best model achieves F1 of 55.8, falling short of human performance by 14.2 points, indicating the difficulty of our dataset. Our dataset and code are available at https:\/\/mcgill-nlp.github.io\/topiocqa.<\/jats:p>","DOI":"10.1162\/tacl_a_00471","type":"journal-article","created":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T13:31:23Z","timestamp":1649770283000},"page":"468-483","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":52,"title":["TopiOCQA: Open-domain Conversational Question Answering with Topic Switching"],"prefix":"10.1162","volume":"10","author":[{"given":"Vaibhav","family":"Adlakha","sequence":"first","affiliation":[{"name":"Mila, McGill University, Canada"},{"name":"ServiceNow Research, Canada. vaibhav.adlakha@mila.quebec"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Shehzaad","family":"Dhuliawala","sequence":"additional","affiliation":[{"name":"ETH Z\u00fcrich, Switzerland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kaheer","family":"Suleman","sequence":"additional","affiliation":[{"name":"Microsoft Montr\u00e9al, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Harm","family":"de Vries","sequence":"additional","affiliation":[{"name":"ServiceNow Research, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Siva","family":"Reddy","sequence":"additional","affiliation":[{"name":"Mila, McGill University, Canada"},{"name":"Facebook CIFAR AI Chair, Canada. siva.reddy@mila.quebec"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"281","published-online":{"date-parts":[[2022,4,13]]},"reference":[{"key":"2022041213305130900_bib1","doi-asserted-by":"publisher","first-page":"520","DOI":"10.18653\/v1\/2021.naacl-main.44","article-title":"Open-domain question answering goes conversational via question rewriting","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Anantha","year":"2021"},{"key":"2022041213305130900_bib2","first-page":"1877","article-title":"Language models are few-shot learners","volume-title":"Advances in Neural Information Processing Systems","author":"Brown","year":"2020"},{"key":"2022041213305130900_bib3","doi-asserted-by":"crossref","first-page":"1870","DOI":"10.18653\/v1\/P17-1171","article-title":"Reading Wikipedia to answer open-domain questions","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Chen","year":"2017"},{"key":"2022041213305130900_bib4","doi-asserted-by":"crossref","first-page":"2174","DOI":"10.18653\/v1\/D18-1241","article-title":"QuAC: Question answering in context","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Choi","year":"2018"},{"key":"2022041213305130900_bib5","doi-asserted-by":"publisher","first-page":"1985","DOI":"10.1145\/3397271.3401206","article-title":"CAsT-19: A dataset for conversational information seeking","volume-title":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Dalton","year":"2020"},{"key":"2022041213305130900_bib6","first-page":"4171","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Devlin","year":"2019"},{"key":"2022041213305130900_bib7","doi-asserted-by":"publisher","first-page":"5918","DOI":"10.18653\/v1\/D19-1605","article-title":"Can you unpack that? Learning to rewrite questions-in-context","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Elgohary","year":"2019"},{"key":"2022041213305130900_bib8","doi-asserted-by":"publisher","first-page":"874","DOI":"10.18653\/v1\/2021.eacl-main.74","article-title":"Leveraging passage retrieval with generative models for open domain question answering","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Izacard","year":"2021"},{"key":"2022041213305130900_bib9","doi-asserted-by":"publisher","first-page":"2021","DOI":"10.18653\/v1\/D17-1215","article-title":"Adversarial examples for evaluating reading comprehension systems","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Jia","year":"2017"},{"key":"2022041213305130900_bib10","doi-asserted-by":"publisher","first-page":"6769","DOI":"10.18653\/v1\/2020.emnlp-main.550","article-title":"Dense passage retrieval for open-domain question answering","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Karpukhin","year":"2020"},{"key":"2022041213305130900_bib11","doi-asserted-by":"publisher","first-page":"317","DOI":"10.1162\/tacl_a_00023","article-title":"The NarrativeQA reading comprehension challenge","volume":"6","author":"Ko\u010disk\u00fd","year":"2018","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2022041213305130900_bib12","doi-asserted-by":"publisher","first-page":"452","DOI":"10.1162\/tacl_a_00276","article-title":"Natural questions: A benchmark for question answering research","volume":"7","author":"Kwiatkowski","year":"2019","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2022041213305130900_bib13","doi-asserted-by":"crossref","first-page":"6086","DOI":"10.18653\/v1\/P19-1612","article-title":"Latent retrieval for weakly supervised open domain question answering","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Lee","year":"2019"},{"key":"2022041213305130900_bib14","article-title":"MS MARCO: A Human Generated MAchine Reading COmprehension Dataset","volume-title":"CoCo@NIPS","author":"Nguyen","year":"2016"},{"key":"2022041213305130900_bib15","doi-asserted-by":"publisher","first-page":"2463","DOI":"10.18653\/v1\/D19-1250","article-title":"Language models as knowledge bases?","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Petroni","year":"2019"},{"key":"2022041213305130900_bib16","first-page":"539","article-title":"Open-Retrieval Conversational Question Answering","author":"Chen","year":"2020"},{"key":"2022041213305130900_bib17","first-page":"1133","article-title":"BERT with history answer embedding for conversational question answering","volume-title":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR\u201919","author":"Chen","year":"2019"},{"key":"2022041213305130900_bib18","first-page":"1391","article-title":"Attentive history selection for conversational question answering","volume-title":"Proceedings of the 28th ACM International Conference on Information and Knowledge Management","author":"Chen","year":"2019"},{"issue":"140","key":"2022041213305130900_bib19","first-page":"1","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"21","author":"Raffel","year":"2020","journal-title":"Journal of Machine Learning Research"},{"key":"2022041213305130900_bib20","doi-asserted-by":"publisher","first-page":"784","DOI":"10.18653\/v1\/P18-2124","article-title":"Know what you don\u2019t know: Unanswerable questions for SQuAD","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)","author":"Rajpurkar","year":"2018"},{"key":"2022041213305130900_bib21","doi-asserted-by":"publisher","first-page":"2383","DOI":"10.18653\/v1\/D16-1264","article-title":"SQuAD: 100,000+ questions for machine comprehension of text","volume-title":"Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing","author":"Rajpurkar","year":"2016"},{"issue":"0","key":"2022041213305130900_bib22","doi-asserted-by":"publisher","first-page":"249","DOI":"10.1162\/tacl_a_00266","article-title":"CoQA: A conversational question answering challenge","volume":"7","author":"Reddy","year":"2019","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2022041213305130900_bib23","doi-asserted-by":"publisher","first-page":"5418","DOI":"10.18653\/v1\/2020.emnlp-main.437","article-title":"How much knowledge can you pack into the parameters of a language model?","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Roberts","year":"2020"},{"key":"2022041213305130900_bib24","first-page":"109","article-title":"Okapi at TREC-3","volume-title":"Overview of the Third Text Retrieval Conference (TREC-3)","author":"Robertson","year":"1995"},{"key":"2022041213305130900_bib25","article-title":"QA dataset explosion: A taxonomy of NLP resources for question answering and reading comprehension","author":"Rogers","year":"2021","journal-title":"arXiv preprint arXiv:2107.12708"},{"key":"2022041213305130900_bib26","first-page":"289","article-title":"Lectures on conversation","author":"Sacks","year":"1995"},{"key":"2022041213305130900_bib27","first-page":"2321","article-title":"Asymmetric LSH (ALSH) for sublinear time maximum inner product search (MIPS)","volume-title":"Proceedings of the 27th International Conference on Neural Information Processing Systems","author":"Shrivastava","year":"2014"},{"key":"2022041213305130900_bib28","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-emnlp.320","article-title":"Retrieval augmentation reduces hallucination in conversation","author":"Shuster","year":"2021","journal-title":"arXiv preprint arXiv:2104.07567"},{"issue":"8","key":"2022041213305130900_bib29","doi-asserted-by":"publisher","first-page":"639","DOI":"10.1002\/asi.10124","article-title":"Multitasking information seeking and searching processes","volume":"53","author":"Spink","year":"2002","journal-title":"Journal of the American Society for Information Science and Technology"},{"key":"2022041213305130900_bib30","article-title":"Information-seeking chat: Dialogue management by topic structure","author":"Stede","year":"2004"},{"key":"2022041213305130900_bib31","doi-asserted-by":"publisher","first-page":"355","DOI":"10.1145\/3437963.3441748","article-title":"Question rewriting for conversational question answering","volume-title":"Proceedings of the 14th ACM International Conference on Web Search and Data Mining","author":"Vakulenko","year":"2021"},{"key":"2022041213305130900_bib32","volume-title":"The New Dialectic: Conversational Contexts of Argument","author":"Walton","year":"2019"},{"key":"2022041213305130900_bib33","first-page":"5981","article-title":"R3: Reinforced ranker-reader for open-domain question answering","volume-title":"AAAI","author":"Wang","year":"2018"},{"key":"2022041213305130900_bib34","doi-asserted-by":"publisher","first-page":"5878","DOI":"10.18653\/v1\/D19-1599","article-title":"Multi-passage BERT: A globally normalized BERT model for open-domain question answering","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Wang","year":"2019"},{"key":"2022041213305130900_bib35","doi-asserted-by":"publisher","first-page":"72","DOI":"10.18653\/v1\/N19-4013","article-title":"End-to-end open-domain question answering with BERTserini","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)","author":"Yang","year":"2019"},{"key":"2022041213305130900_bib36","doi-asserted-by":"publisher","first-page":"2013","DOI":"10.18653\/v1\/D15-1237","article-title":"WikiQA: A challenge dataset for open-domain question answering","volume-title":"Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing","author":"Yi","year":"2015"},{"key":"2022041213305130900_bib37","article-title":"SDNet: Contextualized attention- based deep network for conversational question answering","author":"Zhu","year":"2019","journal-title":"arXiv preprint arXiv:1812 .03593"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00471\/2008126\/tacl_a_00471.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00471\/2008126\/tacl_a_00471.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T13:31:46Z","timestamp":1649770306000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00471\/110550\/TopiOCQA-Open-domain-Conversational-Question"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022]]},"references-count":37,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00471","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022]]},"published":{"date-parts":[[2022]]}}}