{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T04:27:43Z","timestamp":1780547263750,"version":"3.54.1"},"reference-count":48,"publisher":"MIT Press","license":[{"start":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T00:00:00Z","timestamp":1712880000000},"content-version":"vor","delay-in-days":102,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,4,9]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>An open-domain question answering (QA) system usually follows a retrieve-then-read paradigm, in which a retriever is used to retrieve relevant passages from a large corpus, and then a reader generates answers based on the retrieved passages and the original question. In this paper, we propose a simple and novel mutual learning framework to improve the performance of retrieve-then-read-style models via an intermediate module named the knowledge selector, which we train with reinforcement learning. The key benefits of our proposed intermediate module are: 1) no requirement for additional annotated question-passage pairs; 2) improvements in both retrieval and QA performance, as well as computational efficiency, compared to prior competitive retrieve-then-read models; 3) with no finetuning, improvement in the zero-shot performance of large-scale pre-trained language models, e.g., ChatGPT, by encapsulating the input with relevant knowledge without violating the input length constraint.<\/jats:p>","DOI":"10.1162\/tacl_a_00646","type":"journal-article","created":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T19:02:43Z","timestamp":1712948563000},"page":"247-263","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":9,"title":["Retrieve What You Need: A Mutual Learning Framework for Open-domain Question Answering"],"prefix":"10.1162","volume":"12","author":[{"given":"Dingmin","family":"Wang","sequence":"first","affiliation":[{"name":"University of Oxford, UK. dingmin.wang@cs.ox.ac.uk"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Qiuyuan","family":"Huang","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, USA. qihua@microsoft.com"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Matthew","family":"Jackson","sequence":"additional","affiliation":[{"name":"University of Oxford, UK. jackson@robots.ox.ac.uk"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jianfeng","family":"Gao","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, USA. jfgao@microsoft.com"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"281","published-online":{"date-parts":[[2024,4,9]]},"reference":[{"key":"2024041219023210700_bib1","doi-asserted-by":"publisher","first-page":"93","DOI":"10.18653\/v1\/2022.acl-demo.9","article-title":"Promptsource: An integrated development environment and repository for natural language prompts","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations","author":"Bach","year":"2022"},{"key":"2024041219023210700_bib2","first-page":"1533","article-title":"Semantic parsing on freebase from question-answer pairs","volume-title":"Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing","author":"Berant","year":"2013"},{"key":"2024041219023210700_bib3","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2024041219023210700_bib4","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1171","article-title":"Reading Wikipedia to answer open-domain questions","volume-title":"Proceedings of ACL","author":"Chen","year":"2017"},{"key":"2024041219023210700_bib5","doi-asserted-by":"publisher","first-page":"3080","DOI":"10.18653\/v1\/2021.acl-long.240","article-title":"Unitedqa: A hybrid approach for open domain question answering","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Cheng","year":"2021"},{"key":"2024041219023210700_bib6","doi-asserted-by":"crossref","first-page":"845","DOI":"10.18653\/v1\/P18-1078","article-title":"Simple and effective multi-paragraph reading comprehension","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Clark","year":"2018"},{"key":"2024041219023210700_bib7","doi-asserted-by":"publisher","first-page":"4171","DOI":"10.18653\/v1\/N19-1423","article-title":"BERT: pre-training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2\u20137, 2019, Volume 1 (Long and Short Papers)","author":"Devlin","year":"2019"},{"key":"2024041219023210700_bib8","article-title":"Realm: Retrieval-augmented language model pre-training","author":"Guu","year":"2020","journal-title":"arXiv preprint arXiv:2002.08909"},{"key":"2024041219023210700_bib9","article-title":"Towards unsupervised dense information retrieval with contrastive learning","author":"Izacard","year":"2021","journal-title":"CoRR"},{"key":"2024041219023210700_bib10","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.eacl-main.74","article-title":"Leveraging passage retrieval with generative models for open domain question answering","author":"Izacard","year":"2020","journal-title":"arXiv preprint arXiv:2007.01282"},{"key":"2024041219023210700_bib11","article-title":"Distilling knowledge from reader to retriever for question answering","volume-title":"9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3\u20137, 2021","author":"Izacard","year":"2021"},{"key":"2024041219023210700_bib12","article-title":"Few-shot learning with retrieval augmented language models","author":"Izacard","year":"2022","journal-title":"arXiv preprint arXiv:2208.03299"},{"key":"2024041219023210700_bib13","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1147","article-title":"Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension","author":"Joshi","year":"2017","journal-title":"arXiv preprint arXiv:1705.03551"},{"key":"2024041219023210700_bib14","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.550","article-title":"Dense passage retrieval for open-domain question answering","author":"Karpukhin","year":"2020","journal-title":"arXiv preprint arXiv:2004.04906"},{"key":"2024041219023210700_bib15","doi-asserted-by":"publisher","first-page":"6769","DOI":"10.18653\/v1\/2020.emnlp-main.550","article-title":"Dense passage retrieval for open-domain question answering","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16\u201320, 2020","author":"Karpukhin","year":"2020"},{"key":"2024041219023210700_bib16","doi-asserted-by":"publisher","first-page":"929","DOI":"10.1162\/tacl_a_00405","article-title":"Relevance-guided supervision for openqa with colbert","volume":"9","author":"Khattab","year":"2021","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2024041219023210700_bib17","article-title":"Adam: A method for stochastic optimization","author":"Kingma","year":"2014","journal-title":"arXiv preprint arXiv:1412.6980"},{"key":"2024041219023210700_bib18","doi-asserted-by":"publisher","first-page":"453","DOI":"10.1162\/tacl_a_00276","article-title":"Natural questions: A benchmark for question answering research","volume":"7","author":"Kwiatkowski","year":"2019","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2024041219023210700_bib19","first-page":"6086","article-title":"Latent retrieval for weakly supervised open domain question answering","volume-title":"Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers","author":"Lee","year":"2019"},{"key":"2024041219023210700_bib20","doi-asserted-by":"publisher","first-page":"7871","DOI":"10.18653\/v1\/2020.acl-main.703","article-title":"Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Lewis","year":"2020"},{"key":"2024041219023210700_bib21","article-title":"Retrieval-augmented generation for knowledge-intensive nlp tasks","author":"Lewis","year":"2020","journal-title":"arXiv preprint arXiv:2005.11401"},{"key":"2024041219023210700_bib22","first-page":"5360","article-title":"Open-domain question answering via chain of reasoning over heterogeneous knowledge","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7\u201311, 2022","author":"Ma","year":"2022"},{"key":"2024041219023210700_bib23","doi-asserted-by":"publisher","first-page":"1605","DOI":"10.18653\/v1\/2022.acl-long.113","article-title":"Open domain question answering with a unified knowledge interface","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Ma","year":"2022"},{"key":"2024041219023210700_bib24","doi-asserted-by":"publisher","first-page":"109265","DOI":"10.1016\/j.knosys.2022.109265","article-title":"Facter-check: Semi-automated fact-checking through semantic similarity and natural language inference","volume":"251","author":"Mart\u00edn","year":"2022","journal-title":"Knowledge-Based Systems"},{"key":"2024041219023210700_bib25","article-title":"Knowledge guided text retrieval and reading for open domain question answering","author":"Min","year":"2019","journal-title":"arXiv preprint arXiv:1911.03868"},{"key":"2024041219023210700_bib26","doi-asserted-by":"publisher","first-page":"1535","DOI":"10.18653\/v1\/2022.findings-naacl.115","article-title":"Unik-qa: Unified representations of structured and unstructured knowledge for open-domain question answering","volume-title":"Findings of the Association for Computational Linguistics: NAACL 2022","author":"Oguz","year":"2022"},{"key":"2024041219023210700_bib27","doi-asserted-by":"publisher","first-page":"2523","DOI":"10.18653\/v1\/2021.naacl-main.200","article-title":"Kilt: A benchmark for knowledge intensive language tasks","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Petroni","year":"2021"},{"key":"2024041219023210700_bib28","doi-asserted-by":"publisher","first-page":"2463","DOI":"10.18653\/v1\/D19-1250","article-title":"Language models as knowledge bases?","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Petroni","year":"2019"},{"key":"2024041219023210700_bib29","first-page":"5835","article-title":"Rocketqa: An optimized training approach to dense passage retrieval for open-domain question answering","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Yingqi","year":"2021"},{"issue":"1","key":"2024041219023210700_bib30","first-page":"5485","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"21","author":"Raffel","year":"2020","journal-title":"The Journal of Machine Learning Research"},{"key":"2024041219023210700_bib31","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1264","article-title":"SQuAD: 100,000+ questions for machine comprehension of text","volume-title":"Proceedings of EMNLP","author":"Rajpurkar","year":"2016"},{"key":"2024041219023210700_bib32","doi-asserted-by":"publisher","first-page":"2825","DOI":"10.18653\/v1\/2021.emnlp-main.224","article-title":"Rocketqav2: A joint training method for dense passage retrieval and passage re-ranking","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Ren","year":"2021"},{"issue":"5","key":"2024041219023210700_bib33","doi-asserted-by":"publisher","first-page":"527","DOI":"10.1090\/S0002-9904-1952-09620-8","article-title":"Some aspects of the sequential design of experiments","volume":"58","author":"Robbins","year":"1952","journal-title":"Bulletin of the American Mathematical Society"},{"key":"2024041219023210700_bib34","doi-asserted-by":"publisher","first-page":"5418","DOI":"10.18653\/v1\/2020.emnlp-main.437","article-title":"How much knowledge can you pack into the parameters of a language model?","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Roberts","year":"2020"},{"issue":"4","key":"2024041219023210700_bib35","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1561\/1500000019","article-title":"The probabilistic relevance framework: Bm25 and beyond","volume":"3","author":"Robertson","year":"2009","journal-title":"Foundations and Trends in Information Retrieval"},{"key":"2024041219023210700_bib36","article-title":"Questions are all you need to train a dense passage retriever","author":"Sachan","year":"2022","journal-title":"arXiv preprint arXiv:2206.10658"},{"key":"2024041219023210700_bib37","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.11164","article-title":"Conceptnet 5.5: An open multilingual graph of general knowledge","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Speer","year":"2017"},{"key":"2024041219023210700_bib38","volume-title":"Reinforcement Learning: An Introduction","author":"Sutton","year":"2018"},{"key":"2024041219023210700_bib39","article-title":"Can open-domain qa reader utilize external knowledge efficiently like humans?","author":"Varshney","year":"2022","journal-title":"arXiv preprint arXiv:2211.12707"},{"key":"2024041219023210700_bib40","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.12053","article-title":"R3: Reinforced ranker-reader for open-domain question answering","volume-title":"Proceedings of AAAI","author":"Wang","year":"2018"},{"key":"2024041219023210700_bib41","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1599","article-title":"Multi-passage BERT: A globally normalized BERT model for open-domain question answering","volume-title":"Proceedings of EMNLP-IJCNLP","author":"Wang","year":"2019"},{"issue":"3","key":"2024041219023210700_bib42","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1007\/BF00992696","article-title":"Simple statistical gradient-following algorithms for connectionist reinforcement learning","volume":"8","author":"Williams","year":"1992","journal-title":"Machine Learning"},{"key":"2024041219023210700_bib43","doi-asserted-by":"publisher","first-page":"72","DOI":"10.18653\/v1\/N19-4013","article-title":"End-to-end open-domain question answering with bertserini","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)","author":"Yang","year":"2019"},{"key":"2024041219023210700_bib44","doi-asserted-by":"publisher","first-page":"4961","DOI":"10.18653\/v1\/2022.acl-long.340","article-title":"Kg-fid: Infusing knowledge graph in fusion-in-decoder for open-domain question answering","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Donghan","year":"2022"},{"key":"2024041219023210700_bib45","article-title":"Generate rather than retrieve: Large language models are strong context generators","author":"Wenhao","year":"2022","journal-title":"arXiv preprint arXiv:2209.10063"},{"key":"2024041219023210700_bib46","doi-asserted-by":"publisher","first-page":"1092","DOI":"10.18653\/v1\/2021.findings-emnlp.94","article-title":"Kers: A knowledge- enhanced framework for recommendation dialog systems with multiple subgoals","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2021","author":"Zhang","year":"2021"},{"key":"2024041219023210700_bib47","doi-asserted-by":"publisher","first-page":"7371","DOI":"10.18653\/v1\/2021.emnlp-main.586","article-title":"Situatedqa: Incorporating extra-linguistic contexts into qa","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Zhang","year":"2021"},{"key":"2024041219023210700_bib48","article-title":"Retrieving and reading: A comprehensive survey on open-domain question answering","author":"Zhu","year":"2021","journal-title":"arXiv preprint arXiv:2101.00774"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00646\/2362196\/tacl_a_00646.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00646\/2362196\/tacl_a_00646.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T19:02:56Z","timestamp":1712948576000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00646\/120574\/Retrieve-What-You-Need-A-Mutual-Learning-Framework"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024]]},"references-count":48,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00646","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024]]},"published":{"date-parts":[[2024]]}}}