{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,27]],"date-time":"2026-07-27T22:34:06Z","timestamp":1785191646170,"version":"3.55.0"},"reference-count":337,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T00:00:00Z","timestamp":1707436800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62222215, and U2001212"],"award-info":[{"award-number":["62222215, and U2001212"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Beijing Natural Science Foundation","award":["4222027"],"award-info":[{"award-number":["4222027"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2024,7,31]]},"abstract":"<jats:p>Text retrieval is a long-standing research topic on information seeking, where a system is required to return relevant information resources to user\u2019s queries in natural language. From heuristic-based retrieval methods to learning-based ranking functions, the underlying retrieval models have been continually evolved with the ever-lasting technical innovation. To design effective retrieval models, a key point lies in how to learn text representations and model the relevance matching. The recent success of pretrained language models\u00a0(PLM) sheds light on developing more capable text-retrieval approaches by leveraging the excellent modeling capacity of PLMs. With powerful PLMs, we can effectively learn the semantic representations of queries and texts in the latent representation space, and further construct the semantic matching function between the dense vectors for relevance modeling. Such a retrieval approach is called<jats:italic>dense retrieval<\/jats:italic>, since it employs dense vectors to represent the texts. Considering the rapid progress on dense retrieval, this survey systematically reviews the recent progress on PLM-based dense retrieval. Different from previous surveys on dense retrieval, we take a new perspective to organize the related studies by four major aspects, including architecture, training, indexing and integration, and thoroughly summarize the mainstream techniques for each aspect. We extensively collect the recent advances on this topic, and include 300+ reference papers. To support our survey, we create a website for providing useful resources, and release a code repository for dense retrieval. This survey aims to provide a comprehensive, practical reference focused on the major progress for dense text retrieval.<\/jats:p>","DOI":"10.1145\/3637870","type":"journal-article","created":{"date-parts":[[2023,12,18]],"date-time":"2023-12-18T11:55:39Z","timestamp":1702900539000},"page":"1-60","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":155,"title":["Dense Text Retrieval Based on Pretrained Language Models: A Survey"],"prefix":"10.1145","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8333-6196","authenticated-orcid":false,"given":"Wayne Xin","family":"Zhao","sequence":"first","affiliation":[{"name":"Renmin University of China, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1727-6321","authenticated-orcid":false,"given":"Jing","family":"Liu","sequence":"additional","affiliation":[{"name":"Baidu Inc., China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0562-9911","authenticated-orcid":false,"given":"Ruiyang","family":"Ren","sequence":"additional","affiliation":[{"name":"Renmin University of China, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9777-9676","authenticated-orcid":false,"given":"Ji-Rong","family":"Wen","sequence":"additional","affiliation":[{"name":"Renmin University of China, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2024,2,9]]},"reference":[{"issue":"1","key":"e_1_3_3_2_2","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1016\/S0306-4573(02)00021-3","article-title":"An information-theoretic perspective of TF\u2013IDF measures","volume":"39","author":"Aizawa Akiko","year":"2003","unstructured":"Akiko Aizawa. 2003. An information-theoretic perspective of TF\u2013IDF measures. Info. Process. Manage. 39, 1 (2003), 45\u201365.","journal-title":"Info. Process. Manage."},{"key":"e_1_3_3_3_2","doi-asserted-by":"crossref","first-page":"6168","DOI":"10.18653\/v1\/P19-1620","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Alberti Chris","year":"2019","unstructured":"Chris Alberti, Daniel Andor, Emily Pitler, Jacob Devlin, and Michael Collins. 2019. Synthetic QA corpora generation with roundtrip consistency. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 6168\u20136173."},{"key":"e_1_3_3_4_2","first-page":"4426","volume-title":"Proceedings of the 30th ACM International Conference on Information and Knowledge Management","author":"Arabzadeh Negar","year":"2021","unstructured":"Negar Arabzadeh, Bhaskar Mitra, and Ebrahim Bagheri. 2021. MS MARCO chameleons: Challenging the MS MARCO leaderboard with extremely obstinate queries. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management. 4426\u20134435."},{"key":"e_1_3_3_5_2","doi-asserted-by":"crossref","unstructured":"Negar Arabzadeh Alexandra Vtyurina Xinyi Yan and Charles L. A. Clarke. 2021. Shallow pooling for sparse labels. Retrieved from https:\/\/arXiv:2109.00062","DOI":"10.1007\/s10791-022-09411-0"},{"key":"e_1_3_3_6_2","doi-asserted-by":"crossref","unstructured":"Negar Arabzadeh Xinyi Yan and Charles L. A. Clarke. 2021. Predicting efficiency\/effectiveness trade-offs for dense vs. sparse retrieval strategy selection. Retrieved from https:\/\/arXiv:2109.10739","DOI":"10.1145\/3459637.3482159"},{"key":"e_1_3_3_7_2","first-page":"547","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Asai Akari","year":"2021","unstructured":"Akari Asai, Jungo Kasai, Jonathan Clark, Kenton Lee, Eunsol Choi, and Hannaneh Hajishirzi. 2021. XOR QA: Cross-lingual open-retrieval question answering. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 547\u2013564."},{"key":"e_1_3_3_8_2","article-title":"One question answering model for many languages with cross-lingual dense passage retrieval","volume":"34","author":"Asai Akari","year":"2021","unstructured":"Akari Asai, Xinyan Yu, Jungo Kasai, and Hanna Hajishirzi. 2021. One question answering model for many languages with cross-lingual dense passage retrieval. Adv. Neural Info. Process. Syst. 34 (2021).","journal-title":"Adv. Neural Info. Process. Syst."},{"key":"e_1_3_3_9_2","volume-title":"Proceedings of the International Conference on Similarity Search and Applications (SISAP\u201917)","author":"Aum\u00fcller Martin","year":"2017","unstructured":"Martin Aum\u00fcller, Erik Bernhardsson, and Alexander Faithfull. 2017. ANN-benchmarks: A benchmarking tool for approximate nearest-neighbor algorithms. In Proceedings of the International Conference on Similarity Search and Applications (SISAP\u201917)."},{"key":"e_1_3_3_10_2","volume-title":"Modern Information Retrieval\u2014The Concepts and Technology Behind Search, 2nd Ed","author":"Baeza-Yates Ricardo","year":"2011","unstructured":"Ricardo Baeza-Yates and Berthier A. Ribeiro-Neto. 2011. Modern Information Retrieval\u2014The Concepts and Technology Behind Search, 2nd Ed."},{"key":"e_1_3_3_11_2","doi-asserted-by":"crossref","unstructured":"Vidhisha Balachandran Ashish Vaswani Yulia Tsvetkov and Niki Parmar. 2021. Simple and efficient ways to improve REALM. Retrieved from https:\/\/arXiv:2104.08710","DOI":"10.18653\/v1\/2021.mrqa-1.16"},{"key":"e_1_3_3_12_2","first-page":"222","volume-title":"Proceedings of the International Conference of the Cross-language Evaluation Forum for European Languages","author":"Baudi\u0161 Petr","year":"2015","unstructured":"Petr Baudi\u0161 and Jan \u0160ediv\u1ef3. 2015. Modeling of the question answering task in the YodaQA system. In Proceedings of the International Conference of the Cross-language Evaluation Forum for European Languages. Springer, 222\u2013228."},{"key":"e_1_3_3_13_2","first-page":"932","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 13, Papers from Neural Information Processing Systems (NIPS\u201900)","author":"Bengio Yoshua","year":"2000","unstructured":"Yoshua Bengio, R\u00e9jean Ducharme, and Pascal Vincent. 2000. A neural probabilistic language model. In Proceedings of the Advances in Neural Information Processing Systems 13, Papers from Neural Information Processing Systems (NIPS\u201900). 932\u2013938."},{"key":"e_1_3_3_14_2","first-page":"1533","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Berant Jonathan","year":"2013","unstructured":"Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. 2013. Semantic parsing on Freebase from question-answer pairs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1533\u20131544."},{"key":"e_1_3_3_15_2","article-title":"Autoregressive search engines: Generating substrings as document identifiers","author":"Bevilacqua Michele","year":"2022","unstructured":"Michele Bevilacqua, Giuseppe Ottaviano, Patrick Lewis, Wen tau Yih, Sebastian Riedel, and Fabio Petroni. 2022. Autoregressive search engines: Generating substrings as document identifiers. Retrieved fromhttps:\/\/abs\/2204.10628","journal-title":"Retrieved from"},{"key":"e_1_3_3_16_2","unstructured":"Nitin Bhatia and Vandana. 2010. Survey of nearest-neighbor techniques. Retrieved from https:\/\/arXiv:1007.0085"},{"key":"e_1_3_3_17_2","volume-title":"Proceedings of the Conference and Labs of the Evaluation Forum (CLEF\u201920)","author":"Bondarenko Alexander","year":"2020","unstructured":"Alexander Bondarenko, Maik Fr\u00f6be, Meriem Beloucif, Lukas Gienapp, Yamen Ajjour, Alexander Panchenko, Chris Biemann, Benno Stein, Henning Wachsmuth, Martin Potthast, and Matthias Hagen. 2020. Overview of touch\u00e9 2020: Argument retrieval. In Proceedings of the Conference and Labs of the Evaluation Forum (CLEF\u201920)."},{"key":"e_1_3_3_18_2","doi-asserted-by":"crossref","unstructured":"Luiz Bonifacio Hugo Abonizio Marzieh Fadaee and Rodrigo Nogueira. 2022. Inpars: Data augmentation for information retrieval using large language models. Retrieved from https:\/\/arXiv:2202.05144","DOI":"10.1145\/3477495.3531863"},{"key":"e_1_3_3_19_2","unstructured":"Luiz Henrique Bonifacio Israel Campiotti Roberto Lotufo and Rodrigo Nogueira. 2021. mMARCO: A multilingual version of MS MARCO passage ranking dataset. Retrieved from https:\/\/arXiv:2108.13897"},{"key":"e_1_3_3_20_2","article-title":"Improving language models by retrieving from trillions of tokens","author":"Borgeaud Sebastian","year":"2021","unstructured":"Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, T. W. Hennigan, Saffron Huang, Lorenzo Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Oriol Vinyals, Simon Osindero, Karen Simonyan, Jack W. Rae, Erich Elsen, and L. Sifre. 2021. Improving language models by retrieving from trillions of tokens. Retrieved fromhttps:\/\/abs\/2112.04426","journal-title":"Retrieved from"},{"key":"e_1_3_3_21_2","first-page":"716","volume-title":"Proceedings of the European Conference on Information Retrieval","author":"Boteva Vera","year":"2016","unstructured":"Vera Boteva, Demian Gholipour, Artem Sokolov, and Stefan Riezler. 2016. A full-text learning to rank dataset for medical information retrieval. In Proceedings of the European Conference on Information Retrieval. Springer, 716\u2013722."},{"key":"e_1_3_3_22_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS\u201920)","author":"Brown Tom B.","year":"2020","unstructured":"Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford,Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS\u201920)."},{"key":"e_1_3_3_23_2","unstructured":"Yinqiong Cai Yixing Fan Jiafeng Guo Fei Sun Ruqing Zhang and Xueqi Cheng. 2021. Semantic models for the first-stage retrieval: A comprehensive review. Retrieved from https:\/\/arXiv:2103.04831"},{"key":"e_1_3_3_24_2","unstructured":"Yinqiong Cai Jiafeng Guo Yixing Fan Qingyao Ai Ruqing Zhang and Xueqi Cheng. 2022. Hard negatives or false negatives: Correcting pooling bias in training neural ranking models. Retrieved from https:\/\/arXiv:2209.05072"},{"key":"e_1_3_3_25_2","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1007\/978-3-030-45439-5_40","volume-title":"Proceedings of the Advances in Information Retrieval 42nd European Conference on IR Research (ECIR\u201920)","volume":"12035","author":"C\u00e2mara Arthur","year":"2020","unstructured":"Arthur C\u00e2mara and Claudia Hauff. 2020. Diagnosing BERT with retrieval heuristics. In Proceedings of the Advances in Information Retrieval 42nd European Conference on IR Research (ECIR\u201920), Vol. 12035. 605\u2013618."},{"key":"e_1_3_3_26_2","article-title":"Autoregressive entity retrieval","author":"Cao Nicola De","year":"2021","unstructured":"Nicola De Cao, Gautier Izacard, Sebastian Riedel, and Fabio Petroni. 2021. Autoregressive entity retrieval. Retrieved fromhttps:\/\/abs\/2010.00904","journal-title":"Retrieved from"},{"key":"e_1_3_3_27_2","first-page":"129","volume-title":"Proceedings of the 24th International Conference on Machine Learning (ICML\u201907)","author":"Cao Zhe","year":"2007","unstructured":"Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: From pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning (ICML\u201907). 129\u2013136."},{"issue":"1","key":"e_1_3_3_28_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2071389.2071390","article-title":"A survey of automatic query expansion in information retrieval","volume":"44","author":"Carpineto Claudio","year":"2012","unstructured":"Claudio Carpineto and Giovanni Romano. 2012. A survey of automatic query expansion in information retrieval. ACM Comput. Surveys 44, 1 (2012), 1\u201350.","journal-title":"ACM Comput. Surveys"},{"key":"e_1_3_3_29_2","volume-title":"Proceedings of the 8th International Conference on Learning Representations (ICLR\u201920)","author":"Chang Wei-Cheng","year":"2020","unstructured":"Wei-Cheng Chang, Felix X. Yu, Yin-Wen Chang, Yiming Yang, and Sanjiv Kumar. 2020. Pre-training tasks for embedding-based large-scale retrieval. In Proceedings of the 8th International Conference on Learning Representations (ICLR\u201920)."},{"key":"e_1_3_3_30_2","first-page":"1870","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics","author":"Chen Danqi","year":"2017","unstructured":"Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading Wikipedia to answer open-domain questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 1870\u20131879."},{"issue":"2","key":"e_1_3_3_31_2","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1145\/3166054.3166058","article-title":"A survey on dialogue systems: Recent advances and new frontiers","volume":"19","author":"Chen Hongshen","year":"2017","unstructured":"Hongshen Chen, Xiaorui Liu, Dawei Yin, and Jiliang Tang. 2017. A survey on dialogue systems: Recent advances and new frontiers. SIGKDD Explor. 19, 2 (2017), 25\u201335.","journal-title":"SIGKDD Explor."},{"key":"e_1_3_3_32_2","first-page":"2184","volume-title":"Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201922)","author":"Chen Jiangui","year":"2022","unstructured":"Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Yixing Fan, and Xueqi Cheng. 2022. GERE: Generative evidence retrieval for fact verification. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201922). 2184\u20132189."},{"key":"e_1_3_3_33_2","doi-asserted-by":"crossref","unstructured":"Jiangui Chen Ruqing Zhang Jiafeng Guo Yiqun Liu Yixing Fan and Xueqi Cheng. 2022. CorpusBrain: Pre-train a generative retrieval model for knowledge-intensive language tasks. Retrieved from https:\/\/abs\/2208.07652","DOI":"10.1145\/3511808.3557271"},{"key":"e_1_3_3_34_2","volume-title":"SPTAG: A Library for Fast Approximate Nearest Neighbor Search","author":"Chen Qi","year":"2018","unstructured":"Qi Chen, Haidong Wang, Mingqin Li, Gang Ren, Scarlett Li, Jeffery Zhu, Jason Li, Chuanjie Liu, Lintao Zhang, and Jingdong Wang. 2018. SPTAG: A Library for Fast Approximate Nearest Neighbor Search."},{"key":"e_1_3_3_35_2","volume-title":"Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS\u201921)","author":"Chen Qi","year":"2021","unstructured":"Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. 2021. SPANN: Highly efficient billion-scale approximate nearest-neighbor search. In Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS\u201921)."},{"key":"e_1_3_3_36_2","first-page":"1597","volume-title":"Proceedings of the 37th International Conference on Machine Learning (ICML\u201920)","author":"Chen Ting","year":"2020","unstructured":"Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the 37th International Conference on Machine Learning (ICML\u201920). 1597\u20131607."},{"key":"e_1_3_3_37_2","volume-title":"Proceedings of the Advances in Information Retrieval 44th European Conference on IR Research (ECIR\u201922)","author":"Chen Tao","year":"2022","unstructured":"Tao Chen, Mingyang Zhang, Jing Lu, Michael Bendersky, and Marc-Alexander Najork. 2022. Out-of-domain semantics to the rescue! zero-shot hybrid retrieval models. In Proceedings of the Advances in Information Retrieval 44th European Conference on IR Research (ECIR\u201922)."},{"key":"e_1_3_3_38_2","doi-asserted-by":"crossref","unstructured":"Xilun Chen Kushal Lakhotia Barlas O\u011fuz Anchit Gupta Patrick Lewis Stan Peshterliev Yashar Mehdad Sonal Gupta and Wen-tau Yih. 2021. Salient phrase aware dense retrieval: Can a dense retriever imitate a sparse one?Retrieved from https:\/\/arXiv:2110.06918","DOI":"10.18653\/v1\/2022.findings-emnlp.19"},{"key":"e_1_3_3_39_2","first-page":"1980","volume-title":"Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI\u201922)","author":"Chen Xuanang","year":"2022","unstructured":"Xuanang Chen, Jian Luo, Ben He, Le Sun, and Yingfei Sun. 2022. Towards robust dense retrieval via local ranking alignment. In Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI\u201922). 1980\u20131986."},{"key":"e_1_3_3_40_2","doi-asserted-by":"crossref","unstructured":"Hao Cheng Hao Fang Xiaodong Liu and Jianfeng Gao. 2022. Task-aware specialization for efficient and robust dense retrieval for open-domain question answering. Retrieved from https:\/\/arXiv:2210.05156","DOI":"10.18653\/v1\/2023.acl-short.159"},{"key":"e_1_3_3_41_2","first-page":"103","volume-title":"Proceedings of the 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST\u201914)","author":"Cho Kyunghyun","year":"2014","unstructured":"Kyunghyun Cho, Bart van Merri\u00ebnboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: encoder\u2013decoder approaches. In Proceedings of the 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST\u201914). 103\u2013111."},{"key":"e_1_3_3_42_2","doi-asserted-by":"crossref","first-page":"2270","DOI":"10.18653\/v1\/2020.acl-main.207","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Cohan Arman","year":"2020","unstructured":"Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, and Daniel Weld. 2020. SPECTER: Document-level representation learning using citation-informed transformers. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2270\u20132282."},{"key":"e_1_3_3_43_2","volume-title":"Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC\u201918)","author":"Conneau Alexis","year":"2018","unstructured":"Alexis Conneau and Douwe Kiela. 2018. SentEval: An evaluation toolkit for universal sentence representations. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC\u201918)."},{"key":"e_1_3_3_44_2","first-page":"2983","volume-title":"Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM\u201920)","author":"Craswell Nick","year":"2020","unstructured":"Nick Craswell, Daniel Campos, Bhaskar Mitra, Emine Yilmaz, and Bodo Billerbeck. 2020. ORCAS: 20 million clicked query-document pairs for analyzing search. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM\u201920). 2983\u20132989."},{"key":"e_1_3_3_45_2","doi-asserted-by":"crossref","unstructured":"Nick Craswell Bhaskar Mitra Emine Yilmaz and Daniel Campos. 2021. Overview of the TREC 2020 deep learning track. Retrieved from https:\/\/arXiv:2102.07662","DOI":"10.6028\/NIST.SP.1266.deep-overview"},{"key":"e_1_3_3_46_2","doi-asserted-by":"crossref","unstructured":"Nick Craswell Bhaskar Mitra Emine Yilmaz Daniel Campos and Jimmy Lin. [n.d.]. TREC 2021 Deep Learning Track Guidelines. Retrieved from https:\/\/microsoft.github.io\/msmarco\/TREC-Deep-Learning.html","DOI":"10.6028\/NIST.SP.500-335.deep-overview"},{"key":"e_1_3_3_47_2","volume-title":"Proceedings of the Text Retrieval Conference (TREC\u201922)","author":"Craswell Nick","year":"2022","unstructured":"Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Jimmy Lin. 2022. Overview of the TREC 2021 deep learning track. In Proceedings of the Text Retrieval Conference (TREC\u201922)."},{"key":"e_1_3_3_48_2","doi-asserted-by":"crossref","unstructured":"Nick Craswell Bhaskar Mitra Emine Yilmaz Daniel Campos and Ellen M. Voorhees. 2020. Overview of the TREC 2019 deep learning track. Retrieved from https:\/\/arXiv:2003.07820","DOI":"10.6028\/NIST.SP.1266.deep-overview"},{"key":"e_1_3_3_49_2","article-title":"MS MARCO: Benchmarking ranking models in the large-data regime","author":"Craswell Nick","year":"2021","unstructured":"Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Fernando Campos, and Jimmy J. Lin. 2021. MS MARCO: Benchmarking ranking models in the large-data regime. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.","journal-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval"},{"key":"e_1_3_3_50_2","first-page":"3079","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems (NeurIPS\u201915)","author":"Dai Andrew M.","year":"2015","unstructured":"Andrew M. Dai and Quoc V. Le. 2015. Semi-supervised sequence learning. In Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems (NeurIPS\u201915). 3079\u20133087."},{"key":"e_1_3_3_51_2","first-page":"1897","volume-title":"Proceedings of the Web Conference (WWW\u201920)","author":"Dai Zhuyun","year":"2020","unstructured":"Zhuyun Dai and Jamie Callan. 2020. Context-aware document term weighting for ad hoc search. In Proceedings of the Web Conference (WWW\u201920). 1897\u20131907."},{"key":"e_1_3_3_52_2","first-page":"1533","volume-title":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201920)","author":"Dai Zhuyun","year":"2020","unstructured":"Zhuyun Dai and Jamie Callan. 2020. Context-aware term weighting for first stage passage retrieval. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201920). 1533\u20131536."},{"key":"e_1_3_3_53_2","article-title":"Promptagator: Few-shot dense retrieval from 8 examples","author":"Dai Zhuyun","year":"2022","unstructured":"Zhuyun Dai, Vincent Zhao, Ji Ma, Yi Luan, Jianmo Ni, Jing Lu, Anton Bakalov, Kelvin Guu, Keith B. Hall, and Ming-Wei Chang. 2022. Promptagator: Few-shot dense retrieval from 8 examples. Retrieved fromhttps:\/\/abs\/2209.11755","journal-title":"Retrieved from"},{"key":"e_1_3_3_54_2","first-page":"4171","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171\u20134186."},{"key":"e_1_3_3_55_2","unstructured":"Thomas Diggelmann Jordan Boyd-Graber Jannis Bulian Massimiliano Ciaramita and Markus Leippold. 2020. CLIMATE-FEVER: A dataset for verification of real-world climate claims. Retrieved from https:\/\/arXiv:2012.00614"},{"key":"e_1_3_3_56_2","unstructured":"Bhargav Dodla Akash Kumar Mohankumar and Amit Singh. 2022. HEARTS: Multi-task fusion of dense retrieval and non-autoregressive generation for sponsored search. Retrieved from https:\/\/abs\/2209.05861"},{"key":"e_1_3_3_57_2","first-page":"2509","volume-title":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD\u201919)","author":"Fan Miao","year":"2019","unstructured":"Miao Fan, Jiacheng Guo, Shuai Zhu, Shuo Miao, Mingming Sun, and Ping Li. 2019. MOBIUS: Towards the next generation of query-ad matching in baidu\u2019s sponsored search. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD\u201919). 2509\u20132517."},{"key":"e_1_3_3_58_2","unstructured":"Yixing Fan Xiaohui Xie Yinqiong Cai Jia Chen Xinyu Ma Xiangsheng Li Ruqing Zhang Jiafeng Guo and Yiqun Liu. 2021. Pre-training methods in information retrieval. Retrieved from https:\/\/arXiv:2111.13853"},{"key":"e_1_3_3_59_2","volume-title":"Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201904)","author":"Fang Hui","year":"2004","unstructured":"Hui Fang, Tao Tao, and ChengXiang Zhai. 2004. A formal study of information retrieval heuristics. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201904)."},{"key":"e_1_3_3_60_2","volume-title":"Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201905)","author":"Fang Hui","year":"2005","unstructured":"Hui Fang and ChengXiang Zhai. 2005. An exploration of axiomatic approaches to information retrieval. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201905)."},{"key":"e_1_3_3_61_2","article-title":"Semantic term matching in axiomatic approaches to information retrieval","author":"Fang Hui","year":"2006","unstructured":"Hui Fang and ChengXiang Zhai. 2006. Semantic term matching in axiomatic approaches to information retrieval. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.","journal-title":"Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval"},{"issue":"1","key":"e_1_3_3_62_2","first-page":"5232","article-title":"Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity","volume":"23","author":"Fedus William","year":"2022","unstructured":"William Fedus, Barret Zoph, and Noam Shazeer. 2022. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res. 23, 1 (2022), 5232\u20135270.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_3_63_2","doi-asserted-by":"crossref","unstructured":"Thibault Formal Carlos Lassance Benjamin Piwowarski and St\u00e9phane Clinchant. 2021. SPLADE v2: Sparse lexical and expansion model for information retrieval. Retrieved from https:\/\/arXiv:2109.10086","DOI":"10.1145\/3404835.3463098"},{"key":"e_1_3_3_64_2","doi-asserted-by":"crossref","first-page":"2288","DOI":"10.1145\/3404835.3463098","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Formal Thibault","year":"2021","unstructured":"Thibault Formal, Benjamin Piwowarski, and St\u00e9phane Clinchant. 2021. SPLADE: Sparse lexical and expansion model for first stage ranking. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2288\u20132292."},{"key":"e_1_3_3_65_2","volume-title":"Proceedings of the European Conference on Information Retrieveal (ECIR\u201921)","author":"Formal Thibault","year":"2021","unstructured":"Thibault Formal, Benjamin Piwowarski, and St\u00e9phane Clinchant. 2021. A white box analysis of ColBERT. In Proceedings of the European Conference on Information Retrieveal (ECIR\u201921)."},{"key":"e_1_3_3_66_2","first-page":"981","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Gao Luyu","year":"2021","unstructured":"Luyu Gao and Jamie Callan. 2021. Condenser: A pre-training architecture for dense retrieval. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 981\u2013993."},{"key":"e_1_3_3_67_2","doi-asserted-by":"crossref","unstructured":"Luyu Gao and Jamie Callan. 2021. Unsupervised corpus aware language model pre-training for dense passage retrieval. Retrieved from https:\/\/arXiv:2108.05540","DOI":"10.18653\/v1\/2022.acl-long.203"},{"key":"e_1_3_3_68_2","first-page":"3030","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Gao Luyu","year":"2021","unstructured":"Luyu Gao, Zhuyun Dai, and Jamie Callan. 2021. COIL: Revisit exact lexical match in information retrieval with contextualized inverted list. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3030\u20133042."},{"key":"e_1_3_3_69_2","first-page":"280","volume-title":"Proceedings of the Advances in Information Retrieval 43rd European Conference on IR Research (ECIR\u201921)","volume":"12657","author":"Gao Luyu","year":"2021","unstructured":"Luyu Gao, Zhuyun Dai, and Jamie Callan. 2021. Rethink training of BERT rerankers in multi-stage retrieval pipeline. In Proceedings of the Advances in Information Retrieval 43rd European Conference on IR Research (ECIR\u201921), Vol. 12657. 280\u2013286."},{"key":"e_1_3_3_70_2","article-title":"Tevatron: An efficient and flexible toolkit for dense retrieval","author":"Gao Luyu","year":"2022","unstructured":"Luyu Gao, Xueguang Ma, Jimmy J. Lin, and Jamie Callan. 2022. Tevatron: An efficient and flexible toolkit for dense retrieval. Retrieved fromhttps:\/\/abs\/2203.05765","journal-title":"Retrieved from"},{"key":"e_1_3_3_71_2","first-page":"316","volume-title":"Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP\u201921)","author":"Gao Luyu","year":"2021","unstructured":"Luyu Gao, Yunyi Zhang, Jiawei Han, and Jamie Callan. 2021. Scaling deep contrastive learning batch size under memory limited setup. In Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP\u201921). 316\u2013321."},{"key":"e_1_3_3_72_2","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921)","author":"Gao Tianyu","year":"2021","unstructured":"Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. SimCSE: Simple contrastive learning of sentence embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921)."},{"key":"e_1_3_3_73_2","first-page":"2946","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201913)","author":"Ge Tiezheng","year":"2013","unstructured":"Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. 2013. Optimized product quantization for approximate nearest-neighbor search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201913). 2946\u20132953."},{"key":"e_1_3_3_74_2","doi-asserted-by":"crossref","first-page":"528","DOI":"10.18653\/v1\/K19-1049","volume-title":"Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL\u201919)","author":"Gillick Daniel","year":"2019","unstructured":"Daniel Gillick, Sayali Kulkarni, Larry Lansing, Alessandro Presta, Jason Baldridge, Eugene Ie, and Diego Garcia-Olano. 2019. Learning dense representations for entity retrieval. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL\u201919). 528\u2013537."},{"key":"e_1_3_3_75_2","first-page":"518","volume-title":"Proceedings of the International Conference on Very Large Data Bases (VLDB\u201999)","volume":"99","author":"Gionis Aristides","year":"1999","unstructured":"Aristides Gionis, Piotr Indyk, Rajeev Motwani et\u00a0al. 1999. Similarity search in high dimensions via hashing. In Proceedings of the International Conference on Very Large Data Bases (VLDB\u201999), Vol. 99. 518\u2013529."},{"key":"e_1_3_3_76_2","first-page":"55","volume-title":"Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM\u201916)","author":"Guo Jiafeng","year":"2016","unstructured":"Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A deep relevance matching model for ad hoc retrieval. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM\u201916). 55\u201364."},{"key":"e_1_3_3_77_2","first-page":"1297","volume-title":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201919)","author":"Guo Jiafeng","year":"2019","unstructured":"Jiafeng Guo, Yixing Fan, Xiang Ji, and Xueqi Cheng. 2019. MatchZoo: A learning, practicing, and developing system for neural text matching. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201919). 1297\u20131300."},{"issue":"6","key":"e_1_3_3_78_2","doi-asserted-by":"crossref","first-page":"102067","DOI":"10.1016\/j.ipm.2019.102067","article-title":"A deep look into neural ranking models for information retrieval","volume":"57","author":"Guo Jiafeng","year":"2020","unstructured":"Jiafeng Guo, Yixing Fan, Liang Pang, Liu Yang, Qingyao Ai, Hamed Zamani, Chen Wu, W. Bruce Croft, and Xueqi Cheng. 2020. A deep look into neural ranking models for information retrieval. Info. Process. Manage. 57, 6 (2020), 102067.","journal-title":"Info. Process. Manage."},{"key":"e_1_3_3_79_2","series-title":"Proceedings of the 37th International Conference on Machine Learning (ICML\u201920)","first-page":"3887","volume":"119","author":"Guo Ruiqi","year":"2020","unstructured":"Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating large-scale inference with anisotropic vector quantization. In Proceedings of the 37th International Conference on Machine Learning (ICML\u201920)(Proceedings of Machine Learning Research, Vol. 119). 3887\u20133896."},{"key":"e_1_3_3_80_2","unstructured":"Kelvin Guu Kenton Lee Zora Tung Panupong Pasupat and Ming-Wei Chang. 2020. REALM: Retrieval-augmented language model pre-training. Retrieved from https:\/\/abs\/2002.08909"},{"key":"e_1_3_3_81_2","first-page":"1312","volume-title":"Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI\u201911)","author":"Hajebi Kiana","year":"2011","unstructured":"Kiana Hajebi, Yasin Abbasi-Yadkori, Hossein Shahbazi, and Hong Zhang. 2011. Fast approximate nearest-neighbor search with k-nearest-neighbor graph. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI\u201911). 1312\u20131317."},{"issue":"2","key":"e_1_3_3_82_2","first-page":"1","article-title":"Information retrieval evaluation","volume":"3","author":"Harman Donna","year":"2011","unstructured":"Donna Harman. 2011. Information retrieval evaluation. Synth. Lect. Info. Concepts, Retriev. Serv. 3, 2 (2011), 1\u2013119.","journal-title":"Synth. Lect. Info. Concepts, Retriev. Serv."},{"key":"e_1_3_3_83_2","first-page":"1265","volume-title":"Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Hasibi Faegheh","year":"2017","unstructured":"Faegheh Hasibi, Fedor Nikolaev, Chenyan Xiong, Krisztian Balog, Svein Erik Bratsberg, Alexander Kotov, and Jamie Callan. 2017. DBpedia-entity v2: A Test collection for entity search. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1265\u20131268."},{"key":"e_1_3_3_84_2","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921)","author":"He Junxian","year":"2021","unstructured":"Junxian He, Graham Neubig, and Taylor Berg-Kirkpatrick. 2021. Efficient nearest-neighbor language models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921)."},{"key":"e_1_3_3_85_2","first-page":"9726","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201920)","author":"He Kaiming","year":"2020","unstructured":"Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross B. Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201920). 9726\u20139735."},{"key":"e_1_3_3_86_2","unstructured":"Matthew Henderson Rami Al-Rfou Brian Strope Yun-Hsuan Sung L\u00e1szl\u00f3 Luk\u00e1cs Ruiqi Guo Sanjiv Kumar Balint Miklos and Ray Kurzweil. 2017. Efficient natural language response suggestion for smart reply. Retrieved from https:\/\/arXiv:1705.00652"},{"key":"e_1_3_3_87_2","first-page":"512","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Herzig Jonathan","year":"2021","unstructured":"Jonathan Herzig, Thomas M\u00fcller, Syrine Krichene, and Julian Eisenschlos. 2021. Open domain question answering over tables via dense retrieval. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 512\u2013519."},{"key":"e_1_3_3_88_2","unstructured":"Geoffrey E. Hinton Oriol Vinyals and Jeffrey Dean. 2015. Distilling the knowledge in a neural network. Retrieved from https:\/\/abs\/1503.02531"},{"issue":"8","key":"e_1_3_3_89_2","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter Sepp","year":"1997","unstructured":"Sepp Hochreiter and J\u00fcrgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735\u20131780.","journal-title":"Neural Comput."},{"key":"e_1_3_3_90_2","unstructured":"Sebastian Hofst\u00e4tter Sophia Althammer Michael Schr\u00f6der Mete Sertkan and Allan Hanbury. 2020. Improving efficient neural ranking models with cross-architecture knowledge distillation. Retrieved from https:\/\/arXiv:2010.02666"},{"key":"e_1_3_3_91_2","article-title":"Are we there yet? A decision framework for replacing term based retrieval with dense retrieval systems","author":"Hofstatter Sebastian","year":"2022","unstructured":"Sebastian Hofstatter, Nick Craswell, Bhaskar Mitra, Hamed Zamani, and Allan Hanbury. 2022. Are we there yet? A decision framework for replacing term based retrieval with dense retrieval systems. Retrieved fromhttps:\/\/abs\/2206.12993","journal-title":"Retrieved from"},{"key":"e_1_3_3_92_2","article-title":"Introducing neural bag of whole-words with colberter: Contextualized late interactions using enhanced reduction","author":"Hofst\u00e4tter Sebastian","year":"2022","unstructured":"Sebastian Hofst\u00e4tter, O. Khattab, Sophia Althammer, Mete Sertkan, and Allan Hanbury. 2022. Introducing neural bag of whole-words with colberter: Contextualized late interactions using enhanced reduction. Retrieved fromhttps:\/\/abs\/2203.13088","journal-title":"Retrieved from"},{"key":"e_1_3_3_93_2","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1145\/3404835.3462891","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Hofst\u00e4tter Sebastian","year":"2021","unstructured":"Sebastian Hofst\u00e4tter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury. 2021. Efficiently teaching an effective dense retriever with balanced topic aware sampling. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 113\u2013122."},{"key":"e_1_3_3_94_2","first-page":"1062","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics","author":"Hong Wu","year":"2022","unstructured":"Wu Hong, Zhuosheng Zhang, Jinyuan Wang, and Hai Zhao. 2022. Sentence-aware contrastive learning for open-domain passage retrieval. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 1062\u20131074."},{"key":"e_1_3_3_95_2","first-page":"1","volume-title":"Proceedings of the 20th Australasian Document Computing Symposium","author":"Hoogeveen Doris","year":"2015","unstructured":"Doris Hoogeveen, Karin M. Verspoor, and Timothy Baldwin. 2015. CQADupStack: A benchmark data set for community question-answering research. In Proceedings of the 20th Australasian Document Computing Symposium. 1\u20138."},{"key":"e_1_3_3_96_2","first-page":"2553","volume-title":"Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD\u201920)","author":"Huang Jui-Ting","year":"2020","unstructured":"Jui-Ting Huang, Ashish Sharma, Shuying Sun, Li Xia, David Zhang, Philip Pronin, Janani Padmanabhan, Giuseppe Ottaviano, and Linjun Yang. 2020. Embedding-based retrieval in facebook search. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD\u201920). 2553\u20132561."},{"key":"e_1_3_3_97_2","first-page":"2333","volume-title":"Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM\u201913)","author":"Huang Po-Sen","year":"2013","unstructured":"Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry P. Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM\u201913). 2333\u20132338."},{"key":"e_1_3_3_98_2","doi-asserted-by":"crossref","unstructured":"Patrick Huber Armen Aghajanyan Barlas O\u011fuz Dmytro Okhonko Wen-tau Yih Sonal Gupta and Xilun Chen. 2021. CCQA: A new web-scale question answering dataset for model pre-training. Retrieved from https:\/\/arXiv:2110.07731","DOI":"10.18653\/v1\/2022.findings-naacl.184"},{"key":"e_1_3_3_99_2","volume-title":"Proceedings of the 8th International Conference on Learning Representations (ICLR\u201920)","author":"Humeau Samuel","year":"2020","unstructured":"Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, and Jason Weston. 2020. Poly-encoders: Architectures and pre-training strategies for fast and accurate multi-sentence scoring. In Proceedings of the 8th International Conference on Learning Representations (ICLR\u201920)."},{"key":"e_1_3_3_100_2","volume-title":"Proceedings of the 30th Annual ACM Symposium on Theory of Computing (STOC\u201998)","author":"Indyk Piotr","year":"1998","unstructured":"Piotr Indyk and Rajeev Motwani. 1998. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the 30th Annual ACM Symposium on Theory of Computing (STOC\u201998)."},{"key":"e_1_3_3_101_2","unstructured":"Shankar Iyer Nikhil Dandekar and Korn\u00e9l Csernai. [n. d.]. First Quora Dataset Release: Question Pairs. https:\/\/quoradata.quora.com\/First-Quora-Dataset-Release-Question-Pairs"},{"key":"e_1_3_3_102_2","article-title":"Towards unsupervised dense information retrieval with contrastive learning","author":"Izacard Gautier","year":"2021","unstructured":"Gautier Izacard, Mathilde Caron, Lucas Hosseini, Sebastian Riedel, Piotr Bojanowski, Armand Joulin, and Edouard Grave. 2021. Towards unsupervised dense information retrieval with contrastive learning. Retrieved fromhttps:\/\/abs\/2112.09118","journal-title":"Retrieved from"},{"key":"e_1_3_3_103_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Izacard Gautier","year":"2021","unstructured":"Gautier Izacard and Edouard Grave. 2021. Distilling knowledge from reader to retriever for question answering. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_3_104_2","first-page":"874","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Izacard Gautier","year":"2021","unstructured":"Gautier Izacard and Edouard Grave. 2021. Leveraging passage retrieval with generative models for open domain question answering. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 874\u2013880."},{"issue":"4","key":"e_1_3_3_105_2","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1145\/582415.582418","article-title":"Cumulated gain-based evaluation of IR techniques","volume":"20","author":"J\u00e4rvelin Kalervo","year":"2002","unstructured":"Kalervo J\u00e4rvelin and Jaana Kek\u00e4l\u00e4inen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20, 4 (2002), 422\u2013446.","journal-title":"ACM Trans. Inf. Syst."},{"issue":"1","key":"e_1_3_3_106_2","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1109\/TPAMI.2010.57","article-title":"Product quantization for nearest-neighbor search","volume":"33","author":"J\u00e9gou Herv\u00e9","year":"2011","unstructured":"Herv\u00e9 J\u00e9gou, Matthijs Douze, and Cordelia Schmid. 2011. Product quantization for nearest-neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1 (2011), 117\u2013128.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_3_3_107_2","first-page":"442","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics","author":"Jeong Soyeong","year":"2022","unstructured":"Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, and Jong C. Park. 2022. Augmenting document representations for dense retrieval with interpolation and perturbation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 442\u2013452."},{"key":"e_1_3_3_108_2","unstructured":"Zongcheng Ji Zhengdong Lu and Hang Li. 2014. An information retrieval approach to short text conversation. Retrieved from https:\/\/abs\/1408.6988"},{"key":"e_1_3_3_109_2","article-title":"Billion-scale similarity search with GPUs","author":"Johnson J.","year":"2019","unstructured":"J. Johnson, M. Douze, and H. Jegou. 2019. Billion-scale similarity search with GPUs. IEEE Trans. Big Data 7, 3 (2019), 535\u2013547.","journal-title":"IEEE Trans. Big Data"},{"key":"e_1_3_3_110_2","first-page":"1601","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics","author":"Joshi Mandar","year":"2017","unstructured":"Mandar Joshi, Eunsol Choi, Daniel Weld, and Luke Zettlemoyer. 2017. TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 1601\u20131611."},{"issue":"3","key":"e_1_3_3_111_2","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1002\/asi.5090090305","article-title":"The thesaurus approach to information retrieval","volume":"9","author":"Joyce T.","year":"1958","unstructured":"T. Joyce and R. M. Needham. 1958. The thesaurus approach to information retrieval. American Document. 9, 3 (1958), 192.","journal-title":"American Document."},{"key":"e_1_3_3_112_2","first-page":"6769","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201920)","author":"Karpukhin Vladimir","year":"2020","unstructured":"Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense passage retrieval for open-domain question answering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201920). 6769\u20136781."},{"key":"e_1_3_3_113_2","volume-title":"Proceedings of the 8th International Conference on Learning Representations (ICLR\u201920)","author":"Khandelwal Urvashi","year":"2020","unstructured":"Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, and Mike Lewis. 2020. Generalization through memorization: Nearest neighbor language models. In Proceedings of the 8th International Conference on Learning Representations (ICLR\u201920)."},{"key":"e_1_3_3_114_2","first-page":"39","volume-title":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201920)","author":"Khattab Omar","year":"2020","unstructured":"Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and effective passage search via contextualized late interaction over BERT. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201920). 39\u201348."},{"key":"e_1_3_3_115_2","first-page":"163","volume-title":"Proceedings of the 32nd Annual ACM Symposium on Theory of Computing","author":"Kleinberg Jon M.","year":"2000","unstructured":"Jon M. Kleinberg. 2000. The small-world phenomenon: An algorithmic perspective. In Proceedings of the 32nd Annual ACM Symposium on Theory of Computing. 163\u2013170."},{"key":"e_1_3_3_116_2","doi-asserted-by":"crossref","first-page":"82","DOI":"10.18653\/v1\/2021.mrqa-1.8","volume-title":"Proceedings of the 3rd Workshop on Machine Reading for Question Answering","author":"Kosti\u0107 Bogdan","year":"2021","unstructured":"Bogdan Kosti\u0107, Julian Risch, and Timo M\u00f6ller. 2021. Multi-modal retrieval of tables and texts using tri-encoder models. In Proceedings of the 3rd Workshop on Machine Reading for Question Answering. 82\u201391."},{"key":"e_1_3_3_117_2","first-page":"452","article-title":"Natural questions: A benchmark for question answering research","volume":"7","author":"Kwiatkowski Tom","year":"2019","unstructured":"Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov. 2019. Natural questions: A benchmark for question answering research. Trans. Assoc. Comput. Linguist. 7 (2019), 452\u2013466.","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"e_1_3_3_118_2","unstructured":"Hyunji Lee Sohee Yang Hanseok Oh and Minjoon Seo. 2022. Generative retrieval for long sequences. Retrieved from https:\/\/abs\/2204.13596"},{"key":"e_1_3_3_119_2","first-page":"6634","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing","author":"Lee Jinhyuk","year":"2021","unstructured":"Jinhyuk Lee, Mujeen Sung, Jaewoo Kang, and Danqi Chen. 2021. Learning dense representations of phrases at scale. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 6634\u20136647."},{"key":"e_1_3_3_120_2","unstructured":"Jinhyuk Lee Alexander Wettig and Danqi Chen. 2021. Phrase retrieval learns passage retrieval too. Retrieved from https:\/\/arXiv:2109.08133"},{"key":"e_1_3_3_121_2","first-page":"6086","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Lee Kenton","year":"2019","unstructured":"Kenton Lee, Ming-Wei Chang, and Kristina Toutanova. 2019. Latent retrieval for weakly supervised open domain question answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 6086\u20136096."},{"key":"e_1_3_3_122_2","doi-asserted-by":"crossref","first-page":"7871","DOI":"10.18653\/v1\/2020.acl-main.703","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Lewis Mike","year":"2020","unstructured":"Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 7871\u20137880."},{"key":"e_1_3_3_123_2","article-title":"Boosted dense retriever","author":"Lewis Patrick","year":"2021","unstructured":"Patrick Lewis, Barlas O\u011fuz, Wenhan Xiong, Fabio Petroni, Wen tau Yih, and Sebastian Riedel. 2021. Boosted dense retriever. Retrieved fromhttps:\/\/abs\/2112.07771","journal-title":"Retrieved from"},{"key":"e_1_3_3_124_2","doi-asserted-by":"crossref","unstructured":"Patrick Lewis Yuxiang Wu Linqing Liu Pasquale Minervini Heinrich K\u00fcttler Aleksandra Piktus Pontus Stenetorp and Sebastian Riedel. 2021. Paq: 65 million probably-asked questions and what you can do with them. Retrieved from https:\/\/arXiv:2102.07033","DOI":"10.1162\/tacl_a_00415"},{"key":"e_1_3_3_125_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS\u201920)","author":"Lewis Patrick S. H.","year":"2020","unstructured":"Patrick S. H. Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K\u00fcttler, Mike Lewis, Wen-tau Yih, Tim Rockt\u00e4schel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS\u201920)."},{"key":"e_1_3_3_126_2","first-page":"1","article-title":"Learning to rank for information retrieval and natural language processing","volume":"4","author":"Li Hang","year":"2011","unstructured":"Hang Li. 2011. Learning to rank for information retrieval and natural language processing. Synth. Lect. Hum. Lang. Technol. 4 (2011), 1\u2013113.","journal-title":"Synth. Lect. Hum. Lang. Technol."},{"key":"e_1_3_3_127_2","unstructured":"Hang Li Ahmed Mourad Shengyao Zhuang Bevan Koopman and Guido Zuccon. 2021. Pseudo relevance feedback with deep language models and dense retrievers: Successes and pitfalls. Retrieved from https:\/\/arXiv:2108.11044"},{"key":"e_1_3_3_128_2","article-title":"Improving query representations for dense retrieval with pseudo relevance feedback: A reproducibility study","author":"Li Hang","year":"2021","unstructured":"Hang Li, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy J. Lin, and G. Zuccon. 2021. Improving query representations for dense retrieval with pseudo relevance feedback: A reproducibility study. Retrieved fromhttps:\/\/abs\/2112.06400","journal-title":"Retrieved from"},{"key":"e_1_3_3_129_2","first-page":"1106","volume-title":"Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing","author":"Li Jiwei","year":"2015","unstructured":"Jiwei Li, Thang Luong, and Dan Jurafsky. 2015. A hierarchical neural autoencoder for paragraphs and documents. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 1106\u20131115."},{"key":"e_1_3_3_130_2","first-page":"274","volume-title":"Findings of the Association for Computational Linguistics (EMNLP\u201921)","author":"Li Minghan","year":"2021","unstructured":"Minghan Li, Ming Li, Kun Xiong, and Jimmy Lin. 2021. Multi-task dense retrieval via model uncertainty fusion for open-domain question answering. In Findings of the Association for Computational Linguistics (EMNLP\u201921). 274\u2013287."},{"key":"e_1_3_3_131_2","first-page":"3181","volume-title":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","author":"Li Sen","year":"2021","unstructured":"Sen Li, Fuyu Lv, Taiwei Jin, Guli Lin, Keping Yang, Xiaoyi Zeng, Xiao-Ming Wu, and Qianli Ma. 2021. Embedding-based product retrieval in Taobao search. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3181\u20133189."},{"issue":"8","key":"e_1_3_3_132_2","first-page":"1475","article-title":"Approximate nearest-neighbor search on high dimensional data\u2014Experiments, analyses, and improvement","volume":"32","author":"Li Wen","year":"2019","unstructured":"Wen Li, Ying Zhang, Yifang Sun, Wei Wang, Mingjie Li, Wenjie Zhang, and Xuemin Lin. 2019. Approximate nearest-neighbor search on high dimensional data\u2014Experiments, analyses, and improvement. IEEE Trans. Knowl. Data Eng. 32, 8 (2019), 1475\u20131488.","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"e_1_3_3_133_2","first-page":"287","volume-title":"Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval","author":"Li Yizhi","year":"2021","unstructured":"Yizhi Li, Zhenghao Liu, Chenyan Xiong, and Zhiyuan Liu. 2021. More robust dense retrieval with contrastive dual learning. In Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval. 287\u2013296."},{"key":"e_1_3_3_134_2","first-page":"6636","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics","author":"Li Yongqi","year":"2023","unstructured":"Yongqi Li, Nan Yang, Liang Wang, Furu Wei, and Wenjie Li. 2023. Multiview identifiers enhanced generative retrieval. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 6636\u20136648."},{"key":"e_1_3_3_135_2","unstructured":"Zehan Li Nan Yang Liang Wang and Furu Wei. 2022. Learning diverse document representations with deep query interactions for dense retrieval. Retrieved from https:\/\/arXiv:2208.04232"},{"key":"e_1_3_3_136_2","unstructured":"Davis Liang Peng Xu Siamak Shakeri Cicero Nogueira dos Santos Ramesh Nallapati Zhiheng Huang and Bing Xiang. 2020. Embedding-based zero-shot retrieval through query generation. Retrieved from https:\/\/arXiv:2009.10270"},{"key":"e_1_3_3_137_2","first-page":"2356","volume-title":"Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201921)","author":"Lin Jimmy","year":"2021","unstructured":"Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. 2021. Pyserini: A Python toolkit for reproducible information retrieval research with sparse and dense representations. In Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201921). 2356\u20132362."},{"issue":"4","key":"e_1_3_3_138_2","first-page":"1","article-title":"Pretrained transformers for text ranking: Bert and beyond","volume":"14","author":"Lin Jimmy","year":"2021","unstructured":"Jimmy Lin, Rodrigo Nogueira, and Andrew Yates. 2021. Pretrained transformers for text ranking: Bert and beyond. Synth. Lect. Hum. Lang. Technol. 14, 4 (2021), 1\u2013325.","journal-title":"Synth. Lect. Hum. Lang. Technol."},{"key":"e_1_3_3_139_2","article-title":"A few brief notes on deepimpact, COIL, and a conceptual framework for information retrieval techniques","author":"Lin Jimmy J.","year":"2021","unstructured":"Jimmy J. Lin and Xueguang Ma. 2021. A few brief notes on deepimpact, COIL, and a conceptual framework for information retrieval techniques. Retrieved fromhttps:\/\/abs\/2106.14807","journal-title":"Retrieved from"},{"key":"e_1_3_3_140_2","article-title":"Aggretriever: A simple approach to aggregate textual representation for robust dense passage retrieval","author":"Lin Sheng-Chieh","year":"2022","unstructured":"Sheng-Chieh Lin, Minghan Li, and Jimmy Lin. 2022. Aggretriever: A simple approach to aggregate textual representation for robust dense passage retrieval. Retrieved fromhttps:\/\/abs\/2208.00511","journal-title":"Retrieved from"},{"key":"e_1_3_3_141_2","article-title":"Densifying sparse representations for passage retrieval by representational slicing","author":"Lin Sheng-Chieh","year":"2021","unstructured":"Sheng-Chieh Lin and Jimmy J. Lin. 2021. Densifying sparse representations for passage retrieval by representational slicing. Retrieved fromhttps:\/\/abs\/2112.04666","journal-title":"Retrieved from"},{"key":"e_1_3_3_142_2","unstructured":"Sheng-Chieh Lin Jheng-Hong Yang and Jimmy Lin. 2020. Distilling dense representations for ranking using tightly-coupled teachers. Retrieved from https:\/\/arXiv:2010.11386"},{"key":"e_1_3_3_143_2","doi-asserted-by":"crossref","unstructured":"Zhenghao Lin Yeyun Gong Xiao Liu Hang Zhang Chen Lin Anlei Dong Jian Jiao Jingwen Lu Daxin Jiang Rangan Majumder et\u00a0al. 2023. PROD: Progressive distillation for dense retrieval. In Proceedings of the ACM Web Conference 2023. 3299\u20133308.","DOI":"10.1145\/3543507.3583421"},{"key":"e_1_3_3_144_2","unstructured":"Alexander Liu and Samuel Yang. 2022. Masked autoencoders as the unified learners for pre-trained sentence representation. Retrieved from https:\/\/abs\/2208.00231"},{"key":"e_1_3_3_145_2","unstructured":"Fangyu Liu Serhii Havrylov Yunlong Jiao Jordan Massiah and Emine Yilmaz. 2021. Trans-encoder: Unsupervised sentence-pair modelling through self-and mutual-distillations. Retrieved from https:\/\/arXiv:2109.13059"},{"key":"e_1_3_3_146_2","unstructured":"Jiawei Liu Yangyang Kang Di Tang Kaisong Song Changlong Sun Xiaofeng Wang Wei Lu and Xiaozhong Liu. 2022. Order-disorder: Imitation adversarial attacks for black-box neural ranking models. Retrieved from https:\/\/abs\/2209.06506"},{"key":"e_1_3_3_147_2","unstructured":"Jiduan Liu Jiahao Liu Yang Yang Jingang Wang Wei Wu Dongyan Zhao and Rui Yan. 2022. GNN-encoder: Learning a dual-encoder architecture via graph neural networks for passage retrieval. Retrieved from https:\/\/arXiv:2204.08241"},{"key":"e_1_3_3_148_2","article-title":"Challenges in generalization in open domain question answering","author":"Liu Linqing","year":"2021","unstructured":"Linqing Liu, Patrick Lewis, Sebastian Riedel, and Pontus Stenetorp. 2021. Challenges in generalization in open domain question answering. Retrieved fromhttps:\/\/abs\/2109.01156","journal-title":"Retrieved from"},{"key":"e_1_3_3_149_2","article-title":"Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing","author":"Liu Pengfei","year":"2023","unstructured":"Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Surveys, 55, 9 (2023), 1\u201335.","journal-title":"ACM Comput. Surveys"},{"key":"e_1_3_3_150_2","first-page":"904","volume-title":"Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201910)","author":"Liu Tie-Yan","year":"2010","unstructured":"Tie-Yan Liu. 2010. Learning to rank for information retrieval. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201910). 904."},{"key":"e_1_3_3_151_2","first-page":"188","volume-title":"Proceedings of the Association for Computational Linguistics (EMNLP\u201921)","author":"Liu Ye","year":"2021","unstructured":"Ye Liu, Kazuma Hashimoto, Yingbo Zhou, Semih Yavuz, Caiming Xiong, and S Yu Philip. 2021. Dense hierarchical retrieval for open-domain question answering. In Proceedings of the Association for Computational Linguistics (EMNLP\u201921). 188\u2013200."},{"key":"e_1_3_3_152_2","article-title":"Pre-trained language model for web-scale retrieval in baidu search","author":"Liu Yiding","year":"2021","unstructured":"Yiding Liu, Guan Huang, Jiaxiang Liu, Weixue Lu, Suqi Cheng, Yukun Li, Daiting Shi, Shuaiqiang Wang, Zhicong Cheng, and Dawei Yin. 2021. Pre-trained language model for web-scale retrieval in baidu search. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.","journal-title":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining"},{"key":"e_1_3_3_153_2","unstructured":"Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy Mike Lewis Luke Zettlemoyer and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. Retrieved from https:\/\/arXiv:1907.11692"},{"key":"e_1_3_3_154_2","article-title":"Que2Search: Fast and accurate query and document understanding for search at Facebook","author":"Liu Yiqun","year":"2021","unstructured":"Yiqun Liu, Kaushik Rangadurai, Yunzhong He, Siddarth Malreddy, Xunlong Gui, Xiaoyi Liu, and Fedor Borisyuk. 2021. Que2Search: Fast and accurate query and document understanding for search at Facebook. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.","journal-title":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining"},{"key":"e_1_3_3_155_2","unstructured":"Zheng Liu and Yingxia Shao. 2022. RetroMAE: Pre-training retrieval-oriented transformers via masked auto-encoder. Retrieved from https:\/\/abs\/2205.12035"},{"key":"e_1_3_3_156_2","first-page":"2531","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201921)","author":"Liu Zhenghao","year":"2021","unstructured":"Zhenghao Liu, Kaitao Zhang, Chenyan Xiong, Zhiyuan Liu, and Maosong Sun. 2021. OpenMatch: An open source library for Neu-IR research. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201921). 2531\u20132535."},{"key":"e_1_3_3_157_2","unstructured":"Jing Lu Gustavo Hern\u00e1ndez \u00c1brego Ji Ma Jianmo Ni and Yinfei Yang. 2020. Neural passage retrieval with improved negative contrast. Retrieved from https:\/\/abs\/2010.12523"},{"key":"e_1_3_3_158_2","first-page":"6091","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Lu Jing","year":"2021","unstructured":"Jing Lu, Gustavo Hern\u00e1ndez \u00c1brego, Ji Ma, Jianmo Ni, and Yinfei Yang. 2021. Multi-stage training with improved negative contrast for neural passage retrieval. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 6091\u20136103."},{"key":"e_1_3_3_159_2","first-page":"6227","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL\u201922)","author":"Lu Shuai","year":"2022","unstructured":"Shuai Lu, Nan Duan, Hojae Han, Daya Guo, Seung-won Hwang, and Alexey Svyatkovskiy. 2022. ReACC: A retrieval-augmented code completion framework. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL\u201922). 6227\u20136240."},{"key":"e_1_3_3_160_2","first-page":"2780","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921)","author":"Lu Shuqi","year":"2021","unstructured":"Shuqi Lu, Di He, Chenyan Xiong, Guolin Ke, Waleed Malik, Zhicheng Dou, Paul Bennett, Tie-Yan Liu, and Arnold Overwijk. 2021. Less is more: Pretrain a strong siamese encoder for dense text retrieval using a weak decoder. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921). 2780\u20132791."},{"key":"e_1_3_3_161_2","unstructured":"Yuxiang Lu Yiding Liu Jiaxiang Liu Yunsheng Shi Zhengjie Huang Shikun Feng Yu Sun Hao Tian Hua Wu Shuaiqiang Wang Dawei Yin and Haifeng Wang. 2022. ERNIE-search: Bridging cross-encoder with dual-encoder via self on-the-fly distillation for dense passage retrieval. Retrieved from https:\/\/abs\/2205.09153"},{"key":"e_1_3_3_162_2","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1162\/tacl_a_00369","article-title":"Sparse, dense, and attentional representations for text retrieval","volume":"9","author":"Luan Yi","year":"2021","unstructured":"Yi Luan, Jacob Eisenstein, Kristina Toutanova, and Michael Collins. 2021. Sparse, dense, and attentional representations for text retrieval. Trans. Assoc. Comput. Linguist. 9 (2021), 329\u2013345.","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"e_1_3_3_163_2","first-page":"11038","volume-title":"Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI\u201922), 34th Conference on Innovative Applications of Artificial Intelligence (IAAI\u201922), and the 12th Symposium on Educational Advances in Artificial Intelligence (EAAI\u201922)","author":"Luo Man","year":"2022","unstructured":"Man Luo, Arindam Mitra, Tejas Gokhale, and Chitta Baral. 2022. Improving biomedical information retrieval with neural retrievers. In Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI\u201922), 34th Conference on Innovative Applications of Artificial Intelligence (IAAI\u201922), and the 12th Symposium on Educational Advances in Artificial Intelligence (EAAI\u201922). 11038\u201311046."},{"key":"e_1_3_3_164_2","first-page":"1075","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics","author":"Ma Ji","year":"2021","unstructured":"Ji Ma, Ivan Korotkov, Yinfei Yang, Keith Hall, and Ryan McDonald. 2021. Zero-shot neural passage retrieval via domain-targeted synthetic question generation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. 1075\u20131088."},{"key":"e_1_3_3_165_2","first-page":"848","volume-title":"Proceedings of the 45rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Ma Xinyu","year":"2022","unstructured":"Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, and Xueqi Cheng. 2022. Pre-train a discriminative text encoder for dense retrieval via contrastive span prediction. In Proceedings of the 45rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 848\u2013858."},{"key":"e_1_3_3_166_2","first-page":"283","volume-title":"Proceedings of the 14th ACM International Conference on Web Search and Data Mining","author":"Ma Xinyu","year":"2021","unstructured":"Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Xiang Ji, and Xueqi Cheng. 2021. PROP: Pre-training with representative words prediction for ad hoc retrieval. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 283\u2013291."},{"key":"e_1_3_3_167_2","first-page":"1318","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201921)","author":"Ma Xinyu","year":"2021","unstructured":"Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Yingyan Li, and Xueqi Cheng. 2021. B-PROP: Bootstrapped pre-training with representative words prediction for ad hoc retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201921). 1318\u20131327."},{"key":"e_1_3_3_168_2","unstructured":"Xinyu Ma Ruqing Zhang Jiafeng Guo Yixing Fan and Xueqi Cheng. 2022. A contrastive pre-training approach to learn discriminative autoencoder for dense retrieval. Retrieved from https:\/\/arXiv:2208.09846"},{"key":"e_1_3_3_169_2","first-page":"1212","volume-title":"Proceedings of the 30th ACM International Conference on Information and Knowledge Management","author":"Ma Zhengyi","year":"2021","unstructured":"Zhengyi Ma, Zhicheng Dou, Wei Xu, Xinyu Zhang, Hao Jiang, Zhao Cao, and Ji-Rong Wen. 2021. Pre-training for ad hoc retrieval: Hyperlink is also you need. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management. 1212\u20131221."},{"key":"e_1_3_3_170_2","unstructured":"Sean MacAvaney Sergey Feldman Nazli Goharian Doug Downey and Arman Cohan. 2020. ABNIRML: Analyzing the behavior of neural IR models. Retrieved from https:\/\/arXiv:2011.00696"},{"key":"e_1_3_3_171_2","first-page":"1101","volume-title":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201919)","author":"MacAvaney Sean","year":"2019","unstructured":"Sean MacAvaney, Andrew Yates, Arman Cohan, and Nazli Goharian. 2019. CEDR: Contextualized embeddings for document ranking. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201919). 1101\u20131104."},{"key":"e_1_3_3_172_2","doi-asserted-by":"crossref","first-page":"3495","DOI":"10.1145\/3534678.3539164","volume-title":"Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD\u201922)","author":"Magnani Alessandro","year":"2022","unstructured":"Alessandro Magnani, Feng Liu, Suthee Chaidaroon, Sachin Yadav, Praveen Reddy Suram, Ajit Puthenputhussery, Sijie Chen, Min Xie, Anirudh Kashi, Tony Lee, and Ciya Liao. 2022. Semantic retrieval at walmart. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD\u201922). 3495\u20133503."},{"key":"e_1_3_3_173_2","first-page":"1941","volume-title":"Proceedings of the Web Conference (WWW\u201918)","author":"Maia Macedo","year":"2018","unstructured":"Macedo Maia, Siegfried Handschuh, Andr\u00e9 Freitas, Brian Davis, Ross McDermott, Manel Zarrouk, and Alexandra Balahur. 2018. WWW\u201918 open challenge: Financial opinion mining and question answering. In Proceedings of the Web Conference (WWW\u201918). 1941\u20131942."},{"key":"e_1_3_3_174_2","doi-asserted-by":"crossref","unstructured":"Jean Maillard Vladimir Karpukhin Fabio Petroni Wen-tau Yih Barlas O\u011fuz Veselin Stoyanov and Gargi Ghosh. 2021. Multi-task retrieval for knowledge-intensive tasks. Retrieved from https:\/\/arXiv:2101.00117","DOI":"10.18653\/v1\/2021.acl-long.89"},{"key":"e_1_3_3_175_2","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1016\/j.is.2013.10.006","article-title":"Approximate nearest-neighbor algorithm based on navigable small world graphs","volume":"45","author":"Malkov Yury","year":"2014","unstructured":"Yury Malkov, Alexander Ponomarenko, Andrey Logvinov, and Vladimir Krylov. 2014. Approximate nearest-neighbor algorithm based on navigable small world graphs. Inf. Syst. 45 (2014), 61\u201368.","journal-title":"Inf. Syst."},{"issue":"4","key":"e_1_3_3_176_2","doi-asserted-by":"crossref","first-page":"824","DOI":"10.1109\/TPAMI.2018.2889473","article-title":"Efficient and robust approximate nearest-neighbor search using hierarchical navigable small world graphs","volume":"42","author":"Malkov Yu A.","year":"2018","unstructured":"Yu A. Malkov and Dmitry A. Yashunin. 2018. Efficient and robust approximate nearest-neighbor search using hierarchical navigable small world graphs. IEEE Trans. Pattern Anal. Mach. Intell. 42, 4 (2018), 824\u2013836.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_3_3_177_2","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511809071","volume-title":"Introduction to Information Retrieval","author":"Manning Christopher D.","year":"2008","unstructured":"Christopher D. Manning, Prabhakar Raghavan, and Hinrich Sch\u00fctze. 2008. Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK."},{"key":"e_1_3_3_178_2","unstructured":"Kelong Mao Zhicheng Dou Haonan Chen Fengran Mo and Hongjin Qian. 2023. Large language models know your contextual search intent: A prompting framework for conversational search. Retrieved from https:\/\/arXiv:2303.06573"},{"issue":"3","key":"e_1_3_3_179_2","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1145\/321033.321035","article-title":"On relevance, probabilistic indexing and information retrieval","volume":"7","author":"Maron Melvin Earl","year":"1960","unstructured":"Melvin Earl Maron and John Larry Kuhns. 1960. On relevance, probabilistic indexing and information retrieval. J. ACM 7, 3 (1960), 216\u2013244.","journal-title":"J. ACM"},{"issue":"2","key":"e_1_3_3_180_2","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1093\/comjnl\/20.2.116","article-title":"An inverted index implementation","volume":"20","author":"McDonell Ken J.","year":"1977","unstructured":"Ken J. McDonell. 1977. An inverted index implementation. Comput. J. 20, 2 (1977), 116\u2013123.","journal-title":"Comput. J."},{"key":"e_1_3_3_181_2","doi-asserted-by":"crossref","unstructured":"Donald Metzler Yi Tay Dara Bahri and Marc Najork. 2021. Rethinking search: Making experts out of dilettantes. Retrieved from https:\/\/abs\/2105.02274","DOI":"10.1145\/3476415.3476428"},{"key":"e_1_3_3_182_2","first-page":"3111","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems (NeurIPS\u201913)","author":"Mikolov Tom\u00e1s","year":"2013","unstructured":"Tom\u00e1s Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems (NeurIPS\u201913). 3111\u20133119."},{"key":"e_1_3_3_183_2","unstructured":"Bhaskar Mitra and Nick Craswell. 2017. Neural models for information retrieval. Retrieved from https:\/\/arXiv:1705.01509"},{"key":"e_1_3_3_184_2","doi-asserted-by":"crossref","first-page":"1291","DOI":"10.1145\/3038912.3052579","volume-title":"Proceedings of the 26th International Conference on World Wide Web (WWW\u201917)","author":"Mitra Bhaskar","year":"2017","unstructured":"Bhaskar Mitra, Fernando Diaz, and Nick Craswell. 2017. Learning to match using local and distributed representations of text for web search. In Proceedings of the 26th International Conference on World Wide Web (WWW\u201917). 1291\u20131299."},{"key":"e_1_3_3_185_2","article-title":"Text and code embeddings by contrastive pre-training","author":"Neelakantan Arvind","year":"2022","unstructured":"Arvind Neelakantan, Tao Xu, Raul Puri, Alec Radford, Jesse Michael Han, Jerry Tworek, Qiming Yuan, Nikolas A. Tezak, Jong Wook Kim, Chris Hallacy, Johannes Heidecke, Pranav Shyam, Boris Power, Tyna Eloundou Nekoul, Girish Sastry, Gretchen Krueger, David P. Schnurr, Felipe Petroski Such, Kenny Sai-Kin Hsu, Madeleine Thompson, Tabarak Khan, Toki Sherbakov, Joanne Jang, Peter Welinder, and Lilian Weng. 2022. Text and code embeddings by contrastive pre-training. Retrieved fromhttps:\/\/abs\/2201.10005","journal-title":"Retrieved from"},{"key":"e_1_3_3_186_2","volume-title":"Proceedings of the Workshop on Cognitive Computation: Integrating Neural and Symbolic Approaches Co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS\u201916)","author":"Nguyen Tri","year":"2016","unstructured":"Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated machine reading comprehension dataset. In Proceedings of the Workshop on Cognitive Computation: Integrating Neural and Symbolic Approaches Co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS\u201916)."},{"key":"e_1_3_3_187_2","article-title":"Sentence-T5: Scalable sentence encoders from pre-trained text-to-text models","author":"Ni Jianmo","year":"2021","unstructured":"Jianmo Ni, Gustavo Hern\u2019andez \u2019Abrego, Noah Constant, Ji Ma, Keith B. Hall, Daniel Matthew Cer, and Yinfei Yang. 2021. Sentence-T5: Scalable sentence encoders from pre-trained text-to-text models. Retrieved fromhttps:\/\/abs\/2108.08877","journal-title":"Retrieved from"},{"key":"e_1_3_3_188_2","article-title":"Large dual encoders are generalizable retrievers","author":"Ni Jianmo","year":"2021","unstructured":"Jianmo Ni, Chen Qu, Jing Lu, Zhuyun Dai, Gustavo Hern\u2019andez \u2019Abrego, Ji Ma, Vincent Zhao, Yi Luan, Keith Hall, Ming-Wei Chang, and Yinfei Yang. 2021. Large dual encoders are generalizable retrievers. Retrieved fromhttps:\/\/abs\/2112.07899","journal-title":"Retrieved from"},{"key":"e_1_3_3_189_2","unstructured":"Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage re-ranking with BERT. Retrieved from https:\/\/abs\/1901.04085"},{"key":"e_1_3_3_190_2","first-page":"708","volume-title":"Proceedings of the Association for Computational Linguistics (EMNLP\u201920)","author":"Nogueira Rodrigo","year":"2020","unstructured":"Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy Lin. 2020. Document ranking with a pretrained sequence-to-sequence model. In Proceedings of the Association for Computational Linguistics (EMNLP\u201920). 708\u2013718."},{"key":"e_1_3_3_191_2","unstructured":"Rodrigo Nogueira Jimmy Lin and AI Epistemic. 2019. From doc2query to docTTTTTquery. Online Preprint 6 (2019) 2."},{"key":"e_1_3_3_192_2","unstructured":"Rodrigo Nogueira Wei Yang Kyunghyun Cho and Jimmy Lin. 2019. Multi-stage document ranking with BERT. Retrieved from https:\/\/abs\/1910.14424"},{"key":"e_1_3_3_193_2","unstructured":"Barlas Oguz Xilun Chen Vladimir Karpukhin Stan Peshterliev Dmytro Okhonko Michael Schlichtkrull Sonal Gupta Yashar Mehdad and Scott Yih. 2020. Unik-qa: Unified representations of structured and unstructured knowledge for open-domain question answering. Retrieved from https:\/\/arXiv:2012.14610"},{"key":"e_1_3_3_194_2","doi-asserted-by":"crossref","unstructured":"Barlas O\u011fuz Kushal Lakhotia Anchit Gupta Patrick Lewis Vladimir Karpukhin Aleksandra Piktus Xilun Chen Sebastian Riedel Wen-tau Yih Sonal Gupta et\u00a0al. 2021. Domain-matched pre-training tasks for dense retrieval. Retrieved from https:\/\/arXiv:2107.13602","DOI":"10.18653\/v1\/2022.findings-naacl.114"},{"key":"e_1_3_3_195_2","article-title":"SetRank: Learning a permutation-invariant ranking model for information retrieval","author":"Pang Liang","year":"2020","unstructured":"Liang Pang, Jun Xu, Qingyao Ai, Yanyan Lan, Xueqi Cheng, and Jirong Wen. 2020. SetRank: Learning a permutation-invariant ranking model for information retrieval. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.","journal-title":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval"},{"key":"e_1_3_3_196_2","unstructured":"Baolin Peng Michel Galley Pengcheng He Hao Cheng Yujia Xie Yu Hu Qiuyuan Huang Lars Liden Zhou Yu Weizhu Chen et\u00a0al. 2023. Check your facts and try again: Improving large language models with external knowledge and automated feedback. Retrieved from https:\/\/arXiv:2302.12813"},{"key":"e_1_3_3_197_2","doi-asserted-by":"crossref","first-page":"397","DOI":"10.1007\/978-3-030-99736-6_27","volume-title":"Advances in Information Retrieval 44th European Conference on IR Research (ECIR\u201922)","author":"Penha Gustavo","year":"2022","unstructured":"Gustavo Penha, Arthur C\u00e2mara, and Claudia Hauff. 2022. Evaluating the robustness of retrieval pipelines with query variation generators. In Advances in Information Retrieval 44th European Conference on IR Research (ECIR\u201922). 397\u2013412."},{"key":"e_1_3_3_198_2","first-page":"2227","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Peters Matthew E.","year":"2018","unstructured":"Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2227\u20132237."},{"key":"e_1_3_3_199_2","first-page":"2523","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Petroni Fabio","year":"2021","unstructured":"Fabio Petroni, Aleksandra Piktus, Angela Fan, Patrick Lewis, Majid Yazdani, Nicola De Cao, James Thorne, Yacine Jernite, Vladimir Karpukhin, Jean Maillard, Vassilis Plachouras, Tim Rockt\u00e4schel, and Sebastian Riedel. 2021. KILT: A benchmark for knowledge intensive language tasks. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2523\u20132544."},{"key":"e_1_3_3_200_2","first-page":"2463","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP\u201919)","author":"Petroni Fabio","year":"2019","unstructured":"Fabio Petroni, Tim Rockt\u00e4schel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, and Alexander Miller. 2019. Language models as knowledge bases? In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP\u201919). 2463\u20132473."},{"key":"e_1_3_3_201_2","unstructured":"Aleksandra Piktus Fabio Petroni Vladimir Karpukhin Dmytro Okhonko Samuel Broscheit Gautier Izacard Patrick Lewis Barlas Oguz Edouard Grave Wen-tau Yih and Sebastian Riedel. 2021. The web is your oyster\u2014Knowledge-intensive NLP against a very large web corpus. Retrieved from https:\/\/abs\/2112.09924"},{"key":"e_1_3_3_202_2","doi-asserted-by":"crossref","first-page":"4996","DOI":"10.18653\/v1\/P19-1493","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Pires Telmo","year":"2019","unstructured":"Telmo Pires, Eva Schlinger, and Dan Garrette. 2019. How multilingual is multilingual BERT? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4996\u20135001."},{"key":"e_1_3_3_203_2","doi-asserted-by":"crossref","first-page":"1728","DOI":"10.1145\/3404835.3463106","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Prakash Prafull","year":"2021","unstructured":"Prafull Prakash, Julian Killingback, and Hamed Zamani. 2021. Learning robust dense retrieval models from incomplete relevance labels. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1728\u20131732."},{"key":"e_1_3_3_204_2","article-title":"Understanding the behaviors of BERT in ranking","author":"Qiao Yifan","year":"2019","unstructured":"Yifan Qiao, Chenyan Xiong, Zhenghao Liu, and Zhiyuan Liu. 2019. Understanding the behaviors of BERT in ranking. Retrieved fromhttps:\/\/abs\/1904.07531","journal-title":"Retrieved from"},{"key":"e_1_3_3_205_2","unstructured":"Yifan Qiao Chenyan Xiong Zhenghao Liu and Zhiyuan Liu. 2019. Understanding the behaviors of BERT in ranking. Retrieved from https:\/\/abs\/1904.07531"},{"issue":"4","key":"e_1_3_3_206_2","first-page":"60:1\u201360:34","article-title":"Sponsored search auctions: Recent advances and future directions","volume":"5","author":"Qin Tao","year":"2014","unstructured":"Tao Qin, Wei Chen, and Tie-Yan Liu. 2014. Sponsored search auctions: Recent advances and future directions. ACM Trans. Intell. Syst. Technol. 5, 4 (2014), 60:1\u201360:34.","journal-title":"ACM Trans. Intell. Syst. Technol."},{"key":"e_1_3_3_207_2","doi-asserted-by":"crossref","unstructured":"Yifu Qiu Hongyu Li Yingqi Qu Ying Chen Qiaoqiao She Jing Liu Hua Wu and Haifeng Wang. 2022. DuReader_retrieval: A large-scale chinese benchmark for passage retrieval from web search engine. Retrieved from https:\/\/arXiv:2203.10232","DOI":"10.18653\/v1\/2022.emnlp-main.357"},{"key":"e_1_3_3_208_2","first-page":"5835","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Qu Yingqi","year":"2021","unstructured":"Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Wayne Xin Zhao, Daxiang Dong, Hua Wu, and Haifeng Wang. 2021. RocketQA: An optimized training approach to dense passage retrieval for open-domain question answering. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 5835\u20135847."},{"key":"e_1_3_3_209_2","first-page":"8748","volume-title":"Proceedings of the 38th International Conference on Machine Learning (ICML\u201921)","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning (ICML\u201921). 8748\u20138763."},{"key":"e_1_3_3_210_2","unstructured":"Alec Radford and Karthik arasimhan. 2018. Improving language understanding by generative pre-training. Online preprint (2018)."},{"key":"e_1_3_3_211_2","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel, Noam M. Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. Retrieved fromhttps:\/\/abs\/1910.10683","journal-title":"Retrieved from"},{"key":"e_1_3_3_212_2","first-page":"2383","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Rajpurkar Pranav","year":"2016","unstructured":"Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2383\u20132392."},{"key":"e_1_3_3_213_2","doi-asserted-by":"crossref","unstructured":"Ori Ram Yoav Levine Itay Dalmedigos Dor Muhlgay Amnon Shashua Kevin Leyton-Brown and Yoav Shoham. 2023. In-context retrieval-augmented language models. Retrieved from https:\/\/arXiv:2302.00083","DOI":"10.1162\/tacl_a_00605"},{"key":"e_1_3_3_214_2","article-title":"Learning to retrieve passages without supervision","author":"Ram Ori","year":"2021","unstructured":"Ori Ram, Gal Shachaf, Omer Levy, Jonathan Berant, and Amir Globerson. 2021. Learning to retrieve passages without supervision. Retrieved fromhttps:\/\/abs\/2112.07708","journal-title":"Retrieved from"},{"key":"e_1_3_3_215_2","unstructured":"Revanth Gangi Reddy Vikas Yadav Md Arafat Sultan Martin Franz Vittorio Castelli Heng Ji and Avirup Sil. 2021. Towards robust neural retrieval models with synthetic pre-training. Retrieved from https:\/\/arXiv:2104.07800"},{"key":"e_1_3_3_216_2","first-page":"3982","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP\u201919)","author":"Reimers Nils","year":"2019","unstructured":"Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP\u201919). 3982\u20133992."},{"key":"e_1_3_3_217_2","first-page":"3982","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP\u201919)","author":"Reimers Nils","year":"2019","unstructured":"Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP\u201919). 3982\u20133992."},{"key":"e_1_3_3_218_2","first-page":"605","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing","author":"Reimers Nils","year":"2021","unstructured":"Nils Reimers and Iryna Gurevych. 2021. The curse of dense low-dimensional information retrieval for large index sizes. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 605\u2013611."},{"key":"e_1_3_3_219_2","first-page":"2173","volume-title":"Proceedings of the Association for Computational Linguistics (ACL-IJCNLP\u201921)","author":"Ren Ruiyang","year":"2021","unstructured":"Ruiyang Ren, Shangwen Lv, Yingqi Qu, Jing Liu, Wayne Xin Zhao, QiaoQiao She, Hua Wu, Haifeng Wang, and Ji-Rong Wen. 2021. PAIR: Leveraging passage-centric similarity relation for improving dense passage retrieval. In Proceedings of the Association for Computational Linguistics (ACL-IJCNLP\u201921). 2173\u20132183."},{"key":"e_1_3_3_220_2","first-page":"2825","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Ren Ruiyang","year":"2021","unstructured":"Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qiaoqiao She, Hua Wu, Haifeng Wang, and Ji-Rong Wen. 2021. RocketQAv2: A joint training method for dense passage retrieval and passage re-ranking. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2825\u20132835."},{"key":"e_1_3_3_221_2","volume-title":"Proceedings of the Association for Computational Linguistics (EMNLP\u201923)","author":"Ren Ruiyang","year":"2023","unstructured":"Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qifei Wu, Yuchen Ding, Hua Wu, Haifeng Wang, and Ji-Rong Wen. 2023. A thorough examination on zero-shot dense retrieval. In Proceedings of the Association for Computational Linguistics (EMNLP\u201923)."},{"key":"e_1_3_3_222_2","unstructured":"Ruiyang Ren Yuhao Wang Yingqi Qu Wayne Xin Zhao Jing Liu Hao Tian Hua Wu Ji-Rong Wen and Haifeng Wang. 2023. Investigating the factual knowledge boundary of large language models with retrieval augmentation. Retrieved from https:\/\/arXiv:2307.11019"},{"key":"e_1_3_3_223_2","first-page":"6102","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics","author":"Ren Ruiyang","year":"2023","unstructured":"Ruiyang Ren, Wayne Xin Zhao, Jing Liu, Hua Wu, Ji-Rong Wen, and Haifeng Wang. 2023. TOME: A two-stage approach for model-based retrieval. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 6102\u20136114."},{"key":"e_1_3_3_224_2","first-page":"4714","volume-title":"Proceedings of the 29th International Conference on Computational Linguistics (COLING\u201922)","author":"Rezagholizadeh Mehdi","year":"2022","unstructured":"Mehdi Rezagholizadeh, Aref Jafari, Puneeth S. M. Saladi, Pranav Sharma, Ali Saheb Pasand, and Ali Ghodsi. 2022. Pro-KD: Progressive distillation by following the footsteps of the teacher. In Proceedings of the 29th International Conference on Computational Linguistics (COLING\u201922). 4714\u20134727."},{"issue":"9","key":"e_1_3_3_225_2","doi-asserted-by":"crossref","first-page":"1431","DOI":"10.1093\/jamia\/ocaa091","article-title":"TREC-COVID: Rationale and structure of an information retrieval shared task for COVID-19","volume":"27","author":"Roberts Kirk","year":"2020","unstructured":"Kirk Roberts, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, Kyle Lo, Ian Soboroff, Ellen Voorhees, Lucy Lu Wang, and William R. Hersh. 2020. TREC-COVID: Rationale and structure of an information retrieval shared task for COVID-19. J. Amer. Med. Info. Assoc. 27, 9, 1431\u20131436.","journal-title":"J. Amer. Med. Info. Assoc."},{"key":"e_1_3_3_226_2","doi-asserted-by":"crossref","unstructured":"Stephen Robertson. 2004. Understanding inverse document frequency: on theoretical arguments for IDF. Journal of documentation 60 5 (2004) 503\u2013520.","DOI":"10.1108\/00220410410560582"},{"key":"e_1_3_3_227_2","volume-title":"The Probabilistic Relevance Framework: BM25 and Beyond","author":"Robertson Stephen","year":"2009","unstructured":"Stephen Robertson and Hugo Zaragoza. 2009. The Probabilistic Relevance Framework: BM25 and Beyond. Now Publishers Inc., Hanover, MD."},{"key":"e_1_3_3_228_2","first-page":"109","article-title":"Okapi at TREC-3","volume":"109","author":"Robertson Stephen E.","year":"1995","unstructured":"Stephen E. Robertson, Steve Walker, Susan Jones, Micheline M. Hancock-Beaulieu, Mike Gatford et\u00a0al. 1995. Okapi at TREC-3. Nist Special Publ. 109 (1995), 109.","journal-title":"Nist Special Publ."},{"key":"e_1_3_3_229_2","first-page":"981","volume-title":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201919)","author":"Rosset Corby","year":"2019","unstructured":"Corby Rosset, Bhaskar Mitra, Chenyan Xiong, Nick Craswell, Xia Song, and Saurabh Tiwary. 2019. An axiomatic approach to regularizing neural ranking models. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201919), Benjamin Piwowarski, Max Chevalier, \u00c9ric Gaussier, Yoelle Maarek, Jian-Yun Nie, and Falk Scholer (Eds.). ACM, 981\u2013984."},{"key":"e_1_3_3_230_2","first-page":"6648","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing","author":"Sachan Devendra","year":"2021","unstructured":"Devendra Sachan, Mostofa Patwary, Mohammad Shoeybi, Neel Kant, Wei Ping, William L. Hamilton, and Bryan Catanzaro. 2021. End-to-end training of neural retrievers for open-domain question answering. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 6648\u20136662."},{"key":"e_1_3_3_231_2","unstructured":"Devendra Singh Sachan Mike Lewis Dani Yogatama Luke Zettlemoyer Joelle Pineau and Manzil Zaheer. 2022. Questions are all you need to train a dense passage retriever. Retrieved from https:\/\/arXiv:2206.10658"},{"key":"e_1_3_3_232_2","first-page":"234","volume-title":"Proceedings of the Computer Conference of the American Federation of Information Processing Societies (AFIPS\u201962)","author":"Salton Gerard","year":"1962","unstructured":"Gerard Salton. 1962. Some experiments in the generation of word and document associations. In Proceedings of the Computer Conference of the American Federation of Information Processing Societies (AFIPS\u201962). 234\u2013250."},{"key":"e_1_3_3_233_2","doi-asserted-by":"crossref","first-page":"513","DOI":"10.1016\/0306-4573(88)90021-0","article-title":"Term-weighting approaches in automatic text retrieval","volume":"24","author":"Salton Gerard","year":"1988","unstructured":"Gerard Salton and Chris Buckley. 1988. Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24 (1988), 513\u2013523.","journal-title":"Inf. Process. Manag."},{"issue":"11","key":"e_1_3_3_234_2","doi-asserted-by":"crossref","first-page":"613","DOI":"10.1145\/361219.361220","article-title":"A vector space model for automatic indexing","volume":"18","author":"Salton Gerard","year":"1975","unstructured":"Gerard Salton, Anita Wong, and Chung-Shu Yang. 1975. A vector space model for automatic indexing. Commun. ACM 18, 11 (1975), 613\u2013620.","journal-title":"Commun. ACM"},{"issue":"4","key":"e_1_3_3_235_2","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1561\/1500000009","article-title":"Test collection based evaluation of information retrieval systems","volume":"4","author":"Sanderson Mark","year":"2010","unstructured":"Mark Sanderson et\u00a0al. 2010. Test collection based evaluation of information retrieval systems. Foundations and Trends in Information Retrieval 4, 4 (2010), 247\u2013375.","journal-title":"Foundations and Trends in Information Retrieval"},{"key":"e_1_3_3_236_2","doi-asserted-by":"crossref","unstructured":"Keshav Santhanam Omar Khattab Jon Saad-Falcon Christopher Potts and Matei Zaharia. 2021. ColBERTv2: Effective and efficient retrieval via lightweight late interaction. Retrieved from https:\/\/arXiv:2112.01488","DOI":"10.18653\/v1\/2022.naacl-main.272"},{"key":"e_1_3_3_237_2","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921)","author":"Sciavolino Christopher","year":"2021","unstructured":"Christopher Sciavolino, Zexuan Zhong, Jinhyuk Lee, and Danqi Chen. 2021. Simple entity-centric questions challenge dense retrievers. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921)."},{"key":"e_1_3_3_238_2","first-page":"559","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Seo Minjoon","year":"2018","unstructured":"Minjoon Seo, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, and Hannaneh Hajishirzi. 2018. Phrase-indexed question answering: A new challenge for scalable document comprehension. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 559\u2013564."},{"key":"e_1_3_3_239_2","first-page":"4430","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Seo Minjoon","year":"2019","unstructured":"Minjoon Seo, Jinhyuk Lee, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, and Hannaneh Hajishirzi. 2019. Real-time open-domain question answering with dense-sparse phrase index. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4430\u20134441."},{"key":"e_1_3_3_240_2","first-page":"5445","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201920)","author":"Shakeri Siamak","year":"2020","unstructured":"Siamak Shakeri, Cicero Nogueira dos Santos, Henghui Zhu, Patrick Ng, Feng Nan, Zhiguo Wang, Ramesh Nallapati, and Bing Xiang. 2020. End-to-end synthetic data generation for domain adaptation of question answering systems. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201920). 5445\u20135460."},{"key":"e_1_3_3_241_2","unstructured":"Tao Shen Xiubo Geng Chongyang Tao Can Xu Xiaolong Huang Binxing Jiao Linjun Yang and Daxin Jiang. 2022. LexMAE: Lexicon-bottlenecked pretraining for large-scale retrieval. Retrieved from https:\/\/abs\/2208.14754"},{"key":"e_1_3_3_242_2","unstructured":"Xiaoyu Shen Svitlana Vakulenko Marco Del Tredici Gianni Barlacchi Bill Byrne and Adri\u00e0 de Gispert. 2022. Low-resource dense retrieval for open-domain question answering: A comprehensive survey. Retrieved from https:\/\/abs\/2208.03197"},{"key":"e_1_3_3_243_2","first-page":"2321","volume-title":"Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems","author":"Shrivastava Anshumali","year":"2014","unstructured":"Anshumali Shrivastava and Ping Li. 2014. Asymmetric LSH (ALSH) for sublinear time maximum inner product search (MIPS). In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems. 2321\u20132329."},{"key":"e_1_3_3_244_2","first-page":"2132","volume-title":"Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201922)","author":"Sidiropoulos Georgios","year":"2022","unstructured":"Georgios Sidiropoulos and Evangelos Kanoulas. 2022. Analysing the robustness of dual encoders for dense retrieval against misspellings. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201922). 2132\u20132136."},{"key":"e_1_3_3_245_2","article-title":"Results of the NeurIPS\u201921 challenge on billion-scale approximate nearest-neighbor search","author":"Simhadri Harsha Vardhan","year":"2022","unstructured":"Harsha Vardhan Simhadri, G. R. Williams, Martin Aum\u00fcller, Matthijs Douze, Artem Babenko, Dmitry Baranchuk, Qi Chen, Lucas Hosseini, Ravishankar Krishnaswamy, Gopal Srinivasa, Suhas Jayaram Subramanya, and Jingdong Wang. 2022. Results of the NeurIPS\u201921 challenge on billion-scale approximate nearest-neighbor search. Retrieved fromhttps:\/\/abs\/2205.03763","journal-title":"Retrieved from"},{"key":"e_1_3_3_246_2","first-page":"1470","article-title":"Video google: A text-retrieval approach to object matching in videos","author":"Sivic Josef","year":"2003","unstructured":"Josef Sivic and Andrew Zisserman. 2003. Video google: A text-retrieval approach to object matching in videos. Proceedings of the 9th IEEE International Conference on Computer Vision. 1470\u20131477.","journal-title":"Proceedings of the 9th IEEE International Conference on Computer Vision"},{"key":"e_1_3_3_247_2","volume-title":"Proceedings of the Text Retrieval Conference (TREC\u201919)","author":"Soboroff Ian","year":"2019","unstructured":"Ian Soboroff, Shudong Huang, and Donna Harman. 2019. TREC 2019 news track overview. In Proceedings of the Text Retrieval Conference (TREC\u201919)."},{"key":"e_1_3_3_248_2","first-page":"801","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems","author":"Socher Richard","year":"2011","unstructured":"Richard Socher, Eric H. Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems. 801\u2013809."},{"key":"e_1_3_3_249_2","first-page":"151","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Socher Richard","year":"2011","unstructured":"Richard Socher, Jeffrey Pennington, Eric H. Huang, Andrew Y. Ng, and Christopher D. Manning. 2011. Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 151\u2013161."},{"key":"e_1_3_3_250_2","first-page":"780","volume-title":"Proceedings of the European Conference on Information Retrieval","author":"Suarez Axel","year":"2018","unstructured":"Axel Suarez, Dyaa Albakour, David Corney, Miguel Martinez, and Jos\u00e9 Esquivel. 2018. A data collection for evaluating the retrieval of related tweets to news articles. In Proceedings of the European Conference on Information Retrieval. Springer, 780\u2013786."},{"key":"e_1_3_3_251_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems (NeurIPS\u201919)","author":"Subramanya Suhas Jayaram","year":"2019","unstructured":"Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnaswamy, and Rohan Kadekodi. 2019. DiskANN: Fast accurate billion-point nearest-neighbor search on a single node. In Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems (NeurIPS\u201919)."},{"key":"e_1_3_3_252_2","doi-asserted-by":"crossref","unstructured":"Weiwei Sun Lingyong Yan Xinyu Ma Pengjie Ren Dawei Yin and Zhaochun Ren. 2023. Is ChatGPT good at search? Investigating large language models as re-ranking agent. Retrieved from https:\/\/arXiv:2304.09542","DOI":"10.18653\/v1\/2023.emnlp-main.923"},{"key":"e_1_3_3_253_2","doi-asserted-by":"crossref","unstructured":"Yutao Sun Li Dong Barun Patra Shuming Ma Shaohan Huang Alon Benhaim Vishrav Chaudhary Xia Song and Furu Wei. 2022. A length-extrapolatable transformer. Retrieved from https:\/\/arXiv:2212.10554","DOI":"10.18653\/v1\/2023.acl-long.816"},{"key":"e_1_3_3_254_2","first-page":"8968","volume-title":"Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI\u201920), the 32nd Innovative Applications of Artificial Intelligence Conference (IAAI\u201920), the 10th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI\u201920)","author":"Sun Yu","year":"2020","unstructured":"Yu Sun, Shuohuan Wang, Yu-Kun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang. 2020. ERNIE 2.0: A continual pre-training framework for language understanding. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI\u201920), the 32nd Innovative Applications of Artificial Intelligence Conference (IAAI\u201920), the 10th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI\u201920). 8968\u20138975."},{"key":"e_1_3_3_255_2","article-title":"Parameter-efficient prompt tuning makes generalized and calibrated neural text retrievers","author":"Tam Weng Lam","year":"2022","unstructured":"Weng Lam Tam, Xiao Liu, Kaixuan Ji, Lilong Xue, Xing Zhang, Yuxiao Dong, Jiahua Liu, Maodi Hu, and Jie Tang. 2022. Parameter-efficient prompt tuning makes generalized and calibrated neural text retrievers. Retrieved fromhttps:\/\/abs\/2207.07087","journal-title":"Retrieved from"},{"key":"e_1_3_3_256_2","unstructured":"Alexandre Tamborrino. [n. d.]. Introducing Natural Language Search for Podcast Episodes. https:\/\/engineering.atspotify.com\/2022\/03\/introducing-natural-language-search-for-podcast-episodes\/."},{"key":"e_1_3_3_257_2","first-page":"5054","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing","author":"Tang Hongyin","year":"2021","unstructured":"Hongyin Tang, Xingwu Sun, Beihong Jin, Jingang Wang, Fuzheng Zhang, and Wei Wu. 2021. Improving document representations by generating pseudo query embeddings for dense retrieval. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 5054\u20135064."},{"key":"e_1_3_3_258_2","unstructured":"Zhengyang Tang Benyou Wang and Ting Yao. 2022. DPTDR: Deep prompt tuning for dense passage retrieval. Retrieved from https:\/\/arXiv:2208.11503"},{"key":"e_1_3_3_259_2","article-title":"Transformer memory as a differentiable search index","author":"Tay Yi","year":"2022","unstructured":"Yi Tay, Vinh Quang Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, and Donald Metzler. 2022. Transformer memory as a differentiable search index. Retrieved fromhttps:\/\/abs\/2202.06991","journal-title":"Retrieved from"},{"key":"e_1_3_3_260_2","first-page":"296","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Thakur Nandan","year":"2021","unstructured":"Nandan Thakur, Nils Reimers, Johannes Daxenberger, and Iryna Gurevych. 2021. Augmented SBERT: Data augmentation method for improving bi-encoders for pairwise sentence scoring tasks. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 296\u2013310."},{"key":"e_1_3_3_261_2","unstructured":"Nandan Thakur Nils Reimers Andreas R\u00fcckl\u00e9 Abhishek Srivastava and Iryna Gurevych. 2021. BEIR: A heterogenous benchmark for zero-shot evaluation of information retrieval models. Retrieved from https:\/\/arXiv:2104.08663"},{"key":"e_1_3_3_262_2","first-page":"809","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Thorne James","year":"2018","unstructured":"James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. 2018. FEVER: A large-scale dataset for fact extraction and VERification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 809\u2013819."},{"key":"e_1_3_3_263_2","first-page":"1","volume-title":"Proceedings of the IEEE Information Theory Workshop (ITW\u201915)","author":"Tishby Naftali","year":"2015","unstructured":"Naftali Tishby and Noga Zaslavsky. 2015. Deep learning and the information bottleneck principle. In Proceedings of the IEEE Information Theory Workshop (ITW\u201915). 1\u20135."},{"key":"e_1_3_3_264_2","unstructured":"Nicola Tonellotto. 2022. Lecture notes on neural information retrieval. Retrieved from https:\/\/abs\/2207.13443"},{"key":"e_1_3_3_265_2","first-page":"3453","volume-title":"Proceedings of the 30th ACM International Conference on Information and Knowledge Management","author":"Tonellotto Nicola","year":"2021","unstructured":"Nicola Tonellotto and Craig Macdonald. 2021. Query embedding pruning for dense retrieval. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management. 3453\u20133457."},{"issue":"1","key":"e_1_3_3_266_2","first-page":"1","article-title":"An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition","volume":"16","author":"Tsatsaronis George","year":"2015","unstructured":"George Tsatsaronis, Georgios Balikas, Prodromos Malakasiotis, Ioannis Partalas, Matthias Zschunke, Michael R. Alvers, Dirk Weissenborn, Anastasia Krithara, Sergios Petridis, Dimitris Polychronopoulos et\u00a0al. 2015. An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinform. 16, 1 (2015), 1\u201328.","journal-title":"BMC Bioinform."},{"key":"e_1_3_3_267_2","unstructured":"A\u00e4ron van den Oord Yazhe Li and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. Retrieved from https:\/\/abs\/1807.03748 (2018)."},{"key":"e_1_3_3_268_2","first-page":"1281","volume-title":"Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201914)","author":"Vargas Sa\u00fal","year":"2014","unstructured":"Sa\u00fal Vargas. 2014. Novelty and diversity enhancement and evaluation in recommender systems and information retrieval. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201914). 1281."},{"key":"e_1_3_3_269_2","first-page":"5998","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems. 5998\u20136008."},{"key":"e_1_3_3_270_2","first-page":"13","volume-title":"Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval","author":"V\u00f6lske Michael","year":"2021","unstructured":"Michael V\u00f6lske, Alexander Bondarenko, Maik Fr\u00f6be, Benno Stein, Jaspreet Singh, Matthias Hagen, and Avishek Anand. 2021. Towards axiomatic explanations for neural ranking models. In Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval. 13\u201322."},{"key":"e_1_3_3_271_2","first-page":"77","volume-title":"Proceedings of the Text Retrieval Conference (TREC\u201999)","volume":"99","author":"Voorhees Ellen M.","year":"1999","unstructured":"Ellen M. Voorhees et\u00a0al. 1999. The trec-8 question answering track report. In Proceedings of the Text Retrieval Conference (TREC\u201999), Vol. 99. 77\u201382."},{"key":"e_1_3_3_272_2","first-page":"241","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics","author":"Wachsmuth Henning","year":"2018","unstructured":"Henning Wachsmuth, Shahbaz Syed, and Benno Stein. 2018. Retrieval of the best counterargument without prior topic knowledge. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 241\u2013251."},{"key":"e_1_3_3_273_2","first-page":"7534","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201920)","author":"Wadden David","year":"2020","unstructured":"David Wadden, Shanchuan Lin, Kyle Lo, Lucy Lu Wang, Madeleine van Zuylen, Arman Cohan, and Hannaneh Hajishirzi. 2020. Fact or fiction: Verifying scientific claims. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201920). 7534\u20137550."},{"key":"e_1_3_3_274_2","doi-asserted-by":"crossref","unstructured":"Jiexin Wang Adam Jatowt and Masatoshi Yoshikawa. 2021. ArchivalQA: A large-scale benchmark dataset for open domain question answering over archival news collections. Retrieved from https:\/\/arXiv:2109.03438","DOI":"10.1145\/3477495.3531734"},{"key":"e_1_3_3_275_2","doi-asserted-by":"crossref","DOI":"10.1145\/2393347.2393378","article-title":"Query-driven iterated neighborhood graph search for large scale indexing","author":"Wang Jingdong","year":"2012","unstructured":"Jingdong Wang and Shipeng Li. 2012. Query-driven iterated neighborhood graph search for large scale indexing. Proceedings of the 20th ACM International Conference on Multimedia.","journal-title":"Proceedings of the 20th ACM International Conference on Multimedia"},{"key":"e_1_3_3_276_2","doi-asserted-by":"crossref","first-page":"769","DOI":"10.1109\/TPAMI.2017.2699960","article-title":"A survey on learning to hash","volume":"40","author":"Wang Jingdong","year":"2018","unstructured":"Jingdong Wang, Ting Zhang, Jingkuan Song, N. Sebe, and Heng Tao Shen. 2018. A survey on learning to hash. IEEE Trans. Pattern Anal. Mach. Intell. 40 (2018), 769\u2013790.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_3_3_277_2","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921)","author":"Wang Kexin","year":"2021","unstructured":"Kexin Wang, Nils Reimers, and Iryna Gurevych. 2021. TSDAE: Using transformer-based sequential denoising auto-encoder for unsupervised sentence embedding learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921)."},{"key":"e_1_3_3_278_2","doi-asserted-by":"crossref","unstructured":"Kexin Wang Nandan Thakur Nils Reimers and Iryna Gurevych. 2021. GPL: Generative pseudo labeling for unsupervised domain adaptation of dense retrieval. Retrieved from https:\/\/arXiv:2112.07577","DOI":"10.18653\/v1\/2022.naacl-main.168"},{"key":"e_1_3_3_279_2","unstructured":"Kaiye Wang Qiyue Yin Wei Wang Shu Wu and Liang Wang. 2016. A comprehensive survey on cross-modal retrieval. Retrieved from https:\/\/abs\/1607.06215 (2016)."},{"key":"e_1_3_3_280_2","doi-asserted-by":"crossref","unstructured":"Liang Wang Nan Yang Xiaolong Huang Binxing Jiao Linjun Yang Daxin Jiang Rangan Majumder and Furu Wei. 2022. SimLM: Pre-training with representation bottleneck for dense passage retrieval. Retrieved from https:\/\/abs\/2207.02578.","DOI":"10.18653\/v1\/2023.acl-long.125"},{"key":"e_1_3_3_281_2","doi-asserted-by":"crossref","unstructured":"Xiao Wang Craig Macdonald Nicola Tonellotto and Iadh Ounis. 2021. Pseudo-relevance feedback for multiple representation dense retrieval. Retrieved from https:\/\/arXiv:2106.11251","DOI":"10.1145\/3471158.3472250"},{"key":"e_1_3_3_282_2","first-page":"297","volume-title":"Proceedings of the ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR\u201921)","author":"Wang Xiao","year":"2021","unstructured":"Xiao Wang, Craig Macdonald, Nicola Tonellotto, and Iadh Ounis. 2021. Pseudo-relevance feedback for multiple representation dense retrieval. In Proceedings of the ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR\u201921). 297\u2013306."},{"key":"e_1_3_3_283_2","unstructured":"Yujing Wang Yingyan Hou Haonan Wang Ziming Miao Shibin Wu Hao Sun Qi Chen Yuqing Xia Chengmin Chi Guoshuai Zhao Zheng Liu Xing Xie Hao Allen Sun Weiwei Deng Qi Zhang and Mao Yang. 2022. A neural corpus indexer for document retrieval. Retrieved from https:\/\/abs\/2206.02743"},{"key":"e_1_3_3_284_2","first-page":"115","volume-title":"Proceedings of the ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR\u201922)","author":"Wang Yumeng","year":"2022","unstructured":"Yumeng Wang, Lijun Lyu, and Avishek Anand. 2022. BERT rankers are brittle: A study using adversarial document perturbations. In Proceedings of the ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR\u201922). 115\u2013120."},{"key":"e_1_3_3_285_2","first-page":"5878","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP\u201919)","author":"Wang Zhiguo","year":"2019","unstructured":"Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, and Bing Xiang. 2019. Multi-passage BERT: A globally normalized bert model for open-domain question answering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP\u201919). 5878\u20135882."},{"key":"e_1_3_3_286_2","article-title":"Finetuned language models are zero-shot learners","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Maarten Bosma, Vincent Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V. Le. 2022. Finetuned language models are zero-shot learners. Retrieved fromhttps:\/\/abs\/2109.01652","journal-title":"Retrieved from"},{"key":"e_1_3_3_287_2","first-page":"6397","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201920)","author":"Wu Ledell","year":"2020","unstructured":"Ledell Wu, Fabio Petroni, Martin Josifoski, Sebastian Riedel, and Luke Zettlemoyer. 2020. Scalable zero-shot entity linking with dense entity retrieval. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201920). 6397\u20136407."},{"key":"e_1_3_3_288_2","unstructured":"Xing Wu Guangyuan Ma Meng Lin Zijia Lin Zhongyuan Wang and Songlin Hu. 2022. ConTextual mask auto-encoder for dense passage retrieval. Retrieved from https:\/\/arXiv:2208.07670"},{"key":"e_1_3_3_289_2","first-page":"55","volume-title":"Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Xiong Chenyan","year":"2017","unstructured":"Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad hoc ranking with kernel pooling. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 55\u201364."},{"key":"e_1_3_3_290_2","volume-title":"Proceedings of the 9th International Conference on Learning Representations (ICLR\u201921)","author":"Xiong Lee","year":"2021","unstructured":"Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N. Bennett, Junaid Ahmed, and Arnold Overwijk. 2021. Approximate nearest-neighbor negative contrastive learning for dense text retrieval. In Proceedings of the 9th International Conference on Learning Representations (ICLR\u201921)."},{"key":"e_1_3_3_291_2","unstructured":"Canwen Xu Daya Guo Nan Duan and Julian McAuley. 2022. LaPraDoR: Unsupervised pretrained dense retriever for zero-shot text retrieval. Retrieved from https:\/\/arXiv:2203.06169"},{"key":"e_1_3_3_292_2","unstructured":"Shicheng Xu Liang Pang Huawei Shen Xueqi Cheng and Tat-seng Chua. 2023. Search-in-the-chain: Towards the accurate credible and traceable content generation for complex knowledge-intensive tasks. Retrieved from https:\/\/arXiv:2304.14732"},{"key":"e_1_3_3_293_2","first-page":"979","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing","author":"Yamada Ikuya","year":"2021","unstructured":"Ikuya Yamada, Akari Asai, and Hannaneh Hajishirzi. 2021. Efficient passage retrieval with hashing for open-domain question answering. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 979\u2013986."},{"key":"e_1_3_3_294_2","doi-asserted-by":"crossref","unstructured":"Ming Yan Chenliang Li Chen Wu Bin Bi Wei Wang Jiangnan Xia and Luo Si. 2019. IDST at TREC 2019 deep learning track: deep cascade ranking with generation-based document expansion and pre-trained language modeling. In Proceedings of the 28th Text Retrieval Conference (TREC\u201919)(NIST Special Publication Vol. 1250).","DOI":"10.6028\/NIST.SP.1250.deep-IDST"},{"key":"e_1_3_3_295_2","first-page":"5065","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing","author":"Yan Yuanmeng","year":"2021","unstructured":"Yuanmeng Yan, Rumei Li, Sirui Wang, Fuzheng Zhang, Wei Wu, and Weiran Xu. 2021. ConSERT: A contrastive framework for self-supervised sentence representation transfer. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 5065\u20135075."},{"key":"e_1_3_3_296_2","first-page":"6120","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing","author":"Yang Nan","year":"2021","unstructured":"Nan Yang, Furu Wei, Binxing Jiao, Daxing Jiang, and Linjun Yang. 2021. xMoCo: Cross momentum contrastive learning for open-domain question answering. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 6120\u20136129."},{"key":"e_1_3_3_297_2","first-page":"1253","volume-title":"Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Yang Peilin","year":"2017","unstructured":"Peilin Yang, Hui Fang, and Jimmy Lin. 2017. Anserini: Enabling the use of lucene for information retrieval research. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1253\u20131256."},{"key":"e_1_3_3_298_2","first-page":"263","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing","author":"Yang Yinfei","year":"2021","unstructured":"Yinfei Yang, Ning Jin, Kuo Lin, Mandy Guo, and Daniel Cer. 2021. Neural retrieval for question answering with cross-attention supervised data augmentation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 263\u2013268."},{"key":"e_1_3_3_299_2","first-page":"2369","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Yang Zhilin","year":"2018","unstructured":"Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhutdinov, and Christopher D. Manning. 2018. HotpotQA: A dataset for diverse, explainable multi-hop question answering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2369\u20132380."},{"key":"e_1_3_3_300_2","unstructured":"Shunyu Yao Jeffrey Zhao Dian Yu Nan Du Izhak Shafran Karthik Narasimhan and Yuan Cao. 2022. React: Synergizing reasoning and acting in language models. Retrieved from https:\/\/arXiv:2210.03629"},{"key":"e_1_3_3_301_2","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1162\/tacl_a_00371","article-title":"Adaptive semiparametric language models","volume":"9","author":"Yogatama Dani","year":"2021","unstructured":"Dani Yogatama, Cyprien de Masson d\u2019Autume, and Lingpeng Kong. 2021. Adaptive semiparametric language models. Trans. Assoc. Comput. Linguist. 9 (2021), 362\u2013373.","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"e_1_3_3_302_2","first-page":"3592","volume-title":"Proceedings of the 30th ACM International Conference on Information and Knowledge Management","author":"Yu HongChien","year":"2021","unstructured":"HongChien Yu, Chenyan Xiong, and Jamie Callan. 2021. Improving query representations for dense retrieval with pseudo relevance feedback. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management. 3592\u20133596."},{"key":"e_1_3_3_303_2","first-page":"829","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201921)","author":"Yu Shi","year":"2021","unstructured":"Shi Yu, Zhenghao Liu, Chenyan Xiong, Tao Feng, and Zhiyuan Liu. 2021. Few-shot conversational dense retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201921). 829\u2013838."},{"key":"e_1_3_3_304_2","first-page":"1979","volume-title":"Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201922)","author":"Zeng Hansi","year":"2022","unstructured":"Hansi Zeng, Hamed Zamani, and Vishwa Vinay. 2022. Curriculum learning for dense retrieval distillation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201922). 1979\u20131983."},{"key":"e_1_3_3_305_2","volume-title":"Statistical Language Models for Information Retrieval","author":"Zhai ChengXiang","year":"2008","unstructured":"ChengXiang Zhai. 2008. Statistical Language Models for Information Retrieval. Morgan & Claypool Publishers."},{"key":"e_1_3_3_306_2","unstructured":"Jingtao Zhan Jiaxin Mao Yiqun Liu Jiafeng Guo Min Zhang and Shaoping Ma. 2021. Interpreting dense retrieval as mixture of topics. Retrieved from https:\/\/arXiv:2111.13957"},{"key":"e_1_3_3_307_2","first-page":"2487","volume-title":"Proceedings of the 30th ACM International Conference on Information and Knowledge Management","author":"Zhan Jingtao","year":"2021","unstructured":"Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, and Shaoping Ma. 2021. Jointly optimizing query encoder and product quantization to improve retrieval performance. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management. 2487\u20132496."},{"key":"e_1_3_3_308_2","doi-asserted-by":"crossref","unstructured":"Jingtao Zhan Jiaxin Mao Yiqun Liu Jiafeng Guo Min Zhang and Shaoping Ma. 2021. Learning discrete representations via constrained clustering for effective and efficient dense retrieval. Retrieved from https:\/\/arXiv:2110.05789","DOI":"10.24963\/ijcai.2022\/754"},{"key":"e_1_3_3_309_2","doi-asserted-by":"crossref","first-page":"1503","DOI":"10.1145\/3404835.3462880","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Zhan Jingtao","year":"2021","unstructured":"Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, and Shaoping Ma. 2021. Optimizing dense retrieval model training with hard negatives. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1503\u20131512."},{"key":"e_1_3_3_310_2","unstructured":"Jingtao Zhan Jiaxin Mao Yiqun Liu Min Zhang and Shaoping Ma. 2020. RepBERT: Contextualized text embeddings for first-stage retrieval. Retrieved from https:\/\/abs\/2006.15498"},{"key":"e_1_3_3_311_2","unstructured":"Jingtao Zhan Xiaohui Xie Jiaxin Mao Yiqun Liu Min Zhang and Shaoping Ma. 2022. Evaluating extrapolation performance of dense retrieval. Retrieved from https:\/\/abs\/2204.11447"},{"key":"e_1_3_3_312_2","unstructured":"Hang Zhang Yeyun Gong Yelong Shen Jiancheng Lv Nan Duan and Weizhu Chen. 2021. Adversarial retriever-ranker for dense text retrieval. Retrieved from https:\/\/arXiv:2110.03611"},{"key":"e_1_3_3_313_2","doi-asserted-by":"crossref","unstructured":"Han Zhang Hongwei Shen Yiming Qiu Yunjiang Jiang Songlin Wang Sulong Xu Yun Xiao Bo Long and Wen-Yun Yang. 2021. Joint learning of deep retrieval model and product quantization based embedding index. Retrieved from https:\/\/arXiv:2105.03933","DOI":"10.1145\/3404835.3462988"},{"key":"e_1_3_3_314_2","article-title":"Uni-retriever: Towards learning the unified embedding based retriever in bing sponsored search","author":"Zhang Jianjin","year":"2022","unstructured":"Jianjin Zhang, Zheng Liu, Weihao Han, Shitao Xiao, Rui Zheng, Yingxia Shao, Hao Sun, Hanqing Zhu, Premkumar Srinivasan, Denvy Deng, Qi Zhang, and Xing Xie. 2022. Uni-retriever: Towards learning the unified embedding based retriever in bing sponsored search. Retrieved fromhttps:\/\/abs\/2202.06212","journal-title":"Retrieved from"},{"key":"e_1_3_3_315_2","article-title":"LED: Lexicon-enlightened dense retriever for large-scale retrieval","author":"Zhang Kai","year":"2022","unstructured":"Kai Zhang, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Binxing Jiao, and Daxin Jiang. 2022. LED: Lexicon-enlightened dense retriever for large-scale retrieval. Retrieved fromhttps:\/\/abs\/2208.13661","journal-title":"Retrieved from"},{"key":"e_1_3_3_316_2","unstructured":"Michael J. Q. Zhang and Eunsol Choi. 2021. SituatedQA: Incorporating extra-linguistic contexts into QA. Retrieved from https:\/\/arXiv:2109.06157"},{"key":"e_1_3_3_317_2","first-page":"5990","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics","author":"Zhang Shunyu","year":"2022","unstructured":"Shunyu Zhang, Yaobo Liang, Ming Gong, Daxin Jiang, and Nan Duan. 2022. Multi-view document representation learning for open-domain dense retrieval. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 5990\u20136000."},{"key":"e_1_3_3_318_2","doi-asserted-by":"crossref","unstructured":"Xinyu Zhang Xueguang Ma Peng Shi and Jimmy Lin. 2021. Mr. TyDi: A multi-lingual benchmark for dense retrieval. Retrieved from https:\/\/abs\/2108.08787","DOI":"10.18653\/v1\/2021.mrl-1.12"},{"key":"e_1_3_3_319_2","doi-asserted-by":"publisher","unstructured":"Yanzhao Zhang Dingkun Long Guangwei Xu and Pengjun Xie. 2022. HLATR: Enhance multi-stage text retrieval with hybrid list aware transformer reranking. Retrieved from https:\/\/abs\/2205.10569. 10.48550\/arXiv.2205.10569","DOI":"10.48550\/arXiv.2205.10569"},{"key":"e_1_3_3_320_2","unstructured":"Wayne Xin Zhao Kun Zhou Junyi Li Tianyi Tang Xiaolei Wang Yupeng Hou Yingqian Min Beichen Zhang Junjie Zhang Zican Dong et\u00a0al. 2023. A survey of large language models. Retrieved from https:\/\/arXiv:2303.18223"},{"key":"e_1_3_3_321_2","doi-asserted-by":"crossref","unstructured":"Wei Zhong Jheng-Hong Yang and Jimmy Lin. 2022. Evaluating token-level and passage-level dense retrieval models for math information retrieval. Retrieved from https:\/\/abs\/2203.11163","DOI":"10.18653\/v1\/2022.findings-emnlp.78"},{"key":"e_1_3_3_322_2","volume-title":"Proceedings of the 8th International Conference on Learning Representations (ICLR\u201920)","author":"Zhou Chunting","year":"2020","unstructured":"Chunting Zhou, Jiatao Gu, and Graham Neubig. 2020. Understanding knowledge distillation in non-autoregressive machine translation. In Proceedings of the 8th International Conference on Learning Representations (ICLR\u201920)."},{"key":"e_1_3_3_323_2","first-page":"7135","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics","author":"Zhou Jiawei","year":"2022","unstructured":"Jiawei Zhou, Xiaoguang Li, Lifeng Shang, Lan Luo, Ke Zhan, Enrui Hu, Xinyu Zhang, Hao Jiang, Zhao Cao, Fan Yu et\u00a0al. 2022. Hyperlink-induced pre-training for passage retrieval in open-domain question answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 7135\u20137146."},{"key":"e_1_3_3_324_2","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201922)","author":"Zhou Kun","year":"2022","unstructured":"Kun Zhou, Yeyun Gong, Xiao Liu, Wayne Xin Zhao, Yelong Shen, Anlei Dong, Jingwen Lu, Rangan Majumder, Ji-Rong Wen, Nan Duan, and Weizhu Chen. 2022. SimANS: Simple ambiguous negatives sampling for dense text retrieval. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201922)."},{"key":"e_1_3_3_325_2","first-page":"6120","volume-title":"Proceedings of the Association for Computational Linguistics (ACL\u201922)","author":"Zhou Kun","year":"2022","unstructured":"Kun Zhou, Beichen Zhang, Xin Zhao, and Ji-Rong Wen. 2022. Debiased contrastive learning of unsupervised sentence representations. In Proceedings of the Association for Computational Linguistics (ACL\u201922). 6120\u20136130."},{"key":"e_1_3_3_326_2","article-title":"Towards robust ranker for text retrieval","author":"Zhou Yucheng","year":"2022","unstructured":"Yucheng Zhou, Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Guodong Long, Binxing Jiao, and Daxin Jiang. 2022. Towards robust ranker for text retrieval. Retrieved fromhttps:\/\/abs\/2206.08063","journal-title":"Retrieved from"},{"key":"e_1_3_3_327_2","unstructured":"Yujia Zhou Jing Yao Zhicheng Dou Ledell Wu Peitian Zhang and Ji-Rong Wen. 2022. Ultron: An ultimate retriever on corpus with a model-based indexer. Retrieved from https:\/\/abs\/2208.09257"},{"key":"e_1_3_3_328_2","article-title":"DynamicRetriever: A pre-training model-based IR system with neither sparse nor dense index","author":"Zhou Yujia","year":"2022","unstructured":"Yujia Zhou, Jing Yao, Zhicheng Dou, Ledell Yu Wu, and Ji rong Wen. 2022. DynamicRetriever: A pre-training model-based IR system with neither sparse nor dense index. Retrieved fromhttps:\/\/abs\/2203.00537","journal-title":"Retrieved from"},{"issue":"30","key":"e_1_3_3_329_2","first-page":"6","article-title":"Recall, precision and average precision","volume":"2","author":"Zhu Mu","year":"2004","unstructured":"Mu Zhu. 2004. Recall, precision and average precision. Dept. Stat. Actuar. Sci. (University of Waterloo, Waterloo) 2, 30 (2004), 6.","journal-title":"Dept. Stat. Actuar. Sci. (University of Waterloo, Waterloo)"},{"key":"e_1_3_3_330_2","unstructured":"Yutao Zhu Huaying Yuan Shuting Wang Jiongnan Liu Wenhan Liu Chenlong Deng Zhicheng Dou and Ji-Rong Wen. 2023. Large language models for information retrieval: A survey. Retrieved from https:\/\/arXiv:2308.07107"},{"key":"e_1_3_3_331_2","doi-asserted-by":"publisher","unstructured":"Honglei Zhuang Zhen Qin Rolf Jagerman Kai Hui Ji Ma Jing Lu Jianmo Ni Xuanhui Wang and Michael Bendersky. 2022. RankT5: Fine-tuning T5 for text ranking with ranking losses. Retrieved from https:\/\/abs\/2210.10634. 10.48550\/arXiv.2210.10634","DOI":"10.48550\/arXiv.2210.10634"},{"key":"e_1_3_3_332_2","article-title":"Implicit feedback for dense passage retrieval: A counterfactual approach","author":"Zhuang Shengyao","year":"2022","unstructured":"Shengyao Zhuang, Hang Li, and G. Zuccon. 2022. Implicit feedback for dense passage retrieval: A counterfactual approach. Retrieved fromhttps:\/\/abs\/2204.00718","journal-title":"Retrieved from"},{"key":"e_1_3_3_333_2","unstructured":"Shengyao Zhuang Houxing Ren Linjun Shou Jian Pei Ming Gong Guido Zuccon and Daxin Jiang. 2022. Bridging the gap between indexing and retrieval for differentiable search index with query generation. Retrieved from https:\/\/abs\/2206.10128"},{"key":"e_1_3_3_334_2","doi-asserted-by":"crossref","first-page":"2836","DOI":"10.18653\/v1\/2021.emnlp-main.225","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921)","author":"Zhuang Shengyao","year":"2021","unstructured":"Shengyao Zhuang and Guido Zuccon. 2021. Dealing with typos for BERT-based passage retrieval and ranking. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP\u201921). 2836\u20132842."},{"key":"e_1_3_3_335_2","article-title":"Asyncval: A toolkit for asynchronously validating dense retriever checkpoints during training","author":"Zhuang Shengyao","year":"2022","unstructured":"Shengyao Zhuang and G. Zuccon. 2022. Asyncval: A toolkit for asynchronously validating dense retriever checkpoints during training. Retrieved fromhttps:\/\/abs\/2202.12510","journal-title":"Retrieved from"},{"key":"e_1_3_3_336_2","first-page":"1444","volume-title":"Proceedings of the 45rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Zhuang Shengyao","year":"2022","unstructured":"Shengyao Zhuang and Guido Zuccon. 2022. CharacterBERT and self-teaching for improving the robustness of dense retrievers on queries with typos. In Proceedings of the 45rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1444\u20131454."},{"key":"e_1_3_3_337_2","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1145\/1132956.1132959","article-title":"Inverted files for text search engines","volume":"38","author":"Zobel Justin","year":"2006","unstructured":"Justin Zobel and Alistair Moffat. 2006. Inverted files for text search engines. ACM Comput. Surv. 38 (2006), 6.","journal-title":"ACM Comput. Surv."},{"issue":"4","key":"e_1_3_3_338_2","doi-asserted-by":"crossref","first-page":"453","DOI":"10.1145\/296854.277632","article-title":"Inverted files versus signature files for text indexing","volume":"23","author":"Zobel Justin","year":"1998","unstructured":"Justin Zobel, Alistair Moffat, and Kotagiri Ramamohanarao. 1998. Inverted files versus signature files for text indexing. ACM Trans. Database Syst. 23, 4 (1998), 453\u2013490.","journal-title":"ACM Trans. Database Syst."}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3637870","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3637870","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:49:18Z","timestamp":1750286958000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3637870"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,9]]},"references-count":337,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,7,31]]}},"alternative-id":["10.1145\/3637870"],"URL":"https:\/\/doi.org\/10.1145\/3637870","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"value":"1046-8188","type":"print"},{"value":"1558-2868","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,9]]},"assertion":[{"value":"2023-01-19","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-12-03","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-02-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}