{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T10:00:45Z","timestamp":1775815245717,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":52,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T00:00:00Z","timestamp":1665964800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"the National Natural Science Foundation of China","award":["62006218, 61902381"],"award-info":[{"award-number":["62006218, 61902381"]}]},{"name":"the Lenovo-CAS Joint Lab Youth Scientist Project"},{"name":"the Youth Innovation Promotion Association CAS","award":["20144310, 2021100"],"award-info":[{"award-number":["20144310, 2021100"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,17]]},"DOI":"10.1145\/3511808.3557271","type":"proceedings-article","created":{"date-parts":[[2022,10,16]],"date-time":"2022-10-16T01:29:57Z","timestamp":1665883797000},"page":"191-200","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":45,"title":["CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks"],"prefix":"10.1145","author":[{"given":"Jiangui","family":"Chen","sequence":"first","affiliation":[{"name":"CAS Key Lab of Network Data Science and Technology, ICT, CAS &amp; University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ruqing","family":"Zhang","sequence":"additional","affiliation":[{"name":"CAS Key Lab of Network Data Science and Technology, ICT, CAS &amp; University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiafeng","family":"Guo","sequence":"additional","affiliation":[{"name":"CAS Key Lab of Network Data Science and Technology, ICT, CAS &amp; University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yiqun","family":"Liu","sequence":"additional","affiliation":[{"name":"Dept. CS&amp;T, Beijing National Research Center for Information Science and Technology &amp; Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yixing","family":"Fan","sequence":"additional","affiliation":[{"name":"CAS Key Lab of Network Data Science and Technology, ICT, CAS &amp; University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xueqi","family":"Cheng","sequence":"additional","affiliation":[{"name":"CAS Key Lab of Network Data Science and Technology, ICT, CAS &amp; University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,10,17]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). 54--59","author":"Akbik Alan","year":"2019","unstructured":"Alan Akbik , Tanja Bergmann , Duncan Blythe , Kashif Rasul , Stefan Schweter , and Roland Vollgraf . 2019 . FLAIR: An easy-to-use framework for state-of-the-art NLP . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). 54--59 . Alan Akbik, Tanja Bergmann, Duncan Blythe, Kashif Rasul, Stefan Schweter, and Roland Vollgraf. 2019. FLAIR: An easy-to-use framework for state-of-the-art NLP. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). 54--59."},{"key":"e_1_3_2_2_2_1","volume-title":"Sebastian Riedel, and Fabio Petroni.","author":"Bevilacqua Michele","year":"2022","unstructured":"Michele Bevilacqua , Giuseppe Ottaviano , Patrick Lewis , Wen tau Yih , Sebastian Riedel, and Fabio Petroni. 2022 . Autoregressive Search Engines: Generating Substrings as Document Identifiers. In arXiv pre-print 2204.10629. https:\/\/arxiv.org\/abs\/2204.10628 Michele Bevilacqua, Giuseppe Ottaviano, Patrick Lewis, Wen tau Yih, Sebastian Riedel, and Fabio Petroni. 2022. Autoregressive Search Engines: Generating Substrings as Document Identifiers. In arXiv pre-print 2204.10629. https:\/\/arxiv.org\/abs\/2204.10628"},{"key":"e_1_3_2_2_3_1","volume-title":"NIPS","volume":"19","author":"Burges Christopher","year":"2006","unstructured":"Christopher Burges , Robert Ragno , and Quoc Le . 2006 . Learning to rank with nonsmooth cost functions . NIPS , Vol. 19 (2006). Christopher Burges, Robert Ragno, and Quoc Le. 2006. Learning to rank with nonsmooth cost functions. NIPS, Vol. 19 (2006)."},{"key":"e_1_3_2_2_4_1","unstructured":"Wei-Cheng Chang X Yu Felix Yin-Wen Chang Yiming Yang and Sanjiv Kumar. 2019. Pre-training Tasks for Embedding-based Large-scale Retrieval. In ICLR.  Wei-Cheng Chang X Yu Felix Yin-Wen Chang Yiming Yang and Sanjiv Kumar. 2019. Pre-training Tasks for Embedding-based Large-scale Retrieval. In ICLR."},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1171"},{"key":"e_1_3_2_2_6_1","volume-title":"GERE: Generative Evidence Retrieval for Fact Verification. arXiv preprint arXiv:2204.05511","author":"Chen Jiangui","year":"2022","unstructured":"Jiangui Chen , Ruqing Zhang , Jiafeng Guo , Yixing Fan , and Xueqi Cheng . 2022 . GERE: Generative Evidence Retrieval for Fact Verification. arXiv preprint arXiv:2204.05511 (2022). Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Yixing Fan, and Xueqi Cheng. 2022. GERE: Generative Evidence Retrieval for Fact Verification. arXiv preprint arXiv:2204.05511 (2022)."},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"crossref","unstructured":"Zhuyun Dai and Jamie Callan. 2020. Context-aware term weighting for first stage passage retrieval. In SIGIR. 1533--1536.  Zhuyun Dai and Jamie Callan. 2020. Context-aware term weighting for first stage passage retrieval. In SIGIR. 1533--1536.","DOI":"10.1145\/3397271.3401204"},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"crossref","unstructured":"Zhuyun Dai Chenyan Xiong Jamie Callan and Zhiyuan Liu. 2018. Convolutional neural networks for soft-matching n-grams in ad-hoc search. In WSDM. 126--134.  Zhuyun Dai Chenyan Xiong Jamie Callan and Zhiyuan Liu. 2018. Convolutional neural networks for soft-matching n-grams in ad-hoc search. In WSDM. 126--134.","DOI":"10.1145\/3159652.3159659"},{"key":"e_1_3_2_2_9_1","unstructured":"Nicola De Cao Gautier Izacard Sebastian Riedel and Fabio Petroni. 2020. Autoregressive Entity Retrieval. In ICLR.  Nicola De Cao Gautier Izacard Sebastian Riedel and Fabio Petroni. 2020. Autoregressive Entity Retrieval. In ICLR."},{"key":"e_1_3_2_2_10_1","volume-title":"Wizard of Wikipedia: Knowledge-Powered Conversational Agents. In International Conference on Learning Representations.","author":"Dinan Emily","year":"2018","unstructured":"Emily Dinan , Stephen Roller , Kurt Shuster , Angela Fan , Michael Auli , and Jason Weston . 2018 . Wizard of Wikipedia: Knowledge-Powered Conversational Agents. In International Conference on Learning Representations. Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, and Jason Weston. 2018. Wizard of Wikipedia: Knowledge-Powered Conversational Agents. In International Conference on Learning Representations."},{"key":"e_1_3_2_2_11_1","volume-title":"T-rex: A large scale alignment of natural language with knowledge base triples. In LREC.","author":"Elsahar Hady","year":"2018","unstructured":"Hady Elsahar , Pavlos Vougiouklis , Arslen Remaci , Christophe Gravier , Jonathon Hare , Frederique Laforest , and Elena Simperl . 2018 . T-rex: A large scale alignment of natural language with knowledge base triples. In LREC. Hady Elsahar, Pavlos Vougiouklis, Arslen Remaci, Christophe Gravier, Jonathon Hare, Frederique Laforest, and Elena Simperl. 2018. T-rex: A large scale alignment of natural language with knowledge base triples. In LREC."},{"key":"e_1_3_2_2_12_1","volume-title":"ELI5: Long Form Question Answering","author":"Fan Angela","year":"1865","unstructured":"Angela Fan , Yacine Jernite , Ethan Perez , David Grangier , Jason Weston , and Michael Auli . 2019. ELI5: Long Form Question Answering . In ACL. Association for Computational Linguistics , Florence, Italy , 3558--3567. https:\/\/doi.org\/10. 1865 3\/v1\/P19--1346 10.18653\/v1 Angela Fan, Yacine Jernite, Ethan Perez, David Grangier, Jason Weston, and Michael Auli. 2019. ELI5: Long Form Question Answering. In ACL. Association for Computational Linguistics, Florence, Italy, 3558--3567. https:\/\/doi.org\/10.18653\/v1\/P19--1346"},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"crossref","unstructured":"Jibril Frej Philippe Mulhem Didier Schwab and Jean-Pierre Chevallet. 2020. Learning term discrimination. In SIGIR. 1993--1996.  Jibril Frej Philippe Mulhem Didier Schwab and Jean-Pierre Chevallet. 2020. Learning term discrimination. In SIGIR. 1993--1996.","DOI":"10.1145\/3397271.3401211"},{"key":"e_1_3_2_2_14_1","unstructured":"Tianyu Gao Xingcheng Yao and Danqi Chen. 2021. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In EMNLP. 6894--6910.  Tianyu Gao Xingcheng Yao and Danqi Chen. 2021. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In EMNLP. 6894--6910."},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3486250"},{"key":"e_1_3_2_2_16_1","unstructured":"Jiafeng Guo Yixing Fan Qingyao Ai and W Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In CIKM. 55--64.  Jiafeng Guo Yixing Fan Qingyao Ai and W Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In CIKM. 55--64."},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.3233\/SW-170273"},{"key":"e_1_3_2_2_18_1","volume-title":"Ilaria Bordino, Hagen F\u00fcrstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum.","author":"Hoffart Johannes","year":"2011","unstructured":"Johannes Hoffart , Mohamed Amir Yosef , Ilaria Bordino, Hagen F\u00fcrstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011 . Robust disambiguation of named entities in text. In EMNLP. 782--792. Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen F\u00fcrstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust disambiguation of named entities in text. In EMNLP. 782--792."},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"crossref","unstructured":"Sebastian Hofst\"atter Sheng-Chieh Lin Jheng-Hong Yang Jimmy Lin and Allan Hanbury. 2021. Efficiently teaching an effective dense retriever with balanced topic aware sampling. In SIGIR. 113--122.  Sebastian Hofst\"atter Sheng-Chieh Lin Jheng-Hong Yang Jimmy Lin and Allan Hanbury. 2021. Efficiently teaching an effective dense retriever with balanced topic aware sampling. In SIGIR. 113--122.","DOI":"10.1145\/3404835.3462891"},{"key":"e_1_3_2_2_20_1","volume-title":"TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension","author":"Joshi Mandar","unstructured":"Mandar Joshi , Eunsol Choi , Daniel Weld , and Luke Zettlemoyer . 2017. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension . In ACL. Association for Computational Linguistics , Vancouver, Canada , 1601--1611. Mandar Joshi, Eunsol Choi, Daniel Weld, and Luke Zettlemoyer. 2017. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. In ACL. Association for Computational Linguistics, Vancouver, Canada, 1601--1611."},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"crossref","unstructured":"Vladimir Karpukhin Barlas Oguz Sewon Min Patrick Lewis Ledell Wu Sergey Edunov Danqi Chen and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In EMNLP. 6769--6781.  Vladimir Karpukhin Barlas Oguz Sewon Min Patrick Lewis Ledell Wu Sergey Edunov Danqi Chen and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In EMNLP. 6769--6781.","DOI":"10.18653\/v1\/2020.emnlp-main.550"},{"key":"e_1_3_2_2_22_1","volume-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186.","author":"Ming-Wei Chang Jacob Devlin","year":"2019","unstructured":"Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186."},{"key":"e_1_3_2_2_23_1","volume-title":"Colbert: Efficient and effective passage search via contextualized late interaction over bert. In SIGIR. 39--48.","author":"Khattab Omar","year":"2020","unstructured":"Omar Khattab and Matei Zaharia . 2020 . Colbert: Efficient and effective passage search via contextualized late interaction over bert. In SIGIR. 39--48. Omar Khattab and Matei Zaharia. 2020. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In SIGIR. 39--48."},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00276"},{"key":"e_1_3_2_2_25_1","unstructured":"Kenton Lee Ming-Wei Chang and Kristina Toutanova. 2019. Latent Retrieval for Weakly Supervised Open Domain Question Answering. In ACL. 6086--6096.  Kenton Lee Ming-Wei Chang and Kristina Toutanova. 2019. Latent Retrieval for Weakly Supervised Open Domain Question Answering. In ACL. 6086--6096."},{"key":"e_1_3_2_2_26_1","volume-title":"TABi: Type-Aware Bi-Encoders for Open-Domain Entity Retrieval. arXiv preprint arXiv:2204.08173","author":"Leszczynski Megan","year":"2022","unstructured":"Megan Leszczynski , Daniel Y Fu , Mayee F Chen , and Christopher R\u00e9. 2022. TABi: Type-Aware Bi-Encoders for Open-Domain Entity Retrieval. arXiv preprint arXiv:2204.08173 ( 2022 ). Megan Leszczynski, Daniel Y Fu, Mayee F Chen, and Christopher R\u00e9. 2022. TABi: Type-Aware Bi-Encoders for Open-Domain Entity Retrieval. arXiv preprint arXiv:2204.08173 (2022)."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"crossref","unstructured":"Omer Levy Minjoon Seo Eunsol Choi and Luke Zettlemoyer. 2017. Zero-Shot Relation Extraction via Reading Comprehension. In CoNLL. 333--342.  Omer Levy Minjoon Seo Eunsol Choi and Luke Zettlemoyer. 2017. Zero-Shot Relation Extraction via Reading Comprehension. In CoNLL. 333--342.","DOI":"10.18653\/v1\/K17-1034"},{"key":"e_1_3_2_2_28_1","volume-title":"BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In ACL. 7871--7880.","author":"Lewis Mike","year":"2020","unstructured":"Mike Lewis , Yinhan Liu , Naman Goyal , Marjan Ghazvininejad , Abdelrahman Mohamed , Omer Levy , Veselin Stoyanov , and Luke Zettlemoyer . 2020 a. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In ACL. 7871--7880. Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020a. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In ACL. 7871--7880."},{"key":"e_1_3_2_2_29_1","first-page":"9459","article-title":"Retrieval-augmented generation for knowledge-intensive nlp tasks","volume":"33","author":"Lewis Patrick","year":"2020","unstructured":"Patrick Lewis , Ethan Perez , Aleksandra Piktus , Fabio Petroni , Vladimir Karpukhin , Naman Goyal , Heinrich K\u00fcttler , Mike Lewis , Wen-tau Yih, Tim Rockt\"aschel, 2020 b. Retrieval-augmented generation for knowledge-intensive nlp tasks . NIPS , Vol. 33 (2020), 9459 -- 9474 . Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K\u00fcttler, Mike Lewis, Wen-tau Yih, Tim Rockt\"aschel, et al. 2020b. Retrieval-augmented generation for knowledge-intensive nlp tasks. NIPS, Vol. 33 (2020), 9459--9474.","journal-title":"NIPS"},{"key":"e_1_3_2_2_30_1","volume-title":"Learning to rank for information retrieval and natural language processing. Synthesis lectures on human language technologies","author":"Hang Li.","year":"2014","unstructured":"Hang Li. 2014. Learning to rank for information retrieval and natural language processing. Synthesis lectures on human language technologies , Vol. 7 , 3 ( 2014 ), 1--121. Hang Li. 2014. Learning to rank for information retrieval and natural language processing. Synthesis lectures on human language technologies, Vol. 7, 3 (2014), 1--121."},{"key":"e_1_3_2_2_31_1","volume-title":"Foundations and Trends\u00ae in Information Retrieval","volume":"3","author":"Tie-Yan","year":"2009","unstructured":"Tie-Yan Liu et al. 2009. Learning to rank for information retrieval . Foundations and Trends\u00ae in Information Retrieval , Vol. 3 , 3 ( 2009 ), 225--331. Tie-Yan Liu et al. 2009. Learning to rank for information retrieval. Foundations and Trends\u00ae in Information Retrieval, Vol. 3, 3 (2009), 225--331."},{"key":"e_1_3_2_2_32_1","volume-title":"Pre-train a Discriminative Text Encoder for Dense Retrieval via Contrastive Span Prediction. arXiv preprint arXiv:2204.10641","author":"Ma Xinyu","year":"2022","unstructured":"Xinyu Ma , Jiafeng Guo , Ruqing Zhang , Yixing Fan , and Xueqi Cheng . 2022. Pre-train a Discriminative Text Encoder for Dense Retrieval via Contrastive Span Prediction. arXiv preprint arXiv:2204.10641 ( 2022 ). Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, and Xueqi Cheng. 2022. Pre-train a Discriminative Text Encoder for Dense Retrieval via Contrastive Span Prediction. arXiv preprint arXiv:2204.10641 (2022)."},{"key":"e_1_3_2_2_33_1","volume-title":"Prop: Pre-training with representative words prediction for ad-hoc retrieval. In WSDM. 283--291.","author":"Ma Xinyu","year":"2021","unstructured":"Xinyu Ma , Jiafeng Guo , Ruqing Zhang , Yixing Fan , Xiang Ji , and Xueqi Cheng . 2021 b. Prop: Pre-training with representative words prediction for ad-hoc retrieval. In WSDM. 283--291. Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Xiang Ji, and Xueqi Cheng. 2021b. Prop: Pre-training with representative words prediction for ad-hoc retrieval. In WSDM. 283--291."},{"key":"e_1_3_2_2_34_1","unstructured":"Xinyu Ma Jiafeng Guo Ruqing Zhang Yixing Fan Yingyan Li and Xueqi Cheng. 2021c. B-PROP: bootstrapped pre-training with representative words prediction for ad-hoc retrieval. In SIGIR. 1513--1522.  Xinyu Ma Jiafeng Guo Ruqing Zhang Yixing Fan Yingyan Li and Xueqi Cheng. 2021c. B-PROP: bootstrapped pre-training with representative words prediction for ad-hoc retrieval. In SIGIR. 1513--1522."},{"key":"e_1_3_2_2_35_1","unstructured":"Zhengyi Ma Zhicheng Dou Wei Xu Xinyu Zhang Hao Jiang Zhao Cao and Ji-Rong Wen. 2021a. Pre-training for Ad-hoc Retrieval: Hyperlink is Also You Need. In CIKM. 1212--1221.  Zhengyi Ma Zhicheng Dou Wei Xu Xinyu Zhang Hao Jiang Zhao Cao and Ji-Rong Wen. 2021a. Pre-training for Ad-hoc Retrieval: Hyperlink is Also You Need. In CIKM. 1212--1221."},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"crossref","unstructured":"Jean Maillard Vladimir Karpukhin Fabio Petroni Wen-tau Yih Barlas Oguz Veselin Stoyanov and Gargi Ghosh. 2021. Multi-Task Retrieval for Knowledge-Intensive Tasks. In ACL. 1098--1111.  Jean Maillard Vladimir Karpukhin Fabio Petroni Wen-tau Yih Barlas Oguz Veselin Stoyanov and Gargi Ghosh. 2021. Multi-Task Retrieval for Knowledge-Intensive Tasks. In ACL. 1098--1111.","DOI":"10.18653\/v1\/2021.acl-long.89"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3476415.3476428"},{"key":"e_1_3_2_2_38_1","volume-title":"James Thorne, Yacine Jernite, Vladimir Karpukhin, Jean Maillard, Vassilis Plachouras, Tim Rockt\"aschel, and Sebastian Riedel.","author":"Petroni Fabio","year":"2021","unstructured":"Fabio Petroni , Aleksandra Piktus , Angela Fan , Patrick Lewis , Majid Yazdani , Nicola De Cao , James Thorne, Yacine Jernite, Vladimir Karpukhin, Jean Maillard, Vassilis Plachouras, Tim Rockt\"aschel, and Sebastian Riedel. 2021 . KILT: a Benchmark for Knowledge Intensive Language Tasks. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics , Online, 2523--2544. https:\/\/doi.org\/10.18653\/v1\/2021.naacl-main.200 10.18653\/v1 Fabio Petroni, Aleksandra Piktus, Angela Fan, Patrick Lewis, Majid Yazdani, Nicola De Cao, James Thorne, Yacine Jernite, Vladimir Karpukhin, Jean Maillard, Vassilis Plachouras, Tim Rockt\"aschel, and Sebastian Riedel. 2021. KILT: a Benchmark for Knowledge Intensive Language Tasks. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 2523--2544. https:\/\/doi.org\/10.18653\/v1\/2021.naacl-main.200"},{"key":"e_1_3_2_2_39_1","first-page":"1","article-title":"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel , Noam Shazeer , Adam Roberts , Katherine Lee , Sharan Narang , Michael Matena , Yanqi Zhou , Wei Li , and Peter J Liu . 2020 . Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer . Journal of Machine Learning Research , Vol. 21 (2020), 1 -- 67 . Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, Vol. 21 (2020), 1--67.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_2_40_1","volume-title":"The probabilistic relevance framework: BM25 and beyond","author":"Robertson Stephen","unstructured":"Stephen Robertson and Hugo Zaragoza . 2009. The probabilistic relevance framework: BM25 and beyond . Now Publishers Inc . Stephen Robertson and Hugo Zaragoza. 2009. The probabilistic relevance framework: BM25 and beyond. Now Publishers Inc."},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.4630270302"},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/361219.361220"},{"key":"e_1_3_2_2_43_1","unstructured":"Ilya Sutskever James Martens and Geoffrey E Hinton. 2011. Generating text with recurrent neural networks. In ICML.  Ilya Sutskever James Martens and Geoffrey E Hinton. 2011. Generating text with recurrent neural networks. In ICML."},{"key":"e_1_3_2_2_44_1","unstructured":"Yi Tay Vinh Q Tran Mostafa Dehghani Jianmo Ni Dara Bahri Harsh Mehta Zhen Qin Kai Hui Zhe Zhao Jai Gupta etal 2022. Transformer memory as a differentiable search index. arXiv preprint arXiv:2202.06991 (2022).  Yi Tay Vinh Q Tran Mostafa Dehghani Jianmo Ni Dara Bahri Harsh Mehta Zhen Qin Kai Hui Zhe Zhao Jai Gupta et al. 2022. Transformer memory as a differentiable search index. arXiv preprint arXiv:2202.06991 (2022)."},{"key":"e_1_3_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1074"},{"key":"e_1_3_2_2_46_1","unstructured":"Ledell Wu Fabio Petroni Martin Josifoski Sebastian Riedel and Luke Zettlemoyer. 2020. Scalable Zero-shot Entity Linking with Dense Entity Retrieval. In EMNLP. 6397--6407.  Ledell Wu Fabio Petroni Martin Josifoski Sebastian Riedel and Luke Zettlemoyer. 2020. Scalable Zero-shot Entity Linking with Dense Entity Retrieval. In EMNLP. 6397--6407."},{"key":"e_1_3_2_2_47_1","unstructured":"Lee Xiong Chenyan Xiong Ye Li Kwok-Fung Tang Jialin Liu Paul N Bennett Junaid Ahmed and Arnold Overwijk. 2020. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In ICLR.  Lee Xiong Chenyan Xiong Ye Li Kwok-Fung Tang Jialin Liu Paul N Bennett Junaid Ahmed and Arnold Overwijk. 2020. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In ICLR."},{"key":"e_1_3_2_2_48_1","volume-title":"Manning","author":"Yang Zhilin","year":"2018","unstructured":"Zhilin Yang , Peng Qi , Saizheng Zhang , Yoshua Bengio , William Cohen , Ruslan Salakhutdinov , and Christopher D . Manning . 2018 . HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. In EMNLP. Association for Computational Linguistics , Brussels, Belgium, 2369--2380. https:\/\/doi.org\/10.18653\/v1\/D18--1259 10.18653\/v1 Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhutdinov, and Christopher D. Manning. 2018. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. In EMNLP. Association for Computational Linguistics, Brussels, Belgium, 2369--2380. https:\/\/doi.org\/10.18653\/v1\/D18--1259"},{"key":"e_1_3_2_2_49_1","doi-asserted-by":"crossref","unstructured":"Jingtao Zhan Jiaxin Mao Yiqun Liu Jiafeng Guo Min Zhang and Shaoping Ma. 2021. Optimizing dense retrieval model training with hard negatives. In SIGIR. 1503--1512.  Jingtao Zhan Jiaxin Mao Yiqun Liu Jiafeng Guo Min Zhang and Shaoping Ma. 2021. Optimizing dense retrieval model training with hard negatives. In SIGIR. 1503--1512.","DOI":"10.1145\/3404835.3462880"},{"key":"e_1_3_2_2_50_1","volume-title":"International Conference on Machine Learning. PMLR, 11328--11339","author":"Zhang Jingqing","year":"2020","unstructured":"Jingqing Zhang , Yao Zhao , Mohammad Saleh , and Peter Liu . 2020 . Pegasus: Pre-training with extracted gap-sentences for abstractive summarization . In International Conference on Machine Learning. PMLR, 11328--11339 . Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. 2020. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In International Conference on Machine Learning. PMLR, 11328--11339."},{"key":"e_1_3_2_2_51_1","doi-asserted-by":"crossref","unstructured":"Guoqing Zheng and Jamie Callan. 2015. Learning to reweight terms with distributed representations. In SIGIR. 575--584.  Guoqing Zheng and Jamie Callan. 2015. Learning to reweight terms with distributed representations. In SIGIR. 575--584.","DOI":"10.1145\/2766462.2767700"},{"key":"e_1_3_2_2_52_1","volume-title":"DynamicRetriever: A Pre-training Model-based IR System with Neither Sparse nor Dense Index. arXiv preprint arXiv:2203.00537","author":"Zhou Yujia","year":"2022","unstructured":"Yujia Zhou , Jing Yao , Zhicheng Dou , Ledell Wu , and Ji-Rong Wen . 2022. DynamicRetriever: A Pre-training Model-based IR System with Neither Sparse nor Dense Index. arXiv preprint arXiv:2203.00537 ( 2022 ). Yujia Zhou, Jing Yao, Zhicheng Dou, Ledell Wu, and Ji-Rong Wen. 2022. DynamicRetriever: A Pre-training Model-based IR System with Neither Sparse nor Dense Index. arXiv preprint arXiv:2203.00537 (2022)."}],"event":{"name":"CIKM '22: The 31st ACM International Conference on Information and Knowledge Management","location":"Atlanta GA USA","acronym":"CIKM '22","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 31st ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557271","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3511808.3557271","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:28Z","timestamp":1750182568000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557271"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,17]]},"references-count":52,"alternative-id":["10.1145\/3511808.3557271","10.1145\/3511808"],"URL":"https:\/\/doi.org\/10.1145\/3511808.3557271","relation":{},"subject":[],"published":{"date-parts":[[2022,10,17]]},"assertion":[{"value":"2022-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}