{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,16]],"date-time":"2026-07-16T05:15:07Z","timestamp":1784178907690,"version":"3.55.0"},"publisher-location":"New York, NY, USA","reference-count":71,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,5,13]],"date-time":"2024-05-13T00:00:00Z","timestamp":1715558400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,5,13]]},"DOI":"10.1145\/3589334.3645477","type":"proceedings-article","created":{"date-parts":[[2024,5,8]],"date-time":"2024-05-08T07:08:13Z","timestamp":1715152093000},"page":"1441-1452","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":28,"title":["Scalable and Effective Generative Information Retrieval"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0000-2699-8460","authenticated-orcid":false,"given":"Hansi","family":"Zeng","sequence":"first","affiliation":[{"name":"University of Massachusetts Amherst, Amherst, MA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5339-5817","authenticated-orcid":false,"given":"Chen","family":"Luo","sequence":"additional","affiliation":[{"name":"Amazon, Palo Alto, CA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1295-2829","authenticated-orcid":false,"given":"Bowen","family":"Jin","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Champaign, IL, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4820-9201","authenticated-orcid":false,"given":"Sheikh Muhammad","family":"Sarwar","sequence":"additional","affiliation":[{"name":"Amazon, Palo Alto, CA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4450-2005","authenticated-orcid":false,"given":"Tianxin","family":"Wei","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Champaign, IL, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0800-3340","authenticated-orcid":false,"given":"Hamed","family":"Zamani","sequence":"additional","affiliation":[{"name":"University of Massachusetts Amherst, Amherst, MA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2024,5,13]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Additive Quantization for Extreme Vector Compression. 2014 IEEE Conference on Computer Vision and Pattern Recognition, 931--938","author":"Babenko Artem","unstructured":"Artem Babenko and Victor S. Lempitsky. 2014. Additive Quantization for Extreme Vector Compression. 2014 IEEE Conference on Computer Vision and Pattern Recognition, 931--938. https:\/\/api.semanticscholar.org\/CorpusID:125463275"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553380"},{"key":"e_1_3_2_2_3_1","volume-title":"Sebastian Riedel, and Fabio Petroni.","author":"Bevilacqua Michele","year":"2022","unstructured":"Michele Bevilacqua, Giuseppe Ottaviano, Patrick Lewis, Wen tau Yih, Sebastian Riedel, and Fabio Petroni. 2022. Autoregressive Search Engines: Generating Substrings as Document Identifiers. ArXiv, Vol. abs\/2204.10628. https:\/\/api.semanticscholar.org\/CorpusID:248366293"},{"key":"e_1_3_2_2_4_1","volume-title":"MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. ArXiv","author":"Campos Daniel Fernando","year":"2016","unstructured":"Daniel Fernando Campos, Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, Li Deng, and Bhaskar Mitra. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. ArXiv, Vol. abs\/1611.09268. https:\/\/api.semanticscholar.org\/CorpusID:1289517"},{"key":"e_1_3_2_2_5_1","volume-title":"Autoregressive Entity Retrieval. ArXiv","author":"Cao Nicola De","unstructured":"Nicola De Cao, Gautier Izacard, Sebastian Riedel, and Fabio Petroni. 2020. Autoregressive Entity Retrieval. ArXiv, Vol. abs\/2010.00904. https:\/\/api.semanticscholar.org\/CorpusID:222125277"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3583780.3614821"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3539618.3591631"},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3531827"},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3511808.3557271"},{"key":"e_1_3_2_2_10_1","volume-title":"Switzerland)","author":"Chen Yongjian","unstructured":"Yongjian Chen, Tao Guan, and Cheng Wang. 2010. Approximate Nearest Neighbor Search by Residual Vector Quantization. Sensors (Basel, Switzerland), Vol. 10, 11259 -- 11273. https:\/\/api.semanticscholar.org\/CorpusID:33774240"},{"key":"e_1_3_2_2_11_1","unstructured":"David R. Cheriton. 2019. From doc2query to docTTTTTquery. https:\/\/api.semanticscholar.org\/CorpusID:208612557"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3511808.3557456"},{"key":"e_1_3_2_2_13_1","unstructured":"Hyung Won Chung Le Hou S. Longpre Barret Zoph Yi Tay William Fedus Eric Li Xuezhi Wang Mostafa Dehghani Siddhartha Brahma Albert Webson Shixiang Shane Gu Zhuyun Dai Mirac Suzgun Xinyun Chen Aakanksha Chowdhery Dasha Valter Sharan Narang Gaurav Mishra Adams Wei Yu Vincent Zhao Yanping Huang Andrew M. Dai Hongkun Yu Slav Petrov Ed Huai hsin Chi Jeff Dean Jacob Devlin Adam Roberts Denny Zhou Quoc V. Le and Jason Wei. 2022. Scaling Instruction-Finetuned Language Models. ArXiv Vol. abs\/2210.11416. https:\/\/api.semanticscholar.org\/CorpusID:253018554"},{"key":"e_1_3_2_2_14_1","volume-title":"Overview of the TREC 2019 Deep Learning Track. In TREC.","author":"Craswell Nick","year":"2019","unstructured":"Nick Craswell, Bhaskar Mitra, Emine Yilmaz, and Daniel Campos. 2019. Overview of the TREC 2019 Deep Learning Track. In TREC."},{"key":"e_1_3_2_2_15_1","volume-title":"Overview of the TREC 2020 Deep Learning Track. ArXiv","volume":"2102","author":"Craswell Nick","unstructured":"Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Fernando Campos, and Ellen M. Voorhees. 2021. Overview of the TREC 2020 Deep Learning Track. ArXiv, Vol. abs\/2102.07662. https:\/\/api.semanticscholar.org\/CorpusID:212737158"},{"key":"e_1_3_2_2_16_1","volume-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In North American","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In North American Chapter of the Association for Computational Linguistics. https:\/\/api.semanticscholar.org\/CorpusID:52967399"},{"key":"e_1_3_2_2_17_1","volume-title":"SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval. ArXiv","author":"Formal Thibault","unstructured":"Thibault Formal, C. Lassance, Benjamin Piwowarski, and St\u00e9phane Clinchant. 2021a. SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval. ArXiv, Vol. abs\/2109.10086. https:\/\/api.semanticscholar.org\/CorpusID:237581550"},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3463098"},{"key":"e_1_3_2_2_19_1","volume-title":"Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval. ArXiv","author":"Gao Luyu","unstructured":"Luyu Gao and Jamie Callan. 2021. Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval. ArXiv, Vol. abs\/2108.05540. https:\/\/api.semanticscholar.org\/CorpusID:236987190"},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.240"},{"key":"e_1_3_2_2_21_1","volume-title":"International conference on machine learning. PMLR, 3929--3938","author":"Guu Kelvin","year":"2020","unstructured":"Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Mingwei Chang. 2020. Retrieval augmented language model pre-training. In International conference on machine learning. PMLR, 3929--3938."},{"key":"e_1_3_2_2_22_1","volume-title":"Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation. ArXiv","author":"Sebastian","unstructured":"Sebastian Hofst\"atter, Sophia Althammer, Michael Schr\u00f6der, Mete Sertkan, and Allan Hanbury. 2020. Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation. ArXiv, Vol. abs\/2010.02666. https:\/\/api.semanticscholar.org\/CorpusID:222141041"},{"key":"e_1_3_2_2_23_1","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. https:\/\/api.semanticscholar.org\/CorpusID:233231706","author":"Sebastian","year":"2021","unstructured":"Sebastian Hofst\"atter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy J. Lin, and Allan Hanbury. 2021. Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. https:\/\/api.semanticscholar.org\/CorpusID:233231706"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3539618.3591769"},{"key":"e_1_3_2_2_25_1","volume-title":"Language Models As Semantic Indexers. ArXiv","author":"Jin Bowen","unstructured":"Bowen Jin, Hansi Zeng, Guoyin Wang, Xiusi Chen, Tianxin Wei, Ruirui Li, Zhengyang Wang, Zheng Li, Yang Li, Hanqing Lu, Suhang Wang, Jiawei Han, and Xianfeng Tang. 2023. Language Models As Semantic Indexers. ArXiv, Vol. abs\/2310.07815. https:\/\/api.semanticscholar.org\/CorpusID:263909224"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBDATA.2019.2921572"},{"key":"e_1_3_2_2_27_1","volume-title":"Dense Passage Retrieval for Open-Domain Question Answering. In Conference on Empirical Methods in Natural Language Processing. https:\/\/api.semanticscholar.org\/CorpusID:215737187","author":"Karpukhin Vladimir","year":"2020","unstructured":"Vladimir Karpukhin, Barlas O?uz, Sewon Min, Patrick Lewis, Ledell Yu Wu, Sergey Edunov, Danqi Chen, and Wen tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In Conference on Empirical Methods in Natural Language Processing. https:\/\/api.semanticscholar.org\/CorpusID:215737187"},{"key":"e_1_3_2_2_28_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2014","unstructured":"Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR, Vol. abs\/1412.6980. https:\/\/api.semanticscholar.org\/CorpusID:6628106"},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00276"},{"key":"e_1_3_2_2_30_1","volume-title":"Nonparametric Decoding for Generative Retrieval. In Annual Meeting of the Association for Computational Linguistics. https:\/\/api.semanticscholar.org\/CorpusID:258959550","author":"Lee Hyunji","year":"2022","unstructured":"Hyunji Lee, Jaeyoung Kim, Hoyeon Chang, Hanseok Oh, Sohee Yang, Vladimir Karpukhin, Yi Lu, and Minjoon Seo. 2022. Nonparametric Decoding for Generative Retrieval. In Annual Meeting of the Association for Computational Linguistics. https:\/\/api.semanticscholar.org\/CorpusID:258959550"},{"key":"e_1_3_2_2_31_1","volume-title":"2023 a. Learning to Rank in Generative Retrieval. ArXiv","author":"Li Yongqing","unstructured":"Yongqing Li, Nan Yang, Liang Wang, Furu Wei, and Wenjie Li. 2023 a. Learning to Rank in Generative Retrieval. ArXiv, Vol. abs\/2306.15222. https:\/\/api.semanticscholar.org\/CorpusID:259262395"},{"key":"e_1_3_2_2_32_1","volume-title":"Multiview Identifiers Enhanced Generative Retrieval. In Annual Meeting of the Association for Computational Linguistics. https:\/\/api.semanticscholar.org\/CorpusID:258947148","author":"Li Yongqing","year":"2023","unstructured":"Yongqing Li, Nan Yang, Liang Wang, Furu Wei, and Wenjie Li. 2023 b. Multiview Identifiers Enhanced Generative Retrieval. In Annual Meeting of the Association for Computational Linguistics. https:\/\/api.semanticscholar.org\/CorpusID:258947148"},{"key":"e_1_3_2_2_33_1","volume-title":"Lin","author":"Lin Sheng-Chieh","year":"2020","unstructured":"Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy J. Lin. 2020. Distilling Dense Representations for Ranking using Tightly-Coupled Teachers. ArXiv, Vol. abs\/2010.11386. https:\/\/api.semanticscholar.org\/CorpusID:225041183"},{"key":"e_1_3_2_2_34_1","volume-title":"RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv","author":"Liu Yinhan","year":"1989","unstructured":"Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv, Vol. abs\/1907.11692. https:\/\/api.semanticscholar.org\/CorpusID:198953378"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462869"},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401094"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2889473"},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2019.2934906"},{"key":"e_1_3_2_2_39_1","volume-title":"DSI: Updating Transformer Memory with New Documents. ArXiv","author":"Mehta Sanket Vaibhav","year":"2022","unstructured":"Sanket Vaibhav Mehta, Jai Gupta, Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Jinfeng Rao, Marc Najork, Emma Strubell, and Donald Metzler. 2022. DSI: Updating Transformer Memory with New Documents. ArXiv, Vol. abs\/2212.09744. https:\/\/api.semanticscholar.org\/CorpusID:254854290"},{"key":"e_1_3_2_2_40_1","volume-title":"Noah Constant, Ji Ma, Keith B. Hall, Daniel Matthew Cer, and Yinfei Yang.","author":"Ni Jianmo","year":"2021","unstructured":"Jianmo Ni, Gustavo Hernandez Abrego, Noah Constant, Ji Ma, Keith B. Hall, Daniel Matthew Cer, and Yinfei Yang. 2021. Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models. ArXiv, Vol. abs\/2108.08877. https:\/\/api.semanticscholar.org\/CorpusID:237260023"},{"key":"e_1_3_2_2_41_1","volume-title":"Passage Re-ranking with BERT. ArXiv","author":"Nogueira Rodrigo","unstructured":"Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. ArXiv, Vol. abs\/1901.04085. https:\/\/api.semanticscholar.org\/CorpusID:58004692"},{"key":"e_1_3_2_2_42_1","volume-title":"Lin","author":"Nogueira Rodrigo","year":"2020","unstructured":"Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy J. Lin. 2020. Document Ranking with a Pretrained Sequence-to-Sequence Model. In Findings. https:\/\/api.semanticscholar.org\/CorpusID:212725651"},{"key":"e_1_3_2_2_43_1","volume-title":"Jan Leike, and Ryan J. Lowe.","author":"Ouyang Long","year":"2022","unstructured":"Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke E. Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Francis Christiano, Jan Leike, and Ryan J. Lowe. 2022. Training language models to follow instructions with human feedback. ArXiv, Vol. abs\/2203.02155. https:\/\/api.semanticscholar.org\/CorpusID:246426909"},{"key":"e_1_3_2_2_44_1","volume-title":"Minimizing FLOPs to Learn Efficient Sparse Representations. ArXiv","author":"Paria Biswajit","unstructured":"Biswajit Paria, Chih-Kuan Yeh, Ning Xu, Barnab\u00e1s P\u00f3czos, Pradeep Ravikumar, and Ian En-Hsu Yen. 2020. Minimizing FLOPs to Learn Efficient Sparse Representations. ArXiv, Vol. abs\/2004.05665. https:\/\/api.semanticscholar.org\/CorpusID:211107043"},{"key":"e_1_3_2_2_45_1","volume-title":"Tran","author":"Pradeep Ronak","year":"2023","unstructured":"Ronak Pradeep, Kai Hui, Jai Gupta, \u00c1d\u00e1m D\u00e1niel Lelkes, Honglei Zhuang, Jimmy Lin, Donald Metzler, and Vinh Q. Tran. 2023. How Does Generative Retrieval Scale to Millions of Passages? ArXiv, Vol. abs\/2305.11841. https:\/\/api.semanticscholar.org\/CorpusID:258822999"},{"key":"e_1_3_2_2_46_1","volume-title":"North American","author":"Qu Yingqi","year":"1815","unstructured":"Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Xin Zhao, Daxiang Dong, Hua Wu, and Haifeng Wang. 2020. RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering. In North American Chapter of the Association for Computational Linguistics. https:\/\/api.semanticscholar.org\/CorpusID:231815627"},{"key":"e_1_3_2_2_47_1","article-title":"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer","volume":"21","author":"Raffel Colin","year":"2019","unstructured":"Colin Raffel, Noam M. Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res., Vol. 21, 140:1--140:67. https:\/\/api.semanticscholar.org\/CorpusID:204838007","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_2_2_48_1","volume-title":"Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Q","author":"Rajput Shashank","year":"2023","unstructured":"Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan H. Keshavan, Trung Hieu Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Q. Tran, Jonah Samost, Maciej Kula, Ed H. Chi, and Maheswaran Sathiamoorthy. 2023. Recommender Systems with Generative Retrieval. ArXiv, Vol. abs\/2305.05065. https:\/\/api.semanticscholar.org\/CorpusID:258564854"},{"key":"e_1_3_2_2_49_1","volume-title":"J. Liu, Huaqin Wu, Ji rong Wen, and Haifeng Wang.","author":"Ren Ruiyang","year":"2023","unstructured":"Ruiyang Ren, Wayne Xin Zhao, J. Liu, Huaqin Wu, Ji rong Wen, and Haifeng Wang. 2023. TOME: A Two-stage Approach for Model-based Retrieval. ArXiv, Vol. abs\/2305.11161. https:\/\/api.semanticscholar.org\/CorpusID:258762633"},{"key":"e_1_3_2_2_50_1","volume-title":"Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. https:\/\/api.semanticscholar.org\/CorpusID:16829071","author":"Stephen","unstructured":"Stephen E. Robertson and Steve Walker. 1997. On relevance weights with little relevance information. In Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. https:\/\/api.semanticscholar.org\/CorpusID:16829071"},{"key":"e_1_3_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1561\/1500000019"},{"key":"e_1_3_2_2_52_1","volume-title":"a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv","author":"Sanh Victor","year":"2036","unstructured":"Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv, Vol. abs\/1910.01108. https:\/\/api.semanticscholar.org\/CorpusID:203626972"},{"key":"e_1_3_2_2_53_1","volume-title":"Learning to Tokenize for Generative Retrieval. ArXiv","author":"Sun Weiwei","unstructured":"Weiwei Sun, Lingyong Yan, Zheng Chen, Shuaiqiang Wang, Haichao Zhu, Pengjie Ren, Zhumin Chen, Dawei Yin, M. de Rijke, and Zhaochun Ren. 2023. Learning to Tokenize for Generative Retrieval. ArXiv, Vol. abs\/2304.04171. https:\/\/api.semanticscholar.org\/CorpusID:258048596"},{"key":"e_1_3_2_2_54_1","volume-title":"Le","author":"Sutskever Ilya","year":"2014","unstructured":"Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. ArXiv, Vol. abs\/1409.3215. https:\/\/api.semanticscholar.org\/CorpusID:7961699"},{"key":"e_1_3_2_2_55_1","volume-title":"Transformer Memory as a Differentiable Search Index. ArXiv","author":"Tay Yi","unstructured":"Yi Tay, Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, and Donald Metzler. 2022. Transformer Memory as a Differentiable Search Index. ArXiv, Vol. abs\/2202.06991. https:\/\/api.semanticscholar.org\/CorpusID:246863488"},{"key":"e_1_3_2_2_56_1","volume-title":"Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, et al.","author":"Thoppilan Romal","year":"2022","unstructured":"Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, et al. 2022. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239."},{"key":"e_1_3_2_2_57_1","first-page":"2579","article-title":"Visualizing Data using t-SNE","volume":"9","author":"van der Maaten Laurens","year":"2008","unstructured":"Laurens van der Maaten and Geoffrey E. Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research, Vol. 9, 2579--2605. https:\/\/api.semanticscholar.org\/CorpusID:5855042","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2014.2324592"},{"key":"e_1_3_2_2_59_1","volume-title":"A Neural Corpus Indexer for Document Retrieval. ArXiv","author":"Wang Yujing","unstructured":"Yujing Wang, Ying Hou, Hong Wang, Ziming Miao, Shibin Wu, Hao Sun, Qi Chen, Yuqing Xia, Chengmin Chi, Guoshuai Zhao, Zheng Liu, Xing Xie, Hao Sun, Weiwei Deng, Qi Zhang, and Mao Yang. 2022. A Neural Corpus Indexer for Document Retrieval. ArXiv, Vol. abs\/2206.02743. https:\/\/api.semanticscholar.org\/CorpusID:249395549"},{"key":"e_1_3_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3583780.3614993"},{"key":"e_1_3_2_2_61_1","volume-title":"RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder. In Conference on Empirical Methods in Natural Language Processing. https:\/\/api.semanticscholar.org\/CorpusID:252917569","author":"Xiao Shitao","year":"2022","unstructured":"Shitao Xiao, Zheng Liu, Yingxia Shao, and Zhao Cao. 2022. RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder. In Conference on Empirical Methods in Natural Language Processing. https:\/\/api.semanticscholar.org\/CorpusID:252917569"},{"key":"e_1_3_2_2_62_1","volume-title":"Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. ArXiv","author":"Xiong Lee","year":"2030","unstructured":"Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. ArXiv, Vol. abs\/2007.00808. https:\/\/api.semanticscholar.org\/CorpusID:220302524"},{"key":"e_1_3_2_2_63_1","volume-title":"Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (Shinjuku","author":"Zamani Hamed","unstructured":"Hamed Zamani and W. Bruce Croft. 2017. Relevance-Based Word Embedding. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (Shinjuku, Tokyo, Japan) (SIGIR '17). 505--514."},{"key":"e_1_3_2_2_64_1","volume-title":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management. https:\/\/api.semanticscholar.org\/CorpusID:52229883","author":"Zamani Hamed","unstructured":"Hamed Zamani, Mostafa Dehghani, W. Bruce Croft, Erik G. Learned-Miller, and J. Kamps. 2018. From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing. Proceedings of the 27th ACM International Conference on Information and Knowledge Management. https:\/\/api.semanticscholar.org\/CorpusID:52229883"},{"key":"e_1_3_2_2_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/3539618.3591626"},{"key":"e_1_3_2_2_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3531791"},{"key":"e_1_3_2_2_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462880"},{"key":"e_1_3_2_2_68_1","volume-title":"Term-Sets Can Be Strong Document Identifiers For Auto-Regressive Search Engines. ArXiv","author":"Zhang Peitian","unstructured":"Peitian Zhang, Zheng Liu, Yujia Zhou, Zhicheng Dou, and Zhao Cao. 2023. Term-Sets Can Be Strong Document Identifiers For Auto-Regressive Search Engines. ArXiv, Vol. abs\/2305.13859. https:\/\/api.semanticscholar.org\/CorpusID:258841428"},{"key":"e_1_3_2_2_69_1","volume-title":"Peitian Zhang, and Ji rong Wen.","author":"Zhou Yujia","year":"2022","unstructured":"Yujia Zhou, Jing Yao, Zhicheng Dou, Ledell Yu Wu, Peitian Zhang, and Ji rong Wen. 2022. Ultron: An Ultimate Retriever on Corpus with a Model-based Indexer. ArXiv, Vol. abs\/2208.09257. https:\/\/api.semanticscholar.org\/CorpusID:251710261"},{"key":"e_1_3_2_2_70_1","volume-title":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. https:\/\/api.semanticscholar.org\/CorpusID:252993059","author":"Zhuang Honglei","year":"2022","unstructured":"Honglei Zhuang, Zhen Qin, Rolf Jagerman, Kai Hui, Ji Ma, Jing Lu, Jianmo Ni, Xuanhui Wang, and Michael Bendersky. 2022a. RankT5: Fine-Tuning T5 for Text Ranking with Ranking Losses. Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. https:\/\/api.semanticscholar.org\/CorpusID:252993059"},{"key":"e_1_3_2_2_71_1","volume-title":"Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation. ArXiv","author":"Zhuang Shengyao","unstructured":"Shengyao Zhuang, Houxing Ren, Linjun Shou, Jian Pei, Ming Gong, G. Zuccon, and Daxin Jiang. 2022b. Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation. ArXiv, Vol. abs\/2206.10128. https:\/\/api.semanticscholar.org\/CorpusID:249890267"}],"event":{"name":"WWW '24: The ACM Web Conference 2024","location":"Singapore Singapore","acronym":"WWW '24","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web"]},"container-title":["Proceedings of the ACM Web Conference 2024"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589334.3645477","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3589334.3645477","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T00:22:43Z","timestamp":1755822163000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589334.3645477"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,13]]},"references-count":71,"alternative-id":["10.1145\/3589334.3645477","10.1145\/3589334"],"URL":"https:\/\/doi.org\/10.1145\/3589334.3645477","relation":{},"subject":[],"published":{"date-parts":[[2024,5,13]]},"assertion":[{"value":"2024-05-13","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}