{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T22:54:53Z","timestamp":1777762493886,"version":"3.51.4"},"reference-count":323,"publisher":"Emerald","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,8,18]]},"abstract":"<jats:p>The core of information retrieval (IR) is to identify relevant information from large-scale resources and return it as a ranked list to respond to user\u2019s information need. In recent years, the resurgence of deep learning has greatly advanced this field and leads to a hot topic named NeuIR (i.e., neural information retrieval), especially the paradigm of pre-training methods (PTMs). Owing to sophisticated pre-training objectives and huge model size, pre-trained models can learn universal language representations from massive textual data, which are beneficial to the ranking task of IR. Recently, a large number of works, which are dedicated to the application of PTMs in IR, have been introduced to promote the retrieval performance. Considering the rapid progress of this direction, this survey aims to provide a systematic review of pre-training methods in IR. To be specific, we present an overview of PTMs applied in different components of an IR system, including the retrieval component, the re-ranking component, and other components. In addition, we also introduce PTMs specifically designed for IR, and summarize available datasets as well as benchmark leaderboards. Moreover, we discuss some open challenges and highlight several promising directions, with the hope of inspiring and facilitating more works on these topics for future research.<\/jats:p>","DOI":"10.1561\/1500000100","type":"journal-article","created":{"date-parts":[[2022,8,18]],"date-time":"2022-08-18T04:19:11Z","timestamp":1660796351000},"page":"178-317","source":"Crossref","is-referenced-by-count":45,"title":["Pre-training Methods in Information Retrieval"],"prefix":"10.1108","volume":"16","author":[{"given":"Yixing","family":"Fan","sequence":"first","affiliation":[{"name":"ICT, CAS ,","place":["China"]}]},{"given":"Xiaohui","family":"Xie","sequence":"additional","affiliation":[{"name":"Tsinghua University ,","place":["China"]}]},{"given":"Yinqiong","family":"Cai","sequence":"additional","affiliation":[{"name":"ICT, CAS ,","place":["China"]}]},{"given":"Jia","family":"Chen","sequence":"additional","affiliation":[{"name":"Tsinghua University ,","place":["China"]}]},{"given":"Xinyu","family":"Ma","sequence":"additional","affiliation":[{"name":"ICT, CAS ,","place":["China"]}]},{"given":"Xiangsheng","family":"Li","sequence":"additional","affiliation":[{"name":"Tsinghua University ,","place":["China"]}]},{"given":"Ruqing","family":"Zhang","sequence":"additional","affiliation":[{"name":"ICT, CAS ,","place":["China"]}]},{"given":"Jiafeng","family":"Guo","sequence":"additional","affiliation":[{"name":"ICT, CAS ,","place":["China"]}]}],"member":"140","published-online":{"date-parts":[[2022,8,18]]},"reference":[{"key":"2026040314422548300_ref001","first-page":"9","volume-title":"COLING 2010, 23rd International Conference on Computational Linguistics, Posters Volume, 23-27 August 2010, Beijing, China","author":"Agirre","year":"2010"},{"issue":"4","key":"2026040314422548300_ref002","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3417996","article-title":"Learning Unsupervised Knowledge-Enhanced Representations to Reduce the Semantic Gap in Information Retrieval","volume":"38","author":"Agosti","year":"2020","journal-title":"ACM Transactions on Information Systems"},{"key":"2026040314422548300_ref003","volume-title":"Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval","author":"Ai","year":"2016"},{"key":"2026040314422548300_ref004","volume-title":"Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval","author":"Ai","year":"2016"},{"key":"2026040314422548300_ref005","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Alberti","year":"2019"},{"key":"2026040314422548300_ref006","volume-title":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Aliannejadi","year":"2019"},{"issue":"4","key":"2026040314422548300_ref007","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1145\/582415.582416","article-title":"Probabilistic models of information retrieval based on measuring the divergence from randomness","volume":"20","author":"Amati","year":"2002","journal-title":"ACM Transactions on Information Systems"},{"key":"2026040314422548300_ref008","article-title":"Toward Word Embedding for Personalized Information Retrieval","volume-title":"CoRR","author":"Amer","year":"2016"},{"key":"2026040314422548300_ref009","volume-title":"2015 IEEE International Conference on Computer Vision (ICCV)","author":"Antol","year":"2015"},{"issue":"101374","key":"2026040314422548300_ref010","article-title":"ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms","volume":"87","author":"Aum\u00fcller","year":"2020","journal-title":"Information Systems"},{"key":"2026040314422548300_ref011","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/D19-5402","article-title":"Summary Level Training of Sentence Rewriting for Abstractive Summarization","volume-title":"CoRR","author":"Bae","year":"2019"},{"key":"2026040314422548300_ref012","article-title":"SparTerm: Learning Term-based Sparse Representation for Fast Text Retrieval","volume-title":"CoRR","author":"Bai","year":"2020"},{"key":"2026040314422548300_ref013","first-page":"642","volume-title":"Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event","author":"Bao","year":"2020"},{"key":"2026040314422548300_ref014","article-title":"Query Focused Abstractive Summarization: Incorporating Query Relevance, MultiDocument Coverage, and Summary Length Constraints into seq2seq Models","volume-title":"CoRR","author":"Baumel","year":"2018"},{"key":"2026040314422548300_ref015","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Beltagy","year":"2019"},{"key":"2026040314422548300_ref016","article-title":"Longformer: The Long-Document Transformer","volume-title":"CoRR","author":"Beltagy","year":"2020"},{"issue":"8","key":"2026040314422548300_ref017","doi-asserted-by":"crossref","first-page":"1798","DOI":"10.1109\/TPAMI.2013.50","article-title":"Representation Learning: A Review and New Perspectives","volume":"35","author":"Bengio","year":"2013","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"2026040314422548300_ref018","volume-title":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Bi","year":"2020"},{"key":"2026040314422548300_ref019","volume-title":"Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval","author":"Bi","year":"2021"},{"key":"2026040314422548300_ref020","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Bi","year":"2021"},{"key":"2026040314422548300_ref021","volume-title":"Proceedings of the Web Conference 2021","author":"Bi","year":"2021"},{"key":"2026040314422548300_ref022","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1162\/tacl_a_00051","article-title":"Enriching Word Vectors with Subword Information","volume":"5","author":"Bojanowski","year":"2017","journal-title":"Trans. Assoc. Comput. Linguistics"},{"key":"2026040314422548300_ref023","first-page":"737","volume-title":"Advances in Neural Information Processing Systems 6, [7th NIPS Conference, Denver, Colorado, USA, 1993]","author":"Bromley","year":"1993"},{"key":"2026040314422548300_ref024","article-title":"Language Models are FewShot Learners","volume-title":"Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual","author":"Brown","year":"2020"},{"key":"2026040314422548300_ref025","volume-title":"Proceedings of the 22nd international conference on Machine learning - ICML \u201905","author":"Burges","year":"2005"},{"key":"2026040314422548300_ref026","first-page":"193","volume-title":"Advances in Neural Information Processing Systems 19, Proceedings of the Twentieth Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 4-7, 2006","author":"Burges","year":"2006"},{"key":"2026040314422548300_ref027","doi-asserted-by":"crossref","first-page":"565","DOI":"10.1007\/978-3-030-58539-6_34","volume-title":"Computer Vision - ECCV 2020","author":"Cao","year":"2020"},{"issue":"2","key":"2026040314422548300_ref028","doi-asserted-by":"crossref","first-page":"209","DOI":"10.1145\/3130348.3130369","article-title":"The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries","volume":"51","author":"Carbinell","year":"2017","journal-title":"ACM SIGIR Forum"},{"key":"2026040314422548300_ref029","article-title":"Pretraining Tasks for Embedding-based Large-scale Retrieval","volume-title":"8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020","author":"Chang","year":"2020"},{"key":"2026040314422548300_ref030","doi-asserted-by":"crossref","DOI":"10.1145\/3477495.3531943","article-title":"Axiomatically Regularized Pre-training for Ad hoc Search","volume-title":"Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Chen","year":"2022"},{"issue":"3","key":"2026040314422548300_ref031","first-page":"1","article-title":"A Hybrid Framework for Session Context Modeling","volume":"39","author":"Chen","year":"2021","journal-title":"ACM Transactions on Information Systems"},{"key":"2026040314422548300_ref032","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Chen","year":"2020"},{"key":"2026040314422548300_ref033","volume-title":"Proceedings of The Web Conference 2020","author":"Chen","year":"2020"},{"key":"2026040314422548300_ref034","article-title":"KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction","volume-title":"CoRR","author":"Chen","year":"2021"},{"key":"2026040314422548300_ref035","first-page":"241","volume-title":"Lecture Notes in Computer Science","author":"Chen","year":"2021"},{"key":"2026040314422548300_ref036","volume-title":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Chen","year":"2020"},{"key":"2026040314422548300_ref037","volume-title":"Special interest tracks and posters of the fourteenth international conference on World Wide Web - WWW \u201905","author":"Chiang","year":"2005"},{"key":"2026040314422548300_ref038","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Choi","year":"2021"},{"key":"2026040314422548300_ref039","volume-title":"Interspeech 2020","author":"Chuang","year":"2020"},{"key":"2026040314422548300_ref040","first-page":"100","volume-title":"Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality, CVSM@ACL 2013, Sofia, Bulgaria, August 9, 2013","author":"Clinchant","year":"2013"},{"key":"2026040314422548300_ref041","volume-title":"Proceedings of the Eleventh International Conference on Language Resources and Evaluation, LREC 2018, Miyazaki, Japan, May 7-12, 2018","author":"Conneau","year":"2018"},{"key":"2026040314422548300_ref042","first-page":"7057","article-title":"Cross-lingual Language Model Pretraining","volume-title":"Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada","author":"Conneau","year":"2019"},{"key":"2026040314422548300_ref043","first-page":"641","volume-title":"Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, December 3-8, 2001, Vancouver, British Columbia, Canada]","author":"Crammer","year":"2001"},{"issue":"2","key":"2026040314422548300_ref044","doi-asserted-by":"crossref","first-page":"96","DOI":"10.1145\/3053408.3053425","article-title":"Report on the SIGIR 2016 Workshop on Neural Information Retrieval (Neu-IR)","volume":"50","author":"Craswell","year":"2017","journal-title":"ACM SIGIR Forum"},{"key":"2026040314422548300_ref045","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Craswell","year":"2021"},{"key":"2026040314422548300_ref046","article-title":"Language Modeling for Information Retrieval","volume-title":"The Springer International Series on Information Retrieval","author":"Croft","year":"2003"},{"key":"2026040314422548300_ref047","article-title":"Context-Aware Sentence\/Passage Term Importance Estimation For First Stage Retrieval","volume-title":"CoRR","author":"Dai","year":"2019"},{"key":"2026040314422548300_ref048","volume-title":"Proceedings of 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Dai","year":"2019"},{"key":"2026040314422548300_ref049","volume-title":"Proceedings of The Web Conference 2020","author":"Dai","year":"2020"},{"key":"2026040314422548300_ref050","volume-title":"Proceedings of 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Dai","year":"2020"},{"key":"2026040314422548300_ref051","first-page":"1","article-title":"Overview of DUC 2005","volume":"2005","author":"Dang","year":"2005","journal-title":"Proceedings of document understanding conference"},{"key":"2026040314422548300_ref052","volume-title":"Proceedings of 2017 ACM on Conference on Information and Knowledge Management","author":"Dehghani","year":"2017"},{"key":"2026040314422548300_ref053","volume-title":"Proceedings of 40th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Dehghani","year":"2017"},{"key":"2026040314422548300_ref054","volume-title":"Proceedings of 2019 Conference of the North","author":"Devlin","year":"2019"},{"key":"2026040314422548300_ref055","volume-title":"Proceedings of 54th Annual Meeting of Association for Computational Linguistics (Volume 1: Long Papers)","author":"Diaz","year":"2016"},{"key":"2026040314422548300_ref056","volume-title":"Proceedings of The Twenty-Sixth Text REtrieval Conference, TREC 2017, Gaithersburg, Maryland, USA, November 15-17, 2017","author":"Dietz","year":"2017"},{"key":"2026040314422548300_ref057","volume-title":"Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Dou","year":"2021"},{"key":"2026040314422548300_ref058","article-title":"A Decade Survey of Content Based Image Retrieval using Deep Learning","volume-title":"CoRR","author":"Dubey","year":"2020"},{"issue":"3","key":"2026040314422548300_ref059","doi-asserted-by":"crossref","first-page":"403","DOI":"10.14778\/3368289.3368303","article-title":"Return of the Lernaean Hydra","volume":"13","author":"Echihabi","year":"2019","journal-title":"Proceedings of the VLDB Endowment"},{"key":"2026040314422548300_ref060","volume-title":"Proceedings of 35th international ACM SIGIR conference on Research and development in information retrieval - SIGIR \u201912","author":"Efron","year":"2012"},{"key":"2026040314422548300_ref061","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1613\/jair.1523","article-title":"LexRank: Graph-based Lexical Centrality as Salience in Text Summarization","volume":"22","author":"Erkan","year":"2004","journal-title":"J. Artif. Intell. Res."},{"key":"2026040314422548300_ref062","volume-title":"Proceedings of Web Conference 2021","author":"Fan","year":"2021"},{"key":"2026040314422548300_ref063","article-title":"Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity","volume-title":"CoRR","author":"Fedus","year":"2021"},{"key":"2026040314422548300_ref064","volume-title":"Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Fei","year":"2021"},{"key":"2026040314422548300_ref065","volume-title":"Proceedings of 40th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Feigenblat","year":"2017"},{"key":"2026040314422548300_ref066","volume-title":"Proceedings of Document Understanding Conference, DUC-2006, New York, USA","author":"Fisher","year":"2006"},{"key":"2026040314422548300_ref067","volume-title":"Proceedings of 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Formal","year":"2021"},{"key":"2026040314422548300_ref068","volume-title":"Proceedings of 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Frej","year":"2020"},{"issue":"5","key":"2026040314422548300_ref069","article-title":"Efficiency Considerations for Scalable Information Retrieval Servers","volume":"1","author":"Frieder","year":"2000","journal-title":"J. Digit. Inf."},{"key":"2026040314422548300_ref070","article-title":"Compressing Large-Scale Transformer-Based Models: A Case Study on BERT","volume-title":"CoRR","author":"Ganesh","year":"2020"},{"key":"2026040314422548300_ref071","volume-title":"Proceedings of 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Ganguly","year":"2015"},{"key":"2026040314422548300_ref072","volume-title":"Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing","author":"Gao","year":"2021"},{"key":"2026040314422548300_ref073","article-title":"Is Your Language Model Ready for Dense Representation Fine-tuning?","volume-title":"CoRR","author":"Gao","year":"2021"},{"key":"2026040314422548300_ref074","volume-title":"Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Gao","year":"2020"},{"key":"2026040314422548300_ref075","volume-title":"Proceedings of 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","author":"Gao","year":"2020"},{"key":"2026040314422548300_ref076","volume-title":"Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Gao","year":"2021"},{"key":"2026040314422548300_ref077","first-page":"280","volume-title":"Lecture Notes in Computer Science","author":"Gao","year":"2021"},{"key":"2026040314422548300_ref078","article-title":"Complementing Lexical Retrieval with Semantic Residual Embedding","volume-title":"CoRR","author":"Gao","year":"2020"},{"key":"2026040314422548300_ref079","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Gao","year":"2021"},{"key":"2026040314422548300_ref080","article-title":"End-to-End Retrieval in Continuous Space","volume-title":"CoRR","author":"Gillick","year":"2018"},{"key":"2026040314422548300_ref081","volume-title":"Proceedings of 39th International ACM SIGIR conference on Research and Development in Information Retrieval","author":"Grbovic","year":"2016"},{"key":"2026040314422548300_ref082","volume-title":"Proceedings of 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Grbovic","year":"2015"},{"key":"2026040314422548300_ref083","volume-title":"Proceedings of 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Grover","year":"2016"},{"issue":"4","key":"2026040314422548300_ref084","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3486250","article-title":"Semantic Models for First-Stage Retrieval: A Comprehensive Review","volume":"40","author":"Guo","year":"2022","journal-title":"ACM Transactions on Information Systems"},{"key":"2026040314422548300_ref085","volume-title":"Proceedings of 25th ACM International on Conference on Information and Knowledge Management","author":"Guo","year":"2016"},{"issue":"6","key":"2026040314422548300_ref086","doi-asserted-by":"crossref","first-page":"102067","DOI":"10.1016\/j.ipm.2019.102067","article-title":"A Deep Look into neural ranking models for information retrieval","volume":"57","author":"Guo","year":"2020","journal-title":"Information Processing and Management"},{"key":"2026040314422548300_ref087","article-title":"REALM: Retrieval-Augmented Language Model Pre-Training","volume-title":"CoRR","author":"Guu","year":"2020"},{"issue":"4","key":"2026040314422548300_ref088","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3196826","article-title":"Neural Vector Spaces for Unsupervised Information Retrieval","volume":"36","author":"Gysel","year":"2018","journal-title":"ACM Transactions on Information Systems"},{"key":"2026040314422548300_ref089","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1016\/j.datak.2016.06.003","article-title":"Question answering in conversations: Query refinement using contextual and semantic information","volume":"106","author":"Habibi","year":"2016","journal-title":"Data and Knowledge Engineering"},{"key":"2026040314422548300_ref090","article-title":"PTR: Prompt Tuning with Rules for Text Classification","volume-title":"CoRR","author":"Han","year":"2021"},{"key":"2026040314422548300_ref091","article-title":"Dynamic Neural Networks: A Survey","volume-title":"CoRR","author":"Han","year":"2021"},{"key":"2026040314422548300_ref092","volume-title":"Proceedings of 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Hashemi","year":"2020"},{"key":"2026040314422548300_ref093","volume-title":"9th International Conference on Artificial Neural Networks: ICANN \u201999","author":"Herbrich","year":"1999"},{"key":"2026040314422548300_ref094","article-title":"Distilling the Knowledge in a Neural Network","volume-title":"CoRR","author":"Hinton","year":"2015"},{"key":"2026040314422548300_ref095","article-title":"Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation","volume-title":"CoRR","author":"Hofst\u00e4tter","year":"2020"},{"key":"2026040314422548300_ref096","volume-title":"Proceedings of 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Hofst\u00e4tter","year":"2021"},{"key":"2026040314422548300_ref097","volume-title":"Proceedings of 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Hofst\u00e4tter","year":"2021"},{"key":"2026040314422548300_ref098","first-page":"2790","volume-title":"Proceedings of 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA","author":"Houlsby","year":"2019"},{"key":"2026040314422548300_ref099","volume-title":"Proceedings of 56th Annual Meeting of Association for Computational Linguistics (Volume 1: Long Papers)","author":"Howard","year":"2018"},{"key":"2026040314422548300_ref100","first-page":"2042","article-title":"Convolutional Neural Network Architectures for Matching Natural Language Sentences","volume-title":"Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada","author":"Hu","year":"2014"},{"key":"2026040314422548300_ref101","article-title":"LoRA: Low-Rank Adaptation of Large Language Models","volume-title":"CoRR","author":"Hu","year":"2021"},{"key":"2026040314422548300_ref102","volume-title":"Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Huang","year":"2019"},{"key":"2026040314422548300_ref103","volume-title":"Proceedings of 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Huang","year":"2020"},{"key":"2026040314422548300_ref104","volume-title":"Proceedings of 22nd ACM international conference on Conference on information and knowledge management - CIKM \u201913","author":"Huang","year":"2013"},{"key":"2026040314422548300_ref105","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1016\/j.neucom.2021.07.013","article-title":"Element graph-augmented abstractive summarization for legal public opinion news with graph transformer","volume":"460","author":"Huang","year":"2021","journal-title":"Neurocomputing"},{"key":"2026040314422548300_ref106","article-title":"WenLan: Bridging vision and language by large-scale multi-modal pre-training","volume-title":"arXiv preprint arXiv:2103.06561","author":"Huo","year":"2021"},{"key":"2026040314422548300_ref107","volume-title":"Proceedings of Thirteenth Text REtrieval Conference, TREC 2004, Gaithersburg, Maryland, USA, November 16-19, 2004","author":"Jaleel","year":"2004"},{"key":"2026040314422548300_ref108","article-title":"UHD-BERT: Bucketed Ultra-High Dimensional Sparse Representations for Full Ranking","volume-title":"CoRR","author":"Jang","year":"2021"},{"issue":"1","key":"2026040314422548300_ref109","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1109\/TPAMI.2010.57","article-title":"Product Quantization for Nearest Neighbor Search","volume":"33","author":"J\u00e9gou","year":"2011","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"2026040314422548300_ref110","volume-title":"Proceedings of 27th ACM International Conference on Information and Knowledge Management","author":"Jiang","year":"2018"},{"key":"2026040314422548300_ref111","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1162\/tacl_a_00324","article-title":"How Can We Know What Language Models Know","volume":"8","author":"Jiang","year":"2020","journal-title":"Trans. Assoc. Comput. Linguistics"},{"key":"2026040314422548300_ref112","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Jiao","year":"2020"},{"key":"2026040314422548300_ref113","volume-title":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Johnson","year":"2016"},{"key":"2026040314422548300_ref114","article-title":"Semi-Siamese Bi-encoder Neural Ranking Model Using Lightweight Fine-Tuning","volume-title":"CoRR","author":"Jung","year":"2021"},{"key":"2026040314422548300_ref115","volume-title":"Proceedings of 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC)","author":"K\u00e2geb\u00e4ck","year":"2014"},{"key":"2026040314422548300_ref116","volume-title":"Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Karpukhin","year":"2020"},{"key":"2026040314422548300_ref117","article-title":"SentiLR: Linguistic Knowledge Enhanced Language Representation for Sentiment Analysis","volume-title":"CoRR","author":"Ke","year":"2019"},{"key":"2026040314422548300_ref118","first-page":"1411","volume-title":"Proceedings of 24th ACM International on Conference on Information and Knowledge Management. CIKM \u201915","author":"Kenter","year":"2015"},{"key":"2026040314422548300_ref119","volume-title":"Proceedings of 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Khattab","year":"2020"},{"key":"2026040314422548300_ref120","volume-title":"Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing","author":"Kobayashi","year":"2015"},{"key":"2026040314422548300_ref121","article-title":"AQua-MuSe: Automatically Generating Datasets for Query-Based Multi-Document Summarization","volume-title":"CoRR","author":"Kulkarni","year":"2020"},{"key":"2026040314422548300_ref122","volume-title":"Proceedings of 29th ACM International Conference on Information and Knowledge Management","author":"Kumar","year":"2020"},{"key":"2026040314422548300_ref123","volume-title":"Proceedings of 27th annual international conference on Research and development in information retrieval - SIGIR \u201904","author":"Kurland","year":"2004"},{"key":"2026040314422548300_ref124","volume-title":"Proceedings of 40th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Kuzi","year":"2017"},{"key":"2026040314422548300_ref125","volume-title":"Proceedings of 25th ACM International on Conference on Information and Knowledge Management","author":"Kuzi","year":"2016"},{"key":"2026040314422548300_ref126","article-title":"Leveraging Semantic and Lexical Matching to Improve Recall of Document Retrieval Systems: A Hybrid Approach","volume-title":"CoRR","author":"Kuzi","year":"2020"},{"key":"2026040314422548300_ref127","first-page":"452","article-title":"Natural Questions: a Benchmark for Question Answering Research","volume":"7","author":"Kwiatkowski","year":"2019","journal-title":"Trans. Assoc. Comput. Linguistics"},{"key":"2026040314422548300_ref128","first-page":"1","volume-title":"Language modeling for information retrieval","author":"Lafferty","year":"2003"},{"key":"2026040314422548300_ref129","article-title":"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations","volume-title":"8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020","author":"Lan","year":"2020"},{"key":"2026040314422548300_ref130","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1007\/978-3-030-47358-7_35","volume-title":"Advances in Artificial Intelligence","author":"Laskar","year":"2020"},{"key":"2026040314422548300_ref131","volume-title":"Proceedings of 28th International Conference on Computational Linguistics","author":"Laskar","year":"2020"},{"issue":"2","key":"2026040314422548300_ref132","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1145\/3130348.3130376","article-title":"Relevance-Based Language Models","volume":"51","author":"Lavrenko","year":"2017","journal-title":"ACM SIGIR Forum"},{"key":"2026040314422548300_ref133","first-page":"1188","volume-title":"Proceedings of 31st International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014","author":"Le","year":"2014"},{"key":"2026040314422548300_ref134","volume-title":"Proceedings of 58th Annual Meeting of Association for Computational Linguistics","author":"Lee","year":"2020"},{"key":"2026040314422548300_ref135","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume-title":"Bioinformatics","author":"Lee","year":"2019"},{"key":"2026040314422548300_ref136","volume-title":"Proceedings of 57th Annual Meeting of Association for Computational Linguistics","author":"Lee","year":"2019"},{"key":"2026040314422548300_ref137","doi-asserted-by":"crossref","first-page":"212","DOI":"10.1007\/978-3-030-01225-0_13","volume-title":"Computer Vision - ECCV 2018","author":"Lee","year":"2018"},{"key":"2026040314422548300_ref138","volume-title":"Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing","author":"Lester","year":"2021"},{"key":"2026040314422548300_ref139","volume-title":"Proceedings of 58th Annual Meeting of Association for Computational Linguistics","author":"Lewis","year":"2020"},{"key":"2026040314422548300_ref140","article-title":"Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks","volume-title":"Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual","author":"Lewis","year":"2020"},{"key":"2026040314422548300_ref141","article-title":"PARADE: Passage Representation Aggregation for Document Reranking","volume-title":"CoRR","author":"Li","year":"2020"},{"issue":"3","key":"2026040314422548300_ref142","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/978-3-031-02155-8","article-title":"Learning to Rank for Information Retrieval and Natural Language Processing, Second Edition","volume":"7","author":"Li","year":"2014","journal-title":"Synthesis Lectures on Human Language Technologies"},{"issue":"5","key":"2026040314422548300_ref143","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1561\/1500000035","article-title":"Semantic Matching in Search","volume":"7","author":"Li","year":"2014","journal-title":"Foundations and Trends\u00ae in Information Retrieval"},{"key":"2026040314422548300_ref144","volume-title":"Proceedings of 58th Annual Meeting of Association for Computational Linguistics","author":"Li","year":"2020"},{"key":"2026040314422548300_ref145","first-page":"897","volume-title":"Advances in Neural Information Processing Systems 20, Proceedings of Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 3-6, 2007","author":"Li","year":"2007"},{"key":"2026040314422548300_ref146","volume-title":"Proceedings of 59th Annual Meeting of Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Li","year":"2021"},{"issue":"8","key":"2026040314422548300_ref147","doi-asserted-by":"crossref","first-page":"1475","DOI":"10.1109\/TKDE.2019.2909204","article-title":"Approximate Nearest Neighbor Search on High Dimensional Data \u2014 Experiments, Analyses, and Improvement","volume":"32","author":"Li","year":"2020","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"2026040314422548300_ref148","volume-title":"Proceedings of 59th Annual Meeting of Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Li","year":"2021"},{"key":"2026040314422548300_ref149","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1007\/978-3-030-58577-8_8","volume-title":"Computer Vision - ECCV 2020","author":"Li","year":"2020"},{"key":"2026040314422548300_ref150","article-title":"Embedding-based Zero-shot Retrieval through Query Generation","volume-title":"CoRR","author":"Liang","year":"2020"},{"issue":"1","key":"2026040314422548300_ref151","doi-asserted-by":"crossref","first-page":"45","DOI":"10.14801\/JAITC.2020.10.1.45","article-title":"Fine-tuning BERT Models for Keyphrase Extraction in Scientific Articles","volume":"10","author":"Lim","year":"2020","journal-title":"Journal of Advanced Information Technology and Convergence"},{"key":"2026040314422548300_ref152","article-title":"A Few Brief Notes on Deepimpact, COIL, and a Conceptual Framework for Information Retrieval Techniques","volume-title":"CoRR","author":"Lin","year":"2021"},{"issue":"4","key":"2026040314422548300_ref153","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/978-3-031-02181-7","article-title":"Pretrained Transformers for Text Ranking: BERT and Beyond","volume":"14","author":"Lin","year":"2021","journal-title":"Synthesis Lectures on Human Language Technologies"},{"key":"2026040314422548300_ref154","volume-title":"Proceedings of 6th Workshop on Representation Learning for NLP (RepL4NLP-2021)","author":"Lin","year":"2021"},{"key":"2026040314422548300_ref155","article-title":"Query Reformulation using Query History for Passage Retrieval in Conversational Search","volume-title":"CoRR","author":"Lin","year":"2020"},{"key":"2026040314422548300_ref156","volume-title":"Proceedings of Web Conference 2021","author":"Liu","year":"2021"},{"key":"2026040314422548300_ref157","volume-title":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Liu","year":"2021"},{"key":"2026040314422548300_ref158","article-title":"Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing","volume-title":"CoRR","author":"Liu","year":"2021"},{"key":"2026040314422548300_ref159","doi-asserted-by":"crossref","first-page":"3180","DOI":"10.1109\/TASLP.2021.3120587","article-title":"Addressing Extraction and Generation Separately: Keyphrase Prediction With Pre-Trained Language Models","volume":"29","author":"Liu","year":"2021","journal-title":"IEEE\/ACM Transactions on Audio, Speech, and Language Processing"},{"issue":"3","key":"2026040314422548300_ref160","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1561\/1500000016","article-title":"Learning to Rank for Information Retrieval","volume":"3","author":"Liu","year":"2007","journal-title":"Foundations and Trends\u00ae in Information Retrieval"},{"issue":"1","key":"2026040314422548300_ref161","first-page":"1","article-title":"Self-supervised Learning: Generative or Contrastive","volume":"1","author":"Liu","year":"2021","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"2026040314422548300_ref162","article-title":"GPT Understands, Too","volume-title":"CoRR","author":"Liu","year":"2021"},{"key":"2026040314422548300_ref163","volume-title":"Proceedings of 27th annual international conference on Research and development in information retrieval - SIGIR \u201904","author":"Liu","year":"2004"},{"issue":"1","key":"2026040314422548300_ref164","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/aris.1440390108","article-title":"Statistical language modeling for information retrieval","volume":"39","author":"Liu","year":"2006","journal-title":"Annual Review of Information Science and Technology"},{"key":"2026040314422548300_ref165","volume-title":"Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Liu","year":"2019"},{"key":"2026040314422548300_ref166","volume-title":"Proceedings of 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","author":"Liu","year":"2021"},{"key":"2026040314422548300_ref167","article-title":"RoBERTa: A Robustly Optimized BERT Pretraining Approach","volume-title":"CoRR","author":"Liu","year":"2019"},{"key":"2026040314422548300_ref168","article-title":"Ranking Clarifying Questions Based on Predicted User Engagement","volume-title":"CoRR","author":"Lotze","year":"2021"},{"key":"2026040314422548300_ref169","first-page":"13","article-title":"ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks","volume-title":"Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada","author":"Lu","year":"2019"},{"key":"2026040314422548300_ref170","volume-title":"Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing","author":"Lu","year":"2021"},{"key":"2026040314422548300_ref171","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1162\/tacl_a_00369","article-title":"Sparse, Dense, and Attentional Representations for Text Retrieval","volume":"9","author":"Luan","year":"2021","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2026040314422548300_ref172","volume-title":"Proceedings of 40th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Luo","year":"2017"},{"key":"2026040314422548300_ref173","first-page":"205","volume-title":"Lecture Notes in Computer Science","author":"Luo","year":"2017"},{"key":"2026040314422548300_ref174","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Ma","year":"2021"},{"key":"2026040314422548300_ref175","volume-title":"Proceedings of the 14th ACM International Conference on Web Search and Data Mining","author":"Ma","year":"2021"},{"key":"2026040314422548300_ref176","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Ma","year":"2021"},{"key":"2026040314422548300_ref177","volume-title":"Proceedings of the 30th ACM International Conference on Information and Knowledge Management","author":"Ma","year":"2021"},{"key":"2026040314422548300_ref178","volume-title":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"MacAvaney","year":"2020"},{"key":"2026040314422548300_ref179","first-page":"1101","volume-title":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21-25, 2019","author":"MacAvaney","year":"2019"},{"key":"2026040314422548300_ref180","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)","author":"Mahata","year":"2018"},{"key":"2026040314422548300_ref181","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Mallia","year":"2021"},{"key":"2026040314422548300_ref182","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511809071","article-title":"Introduction to Information Retrieval","volume-title":"Cambridge University Press","author":"Manning","year":"2008"},{"key":"2026040314422548300_ref183","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Mass","year":"2020"},{"key":"2026040314422548300_ref184","first-page":"6294","article-title":"Learned in Translation: Contextualized Word Vectors","volume-title":"Annual Conference on Neural Information Processing Systems 2017, December f-9, 2017, Long Beach, CA, USA","author":"McCann","year":"2017"},{"key":"2026040314422548300_ref185","volume-title":"Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"McCreery","year":"2020"},{"key":"2026040314422548300_ref186","volume-title":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD \u201910","author":"Mei","year":"2010"},{"key":"2026040314422548300_ref187","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/K16-1006","article-title":"context2vec: Learning Generic Context Embedding with Bidirectional LSTM","volume-title":"CoNLL","author":"Melamud","year":"2016"},{"issue":"1","key":"2026040314422548300_ref188","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3476415.3476428","article-title":"Rethinking search","volume":"55","author":"Metzler","year":"2021","journal-title":"ACM SIGIR Forum"},{"key":"2026040314422548300_ref189","article-title":"Efficient Estimation of Word Representations in Vector Space","volume-title":"1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-f, 2013, Workshop Track Proceedings","author":"Mikolov","year":"2013"},{"key":"2026040314422548300_ref190","first-page":"3111","article-title":"Distributed Representations of Words and Phrases and their Compositionality","volume-title":"27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States","author":"Mikolov","year":"2013"},{"key":"2026040314422548300_ref191","first-page":"746","volume-title":"Human Language Technologies: Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings, June 9-14, 2013, Westin Peachtree Plaza Hotel, Atlanta, Georgia, USA","author":"Mikolov","year":"2013"},{"issue":"1","key":"2026040314422548300_ref192","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/1500000061","article-title":"An Introduction to Neural Information Retrieval t","volume":"13","author":"Mitra","year":"2018","journal-title":"Foundations and Trends\u00ae in Information Retrieval"},{"key":"2026040314422548300_ref193","volume-title":"Proceedings of the 26th International Conference on World Wide Web","author":"Mitra","year":"2017"},{"key":"2026040314422548300_ref194","article-title":"A Dual Embedding Space Model for Document Ranking","volume-title":"CoRR","author":"Mitra","year":"2016"},{"key":"2026040314422548300_ref195","volume-title":"Proceedings of the 29th ACM International Conference on Information and Knowledge Management","author":"Mitra","year":"2020"},{"key":"2026040314422548300_ref196","doi-asserted-by":"crossref","first-page":"112958","DOI":"10.1016\/j.eswa.2019.112958","article-title":"Text document summarization using word embedding","volume":"143","author":"Mohd","year":"2020","journal-title":"Expert Systems with Applications"},{"key":"2026040314422548300_ref197","article-title":"Using BERT and BART for Query Suggestion","volume":"2621","author":"Mustar","year":"2020","journal-title":"Proceedings of the First Joint Conference of Information Retrieval Communities in Europe (CIRCLE 2020), Sardagna, France, July 6-9, 2020"},{"key":"2026040314422548300_ref198","first-page":"467","volume-title":"Lecture Notes in Computer Science","author":"Naseri","year":"2021"},{"key":"2026040314422548300_ref199","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Nema","year":"2017"},{"key":"2026040314422548300_ref200","article-title":"Large Dual Encoders Are Generalizable Retrievers","volume-title":"CoRR","author":"Ni","year":"2021"},{"key":"2026040314422548300_ref201","first-page":"3977","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021","author":"Ni","year":"2021"},{"key":"2026040314422548300_ref202","article-title":"Passage Re-ranking with BERT","volume-title":"CoRR","author":"Nogueira","year":"2019"},{"key":"2026040314422548300_ref203","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Nogueira","year":"2020"},{"key":"2026040314422548300_ref204","article-title":"From doc2query to docTTTTTquery","volume":"6","author":"Nogueira","year":"2019","journal-title":"Online preprint"},{"key":"2026040314422548300_ref205","article-title":"Multi-Stage Document Ranking with BERT","volume-title":"CoRR","author":"Nogueira","year":"2019"},{"issue":"2-3","key":"2026040314422548300_ref206","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1007\/s10791-017-9321-y","article-title":"Neural information retrieval: at the end of the early years","volume":"21","author":"Onal","year":"2017","journal-title":"Information Retrieval Journal"},{"key":"2026040314422548300_ref207","first-page":"297","volume-title":"Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14-17, 2020, Proceedings, Part II","author":"Padaki","year":"2020"},{"issue":"6","key":"2026040314422548300_ref208","doi-asserted-by":"crossref","first-page":"888","DOI":"10.1016\/j.ipm.2018.06.004","article-title":"Local word vectors guiding keyphrase extraction","volume":"54","author":"Papagiannopoulou","year":"2018","journal-title":"Information Processing and Management"},{"key":"2026040314422548300_ref209","volume-title":"Proceedings of the 28th International Conference on Computational Linguistics","author":"Park","year":"2020"},{"key":"2026040314422548300_ref210","volume-title":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Pennington","year":"2014"},{"key":"2026040314422548300_ref211","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)","author":"Peters","year":"2018"},{"key":"2026040314422548300_ref212","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Petroni","year":"2019"},{"issue":"2","key":"2026040314422548300_ref213","doi-asserted-by":"crossref","first-page":"202","DOI":"10.1145\/3130348.3130368","article-title":"A Language Modeling Approach to Information Retrieval","volume":"51","author":"Ponte","year":"2017","journal-title":"ACM SIGIR Forum"},{"key":"2026040314422548300_ref214","article-title":"The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models","volume-title":"CoRR","author":"Pradeep","year":"2021"},{"key":"2026040314422548300_ref215","article-title":"Zero-shot Text Classification With Generative Language Models","volume-title":"CoRR","author":"Puri","year":"2019"},{"key":"2026040314422548300_ref216","article-title":"Understanding Behaviors of BERT in Ranking","volume-title":"CoRR","author":"Qiao","year":"2019"},{"key":"2026040314422548300_ref217","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1016\/j.eswa.2019.02.001","article-title":"Geoscience keyphrase extraction algorithm using enhanced word embedding","volume":"125","author":"Qiu","year":"2019","journal-title":"Expert Systems with Applications"},{"issue":"10","key":"2026040314422548300_ref218","doi-asserted-by":"crossref","first-page":"1872","DOI":"10.1007\/s11431-020-1647-3","article-title":"Pretrained models for natural language processing: A survey","volume":"63","author":"Qiu","year":"2020","journal-title":"Science China Technological Sciences"},{"key":"2026040314422548300_ref219","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Qu","year":"2021"},{"key":"2026040314422548300_ref220","article-title":"Improving language understanding by generative pre-training","author":"Radford","year":"2018"},{"issue":"8","key":"2026040314422548300_ref221","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI blog"},{"key":"2026040314422548300_ref222","first-page":"140:1","article-title":"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer","volume":"21","author":"Raffel","year":"2020","journal-title":"J. Mach. Learn. Res."},{"key":"2026040314422548300_ref223","article-title":"Towards Robust Neural Retrieval Models with Synthetic Pre-Training","volume-title":"arXiv preprint arXiv:2101.07800","author":"Reddy","year":"2021"},{"key":"2026040314422548300_ref224","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Reimers","year":"2019"},{"key":"2026040314422548300_ref225","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Ren","year":"2021"},{"issue":"3","key":"2026040314422548300_ref226","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1002\/asi.4630270302","article-title":"Relevance weighting of search terms","volume":"27","author":"Robertson","year":"1976","journal-title":"Journal of the American Society for Information Science"},{"key":"2026040314422548300_ref227","doi-asserted-by":"crossref","first-page":"232","DOI":"10.1007\/978-1-4471-2099-5_24","volume-title":"SIGIR \u201994","author":"Robertson","year":"1994"},{"issue":"4","key":"2026040314422548300_ref228","first-page":"333","article-title":"The Probabilistic Relevance Framework: BM25 and Beyond","volume":"3","author":"Robertson","year":"2009","journal-title":"Foundations and Trends\u00ae in Information Retrieval"},{"key":"2026040314422548300_ref229","first-page":"109","volume-title":"Proceedings of The Third Text REtrieval Conference, TREC 1994, Gaithersburg, Maryland, USA, November 2-4, 1994","author":"Robertson","year":"1994"},{"key":"2026040314422548300_ref230","volume-title":"Proceedings of The Web Conference 2020","author":"Roitman","year":"2020"},{"key":"2026040314422548300_ref231","volume-title":"Proceedings of The Web Conference 2020","author":"Rosset","year":"2020"},{"key":"2026040314422548300_ref232","article-title":"Representing documents and queries as sets of word embedded vectors for information retrieval","volume-title":"Proceedings of Neu-IR: The SIGIR 2016 Workshop on Neural Information Retrieval","author":"Roy","year":"2016"},{"key":"2026040314422548300_ref233","article-title":"Using Word Embeddings for Automatic Query Expansion","volume-title":"CoRR","author":"Roy","year":"2016"},{"key":"2026040314422548300_ref234","article-title":"End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering","volume-title":"CoRR","author":"Sachan","year":"2021"},{"key":"2026040314422548300_ref235","first-page":"328","volume-title":"Lecture Notes in Computer Science","author":"Sahrawat","year":"2020"},{"key":"2026040314422548300_ref236","article-title":"Abstractive Summarization with Combination of Pre-trained Sequence-to-Sequence and Saliency Models","volume-title":"CoRR","author":"Saito","year":"2020"},{"key":"2026040314422548300_ref237","volume-title":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Sakata","year":"2019"},{"issue":"11","key":"2026040314422548300_ref238","doi-asserted-by":"crossref","first-page":"613","DOI":"10.1145\/361219.361220","article-title":"A vector space model for automatic indexing","volume":"18","author":"Salton","year":"1975","journal-title":"Communications of the ACM"},{"key":"2026040314422548300_ref239","article-title":"DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter","volume-title":"CoRR","author":"Sanh","year":"2019"},{"key":"2026040314422548300_ref240","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Santosdos","year":"2020"},{"issue":"3","key":"2026040314422548300_ref241","doi-asserted-by":"crossref","first-page":"i","DOI":"10.1007\/978-3-031-02302-6","article-title":"The Notion of Relevance in Information Science: Everybody knows what relevance is. But, what is it really?","volume":"8","author":"Saracevic","year":"2016","journal-title":"Synthesis Lectures on Information Concepts, Retrieval, and Services"},{"key":"2026040314422548300_ref242","doi-asserted-by":"crossref","DOI":"10.1038\/s41597-020-00667-z","article-title":"Question-Driven Summarization of Answers to Consumer Health Questions","volume-title":"CoRR","author":"Savery","year":"2020"},{"key":"2026040314422548300_ref243","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Schick","year":"2021"},{"key":"2026040314422548300_ref244","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2021.naacl-main.185","article-title":"It\u2019s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners","volume-title":"ArXiv","author":"Schick","year":"2021"},{"key":"2026040314422548300_ref245","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"See","year":"2017"},{"key":"2026040314422548300_ref246","article-title":"Longformer for MS MARCO Document Re-ranking Task","volume-title":"ArXiv","author":"Sekulic","year":"2020"},{"key":"2026040314422548300_ref247","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Seo","year":"2019"},{"key":"2026040314422548300_ref248","volume-title":"Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management","author":"Shen","year":"2014"},{"key":"2026040314422548300_ref249","volume-title":"Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Sherman","year":"2017"},{"key":"2026040314422548300_ref250","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Shi","year":"2020"},{"key":"2026040314422548300_ref251","article-title":"Pre-trained Summarization Distillation","volume-title":"CoRR","author":"Shleifer","year":"2020"},{"issue":"2","key":"2026040314422548300_ref252","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1145\/3130348.3130365","article-title":"Pivoted Document Length Normalization","volume":"51","author":"Singhal","year":"2017","journal-title":"ACM SIGIR Forum"},{"key":"2026040314422548300_ref253","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Soldaini","year":"2020"},{"key":"2026040314422548300_ref254","first-page":"5926","volume-title":"Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA","author":"Song","year":"2019"},{"key":"2026040314422548300_ref255","volume-title":"Proceedings of the 24th ACM International on Conference on Information and Knowledge Management","author":"Sordoni","year":"2015"},{"key":"2026040314422548300_ref256","volume-title":"Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020","author":"Su","year":"2020"},{"key":"2026040314422548300_ref257","article-title":"VL-BERT: Pre-training of Generic Visual-Linguistic Representations","volume-title":"8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020","author":"Su","year":"2020"},{"key":"2026040314422548300_ref258","volume-title":"2019 IEEE\/CVF International Conference on Computer Vision (ICCV)","author":"Sun","year":"2019"},{"key":"2026040314422548300_ref259","article-title":"Joint Keyphrase Chunking and Salience Ranking with BERT","volume-title":"CoRR","author":"Sun","year":"2020"},{"key":"2026040314422548300_ref260","article-title":"ERNIE: Enhanced Representation through Knowledge Integration","volume-title":"CoRR","author":"Sun","year":"2019"},{"key":"2026040314422548300_ref261","doi-asserted-by":"crossref","DOI":"10.1145\/3397271.3401296","article-title":"Distilling Knowledge for Fast Retrieval-based Chat-bots","volume-title":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Tahami","year":"2020"},{"key":"2026040314422548300_ref262","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Tang","year":"2021"},{"key":"2026040314422548300_ref263","article-title":"Progress Notes Classification and Keyword Extraction using Attention-based Deep Learning Models with BERT","volume-title":"CoRR","author":"Tang","year":"2019"},{"key":"2026040314422548300_ref264","article-title":"Transformer Memory as a Differentiable Search Index","volume-title":"CoRR","author":"Tay","year":"2022"},{"key":"2026040314422548300_ref265","article-title":"BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models","volume-title":"Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)","author":"Thakur","year":"2021"},{"key":"2026040314422548300_ref266","volume-title":"2015 IEEE Information Theory Workshop (ITW)","author":"Tishby","year":"2015"},{"key":"2026040314422548300_ref267","first-page":"5998","article-title":"Attention is All You Need","volume-title":"Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA","author":"Vaswani","year":"2017"},{"key":"2026040314422548300_ref268","volume-title":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Vinyals","year":"2015"},{"key":"2026040314422548300_ref269","volume-title":"TREC","author":"Voorhees","year":"2004"},{"key":"2026040314422548300_ref270","first-page":"363","volume-title":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR \u201915","author":"Vuli\u0107","year":"2015"},{"key":"2026040314422548300_ref271","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Wang","year":"2019"},{"key":"2026040314422548300_ref272","first-page":"1","article-title":"Corpus-independent generic keyphrase extraction using word embedding vectors","volume":"39","author":"Wang","year":"2014","journal-title":"Software engineering research conference"},{"key":"2026040314422548300_ref273","article-title":"StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding","volume-title":"8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020","author":"Wang","year":"2020"},{"key":"2026040314422548300_ref274","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1162\/tacl_a_00360","article-title":"KEPLER: A Unified Model for Knowledge Embedding and Pretrained Language Representation","volume":"9","author":"Wang","year":"2021","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2026040314422548300_ref275","volume-title":"Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021","author":"Wu","year":"2021"},{"key":"2026040314422548300_ref276","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Wu","year":"2019"},{"key":"2026040314422548300_ref277","doi-asserted-by":"crossref","DOI":"10.1145\/3366423.3380305","article-title":"Leveraging Passage-level Cumulative Gain for Document Ranking","volume-title":"Proceedings of The Web Conference 2020","author":"Wu","year":"2020"},{"key":"2026040314422548300_ref278","volume-title":"Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing","author":"Xin","year":"2020"},{"key":"2026040314422548300_ref279","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Xin","year":"2020"},{"key":"2026040314422548300_ref280","article-title":"Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations","volume-title":"CoRR","author":"Xin","year":"2021"},{"key":"2026040314422548300_ref281","doi-asserted-by":"crossref","DOI":"10.1145\/3077136.3080809","article-title":"End-to-End Neural Ad-hoc Ranking with Kernel Pooling","volume-title":"Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Xiong","year":"2017"},{"key":"2026040314422548300_ref282","volume-title":"Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Xiong","year":"2017"},{"key":"2026040314422548300_ref283","article-title":"Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval","volume-title":"9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021","author":"Xiong","year":"2021"},{"key":"2026040314422548300_ref284","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Xu","year":"2020"},{"key":"2026040314422548300_ref285","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)","author":"Yamada","year":"2021"},{"key":"2026040314422548300_ref286","first-page":"4555","volume-title":"Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI2021, Virtual Event, February 2-9, 2021","author":"Yan","year":"2021"},{"key":"2026040314422548300_ref287","volume-title":"Proceedings of the 29th ACM International Conference on Information and Knowledge Management","author":"Yang","year":"2020"},{"key":"2026040314422548300_ref288","first-page":"5754","article-title":"XLNet: Generalized Autoregressive Pretraining for Language Understanding","volume-title":"Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada","author":"Yang","year":"2019"},{"key":"2026040314422548300_ref289","volume-title":"Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Yin","year":"2016"},{"key":"2026040314422548300_ref290","first-page":"1383","volume-title":"Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015","author":"Yin","year":"2015"},{"key":"2026040314422548300_ref291","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Yu","year":"2021"},{"key":"2026040314422548300_ref292","volume-title":"Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval","author":"Zamani","year":"2016"},{"key":"2026040314422548300_ref293","volume-title":"Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Zamani","year":"2017"},{"key":"2026040314422548300_ref294","volume-title":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","author":"Zamani","year":"2018"},{"key":"2026040314422548300_ref295","volume-title":"The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Zamani","year":"2018"},{"key":"2026040314422548300_ref296","first-page":"497","volume-title":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management. CIKM \u201918","author":"Zamani","year":"2018"},{"issue":"3","key":"2026040314422548300_ref297","first-page":"137","article-title":"Statistical Language Models for Information Retrieval A Critical Review","volume":"2","author":"Zhai","year":"2007","journal-title":"Foundations and Trends\u00ae in Information Retrieval"},{"issue":"2","key":"2026040314422548300_ref298","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1145\/984321.984322","article-title":"A study of smoothing methods for language models applied to information retrieval","volume":"22","author":"Zhai","year":"2004","journal-title":"ACM Transactions on Information Systems"},{"key":"2026040314422548300_ref299","volume-title":"Proceedings of the 30th ACM International Conference on Information and Knowledge Management","author":"Zhan","year":"2021"},{"key":"2026040314422548300_ref300","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Zhan","year":"2021"},{"key":"2026040314422548300_ref301","volume-title":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","author":"Zhan","year":"2022"},{"key":"2026040314422548300_ref302","article-title":"Learning To Retrieve: How to Train a Dense Retrieval Model Effectively and Efficiently","volume-title":"CoRR","author":"Zhan","year":"2020"},{"key":"2026040314422548300_ref303","article-title":"RepBERT: Contextualized Text Embeddings for First-Stage Retrieval","volume-title":"CoRR","author":"Zhan","year":"2020"},{"key":"2026040314422548300_ref304","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Zhang","year":"2021"},{"key":"2026040314422548300_ref305","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Zhang","year":"2021"},{"key":"2026040314422548300_ref306","volume-title":"Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)","author":"Zhang","year":"2019"},{"key":"2026040314422548300_ref307","first-page":"11328","volume-title":"International Conference on Machine Learning","author":"Zhang","year":"2020"},{"key":"2026040314422548300_ref308","volume-title":"Proceedings of The Web Conference 2020","author":"Zhang","year":"2020"},{"key":"2026040314422548300_ref309","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Zhang","year":"2019"},{"key":"2026040314422548300_ref310","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Zhang","year":"2019"},{"key":"2026040314422548300_ref311","first-page":"575","volume-title":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR \u201915","author":"Zheng","year":"2015"},{"key":"2026040314422548300_ref312","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Zheng","year":"2020"},{"issue":"5","key":"2026040314422548300_ref313","doi-asserted-by":"crossref","first-page":"102672","DOI":"10.1016\/j.ipm.2021.102672","article-title":"Contextualized query expansion via unsupervised chunk selection for text retrieval","volume":"58","author":"Zheng","year":"2021","journal-title":"Information Processing and Management"},{"key":"2026040314422548300_ref314","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Zhong","year":"2020"},{"key":"2026040314422548300_ref315","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Zhong","year":"2019"},{"key":"2026040314422548300_ref316","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Zhou","year":"2021"},{"key":"2026040314422548300_ref317","volume-title":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Zhou","year":"2020"},{"key":"2026040314422548300_ref318","volume-title":"Proceedings of the 30th ACM International Conference on Information and Knowledge Management","author":"Zhou","year":"2021"},{"key":"2026040314422548300_ref319","article-title":"DynamicRetriever: A Pre-training Model-based IR System with Neither Sparse nor Dense Index","volume-title":"CoRR","author":"Zhou","year":"2022"},{"key":"2026040314422548300_ref320","article-title":"Transforming Wikipedia into Augmented Data for Query-Focused Summarization","volume-title":"CoRR","author":"Zhu","year":"2019"},{"key":"2026040314422548300_ref321","volume-title":"Proceedings of the 30th ACM International Conference on Information and Knowledge Management","author":"Zhu","year":"2021"},{"key":"2026040314422548300_ref322","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Zou","year":"2020"},{"key":"2026040314422548300_ref323","volume-title":"Proceedings of the 20th Australasian Document Computing Symposium","author":"Zuccon","year":"2015"}],"container-title":["Foundations and Trends\u00ae in Information Retrieval"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/ftinr\/article-pdf\/16\/3\/178\/11098587\/1500000100en.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/www.emerald.com\/ftinr\/article-pdf\/16\/3\/178\/11098587\/1500000100en.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T14:29:57Z","timestamp":1777472997000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.emerald.com\/ftinr\/article\/16\/3\/178\/1330394\/Pre-training-Methods-in-Information-Retrieval"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,18]]},"references-count":323,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,8,18]]}},"URL":"https:\/\/doi.org\/10.1561\/1500000100","relation":{},"ISSN":["1554-0669","1554-0677"],"issn-type":[{"value":"1554-0669","type":"print"},{"value":"1554-0677","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,18]]}}}