{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T10:03:09Z","timestamp":1775815389327,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":44,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,3,8]],"date-time":"2021-03-08T00:00:00Z","timestamp":1615161600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Lenovo-CAS Joint Lab Youth Scientist Project"},{"name":"Frontier Research Key Program of Chongqing Science and Technology Commission","award":["cstc2017jcyjBX0059"],"award-info":[{"award-number":["cstc2017jcyjBX0059"]}]},{"name":"K.C.Wong Education Foundation, and the Foundation"},{"name":"Beijing Academy of Artificial Intelligence","award":["BAAI2019ZD0306"],"award-info":[{"award-number":["BAAI2019ZD0306"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61722211, 61773362, 61872338, 62006218, 61902381"],"award-info":[{"award-number":["61722211, 61773362, 61872338, 62006218, 61902381"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"the Youth Innovation Promotion Association CAS","award":["20144310, 2016102"],"award-info":[{"award-number":["20144310, 2016102"]}]},{"name":"National Key RD Program of China","award":["2016QY02D0405"],"award-info":[{"award-number":["2016QY02D0405"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,3,8]]},"DOI":"10.1145\/3437963.3441777","type":"proceedings-article","created":{"date-parts":[[2021,3,6]],"date-time":"2021-03-06T04:34:28Z","timestamp":1615005268000},"page":"283-291","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":43,"title":["PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval"],"prefix":"10.1145","author":[{"given":"Xinyu","family":"Ma","sequence":"first","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences &amp; University of Chinese Academy of Sciences, Beijing, China"}]},{"given":"Jiafeng","family":"Guo","sequence":"additional","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences &amp; University of Chinese Academy of Sciences, Beijing, China"}]},{"given":"Ruqing","family":"Zhang","sequence":"additional","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences &amp; University of Chinese Academy of Sciences, Beijing, China"}]},{"given":"Yixing","family":"Fan","sequence":"additional","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences &amp; University of Chinese Academy of Sciences, Beijing, China"}]},{"given":"Xiang","family":"Ji","sequence":"additional","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences &amp; University of Chinese Academy of Sciences, Beijing, China"}]},{"given":"Xueqi","family":"Cheng","sequence":"additional","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences &amp; University of Chinese Academy of Sciences, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2021,3,8]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390517"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2009916.2010058"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277820"},{"key":"e_1_3_2_2_4_1","unstructured":"Yoshua Bengio R\u00e9jean Ducharme Pascal Vincent and Christian Janvin. 2003. A Neural Probabilistic Language Model. JMLR 1137--1155.  Yoshua Bengio R\u00e9jean Ducharme Pascal Vincent and Christian Janvin. 2003. A Neural Probabilistic Language Model. JMLR 1137--1155."},{"key":"e_1_3_2_2_5_1","unstructured":"Wei-Cheng Chang Felix X Yu Yin-Wen Chang Yiming Yang and Sanjiv Kumar. 2019. Pre-training Tasks for Embedding-based Large-scale Retrieval. In ICLR.  Wei-Cheng Chang Felix X Yu Yin-Wen Chang Yiming Yang and Sanjiv Kumar. 2019. Pre-training Tasks for Embedding-based Large-scale Retrieval. In ICLR."},{"key":"e_1_3_2_2_6_1","volume-title":"Electra: Pre-training Text Encoders as Discriminators rather than Generators. In ICLR.","author":"Clark Kevin","year":"2020","unstructured":"Kevin Clark , Minh-Thang Luong , Quoc V Le , and Christopher D Manning . 2020 . Electra: Pre-training Text Encoders as Discriminators rather than Generators. In ICLR. Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. 2020. Electra: Pre-training Text Encoders as Discriminators rather than Generators. In ICLR."},{"key":"e_1_3_2_2_7_1","volume-title":"Technometrics","author":"Consul Prem C","unstructured":"Prem C Consul and Gaurav C Jain . 1973. A Generalization of the Poisson Distribution . In Technometrics , Vol. 15 . Taylor & Francis , 791--799. Prem C Consul and Gaurav C Jain. 1973. A Generalization of the Poisson Distribution. In Technometrics, Vol. 15. Taylor & Francis, 791--799."},{"key":"e_1_3_2_2_8_1","volume-title":"Information retrieval","author":"Cormack Gordon V","unstructured":"Gordon V Cormack , Mark D Smucker , and Charles LA Clarke . 2011. Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets . In Information retrieval , Vol. 14 . Springer , 441--465. Gordon V Cormack, Mark D Smucker, and Charles LA Clarke. 2011. Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets. In Information retrieval, Vol. 14. Springer, 441--465."},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331303"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3077136.3080832"},{"key":"e_1_3_2_2_11_1","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL , Stroudsburg, PA, USA, 4171--4186. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, Stroudsburg, PA, USA, 4171--4186."},{"key":"e_1_3_2_2_12_1","volume-title":"Modeling Diverse Relevance Patterns in Ad-hoc Retrieval. In The 41th international ACM SIGIR conference on research & development in information retrieval. ACM","author":"Fan Yixing","year":"2018","unstructured":"Yixing Fan , Jiafeng Guo , Yanyan Lan , Jun Xu , Chengxiang Zhai , and Xueqi Cheng . 2018 . Modeling Diverse Relevance Patterns in Ad-hoc Retrieval. In The 41th international ACM SIGIR conference on research & development in information retrieval. ACM , New York, NY, USA, 375--384. Yixing Fan, Jiafeng Guo, Yanyan Lan, Jun Xu, Chengxiang Zhai, and Xueqi Cheng. 2018. Modeling Diverse Relevance Patterns in Ad-hoc Retrieval. In The 41th international ACM SIGIR conference on research & development in information retrieval. ACM, New York, NY, USA, 375--384."},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983769"},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983768"},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1031"},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1110"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"crossref","unstructured":"Samuel Huston and W Bruce Croft. 2014. Parameters Learned in the Comparison of Retrieval Models using Term Dependencies. In IR UMASS. Citeseer.  Samuel Huston and W Bruce Croft. 2014. Parameters Learned in the Comparison of Retrieval Models using Term Dependencies. In IR UMASS. Citeseer.","DOI":"10.1145\/2661829.2661894"},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1612"},{"key":"e_1_3_2_2_19_1","unstructured":"Xiaoyong Liu and W Bruce Croft. 2005. Statistical Language Modeling for Information Retrieval. In IR UMASS.  Xiaoyong Liu and W Bruce Croft. 2005. Statistical Language Modeling for Information Retrieval. In IR UMASS."},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331317"},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331316"},{"key":"e_1_3_2_2_22_1","unstructured":"Christopher D. Manning Prabhakar Raghavan and Hinrich Sch\u00fctze. 2008. Frontmatter .Cambridge University Press i--iv.  Christopher D. Manning Prabhakar Raghavan and Hinrich Sch\u00fctze. 2008. Frontmatter .Cambridge University Press i--iv."},{"key":"e_1_3_2_2_23_1","unstructured":"Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085.  Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085."},{"key":"e_1_3_2_2_24_1","unstructured":"Rodrigo Nogueira Wei Yang Kyunghyun Cho and Jimmy Lin. 2019. Multi-stage Document Ranking with BERT. arXiv preprint arXiv:1910.14424.  Rodrigo Nogueira Wei Yang Kyunghyun Cho and Jimmy Lin. 2019. Multi-stage Document Ranking with BERT. arXiv preprint arXiv:1910.14424."},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3132847.3132914"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/290941.291008"},{"key":"e_1_3_2_2_28_1","unstructured":"Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 datasets. arXiv preprint arXiv:1306.2597.  Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 datasets. arXiv preprint arXiv:1306.2597."},{"key":"e_1_3_2_2_29_1","unstructured":"Xipeng Qiu Tianxiang Sun Yige Xu Yunfan Shao Ning Dai and Xuanjing Huang. 2020. Pre-trained Models for Natural Language Processing: A Survey. arXiv preprint arXiv:2003.08271.  Xipeng Qiu Tianxiang Sun Yige Xu Yunfan Shao Ning Dai and Xuanjing Huang. 2020. Pre-trained Models for Natural Language Processing: A Survey. arXiv preprint arXiv:2003.08271."},{"key":"e_1_3_2_2_30_1","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving language understanding by generative pre-training.  Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving language understanding by generative pre-training."},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4471-2099-5_24"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.3115\/1119176.1119195"},{"key":"e_1_3_2_2_33_1","volume-title":"Proceedings of the 2013 conference on empirical methods in natural language processing. 1631--1642","author":"Socher Richard","year":"2013","unstructured":"Richard Socher , Alex Perelygin , Jean Wu , Jason Chuang , Christopher D Manning , Andrew Y Ng , and Christopher Potts . 2013 . Recursive deep models for semantic compositionality over a sentiment treebank . In Proceedings of the 2013 conference on empirical methods in natural language processing. 1631--1642 . Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D Manning, Andrew Y Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing. 1631--1642."},{"key":"e_1_3_2_2_34_1","volume-title":"Mass: Masked Sequence to Sequence Pre-training for Language Generation. In ICML. 11328--11339.","author":"Song Kaitao","year":"2019","unstructured":"Kaitao Song , Xu Tan , Tao Qin , Jianfeng Lu , and Tie-Yan Liu . 2019 . Mass: Masked Sequence to Sequence Pre-training for Language Generation. In ICML. 11328--11339. Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2019. Mass: Masked Sequence to Sequence Pre-training for Language Generation. In ICML. 11328--11339."},{"key":"e_1_3_2_2_35_1","volume-title":"Proceedings of the 27th International Conference on Neural Information Processing Systems. MIT Press","author":"Sutskever Ilya","unstructured":"Ilya Sutskever , Oriol Vinyals , and Quoc V. Le . 2014. Sequence to Sequence Learning with Neural Networks . In Proceedings of the 27th International Conference on Neural Information Processing Systems. MIT Press , Cambridge, MA, USA, 3104--3112. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems. MIT Press, Cambridge, MA, USA, 3104--3112."},{"key":"e_1_3_2_2_36_1","volume-title":"Journalism quarterly","author":"Taylor Wilson L","unstructured":"Wilson L Taylor . 1953. ' Cloze Procedure': A New Tool for Measuring Readability . In Journalism quarterly , Vol. 30 . 415--433. Wilson L Taylor. 1953. 'Cloze Procedure': A New Tool for Measuring Readability. In Journalism quarterly, Vol. 30. 415--433."},{"key":"e_1_3_2_2_37_1","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is All you Need. In Advances in neural information processing systems. 5998--6008.  Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is All you Need. In Advances in neural information processing systems. 5998--6008."},{"key":"e_1_3_2_2_38_1","volume-title":"Structbert: Incorporating Language Structures into Pre-training for Deep Language Understanding. In ICLR.","author":"Wang Wei","year":"2019","unstructured":"Wei Wang , Bin Bi , Ming Yan , Chen Wu , Zuyi Bao , Liwei Peng , and Luo Si . 2019 . Structbert: Incorporating Language Structures into Pre-training for Deep Language Understanding. In ICLR. Wei Wang, Bin Bi, Ming Yan, Chen Wu, Zuyi Bao, Liwei Peng, and Luo Si. 2019. Structbert: Incorporating Language Structures into Pre-training for Deep Language Understanding. In ICLR."},{"key":"e_1_3_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3077136.3080809"},{"key":"e_1_3_2_2_40_1","volume-title":"2019 b. Simple Applications of BERT for Ad Hoc Document Retrieval. arXiv preprint arXiv:1903.10972","author":"Yang Wei","year":"2019","unstructured":"Wei Yang , Haotian Zhang , and Jimmy Lin . 2019 b. Simple Applications of BERT for Ad Hoc Document Retrieval. arXiv preprint arXiv:1903.10972 ( 2019 ). Wei Yang, Haotian Zhang, and Jimmy Lin. 2019 b. Simple Applications of BERT for Ad Hoc Document Retrieval. arXiv preprint arXiv:1903.10972 (2019)."},{"key":"e_1_3_2_2_41_1","volume-title":"Advances in Neural Information Processing Systems. Curran Associates","author":"Yang Zhilin","unstructured":"Zhilin Yang , Zihang Dai , Yiming Yang , Jaime Carbonell , Russ R Salakhutdinov , and Quoc V Le . 2019 a. XLNet: Generalized Autoregressive Pretraining for Language Understanding . In Advances in Neural Information Processing Systems. Curran Associates , Inc ., 5753--5763. Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019 a. XLNet: Generalized Autoregressive Pretraining for Language Understanding. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 5753--5763."},{"key":"e_1_3_2_2_42_1","volume-title":"Statistical Language Models for Information Retrieval. Synthesis lectures on human language technologies","author":"Zhai ChengXiang","year":"2008","unstructured":"ChengXiang Zhai . 2008. Statistical Language Models for Information Retrieval. Synthesis lectures on human language technologies , Vol. 1 , 1 ( 2008 ), 1--141. ChengXiang Zhai. 2008. Statistical Language Models for Information Retrieval. Synthesis lectures on human language technologies, Vol. 1, 1 (2008), 1--141."},{"key":"e_1_3_2_2_43_1","volume-title":"ACM SIGIR Forum.","author":"Zhai Chengxiang","unstructured":"Chengxiang Zhai and John Lafferty . 2001. A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval . In ACM SIGIR Forum. New York, NY, USA , ACM , 268--276. Chengxiang Zhai and John Lafferty. 2001. A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval. In ACM SIGIR Forum. New York, NY, USA, ACM, 268--276."},{"key":"e_1_3_2_2_44_1","volume-title":"Pegasus: Pre-training with Extracted Gap-sentences for Abstractive Summarization. In ICML. 11328--11339.","author":"Zhang Jingqing","year":"2020","unstructured":"Jingqing Zhang , Yao Zhao , Mohammad Saleh , and Peter Liu . 2020 . Pegasus: Pre-training with Extracted Gap-sentences for Abstractive Summarization. In ICML. 11328--11339. Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. 2020. Pegasus: Pre-training with Extracted Gap-sentences for Abstractive Summarization. In ICML. 11328--11339."}],"event":{"name":"WSDM '21: The Fourteenth ACM International Conference on Web Search and Data Mining","location":"Virtual Event Israel","acronym":"WSDM '21","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data","SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 14th ACM International Conference on Web Search and Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3437963.3441777","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3437963.3441777","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:47:35Z","timestamp":1750193255000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3437963.3441777"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,8]]},"references-count":44,"alternative-id":["10.1145\/3437963.3441777","10.1145\/3437963"],"URL":"https:\/\/doi.org\/10.1145\/3437963.3441777","relation":{},"subject":[],"published":{"date-parts":[[2021,3,8]]},"assertion":[{"value":"2021-03-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}