{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,16]],"date-time":"2025-12-16T12:37:59Z","timestamp":1765888679172,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":60,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,8,14]],"date-time":"2021-08-14T00:00:00Z","timestamp":1628899200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,8,14]]},"DOI":"10.1145\/3447548.3467147","type":"proceedings-article","created":{"date-parts":[[2021,8,12]],"date-time":"2021-08-12T06:12:09Z","timestamp":1628748729000},"page":"4014-4022","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":48,"title":["Pre-trained Language Model based Ranking in Baidu Search"],"prefix":"10.1145","author":[{"given":"Lixin","family":"Zou","sequence":"first","affiliation":[{"name":"Baidu Inc., Beijing, China"}]},{"given":"Shengqiang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}]},{"given":"Hengyi","family":"Cai","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}]},{"given":"Dehong","family":"Ma","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}]},{"given":"Suqi","family":"Cheng","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}]},{"given":"Shuaiqiang","family":"Wang","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}]},{"given":"Daiting","family":"Shi","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}]},{"given":"Zhicong","family":"Cheng","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}]},{"given":"Dawei","family":"Yin","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2021,8,14]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Better fine-tuning by reducing representational collapse. arXiv:2008.03156","author":"Aghajanyan Armen","year":"2020","unstructured":"Armen Aghajanyan , Akshat Shrivastava , Anchit Gupta , Naman Goyal , Luke Zettlemoyer , and Sonal Gupta . 2020. Better fine-tuning by reducing representational collapse. arXiv:2008.03156 ( 2020 ). Armen Aghajanyan, Akshat Shrivastava, Anchit Gupta, Naman Goyal, Luke Zettlemoyer, and Sonal Gupta. 2020. Better fine-tuning by reducing representational collapse. arXiv:2008.03156 (2020)."},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.394"},{"volume-title":"Cloze-driven Pretraining of Self-attention Networks. In EMNLP'19","author":"Baevski Alexei","key":"e_1_3_2_2_3_1","unstructured":"Alexei Baevski , Sergey Edunov , Yinhan Liu , Luke Zettlemoyer , and M. Auli . 2019 . Cloze-driven Pretraining of Self-attention Networks. In EMNLP'19 . Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, and M. Auli. 2019. Cloze-driven Pretraining of Self-attention Networks. In EMNLP'19."},{"key":"e_1_3_2_2_4_1","volume-title":"Longformer: The longdocument transformer. arXiv:2004.05150","author":"Beltagy Iz","year":"2020","unstructured":"Iz Beltagy , Matthew E Peters , and Arman Cohan . 2020 . Longformer: The longdocument transformer. arXiv:2004.05150 (2020). Iz Beltagy, Matthew E Peters, and Arman Cohan. 2020. Longformer: The longdocument transformer. arXiv:2004.05150 (2020)."},{"key":"e_1_3_2_2_5_1","volume-title":"From ranknet to lambdarank to lambdamart: An overview. Learning","author":"Burges Christopher JC","year":"2010","unstructured":"Christopher JC Burges . 2010. From ranknet to lambdarank to lambdamart: An overview. Learning ( 2010 ). Christopher JC Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning (2010)."},{"volume-title":"ICML'07","author":"Cao Zhe","key":"e_1_3_2_2_6_1","unstructured":"Zhe Cao , Tao Qin , T. Liu , Ming-Feng Tsai , and H. Li . 2007. Learning to rank: from pairwise approach to listwise approach . In ICML'07 . Zhe Cao, Tao Qin, T. Liu, Ming-Feng Tsai, and H. Li. 2007. Learning to rank: from pairwise approach to listwise approach. In ICML'07."},{"key":"e_1_3_2_2_7_1","volume-title":"Pre-training Tasks for Embedding-based Large-scale Retrieval. In ICLR'19","author":"Chang Wei-Cheng","year":"2019","unstructured":"Wei-Cheng Chang , X Yu Felix , Yin-Wen Chang , Yiming Yang , and Sanjiv Kumar . 2019 . Pre-training Tasks for Embedding-based Large-scale Retrieval. In ICLR'19 . Wei-Cheng Chang, X Yu Felix, Yin-Wen Chang, Yiming Yang, and Sanjiv Kumar. 2019. Pre-training Tasks for Embedding-based Large-scale Retrieval. In ICLR'19."},{"key":"e_1_3_2_2_8_1","volume-title":"Largescale validation and analysis of interleaved search evaluation. TOIS","author":"Chapelle Olivier","year":"2012","unstructured":"Olivier Chapelle , Thorsten Joachims , Filip Radlinski , and Yisong Yue . 2012. Largescale validation and analysis of interleaved search evaluation. TOIS ( 2012 ). Olivier Chapelle, Thorsten Joachims, Filip Radlinski, and Yisong Yue. 2012. Largescale validation and analysis of interleaved search evaluation. TOIS (2012)."},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"crossref","unstructured":"O. Chapelle and Y. Zhang. 2009. A dynamic bayesian network click model for web search ranking. In WWW'09.  O. Chapelle and Y. Zhang. 2009. A dynamic bayesian network click model for web search ranking. In WWW'09.","DOI":"10.1145\/1526709.1526711"},{"key":"e_1_3_2_2_10_1","volume-title":"Fuli Feng, Ming-Chieh Wang, and X. He.","author":"Chen J.","year":"2020","unstructured":"J. Chen , Hande Dong , Xiao lei Wang , Fuli Feng, Ming-Chieh Wang, and X. He. 2020 . Bias and Debias in Recommender System : A Survey and Future Directions . arXiv:2010.03240 (2020). J. Chen, Hande Dong, Xiao lei Wang, Fuli Feng, Ming-Chieh Wang, and X. He. 2020. Bias and Debias in Recommender System: A Survey and Future Directions. arXiv:2010.03240 (2020)."},{"key":"e_1_3_2_2_11_1","unstructured":"Krzysztof Choromanski Valerii Likhosherstov David Dohan Xingyou Song Andreea Gane Tamas Sarlos Peter Hawkins Jared Davis Afroz Mohiuddin Lukasz Kaiser etal 2020. Rethinking attention with performers. arXiv:2009.14794 (2020).  Krzysztof Choromanski Valerii Likhosherstov David Dohan Xingyou Song Andreea Gane Tamas Sarlos Peter Hawkins Jared Davis Afroz Mohiuddin Lukasz Kaiser et al. 2020. Rethinking attention with performers. arXiv:2009.14794 (2020)."},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/133160.133199"},{"key":"e_1_3_2_2_13_1","volume-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT'19.","author":"Devlin J.","year":"2019","unstructured":"J. Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT'19. J. Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT'19."},{"key":"e_1_3_2_2_14_1","volume-title":"An Efficient Boosting Algorithm for Combining Preferences. JMLR","author":"Freund Yoav","year":"2003","unstructured":"Yoav Freund , Raj Iyer , Robert E Schapire , and Yoram Singer . 2003. An Efficient Boosting Algorithm for Combining Preferences. JMLR ( 2003 ). Yoav Freund, Raj Iyer, Robert E Schapire, and Yoram Singer. 2003. An Efficient Boosting Algorithm for Combining Preferences. JMLR (2003)."},{"key":"e_1_3_2_2_15_1","volume-title":"A decision-theoretic generalization of on-line learning and an application to boosting. JCSS","author":"Freund Yoav","year":"1997","unstructured":"Yoav Freund and Robert E Schapire . 1997. A decision-theoretic generalization of on-line learning and an application to boosting. JCSS ( 1997 ). Yoav Freund and Robert E Schapire. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. JCSS (1997)."},{"key":"e_1_3_2_2_16_1","volume":"202","author":"Gao Luyu","unstructured":"Luyu Gao , Zhuyun Dai , and J. Callan. 202 0. Modularized Transfomer-based Ranking Framework. In EMNLP'20. Luyu Gao, Zhuyun Dai, and J. Callan. 2020. Modularized Transfomer-based Ranking Framework. In EMNLP'20.","journal-title":"J. Callan."},{"volume-title":"Deep learning","author":"Goodfellow Ian","key":"e_1_3_2_2_17_1","unstructured":"Ian Goodfellow , Yoshua Bengio , Aaron Courville , and Yoshua Bengio . 2016. Deep learning . Vol. 1 . MIT press Cambridge . Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep learning. Vol. 1. MIT press Cambridge."},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983769"},{"key":"e_1_3_2_2_19_1","volume-title":"Smith","author":"Gururangan Suchin","year":"2020","unstructured":"Suchin Gururangan , Ana Marasovi?, Swabha Swayamdipta , Kyle Lo , Iz Beltagy , Doug Downey , and Noah A . Smith . 2020 . Don't Stop Pretraining : Adapt Language Models to Domains and Tasks . arXiv:2004.10964 (2020). Suchin Gururangan, Ana Marasovi?, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A. Smith. 2020. Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. arXiv:2004.10964 (2020)."},{"key":"e_1_3_2_2_20_1","volume-title":"Distilling the knowledge in a neural network. arXiv:1503.02531","author":"Hinton Geoffrey","year":"2015","unstructured":"Geoffrey Hinton , Oriol Vinyals , and Jeff Dean . 2015. Distilling the knowledge in a neural network. arXiv:1503.02531 ( 2015 ). Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv:1503.02531 (2015)."},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505665"},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00286"},{"key":"e_1_3_2_2_23_1","volume-title":"SIGIR'17","author":"J\u00e4rvelin K.","year":"2017","unstructured":"K. J\u00e4rvelin and Jaana Kek\u00e4l\u00e4inen . 2017 . IR evaluation methods for retrieving highly relevant documents . In SIGIR'17 . K. J\u00e4rvelin and Jaana Kek\u00e4l\u00e4inen. 2017. IR evaluation methods for retrieving highly relevant documents. In SIGIR'17."},{"key":"e_1_3_2_2_24_1","volume-title":"Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization. arXiv:1911.03437","author":"Jiang Haoming","year":"2019","unstructured":"Haoming Jiang , Pengcheng He , Weizhu Chen , Xiaodong Liu , Jianfeng Gao , and Tuo Zhao . 2019 . Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization. arXiv:1911.03437 (2019). Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Tuo Zhao. 2019. Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization. arXiv:1911.03437 (2019)."},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775067"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401075"},{"key":"e_1_3_2_2_27_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba . 2015 . Adam : A Method for Stochastic Optimization. CoRR abs\/1412.6980 (2015). Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. CoRR abs\/1412.6980 (2015)."},{"key":"e_1_3_2_2_28_1","volume-title":"Reformer: The efficient transformer. arXiv:2001.04451","author":"Kitaev Nikita","year":"2020","unstructured":"Nikita Kitaev , ?ukasz Kaiser, and Anselm Levskaya . 2020 . Reformer: The efficient transformer. arXiv:2001.04451 (2020). Nikita Kitaev, ?ukasz Kaiser, and Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv:2001.04451 (2020)."},{"key":"e_1_3_2_2_29_1","volume-title":"Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv:1806.08342","author":"Krishnamoorthi Raghuraman","year":"2018","unstructured":"Raghuraman Krishnamoorthi . 2018. Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv:1806.08342 ( 2018 ). Raghuraman Krishnamoorthi. 2018. Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv:1806.08342 (2018)."},{"key":"e_1_3_2_2_30_1","volume-title":"Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942","author":"Lan Zhenzhong","year":"2019","unstructured":"Zhenzhong Lan , Mingda Chen , Sebastian Goodman , Kevin Gimpel , Piyush Sharma , and Radu Soricut . 2019 . Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942 (2019). Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942 (2019)."},{"key":"e_1_3_2_2_31_1","volume-title":"Chan Ho So, and Jaewoo Kang","author":"Lee Jinhyuk","year":"2020","unstructured":"Jinhyuk Lee , Wonjin Yoon , Sungdong Kim , D. Kim , Sunkyu Kim , Chan Ho So, and Jaewoo Kang . 2020 . BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics ( 2020). Jinhyuk Lee,Wonjin Yoon, Sungdong Kim, D. Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2020. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics (2020)."},{"key":"e_1_3_2_2_32_1","volume-title":"NIPS'07","author":"Li Ping","year":"2007","unstructured":"Ping Li , Qiang Wu , and Christopher Burges . 2007 . McRank: Learning to Rank Using Multiple Classification and Gradient Boosting . NIPS'07 (2007). Ping Li, Qiang Wu, and Christopher Burges. 2007. McRank: Learning to Rank Using Multiple Classification and Gradient Boosting. NIPS'07 (2007)."},{"key":"e_1_3_2_2_33_1","unstructured":"Jimmy Lin Rodrigo Nogueira and A. Yates. 2020. Pretrained Transformers for Text Ranking: BERT and Beyond. arXiv:2010.06467 (2020).  Jimmy Lin Rodrigo Nogueira and A. Yates. 2020. Pretrained Transformers for Text Ranking: BERT and Beyond. arXiv:2010.06467 (2020)."},{"key":"e_1_3_2_2_34_1","volume-title":"Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval","author":"Liu Tie-Yan","year":"2009","unstructured":"Tie-Yan Liu . 2009. Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval ( 2009 ). Tie-Yan Liu. 2009. Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval (2009)."},{"key":"e_1_3_2_2_35_1","volume-title":"Pre-trained Language Model forWeb-scale Retrieval in Baidu Search. In SIGKDD'21","author":"Liu Yiding","year":"2021","unstructured":"Yiding Liu , Weixue Lu , Suqi Cheng , Daiting Shi , Shuaiqiang Wang , Zhicong Cheng , and Dawei Yin . 2021 . Pre-trained Language Model forWeb-scale Retrieval in Baidu Search. In SIGKDD'21 . Yiding Liu, Weixue Lu, Suqi Cheng, Daiting Shi, Shuaiqiang Wang, Zhicong Cheng, and Dawei Yin. 2021. Pre-trained Language Model forWeb-scale Retrieval in Baidu Search. In SIGKDD'21."},{"key":"e_1_3_2_2_36_1","volume-title":"PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval. arXiv:2010.10137","author":"Ma Xinyu","year":"2020","unstructured":"Xinyu Ma , Jiafeng Guo , Ruqing Zhang , Yixing Fan , Xiang Ji , and Xueqi Cheng . 2020 . PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval. arXiv:2010.10137 (2020). Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Xiang Ji, and Xueqi Cheng. 2020. PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval. arXiv:2010.10137 (2020)."},{"key":"e_1_3_2_2_37_1","unstructured":"Oded Z Maimon and Lior Rokach. 2014. Data mining with decision trees: theory and applications. World scientific.  Oded Z Maimon and Lior Rokach. 2014. Data mining with decision trees: theory and applications. World scientific."},{"key":"e_1_3_2_2_38_1","volume-title":"Deep Relevance Ranking Using Enhanced Document-Query Interactions. In EMNLP'18","author":"McDonald Ryan","year":"2018","unstructured":"Ryan McDonald , George Brokos , and Ion Androutsopoulos . 2018 . Deep Relevance Ranking Using Enhanced Document-Query Interactions. In EMNLP'18 . Ryan McDonald, George Brokos, and Ion Androutsopoulos. 2018. Deep Relevance Ranking Using Enhanced Document-Query Interactions. In EMNLP'18."},{"key":"e_1_3_2_2_39_1","volume-title":"Passage Re-ranking with BERT. arXiv:1901.04085","author":"Nogueira Rodrigo","year":"2019","unstructured":"Rodrigo Nogueira and Kyunghyun Cho . 2019. Passage Re-ranking with BERT. arXiv:1901.04085 ( 2019 ). Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv:1901.04085 (2019)."},{"key":"e_1_3_2_2_40_1","volume-title":"Multi-Stage Document Ranking with BERT. arXiv:1910.14424","author":"Nogueira Rodrigo","year":"2019","unstructured":"Rodrigo Nogueira , W. Yang , Kyunghyun Cho , and Jimmy Lin . 2019. Multi-Stage Document Ranking with BERT. arXiv:1910.14424 ( 2019 ). Rodrigo Nogueira, W. Yang, Kyunghyun Cho, and Jimmy Lin. 2019. Multi-Stage Document Ranking with BERT. arXiv:1910.14424 (2019)."},{"key":"e_1_3_2_2_41_1","volume-title":"Modeling of Pruning Techniques for Deep Neural Networks Simplification. arXiv:2001.04062","author":"Pasandi Morteza Mousa","year":"2020","unstructured":"Morteza Mousa Pasandi , Mohsen Hajabdollahi , Nader Karimi , and Shadrokh Samavi . 2020. Modeling of Pruning Techniques for Deep Neural Networks Simplification. arXiv:2001.04062 ( 2020 ). Morteza Mousa Pasandi, Mohsen Hajabdollahi, Nader Karimi, and Shadrokh Samavi. 2020. Modeling of Pruning Techniques for Deep Neural Networks Simplification. arXiv:2001.04062 (2020)."},{"key":"e_1_3_2_2_42_1","volume-title":"Andrea Caponnetto, Michele Piana, and Alessandro Verri.","author":"Rosasco Lorenzo","year":"2004","unstructured":"Lorenzo Rosasco , Ernesto De Vito , Andrea Caponnetto, Michele Piana, and Alessandro Verri. 2004 . Are loss functions all the same? Neural computation (2004). Lorenzo Rosasco, Ernesto De Vito, Andrea Caponnetto, Michele Piana, and Alessandro Verri. 2004. Are loss functions all the same? Neural computation (2004)."},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661829.2661935"},{"key":"e_1_3_2_2_44_1","volume-title":"ERNIE: Enhanced Representation through Knowledge Integration. arXiv preprint arXiv:1904.09223","author":"Sun Yu","year":"2019","unstructured":"Yu Sun , Shuohuan Wang , Yukun Li , Shikun Feng , Xuyi Chen , Han Zhang , Xin Tian , Danxiang Zhu , Hao Tian , and Hua Wu . 2019 . ERNIE: Enhanced Representation through Knowledge Integration. arXiv preprint arXiv:1904.09223 (2019). Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. ERNIE: Enhanced Representation through Knowledge Integration. arXiv preprint arXiv:1904.09223 (2019)."},{"key":"e_1_3_2_2_45_1","volume-title":"Efficient transformers: A survey. arXiv:2009.06732","author":"Tay Yi","year":"2020","unstructured":"Yi Tay , Mostafa Dehghani , Dara Bahri , and Donald Metzler . 2020. Efficient transformers: A survey. arXiv:2009.06732 ( 2020 ). Yi Tay, Mostafa Dehghani, Dara Bahri, and Donald Metzler. 2020. Efficient transformers: A survey. arXiv:2009.06732 (2020)."},{"key":"e_1_3_2_2_46_1","volume-title":"NIPS'17","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N. Gomez , L. Kaiser , and Illia Polosukhin . 2017 . Attention is All you Need . In NIPS'17 . Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, L. Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS'17."},{"key":"e_1_3_2_2_47_1","volume-title":"Linformer: Self-attention with linear complexity. arXiv:2006.04768","author":"Li Belinda","year":"2020","unstructured":"SinongWang, Belinda Li , Madian Khabsa , Han Fang , and Hao Ma . 2020 . Linformer: Self-attention with linear complexity. arXiv:2006.04768 (2020). SinongWang, Belinda Li, Madian Khabsa, Han Fang, and Hao Ma. 2020. Linformer: Self-attention with linear complexity. arXiv:2006.04768 (2020)."},{"key":"e_1_3_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3077136.3080809"},{"key":"e_1_3_2_2_49_1","volume-title":"Le","author":"Yang Z.","year":"2019","unstructured":"Z. Yang , Zihang Dai , Yiming Yang , J. Carbonell , R. Salakhutdinov , and Quoc V . Le . 2019 . XLNet: Generalized Autoregressive Pretraining for Language Understanding. NeurIPS '19 (2019). Z. Yang, Zihang Dai, Yiming Yang, J. Carbonell, R. Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. NeurIPS'19 (2019)."},{"key":"e_1_3_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939677"},{"key":"e_1_3_2_2_51_1","volume-title":"ICML'20","author":"Zhang Jingqing","year":"2020","unstructured":"Jingqing Zhang , Yao Zhao , Mohammad Saleh , and Peter Liu . 2020 . Pegasus: Pre-training with extracted gap-sentences for abstractive summarization . In ICML'20 . Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. 2020. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In ICML'20."},{"key":"e_1_3_2_2_52_1","volume-title":"Md Arafat Sultan, V. Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, S. Roukos, A. Sil, and T. Ward.","author":"Zhang R.","year":"2020","unstructured":"R. Zhang , Revanth Reddy Gangi Reddy , Md Arafat Sultan, V. Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, S. Roukos, A. Sil, and T. Ward. 2020 . Multi-Stage Pre-training for Low-Resource Domain Adaptation . arXiv:2010.05904 (2020). R. Zhang, Revanth Reddy Gangi Reddy, Md Arafat Sultan, V. Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, S. Roukos, A. Sil, and T. Ward. 2020. Multi-Stage Pre-training for Low-Resource Domain Adaptation. arXiv:2010.05904 (2020)."},{"volume-title":"IJCNLP'11","author":"Zhao Shiqi","key":"e_1_3_2_2_53_1","unstructured":"Shiqi Zhao , H. Wang , Chao Li , T. Liu , and Y. Guan . 2011. Automatically Generating Questions from Queries for Community-based Question Answering . In IJCNLP'11 . Shiqi Zhao, H. Wang, Chao Li, T. Liu, and Y. Guan. 2011. Automatically Generating Questions from Queries for Community-based Question Answering. In IJCNLP'11."},{"key":"e_1_3_2_2_54_1","volume-title":"Whole-Chain Recommendations. In CIKM'20","author":"Zhao Xiangyu","year":"2020","unstructured":"Xiangyu Zhao , Long Xia , Lixin Zou , Hui Liu , Dawei Yin , and Jiliang Tang . 2020 . Whole-Chain Recommendations. In CIKM'20 . Xiangyu Zhao, Long Xia, Lixin Zou, Hui Liu, Dawei Yin, and Jiliang Tang. 2020. Whole-Chain Recommendations. In CIKM'20."},{"key":"e_1_3_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277792"},{"key":"e_1_3_2_2_56_1","volume-title":"Seyeon Lee, Bill Yuchen Lin, and Xiang Ren.","author":"Zhou Wangchunshu","year":"2020","unstructured":"Wangchunshu Zhou , Dong-Ho Lee , Ravi Kiran Selvam , Seyeon Lee, Bill Yuchen Lin, and Xiang Ren. 2020 . Pre-training text-to-text transformers for concept centric common sense. arXiv:2011.07956 (2020). Wangchunshu Zhou, Dong-Ho Lee, Ravi Kiran Selvam, Seyeon Lee, Bill Yuchen Lin, and Xiang Ren. 2020. Pre-training text-to-text transformers for concept centric common sense. arXiv:2011.07956 (2020)."},{"key":"e_1_3_2_2_57_1","volume-title":"Fine-tuning language models from human preferences. arXiv:1909.08593","author":"Ziegler Daniel M","year":"2019","unstructured":"Daniel M Ziegler , Nisan Stiennon , JeffreyWu, Tom B Brown , Alec Radford , Dario Amodei , Paul Christiano , and Geoffrey Irving . 2019. Fine-tuning language models from human preferences. arXiv:1909.08593 ( 2019 ). Daniel M Ziegler, Nisan Stiennon, JeffreyWu, Tom B Brown, Alec Radford, Dario Amodei, Paul Christiano, and Geoffrey Irving. 2019. Fine-tuning language models from human preferences. arXiv:1909.08593 (2019)."},{"key":"e_1_3_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330668"},{"key":"e_1_3_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3336191.3371801"},{"key":"e_1_3_2_2_60_1","volume-title":"Neural Interactive Collaborative Filtering. In SIGIR'20","author":"Zou Lixin","year":"2020","unstructured":"Lixin Zou , Long Xia , Yulong Gu , Xiangyu Zhao , Weidong Liu , Jimmy Xiangji Huang , and Dawei Yin . 2020 . Neural Interactive Collaborative Filtering. In SIGIR'20 . Lixin Zou, Long Xia, Yulong Gu, Xiangyu Zhao, Weidong Liu, Jimmy Xiangji Huang, and Dawei Yin. 2020. Neural Interactive Collaborative Filtering. In SIGIR'20."}],"event":{"name":"KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data"],"location":"Virtual Event Singapore","acronym":"KDD '21"},"container-title":["Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery &amp; Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3447548.3467147","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3447548.3467147","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:18:27Z","timestamp":1750191507000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3447548.3467147"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,14]]},"references-count":60,"alternative-id":["10.1145\/3447548.3467147","10.1145\/3447548"],"URL":"https:\/\/doi.org\/10.1145\/3447548.3467147","relation":{},"subject":[],"published":{"date-parts":[[2021,8,14]]},"assertion":[{"value":"2021-08-14","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}