{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T12:13:44Z","timestamp":1776082424569,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":29,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,7,11]],"date-time":"2021-07-11T00:00:00Z","timestamp":1625961600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,7,11]]},"DOI":"10.1145\/3471158.3472238","type":"proceedings-article","created":{"date-parts":[[2021,8,31]],"date-time":"2021-08-31T20:38:17Z","timestamp":1630442297000},"page":"131-136","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Ensemble Distillation for BERT-Based Ranking Models"],"prefix":"10.1145","author":[{"given":"Honglei","family":"Zhuang","sequence":"first","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]},{"given":"Zhen","family":"Qin","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]},{"given":"Shuguang","family":"Han","sequence":"additional","affiliation":[{"name":"Alibaba, Hangzhou, China"}]},{"given":"Xuanhui","family":"Wang","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]},{"given":"Michael","family":"Bendersky","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]},{"given":"Marc","family":"Najork","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2021,8,31]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. arXiv:1611.09268","author":"Bajaj Payal","year":"2016","unstructured":"Payal Bajaj , Daniel Campos , Nick Craswell , Li Deng , Jianfeng Gao , Xiaodong Liu , Rangan Majumder , Andrew McNamara , Bhaskar Mitra , Tri Nguyen , 2016 . MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. arXiv:1611.09268 Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, et al. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. arXiv:1611.09268"},{"key":"e_1_3_2_1_2_1","unstructured":"Michael Bendersky Honglei Zhuang Ji Ma Shuguang Han Keith Hall and Ryan McDonald. 2020. RRF102: Meeting the TREC-COVID challenge with a 100+ runs ensemble. arXiv:2010.00200  Michael Bendersky Honglei Zhuang Ji Ma Shuguang Han Keith Hall and Ryan McDonald. 2020. RRF102: Meeting the TREC-COVID challenge with a 100+ runs ensemble. arXiv:2010.00200"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341981.3344221"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1273496.1273513"},{"key":"e_1_3_2_1_5_1","volume-title":"Advances in Information Retrieval - Proceedings of the 43rd European Conference on IR Research, Part II (Lecture Notes in Computer Science)","author":"Chen Xuanang","unstructured":"Xuanang Chen , Ben He , Kai Hui , Le Sun , and Yingfei Sun . 2021. Simplified TinyBERT: Knowledge Distillation for Document Retrieval . In Advances in Information Retrieval - Proceedings of the 43rd European Conference on IR Research, Part II (Lecture Notes in Computer Science) , Vol. 12657 . Springer , 241--248. Xuanang Chen, Ben He, Kai Hui, Le Sun, and Yingfei Sun. 2021. Simplified TinyBERT: Knowledge Distillation for Document Retrieval. In Advances in Information Retrieval - Proceedings of the 43rd European Conference on IR Research, Part II (Lecture Notes in Computer Science), Vol. 12657. Springer, 241--248."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1571941.1572114"},{"key":"e_1_3_2_1_7_1","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","volume":"1","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , Volume 1 (Long and Short Papers). 4171--4186. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186."},{"key":"e_1_3_2_1_8_1","volume-title":"Ensemble Methods in Machine Learning. In International workshop on multiple classifier systems. Springer, 1--15","author":"Dietterich Thomas G.","year":"2000","unstructured":"Thomas G. Dietterich . 2000 . Ensemble Methods in Machine Learning. In International workshop on multiple classifier systems. Springer, 1--15 . Thomas G. Dietterich. 2000. Ensemble Methods in Machine Learning. In International workshop on multiple classifier systems. Springer, 1--15."},{"key":"e_1_3_2_1_9_1","volume-title":"Proceedings of the 35th International Conference on Machine Learning. 1607--1616","author":"Furlanello Tommaso","year":"2018","unstructured":"Tommaso Furlanello , Zachary Lipton , Michael Tschannen , Laurent Itti , and Anima Anandkumar . 2018 . Born Again Neural Networks . In Proceedings of the 35th International Conference on Machine Learning. 1607--1616 . Tommaso Furlanello, Zachary Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. 2018. Born Again Neural Networks. In Proceedings of the 35th International Conference on Machine Learning. 1607--1616."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3409256.3409838"},{"key":"e_1_3_2_1_11_1","volume-title":"Stephen John Maybank, and Dacheng Tao","author":"Gou Jianping","year":"2020","unstructured":"Jianping Gou , Baosheng Yu , Stephen John Maybank, and Dacheng Tao . 2020 . Knowledge distillation: A survey. arXiv:2006.05525 Jianping Gou, Baosheng Yu, Stephen John Maybank, and Dacheng Tao. 2020. Knowledge distillation: A survey. arXiv:2006.05525"},{"key":"e_1_3_2_1_12_1","unstructured":"Shuguang Han Xuanhui Wang Mike Bendersky and Marc Najork. 2020. Learning-to-Rank with BERT in TF-Ranking. arXiv:2004.08476  Shuguang Han Xuanhui Wang Mike Bendersky and Marc Najork. 2020. Learning-to-Rank with BERT in TF-Ranking. arXiv:2004.08476"},{"key":"e_1_3_2_1_13_1","volume-title":"NIPS Deep Learning and Representation Learning Workshop.","author":"Hinton Geoffrey","year":"2015","unstructured":"Geoffrey Hinton , Oriol Vinyals , and Jeffrey Dean . 2015 . Distilling the Knowl- edge in a Neural Network . In NIPS Deep Learning and Representation Learning Workshop. Geoffrey Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the Knowl- edge in a Neural Network. In NIPS Deep Learning and Representation Learning Workshop."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401075"},{"key":"e_1_3_2_1_15_1","volume-title":"Proceedings of the 8th International Conference on Learning Representations (ICLR '19)","author":"Lan Zhenzhong","year":"2019","unstructured":"Zhenzhong Lan , Mingda Chen , Sebastian Goodman , Kevin Gimpel , Piyush Sharma , and Radu Soricut . 2019 . ALBERT: A Lite BERT for Self-supervised Learning of Language Representations . In Proceedings of the 8th International Conference on Learning Representations (ICLR '19) . Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. In Proceedings of the 8th International Conference on Learning Representations (ICLR '19)."},{"key":"e_1_3_2_1_16_1","unstructured":"Jimmy Lin Rodrigo Nogueira and Andrew Yates. 2020. Pretrained transformers for text ranking: BERT and beyond. arXiv:2010.06467  Jimmy Lin Rodrigo Nogueira and Andrew Yates. 2020. Pretrained transformers for text ranking: BERT and beyond. arXiv:2010.06467"},{"key":"e_1_3_2_1_17_1","unstructured":"Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv:1901.04085  Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv:1901.04085"},{"key":"e_1_3_2_1_18_1","unstructured":"Rodrigo Nogueira Wei Yang Kyunghyun Cho and Jimmy Lin. 2019. Multi-stage document ranking with BERT. arXiv:1910.14424  Rodrigo Nogueira Wei Yang Kyunghyun Cho and Jimmy Lin. 2019. Multi-stage document ranking with BERT. arXiv:1910.14424"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330677"},{"key":"e_1_3_2_1_20_1","volume-title":"International Conference on Learning Representations (ICLR '21)","author":"Qin Zhen","year":"2021","unstructured":"Zhen Qin , Le Yan , Honglei Zhuang , Yi Tay , Rama Kumar Pasumarthi , Xuanhui Wang , Michael Bendersky , and Marc Najork . 2021 . Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees? . In International Conference on Learning Representations (ICLR '21) . Zhen Qin, Le Yan, Honglei Zhuang, Yi Tay, Rama Kumar Pasumarthi, Xuanhui Wang, Michael Bendersky, and Marc Najork. 2021. Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees?. In International Conference on Learning Representations (ICLR '21)."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467099"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1093\/jamia\/ocaa091"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3220021"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8682450"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3336191.3371792"},{"key":"e_1_3_2_1_26_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence (AAAI '21)","author":"Yuan Fei","year":"2021","unstructured":"Fei Yuan , Linjun Shou , Jian Pei , Wutao Lin , Ming Gong , Yan Fu , and Daxin Jiang . 2021 . Reinforced Multi-Teacher Selection for Knowledge Distillation . In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI '21) . 14284--14291. Fei Yuan, Linjun Shou, Jian Pei, Wutao Lin, Ming Gong, Yan Fu, and Daxin Jiang. 2021. Reinforced Multi-Teacher Selection for Knowledge Distillation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI '21). 14284--14291."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401325"},{"key":"e_1_3_2_1_28_1","volume-title":"Proceedings of the 28th International Conference on Computational Linguistics: Industry Track. 33--43","author":"Zhang Wangshu","year":"2020","unstructured":"Wangshu Zhang , Junhong Liu , Zujie Wen , Yafang Wang , and Gerard de Melo . 2020 . Query Distillation: BERT-based Distillation for Ensemble Ranking . In Proceedings of the 28th International Conference on Computational Linguistics: Industry Track. 33--43 . Wangshu Zhang, Junhong Liu, Zujie Wen, Yafang Wang, and Gerard de Melo. 2020. Query Distillation: BERT-based Distillation for Ensemble Ranking. In Proceedings of the 28th International Conference on Computational Linguistics: Industry Track. 33--43."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401333"}],"event":{"name":"ICTIR '21: The 2021 ACM SIGIR International Conference on the Theory of Information Retrieval","location":"Virtual Event Canada","acronym":"ICTIR '21","sponsor":["SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3471158.3472238","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3471158.3472238","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T17:49:29Z","timestamp":1750268969000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3471158.3472238"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,11]]},"references-count":29,"alternative-id":["10.1145\/3471158.3472238","10.1145\/3471158"],"URL":"https:\/\/doi.org\/10.1145\/3471158.3472238","relation":{},"subject":[],"published":{"date-parts":[[2021,7,11]]},"assertion":[{"value":"2021-08-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}