{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T16:01:08Z","timestamp":1772121668332,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":33,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,7,11]],"date-time":"2021-07-11T00:00:00Z","timestamp":1625961600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Natural Science Foundation of China","award":["61732008,61532011"],"award-info":[{"award-number":["61732008,61532011"]}]},{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2018YFC0831700"],"award-info":[{"award-number":["2018YFC0831700"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Tsinghua University Guoqiang Research Institute"},{"name":"Beijing Academy of Artificial Intelligence"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,7,11]]},"DOI":"10.1145\/3404835.3462880","type":"proceedings-article","created":{"date-parts":[[2021,7,12]],"date-time":"2021-07-12T02:41:48Z","timestamp":1626057708000},"page":"1503-1512","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":147,"title":["Optimizing Dense Retrieval Model Training with Hard Negatives"],"prefix":"10.1145","author":[{"given":"Jingtao","family":"Zhan","sequence":"first","affiliation":[{"name":"BNRist, DCST, Tsinghua University, Beijing, China"}]},{"given":"Jiaxin","family":"Mao","sequence":"additional","affiliation":[{"name":"GSAI, Renmin University of China, Beijing, China"}]},{"given":"Yiqun","family":"Liu","sequence":"additional","affiliation":[{"name":"BNRist, DCST, Tsinghua University, Beijing, China"}]},{"given":"Jiafeng","family":"Guo","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences &amp; Institute of Computing Technology, CAS, Beijing, China"}]},{"given":"Min","family":"Zhang","sequence":"additional","affiliation":[{"name":"BNRist, DCST, Tsinghua University, Beijing, China"}]},{"given":"Shaoping","family":"Ma","sequence":"additional","affiliation":[{"name":"BNRist, DCST, Tsinghua University, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2021,7,11]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"et almbox","author":"Bajaj Payal","year":"2016","unstructured":"Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, et almbox. 2016. Ms marco: A human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268 (2016)."},{"key":"e_1_3_2_2_2_1","volume-title":"Representation learning: A review and new perspectives","author":"Bengio Yoshua","year":"2013","unstructured":"Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence , Vol. 35, 8 (2013), 1798--1828."},{"key":"e_1_3_2_2_3_1","first-page":"23","article-title":"From ranknet to lambdarank to lambdamart: An overview","volume":"11","author":"Burges Christopher JC","year":"2010","unstructured":"Christopher JC Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning , Vol. 11, 23--581 (2010), 81.","journal-title":"Learning"},{"key":"e_1_3_2_2_4_1","volume-title":"Overview of the TREC 2019 deep learning track. In Text REtrieval Conference (TREC) . TREC.","author":"Craswell Nick","unstructured":"Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Ellen M. Voorhees. 2020. Overview of the TREC 2019 deep learning track. In Text REtrieval Conference (TREC) . TREC."},{"key":"e_1_3_2_2_5_1","volume-title":"Context-aware sentence\/passage term importance estimation for first stage retrieval. arXiv preprint arXiv:1910.10687","author":"Dai Zhuyun","year":"2019","unstructured":"Zhuyun Dai and Jamie Callan. 2019. Context-aware sentence\/passage term importance estimation for first stage retrieval. arXiv preprint arXiv:1910.10687 (2019)."},{"key":"e_1_3_2_2_6_1","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171--4186","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171--4186."},{"key":"e_1_3_2_2_7_1","volume-title":"RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering. arXiv preprint arXiv:2010.08191","author":"Yuchen Ding Yingqi Qu","year":"2020","unstructured":"Yingqi Qu Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Xin Zhao, Daxiang Dong, Hua Wu, and Haifeng Wang. 2020. RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering. arXiv preprint arXiv:2010.08191 (2020)."},{"key":"e_1_3_2_2_8_1","volume-title":"Complementing lexical retrieval with semantic residual embedding. arXiv preprint arXiv:2004.13969","author":"Gao Luyu","year":"2020","unstructured":"Luyu Gao, Zhuyun Dai, Zhen Fan, and Jamie Callan. 2020. Complementing lexical retrieval with semantic residual embedding. arXiv preprint arXiv:2004.13969 (2020)."},{"key":"e_1_3_2_2_9_1","volume-title":"Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. 297--304","author":"Gutmann Michael","year":"2010","unstructured":"Michael Gutmann and Aapo Hyv\"arinen. 2010. Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. 297--304."},{"key":"e_1_3_2_2_10_1","volume-title":"Realm: retrieval-augmented language model pre-training. arXiv preprint arXiv:2002.08909","author":"Guu Kelvin","year":"2020","unstructured":"Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. 2020. Realm: retrieval-augmented language model pre-training. arXiv preprint arXiv:2002.08909 (2020)."},{"key":"e_1_3_2_2_11_1","volume-title":"Learning-to-Rank with BERT in TF-Ranking. arXiv preprint arXiv:2004.08476","author":"Han Shuguang","year":"2020","unstructured":"Shuguang Han, Xuanhui Wang, Mike Bendersky, and Marc Najork. 2020. Learning-to-Rank with BERT in TF-Ranking. arXiv preprint arXiv:2004.08476 (2020)."},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403305"},{"key":"e_1_3_2_2_13_1","volume-title":"Product quantization for nearest neighbor search","author":"Jegou Herve","year":"2010","unstructured":"Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2010. Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence , Vol. 33, 1 (2010), 117--128."},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBDATA.2019.2921572"},{"key":"e_1_3_2_2_15_1","volume-title":"Sewon Min, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih.","author":"Karpukhin Vladimir","year":"2020","unstructured":"Vladimir Karpukhin, Barlas Oug uz, Sewon Min, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020)."},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401075"},{"key":"e_1_3_2_2_17_1","volume-title":"Semantic matching in search. Foundations and Trends in Information retrieval","author":"Li Hang","year":"2014","unstructured":"Hang Li and Jun Xu. 2014. Semantic matching in search. Foundations and Trends in Information retrieval , Vol. 7, 5 (2014), 343--469."},{"key":"e_1_3_2_2_18_1","volume-title":"Distilling Dense Representations for Ranking using Tightly-Coupled Teachers. arXiv preprint arXiv:2010.11386","author":"Lin Sheng-Chieh","year":"2020","unstructured":"Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. 2020. Distilling Dense Representations for Ranking using Tightly-Coupled Teachers. arXiv preprint arXiv:2010.11386 (2020)."},{"key":"e_1_3_2_2_19_1","volume-title":"Learning to rank for information retrieval. Foundations and trends in information retrieval","author":"Liu Tie-Yan","year":"2009","unstructured":"Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and trends in information retrieval , Vol. 3, 3 (2009), 225--331."},{"key":"e_1_3_2_2_20_1","volume-title":"RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)."},{"key":"e_1_3_2_2_21_1","volume-title":"Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101","author":"Loshchilov Ilya","year":"2017","unstructured":"Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)."},{"key":"e_1_3_2_2_22_1","volume-title":"dense, and attentional representations for text retrieval. arXiv preprint arXiv:2005.00181","author":"Luan Yi","year":"2020","unstructured":"Yi Luan, Jacob Eisenstein, Kristina Toutanove, and Michael Collins. 2020. Sparse, dense, and attentional representations for text retrieval. arXiv preprint arXiv:2005.00181 (2020)."},{"key":"e_1_3_2_2_23_1","volume-title":"Passage Re-ranking with BERT . arXiv preprint arXiv:1901.04085","author":"Nogueira Rodrigo","year":"2019","unstructured":"Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT . arXiv preprint arXiv:1901.04085 (2019)."},{"key":"e_1_3_2_2_24_1","volume-title":"Document expansion by query prediction. arXiv preprint arXiv:1904.08375","author":"Nogueira Rodrigo","year":"2019","unstructured":"Rodrigo Nogueira, Wei Yang, Jimmy Lin, and Kyunghyun Cho. 2019. Document expansion by query prediction. arXiv preprint arXiv:1904.08375 (2019)."},{"key":"e_1_3_2_2_25_1","volume-title":"A general approximation framework for direct optimization of information retrieval measures. Information retrieval","author":"Qin Tao","year":"2010","unstructured":"Tao Qin, Tie-Yan Liu, and Hang Li. 2010. A general approximation framework for direct optimization of information retrieval measures. Information retrieval , Vol. 13, 4 (2010), 375--397."},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4471-2099-5_24"},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.472"},{"key":"e_1_3_2_2_28_1","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is All You Need. In Advances in neural information processing systems. 5998--6008."},{"key":"e_1_3_2_2_29_1","volume-title":"Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. arXiv preprint arXiv:2007.00808","author":"Xiong Lee","year":"2020","unstructured":"Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. arXiv preprint arXiv:2007.00808 (2020)."},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"crossref","unstructured":"Ming Yan Chenliang Li Chen Wu Bin Bi Wei Wang Jiangnan Xia and Luo Si. 2019. IDST at TREC 2019 Deep Learning Track: Deep Cascade Ranking with Generation-based Document Expansion and Pre-trained Language Modeling.. In TREC .","DOI":"10.6028\/NIST.SP.1250.deep-IDST"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3239571"},{"key":"e_1_3_2_2_32_1","volume-title":"Large batch optimization for deep learning: Training bert in 76 minutes. arXiv preprint arXiv:1904.00962","author":"You Yang","year":"2019","unstructured":"Yang You, Jing Li, Sashank Reddi, Jonathan Hseu, Sanjiv Kumar, Srinadh Bhojanapalli, Xiaodan Song, James Demmel, Kurt Keutzer, and Cho-Jui Hsieh. 2019. Large batch optimization for deep learning: Training bert in 76 minutes. arXiv preprint arXiv:1904.00962 (2019)."},{"key":"e_1_3_2_2_33_1","volume-title":"RepBERT: Contextualized Text Embeddings for First-Stage Retrieval. arXiv preprint arXiv:2006.15498","author":"Zhan Jingtao","year":"2020","unstructured":"Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2020. RepBERT: Contextualized Text Embeddings for First-Stage Retrieval. arXiv preprint arXiv:2006.15498 (2020)."}],"event":{"name":"SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","location":"Virtual Event Canada","acronym":"SIGIR '21","sponsor":["SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3404835.3462880","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3404835.3462880","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:47:17Z","timestamp":1750193237000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3404835.3462880"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,11]]},"references-count":33,"alternative-id":["10.1145\/3404835.3462880","10.1145\/3404835"],"URL":"https:\/\/doi.org\/10.1145\/3404835.3462880","relation":{},"subject":[],"published":{"date-parts":[[2021,7,11]]},"assertion":[{"value":"2021-07-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}