{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,12]],"date-time":"2025-12-12T13:08:37Z","timestamp":1765544917289,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":40,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,10,21]],"date-time":"2023-10-21T00:00:00Z","timestamp":1697846400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,10,21]]},"DOI":"10.1145\/3583780.3614661","type":"proceedings-article","created":{"date-parts":[[2023,10,21]],"date-time":"2023-10-21T07:45:42Z","timestamp":1697874342000},"page":"4960-4966","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Build Faster with Less: A Journey to Accelerate Sparse Model Building for Semantic Matching in Product Search"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3192-3281","authenticated-orcid":false,"given":"Jiong","family":"Zhang","sequence":"first","affiliation":[{"name":"Amazon, Palo Alto, CA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-3240-7710","authenticated-orcid":false,"given":"Yau-Shian","family":"Wang","sequence":"additional","affiliation":[{"name":"Amazon, Palo Alto, CA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5646-9356","authenticated-orcid":false,"given":"Wei-Cheng","family":"Chang","sequence":"additional","affiliation":[{"name":"Amazon, Palo Alto, CA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-3967-3215","authenticated-orcid":false,"given":"Wei","family":"Li","sequence":"additional","affiliation":[{"name":"Amazon, Palo Alto, CA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1753-8099","authenticated-orcid":false,"given":"Jyun-Yu","family":"Jiang","sequence":"additional","affiliation":[{"name":"Amazon, Palo Alto, CA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3520-9627","authenticated-orcid":false,"given":"Cho-Jui","family":"Hsieh","sequence":"additional","affiliation":[{"name":"UCLA, Los Angeles, CA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5235-2962","authenticated-orcid":false,"given":"Hsiang-Fu","family":"Yu","sequence":"additional","affiliation":[{"name":"Amazon, Palo Alto, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2023,10,21]]},"reference":[{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/567806.567807"},{"key":"e_1_3_2_1_3_1","volume-title":"ICML","volume":"98","author":"Bradley Paul S","year":"1998","unstructured":"Paul S Bradley and Usama M Fayyad . 1998 . Refining initial points for k-means clustering .. In ICML , Vol. 98 . Citeseer, 91--99. Paul S Bradley and Usama M Fayyad. 1998. Refining initial points for k-means clustering.. In ICML, Vol. 98. Citeseer, 91--99."},{"key":"e_1_3_2_1_4_1","volume-title":"ECML PKDD Workshop: Languages for Data Mining and Machine Learning. 108--122","author":"Buitinck Lars","year":"2013","unstructured":"Lars Buitinck , Gilles Louppe , Mathieu Blondel , Fabian Pedregosa , Andreas Mueller , Olivier Grisel , Vlad Niculae , Peter Prettenhofer , Alexandre Gramfort , Jaques Grobler , Robert Layton , Jake VanderPlas , Arnaud Joly , Brian Holt , and Ga\u00eb l Varoquaux . 2013 . API design for machine learning software: experiences from the scikit-learn project . In ECML PKDD Workshop: Languages for Data Mining and Machine Learning. 108--122 . Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, Robert Layton, Jake VanderPlas, Arnaud Joly, Brian Holt, and Ga\u00eb l Varoquaux. 2013. API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning. 108--122."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467092"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Wei-Cheng Chang Daniel Jiang Hsiang-Fu Yu Choon-Hui Teo Jiong Zhang Kai Zhong Kedarnath Kolluri Qie Hu Nikhil Shandilya Vyacheslav Ievgrafov Japinder Singh and Inderjit S Dhillon. 2021b. Extreme Multi-label Learning for Semantic Matching in Product Search. In KDD. ACM.  Wei-Cheng Chang Daniel Jiang Hsiang-Fu Yu Choon-Hui Teo Jiong Zhang Kai Zhong Kedarnath Kolluri Qie Hu Nikhil Shandilya Vyacheslav Ievgrafov Japinder Singh and Inderjit S Dhillon. 2021b. Extreme Multi-label Learning for Semantic Matching in Product Search. In KDD. ACM.","DOI":"10.1145\/3447548.3467092"},{"key":"e_1_3_2_1_7_1","volume-title":"Pre-training Tasks for Embedding-based Large-scale Retrieval. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=rkg-mA4FDr","author":"Chang Wei-Cheng","year":"2020","unstructured":"Wei-Cheng Chang , Felix X. Yu , Yin-Wen Chang , Yiming Yang , and Sanjiv Kumar . 2020 a. Pre-training Tasks for Embedding-based Large-scale Retrieval. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=rkg-mA4FDr Wei-Cheng Chang, Felix X. Yu, Yin-Wen Chang, Yiming Yang, and Sanjiv Kumar. 2020a. Pre-training Tasks for Embedding-based Large-scale Retrieval. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=rkg-mA4FDr"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403368"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2959100.2959190"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/1390681.1442794"},{"key":"e_1_3_2_1_11_1","series-title":"SIAM journal on matrix analysis and applications","volume-title":"Sparse matrices in MATLAB: Design and implementation","author":"Gilbert John R","year":"1992","unstructured":"John R Gilbert , Cleve Moler , and Robert Schreiber . 1992. Sparse matrices in MATLAB: Design and implementation . SIAM journal on matrix analysis and applications , Vol. 13 , 1 ( 1992 ), 333--356. John R Gilbert, Cleve Moler, and Robert Schreiber. 1992. Sparse matrices in MATLAB: Design and implementation. SIAM journal on matrix analysis and applications, Vol. 13, 1 (1992), 333--356."},{"key":"e_1_3_2_1_12_1","volume-title":"International Conference on Machine Learning. PMLR, 3887--3896","author":"Guo Ruiqi","year":"2020","unstructured":"Ruiqi Guo , Philip Sun , Erik Lindgren , Quan Geng , David Simcha , Felix Chern , and Sanjiv Kumar . 2020 . Accelerating large-scale inference with anisotropic vector quantization . In International Conference on Machine Learning. PMLR, 3887--3896 . Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating large-scale inference with anisotropic vector quantization. In International Conference on Machine Learning. PMLR, 3887--3896."},{"key":"e_1_3_2_1_13_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.  Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR."},{"key":"e_1_3_2_1_14_1","volume-title":"International Conference on Artificial Intelligence and Statistics. PMLR","author":"Jasinska-Kobus Kalina","year":"2021","unstructured":"Kalina Jasinska-Kobus , Marek Wydmuch , Devanathan Thiruvenkatachari , and Krzysztof Dembczynski . 2021 . Online probabilistic label trees . In International Conference on Artificial Intelligence and Statistics. PMLR , 1801--1809. Kalina Jasinska-Kobus, Marek Wydmuch, Devanathan Thiruvenkatachari, and Krzysztof Dembczynski. 2021. Online probabilistic label trees. In International Conference on Artificial Intelligence and Statistics. PMLR, 1801--1809."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3531767"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"crossref","unstructured":"Ting Jiang Deqing Wang Leilei Sun Huayi Yang Zhengyang Zhao and Fuzhen Zhuang. 2021. LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification. In AAAI.  Ting Jiang Deqing Wang Leilei Sun Huayi Yang Zhengyang Zhao and Fuzhen Zhuang. 2021. LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification. In AAAI.","DOI":"10.1609\/aaai.v35i9.16974"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBDATA.2019.2921572"},{"key":"e_1_3_2_1_18_1","volume-title":"BONSAI-Diverse and Shallow Trees for Extreme Multi-label Classification. arXiv preprint arXiv:1904.08249","author":"Khandagale Sujay","year":"2019","unstructured":"Sujay Khandagale , Han Xiao , and Rohit Babbar . 2019. BONSAI-Diverse and Shallow Trees for Extreme Multi-label Classification. arXiv preprint arXiv:1904.08249 ( 2019 ). Sujay Khandagale, Han Xiao, and Rohit Babbar. 2019. BONSAI-Diverse and Shallow Trees for Extreme Multi-label Classification. arXiv preprint arXiv:1904.08249 (2019)."},{"key":"e_1_3_2_1_19_1","volume-title":"CascadeXML: Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-label Classification. In Conference on Neural Information Processing Systems.","author":"Kharbanda Siddhant","year":"2022","unstructured":"Siddhant Kharbanda , Atmadeep Banerjee , Erik Schultheis , and Rohit Babbar . 2022 . CascadeXML: Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-label Classification. In Conference on Neural Information Processing Systems. Siddhant Kharbanda, Atmadeep Banerjee, Erik Schultheis, and Rohit Babbar. 2022. CascadeXML: Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-label Classification. In Conference on Neural Information Processing Systems."},{"key":"#cr-split#-e_1_3_2_1_20_1.1","doi-asserted-by":"crossref","unstructured":"Hanqing Lu Youna Hu Tong Zhao Tony Wu Yiwei Song and Bing Yin. 2021. Graph-based Multilingual Product Retrieval in E-Commerce Search. In NAACL-HLT (Industry Papers). 146--153. https:\/\/doi.org\/10.18653\/v1\/2021.naacl-industry.19 10.18653\/v1","DOI":"10.18653\/v1\/2021.naacl-industry.19"},{"key":"#cr-split#-e_1_3_2_1_20_1.2","doi-asserted-by":"crossref","unstructured":"Hanqing Lu Youna Hu Tong Zhao Tony Wu Yiwei Song and Bing Yin. 2021. Graph-based Multilingual Product Retrieval in E-Commerce Search. In NAACL-HLT (Industry Papers). 146--153. https:\/\/doi.org\/10.18653\/v1\/2021.naacl-industry.19","DOI":"10.18653\/v1\/2021.naacl-industry.19"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2889473"},{"key":"e_1_3_2_1_22_1","volume-title":"MS MARCO: A Human-Generated MAchine Reading COmprehension Dataset.","author":"Nguyen Tri","year":"2016","unstructured":"Tri Nguyen , Mir Rosenberg , Xia Song , Jianfeng Gao , Saurabh Tiwary , Rangan Majumder , and Li Deng . 2016 . MS MARCO: A Human-Generated MAchine Reading COmprehension Dataset. (2016). Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A Human-Generated MAchine Reading COmprehension Dataset. (2016)."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330759"},{"key":"e_1_3_2_1_24_1","volume-title":"Fitzek","author":"P\u00e9ter Vingelmann NVIDIA","year":"2020","unstructured":"NVIDIA , P\u00e9ter Vingelmann , and Frank H.P . Fitzek . 2020 . CUDA , release: 10.2.89. https:\/\/developer.nvidia.com\/cuda-toolkit NVIDIA, P\u00e9ter Vingelmann, and Frank H.P. Fitzek. 2020. CUDA, release: 10.2.89. https:\/\/developer.nvidia.com\/cuda-toolkit"},{"key":"e_1_3_2_1_25_1","unstructured":"Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017).  Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017)."},{"key":"e_1_3_2_1_26_1","first-page":"727","article-title":"X-means: Extending k-means with efficient estimation of the number of clusters","volume":"1","author":"Pelleg Dan","year":"2000","unstructured":"Dan Pelleg , Andrew W Moore , 2000 . X-means: Extending k-means with efficient estimation of the number of clusters .. In Icml , Vol. 1. 727 -- 734 . Dan Pelleg, Andrew W Moore, et al. 2000. X-means: Extending k-means with efficient estimation of the number of clusters.. In Icml, Vol. 1. 727--734.","journal-title":"Icml"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3185998"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3336191.3371768"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1561\/1500000019"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4471-2099-5_24"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403342"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00357-020-09372-3"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-019-0686-2"},{"key":"e_1_3_2_1_34_1","first-page":"577","article-title":"Constrained k-means clustering with background knowledge","volume":"1","author":"Wagstaff Kiri","year":"2001","unstructured":"Kiri Wagstaff , Claire Cardie , Seth Rogers , Stefan Schr\u00f6dl , 2001 . Constrained k-means clustering with background knowledge . In Icml , Vol. 1. 577 -- 584 . Kiri Wagstaff, Claire Cardie, Seth Rogers, Stefan Schr\u00f6dl, et al. 2001. Constrained k-means clustering with background knowledge. In Icml, Vol. 1. 577--584.","journal-title":"Icml"},{"volume-title":"High-Performance Computing on the Intel\u00ae Xeon Phi?","author":"Wang Endong","key":"e_1_3_2_1_35_1","unstructured":"Endong Wang , Qing Zhang , Bo Shen , Guangyong Zhang , Xiaowei Lu , Qing Wu , and Yajuan Wang . 2014. Intel math kernel library . In High-Performance Computing on the Intel\u00ae Xeon Phi? . Springer , 167--188. Endong Wang, Qing Zhang, Bo Shen, Guangyong Zhang, Xiaowei Lu, Qing Wu, and Yajuan Wang. 2014. Intel math kernel library. In High-Performance Computing on the Intel\u00ae Xeon Phi?. Springer, 167--188."},{"key":"e_1_3_2_1_36_1","volume-title":"Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In International Conference on Learning Representations.","author":"Xiong Lee","year":"2020","unstructured":"Lee Xiong , Chenyan Xiong , Ye Li , Kwok-Fung Tang , Jialin Liu , Paul N Bennett , Junaid Ahmed , and Arnold Overwijk . 2020 . Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In International Conference on Learning Representations. Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In International Conference on Learning Representations."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3366424.3386195"},{"key":"e_1_3_2_1_38_1","volume-title":"Advances in Neural Information Processing Systems","volume":"32","author":"You Ronghui","year":"2019","unstructured":"Ronghui You , Zihan Zhang , Ziye Wang , Suyang Dai , Hiroshi Mamitsuka , and Shanfeng Zhu . 2019 . Attentionxml: Label tree-based attention-aware deep model for high-performance extreme multi-label text classification . Advances in Neural Information Processing Systems , Vol. 32 (2019). Ronghui You, Zihan Zhang, Ziye Wang, Suyang Dai, Hiroshi Mamitsuka, and Shanfeng Zhu. 2019. Attentionxml: Label tree-based attention-aware deep model for high-performance extreme multi-label text classification. Advances in Neural Information Processing Systems, Vol. 32 (2019)."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3534678.3542629"},{"key":"e_1_3_2_1_40_1","first-page":"7267","article-title":"Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification","volume":"34","author":"Zhang Jiong","year":"2021","unstructured":"Jiong Zhang , Wei-Cheng Chang , Hsiang-Fu Yu , and Inderjit S Dhillon . 2021 . Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification . Advances in Neural Information Processing Systems , Vol. 34 , 7267 -- 7280 . Jiong Zhang, Wei-Cheng Chang, Hsiang-Fu Yu, and Inderjit S Dhillon. 2021. Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification. Advances in Neural Information Processing Systems, Vol. 34, 7267--7280.","journal-title":"Advances in Neural Information Processing Systems"}],"event":{"name":"CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"],"location":"Birmingham United Kingdom","acronym":"CIKM '23"},"container-title":["Proceedings of the 32nd ACM International Conference on Information and Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3583780.3614661","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3583780.3614661","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:46:30Z","timestamp":1750178790000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3583780.3614661"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,21]]},"references-count":40,"alternative-id":["10.1145\/3583780.3614661","10.1145\/3583780"],"URL":"https:\/\/doi.org\/10.1145\/3583780.3614661","relation":{},"subject":[],"published":{"date-parts":[[2023,10,21]]},"assertion":[{"value":"2023-10-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}