{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T01:56:50Z","timestamp":1776131810930,"version":"3.50.1"},"reference-count":121,"publisher":"Association for Computing Machinery (ACM)","issue":"4","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2023,12]]},"abstract":"<jats:p>Learnable embedding vector is one of the most important applications in machine learning, and is widely used in various database-related domains. However, the high dimensionality of sparse data in recommendation tasks and the huge volume of corpus in retrieval-related tasks lead to a large memory consumption of the embedding table, which poses a great challenge to the training and deployment of models. Recent research has proposed various methods to compress the embeddings at the cost of a slight decrease in model quality or the introduction of other overheads. Nevertheless, the relative performance of these methods remains unclear. Existing experimental comparisons only cover a subset of these methods and focus on limited metrics. In this paper, we perform a comprehensive comparative analysis and experimental evaluation of embedding compression. We introduce a new taxonomy that categorizes these techniques based on their characteristics and methodologies, and further develop a modular benchmarking framework that integrates 14 representative methods. Under a uniform test environment, our benchmark fairly evaluates each approach, presents their strengths and weaknesses under different memory budgets, and recommends the best method based on the use case. In addition to providing useful guidelines, our study also uncovers the limitations of current methods and suggests potential directions for future research.<\/jats:p>","DOI":"10.14778\/3636218.3636234","type":"journal-article","created":{"date-parts":[[2024,3,5]],"date-time":"2024-03-05T17:04:07Z","timestamp":1709658247000},"page":"808-822","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Experimental Analysis of Large-Scale Learnable Vector Storage Compression"],"prefix":"10.14778","volume":"17","author":[{"given":"Hailin","family":"Zhang","sequence":"first","affiliation":[{"name":"School of Computer Science &amp; Key Lab of High Confidence Software Technologies, Peking University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Penghao","family":"Zhao","sequence":"additional","affiliation":[{"name":"School of Computer Science &amp; Key Lab of High Confidence Software Technologies, Peking University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xupeng","family":"Miao","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yingxia","family":"Shao","sequence":"additional","affiliation":[{"name":"Beijing University of Posts and Telecommunications"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zirui","family":"Liu","sequence":"additional","affiliation":[{"name":"Peking University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tong","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Computer Science &amp; Key Lab of High Confidence Software Technologies, Peking University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bin","family":"Cui","sequence":"additional","affiliation":[{"name":"School of Computer Science &amp; Key Lab of High Confidence Software Technologies, Peking University and Institute of Computational Social Science, Peking University (Qingdao)"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,3,5]]},"reference":[{"key":"e_1_2_1_1_1","article-title":"Structured Pruning of Deep Convolutional Neural Networks","volume":"13","author":"Anwar Sajid","year":"2017","unstructured":"Sajid Anwar, Kyuyeon Hwang, and Wonyong Sung. 2017. Structured Pruning of Deep Convolutional Neural Networks. ACM Journal on Emerging Technologies in Computing Systems 13, 3 (2017), 32:1--32:18.","journal-title":"ACM Journal on Emerging Technologies in Computing Systems"},{"key":"e_1_2_1_2_1","unstructured":"Ron Banner Yury Nahshan and Daniel Soudry. 2019. Post training 4-bit quantization of convolutional networks for rapid-deployment. In Advances in Neural Information Processing Systems 32 (NeurIPS)."},{"key":"e_1_2_1_3_1","unstructured":"Nitin Bansal Xiaohan Chen and Zhangyang Wang. 2018. Can We Gain More from Orthogonality Regularizations in Training Deep Networks?. In Advances in Neural Information Processing Systems 31 (NeurIPS)."},{"key":"e_1_2_1_4_1","unstructured":"Sebastian Borgeaud Arthur Mensch Jordan Hoffmann Trevor Cai Eliza Rutherford Katie Millican George van den Driessche Jean-Baptiste Lespiau Bogdan Damoc Aidan Clark Diego de Las Casas Aurelia Guy Jacob Menick Roman Ring Tom Hennigan Saffron Huang Loren Maggiore Chris Jones Albin Cassirer Andy Brock Michela Paganini Geoffrey Irving Oriol Vinyals Simon Osindero Karen Simonyan Jack W. Rae Erich Elsen and Laurent Sifre. 2022. Improving Language Models by Retrieving from Trillions of Tokens. In Proceedings of the 39th International Conference on Machine Learning (ICML)."},{"key":"e_1_2_1_5_1","unstructured":"Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell Sandhini Agarwal Ariel Herbert-Voss Gretchen Krueger Tom Henighan Rewon Child Aditya Ramesh Daniel M. Ziegler Jeffrey Wu Clemens Winter Christopher Hesse Mark Chen Eric Sigler Mateusz Litwin Scott Gray Benjamin Chess Jack Clark Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems 33 (NeurIPS)."},{"key":"e_1_2_1_6_1","unstructured":"Patrick H. Chen Si Si Yang Li Ciprian Chelba and Cho-Jui Hsieh. 2018. GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking. In Advances in Neural Information Processing Systems 31 (NeurIPS)."},{"key":"e_1_2_1_7_1","volume-title":"SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search. In Advances in Neural Information Processing Systems 34 (NeurIPS).","author":"Chen Qi","year":"2021","unstructured":"Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. 2021. SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search. In Advances in Neural Information Processing Systems 34 (NeurIPS)."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.14778\/3587136.3587150"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/3524938.3525089"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the 35th International Conference on Machine Learning (ICML).","author":"Chen Ting","year":"2018","unstructured":"Ting Chen, Martin Renqiang Min, and Yizhou Sun. 2018. Learning K-way D-dimensional Discrete Codes for Compact Embedding Representations. In Proceedings of the 35th International Conference on Machine Learning (ICML)."},{"key":"e_1_2_1_11_1","volume-title":"TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI).","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Q. Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI)."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467220"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 17th International Workshop on Data Management on New Hardware (DaMoN).","author":"Chen Xubin","year":"2021","unstructured":"Xubin Chen, Ning Zheng, Shukun Xu, Yifan Qiao, Yang Liu, Jiangpeng Li, and Tong Zhang. 2021. KallaxDB: A Table-less Hash-based Key-Value Store on Storage Hardware with Built-in Transparent Compression. In Proceedings of the 17th International Workshop on Data Management on New Hardware (DaMoN)."},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the Web Conference (WWW).","author":"Chen Yizhou","year":"2023","unstructured":"Yizhou Chen, Guangda Huzhang, Anxiang Zeng, Qingtao Yu, Hui Sun, Heng-Yi Li, Jingyi Li, Yabo Ni, Han Yu, and Zhiming Zhou. 2023. Clustered Embedding Learning for Recommender Systems. In Proceedings of the Web Conference (WWW)."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2988450.2988454"},{"key":"e_1_2_1_16_1","volume-title":"Differentiable Neural Input Search for Recommender Systems. CoRR abs\/2006.04466","author":"Cheng Weiyu","year":"2020","unstructured":"Weiyu Cheng, Yanyan Shen, and Linpeng Huang. 2020. Differentiable Neural Input Search for Recommender Systems. CoRR abs\/2006.04466 (2020)."},{"key":"e_1_2_1_17_1","unstructured":"DeepRec. 2021. Adaptive Embedding. https:\/\/github.com\/alibaba\/DeepRec\/blob\/main\/docs\/docs_en\/Adaptive-Embedding.md."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3437963.3441727"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of Machine Learning and Systems (MLSys).","author":"Desai Aditya","year":"2022","unstructured":"Aditya Desai, Li Chou, and Anshumali Shrivastava. 2022. Random Offset Block Embedding (ROBE) for compressed embedding tables in deep learning recommendation systems. In Proceedings of Machine Learning and Systems (MLSys)."},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT).","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT)."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.14778\/3236187.3236198"},{"key":"e_1_2_1_22_1","volume-title":"8th International Conference on Learning Representations (ICLR).","author":"Esser Steven K.","unstructured":"Steven K. Esser, Jeffrey L. McKinstry, Deepika Bablani, Rathinakumar Appuswamy, and Dharmendra S. Modha. 2020. Learned Step Size quantization. In 8th International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_1_23_1","volume-title":"Trainable Neural Networks. In 7th International Conference on Learning Representations (ICLR).","author":"Frankle Jonathan","year":"2019","unstructured":"Jonathan Frankle and Michael Carbin. 2019. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. In 7th International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP).","author":"Gao Luyu","year":"2021","unstructured":"Luyu Gao and Jamie Callan. 2021. Condenser: a Pre-training Architecture for Dense Retrieval. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)."},{"key":"e_1_2_1_25_1","volume-title":"IEEE International Symposium on Information Theory (ISIT).","author":"Ginart Antonio A.","year":"2021","unstructured":"Antonio A. Ginart, Maxim Naumov, Dheevatsa Mudigere, Jiyan Yang, and James Zou. 2021. Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems. In IEEE International Symposium on Information Theory (ISIT)."},{"key":"e_1_2_1_26_1","volume-title":"Workshop on Systems for ML at NeurIPS.","author":"Guan Hui","year":"2019","unstructured":"Hui Guan, Andrey Malevich, Jiyan Yang, Jongsoo Park, and Hector Yuen. 2019. Post-Training 4-bit Quantization on Embedding Tables. In Workshop on Systems for ML at NeurIPS."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.14778\/3554821.3554843"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 32nd International Conference on Machine Learning (ICML).","author":"Gupta Suyog","year":"2015","unstructured":"Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep Learning with Limited Numerical Precision. In Proceedings of the 32nd International Conference on Machine Learning (ICML)."},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the 37th International Conference on Machine Learning (ICML).","author":"Guu Kelvin","year":"2020","unstructured":"Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. 2020. Retrieval Augmented Language Model Pre-Training. In Proceedings of the 37th International Conference on Machine Learning (ICML)."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/s41019-023-00219-6"},{"key":"e_1_2_1_31_1","volume-title":"Dally","author":"Han Song","year":"2015","unstructured":"Song Han, Jeff Pool, John Tran, and William J. Dally. 2015. Learning both Weights and Connections for Efficient Neural Network. In Advances in Neural Information Processing Systems 28 (NeurIPS)."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-022-1152-7"},{"key":"e_1_2_1_33_1","volume-title":"Oseledets","author":"Hrinchuk Oleksii","year":"2020","unstructured":"Oleksii Hrinchuk, Valentin Khrulkov, Leyla Mirvakhabova, Elena D. Orlova, and Ivan V. Oseledets. 2020. Tensorized Embedding Layers. In Findings of the Association for Computational Linguistics (EMNLP)."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.14778\/3380750.3380754"},{"key":"e_1_2_1_35_1","article-title":"Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations","volume":"18","author":"Hubara Itay","year":"2017","unstructured":"Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2017. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations. The Journal of Machine Learning Research 18 (2017), 187:1--187:30.","journal-title":"The Journal of Machine Learning Research"},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the 30th Annual ACM Symposium on the Theory of Computing (STOC).","author":"Indyk Piotr","year":"1998","unstructured":"Piotr Indyk and Rajeev Motwani. 1998. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In Proceedings of the 30th Annual ACM Symposium on the Theory of Computing (STOC)."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2010.57"},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD).","author":"Joglekar Manas R.","unstructured":"Manas R. Joglekar, Cong Li, Mei Chen, Taibai Xu, Xiaoming Wang, Jay K. Adams, Pranav Khaitan, Jiahui Liu, and Quoc V. Le. 2020. Neural Input Search for Large Scale Recommendation Models. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD)."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBDATA.2019.2921572"},{"key":"e_1_2_1_40_1","volume-title":"Learning Multi-granular Quantized Embeddings for Large-Vocab Categorical Features in Recommender Systems. In Companion Proceedings of The Web Conference.","author":"Kang Wang-Cheng","unstructured":"Wang-Cheng Kang, Derek Zhiyuan Cheng, Ting Chen, Xinyang Yi, Dong Lin, Lichan Hong, and Ed H. Chi. 2020. Learning Multi-granular Quantized Embeddings for Large-Vocab Categorical Features in Recommender Systems. In Companion Proceedings of The Web Conference."},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD).","author":"Kang Wang-Cheng","unstructured":"Wang-Cheng Kang, Derek Zhiyuan Cheng, Tiansheng Yao, Xinyang Yi, Ting Chen, Lichan Hong, and Ed H. Chi. 2021. Learning to Embed Categorical Features without Embedding Tables for Recommendation. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD)."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.550"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.14778\/3401960.3401970"},{"key":"e_1_2_1_44_1","volume-title":"Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations (ICLR).","author":"Diederik","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.14778\/3494124.3494144"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2022.3186387"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00276"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.14778\/3551793.3551859"},{"key":"e_1_2_1_49_1","unstructured":"Criteo Labs. 2014. Kaggle display advertising challenge dataset. https:\/\/labs.criteo.com\/2014\/02\/kaggle-display-advertising-challenge-dataset."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3386901.3388947"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.703"},{"key":"e_1_2_1_52_1","unstructured":"Patrick S. H. Lewis Ethan Perez Aleksandra Piktus Fabio Petroni Vladimir Karpukhin Naman Goyal Heinrich K\u00fcttler Mike Lewis Wen-tau Yih Tim Rockt\u00e4schel Sebastian Riedel and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems 33 (NeurIPS)."},{"key":"e_1_2_1_53_1","unstructured":"Hao Li Soham De Zheng Xu Christoph Studer Hanan Samet and Tom Goldstein. 2017. Training Quantized Nets: A Deeper Understanding. In Advances in Neural Information Processing Systems 30 (NeurIPS)."},{"key":"e_1_2_1_54_1","volume-title":"Pruning Filters for Efficient ConvNets. In 5th International Conference on Learning Representations (ICLR).","author":"Li Hao","year":"2017","unstructured":"Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. 2017. Pruning Filters for Efficient ConvNets. In 5th International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_1_55_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).","author":"Li Shiwei","year":"2023","unstructured":"Shiwei Li, Huifeng Guo, Lu Hou, Wei Zhang, Xing Tang, Ruiming Tang, Rui Zhang, and Ruixuan Li. 2023. Adaptive Low-Precision Training for Embeddings in Click-Through Rate Prediction. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482448"},{"key":"e_1_2_1_57_1","volume-title":"Proceedings of the Web Conference (WWW).","author":"Lian Defu","year":"2020","unstructured":"Defu Lian, Haoyu Wang, Zheng Liu, Jianxun Lian, Enhong Chen, and Xing Xie. 2020. LightRec: A Memory and Search-Efficient Recommender System. In Proceedings of the Web Conference (WWW)."},{"key":"e_1_2_1_58_1","volume-title":"Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD).","author":"Lian Xiangru","year":"2022","unstructured":"Xiangru Lian, Binhang Yuan, Xuefeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen Yang, Ce Zhang, and Ji Liu. 2022. Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD)."},{"key":"e_1_2_1_59_1","volume-title":"Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD).","author":"Lin Weilin","year":"2022","unstructured":"Weilin Lin, Xiangyu Zhao, Yejing Wang, Tong Xu, and Xian Wu. 2022. AdaFS: Adaptive Feature Selection in Deep Recommender System. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD)."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401436"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476249.3476254"},{"key":"e_1_2_1_62_1","volume-title":"9th International Conference on Learning Representations (ICLR).","author":"Liu Siyi","year":"2021","unstructured":"Siyi Liu, Chen Gao, Yihong Chen, Depeng Jin, and Yong Li. 2021. Learnable Embedding sizes for Recommender Systems. In 9th International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_1_63_1","volume-title":"Proceedings of the Web Conference (WWW).","author":"Lyu Fuyuan","year":"2023","unstructured":"Fuyuan Lyu, Xing Tang, Dugang Liu, Liang Chen, Xiuqiang He, and Xue Liu. 2023. Optimizing Feature Set for Click-Through Rate Prediction. In Proceedings of the Web Conference (WWW)."},{"key":"e_1_2_1_64_1","volume-title":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM).","author":"Lyu Fuyuan","year":"2022","unstructured":"Fuyuan Lyu, Xing Tang, Hong Zhu, Huifeng Guo, Yingxue Zhang, Ruiming Tang, and Xue Liu. 2022. OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM)."},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2013.10.006"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2889473"},{"key":"e_1_2_1_67_1","volume-title":"Hetu: a highly efficient automatic parallel distributed deep learning system. Science China Information Sciences 66, 1","author":"Miao Xupeng","year":"2023","unstructured":"Xupeng Miao, Xiaonan Nie, Hailin Zhang, Tong Zhao, and Bin Cui. 2023. Hetu: a highly efficient automatic parallel distributed deep learning system. Science China Information Sciences 66, 1 (2023)."},{"key":"e_1_2_1_68_1","volume-title":"Proceedings of the International Conference on Management of Data (SIGMOD).","author":"Miao Xupeng","year":"2022","unstructured":"Xupeng Miao, Yining Shi, Hailin Zhang, Xin Zhang, Xiaonan Nie, Zhi Yang, and Bin Cui. 2022. HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training. In Proceedings of the International Conference on Management of Data (SIGMOD)."},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.14778\/3489496.3489511"},{"key":"e_1_2_1_70_1","volume-title":"Mixed Precision Training. In 6th International Conference on Learning Representations (ICLR).","author":"Micikevicius Paulius","year":"2018","unstructured":"Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory F. Diamos, Erich Elsen, David Garc\u00eda, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, and Hao Wu. 2018. Mixed Precision Training. In 6th International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/3470496.3533727"},{"key":"e_1_2_1_72_1","unstructured":"Maxim Naumov Dheevatsa Mudigere Hao-Jun Michael Shi Jianyu Huang Narayanan Sundaraman Jongsoo Park Xiaodong Wang Udit Gupta CaroleJean Wu Alisson G. Azzolini Dmytro Dzhulgakov Andrey Mallevich Ilia Cherniavskii Yinghai Lu Raghuraman Krishnamoorthi Ansha Yu Volodymyr Kondratenko Stephanie Pereira Xianjie Chen Wenlin Chen Vijay Rao Bill Jia Liang Xiong and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR abs\/1906.00091 (2019)."},{"key":"e_1_2_1_73_1","volume-title":"Proceedings of the Workshop on Cognitive Computation at NeurIPS (CoCo@NeurIPS).","author":"Nguyen Tri","year":"2016","unstructured":"Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. In Proceedings of the Workshop on Cognitive Computation at NeurIPS (CoCo@NeurIPS)."},{"key":"e_1_2_1_74_1","volume-title":"Proceedings of Machine Learning and Systems (MLSys).","author":"Pansare Niketan","year":"2022","unstructured":"Niketan Pansare, Jay Katukuri, Aditya Arora, Frank Cipollone, Riyaaz Shaik, Noyan Tokgozoglu, and Chandru Venkataraman. 2022. Learning Compressed Embeddings for On-Device Inference. In Proceedings of Machine Learning and Systems (MLSys)."},{"key":"e_1_2_1_75_1","unstructured":"NVIDIA AI platform. 2020. MLPerf Benchmark. https:\/\/mlperf.org."},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3532060"},{"key":"e_1_2_1_77_1","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT).","author":"Qu Yingqi","year":"2021","unstructured":"Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Wayne Xin Zhao, Daxiang Dong, Hua Wu, and Haifeng Wang. 2021. RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT)."},{"key":"e_1_2_1_78_1","article-title":"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. The Journal of Machine Learning Research 21 (2020), 140:1--140:67.","journal-title":"The Journal of Machine Learning Research"},{"key":"e_1_2_1_79_1","unstructured":"Jie Ren Minjia Zhang and Dong Li. 2020. HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory. In Advances in Neural Information Processing Systems 33 (NeurIPS)."},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.14778\/3151106.3151108"},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503222.3507777"},{"key":"e_1_2_1_82_1","volume-title":"FairCF: fairness-aware collaborative filtering. Science China Information Sciences 65, 12","author":"Shao Pengyang","year":"2022","unstructured":"Pengyang Shao, Le Wu, Lei Chen, Kun Zhang, and Meng Wang. 2022. FairCF: fairness-aware collaborative filtering. Science China Information Sciences 65, 12 (2022)."},{"key":"e_1_2_1_83_1","volume-title":"9th International Conference on Learning Representations (ICLR).","author":"Shen Jiayi","year":"2021","unstructured":"Jiayi Shen, Haotao Wang, Shupeng Gui, Jianchao Tan, Zhangyang Wang, and Ji Liu. 2021. UMEC: Unified model and embedding compression for efficient recommendation systems. In 9th International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_1_84_1","volume-title":"Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD).","author":"Michael Shi Hao-Jun","year":"2020","unstructured":"Hao-Jun Michael Shi, Dheevatsa Mudigere, Maxim Naumov, and Jiyan Yang. 2020. Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD)."},{"key":"e_1_2_1_85_1","volume-title":"6th International Conference on Learning Representations (ICLR).","author":"Shu Raphael","year":"2018","unstructured":"Raphael Shu and Hideki Nakayama. 2018. Compressing Word Embeddings via Deep Compositional Code Learning. In 6th International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_1_86_1","volume-title":"Ravishankar Krishnaswamy, and Rohan Kadekodi.","author":"Subramanya Suhas Jayaram","year":"2019","unstructured":"Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnaswamy, and Rohan Kadekodi. 2019. Rand-NSG: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node. In Advances in Neural Information Processing Systems 32 (NeurIPS)."},{"key":"e_1_2_1_87_1","volume-title":"Jonas Meinertz Hansen, and Ole Winther","author":"Svenstrup Dan","year":"2017","unstructured":"Dan Svenstrup, Jonas Meinertz Hansen, and Ole Winther. 2017. Hash Embeddings for Efficient Word Representations. In Advances in Neural Information Processing Systems 30 (NeurIPS)."},{"key":"e_1_2_1_88_1","volume-title":"Proceedings of the International Conference on Management of Data (SIGMOD).","author":"Vartak Manasi","year":"2018","unstructured":"Manasi Vartak, Joana M. F. da Trindade, Samuel Madden, and Matei Zaharia. 2018. MISTIQUE: A System to Store and Query Model Intermediates for Model Diagnosis. In Proceedings of the International Conference on Management of Data (SIGMOD)."},{"key":"e_1_2_1_89_1","volume-title":"Proceedings of the Web Conference (WWW).","author":"Wang Qinyong","year":"2020","unstructured":"Qinyong Wang, Hongzhi Yin, Tong Chen, Zi Huang, Hao Wang, Yanchang Zhao, and Nguyen Quoc Viet Hung. 2020. Next Point-of-Interest Recommendation on Resource-Constrained Mobile Devices. In Proceedings of the Web Conference (WWW)."},{"key":"e_1_2_1_90_1","doi-asserted-by":"publisher","DOI":"10.1145\/3124749.3124754"},{"key":"e_1_2_1_91_1","unstructured":"Steve Wang and Will Cukierski. 2014. Avazu Click-Through Rate Prediction. https:\/\/kaggle.com\/competitions\/avazu-ctr-prediction."},{"key":"e_1_2_1_92_1","volume-title":"Proceedings of the Web Conference (WWW).","author":"Wang Yejing","year":"2022","unstructured":"Yejing Wang, Xiangyu Zhao, Tong Xu, and Xian Wu. 2022. AutoField: Automating Feature Selection in Deep Recommender Systems. In Proceedings of the Web Conference (WWW)."},{"key":"e_1_2_1_93_1","doi-asserted-by":"publisher","DOI":"10.1145\/3523227.3547405"},{"key":"e_1_2_1_94_1","volume-title":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM).","author":"Wei Zhikun","year":"2021","unstructured":"Zhikun Wei, Xin Wang, and Wenwu Zhu. 2021. AutoIAS: Automatic Integrated Architecture Searcher for Click-Trough Rate Prediction. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM)."},{"key":"e_1_2_1_95_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553516"},{"key":"e_1_2_1_96_1","volume-title":"Joseph A. Konstan, Julian J. McAuley, Yves Raimond, and Hao Zhang.","author":"Wu Carole-Jean","year":"2020","unstructured":"Carole-Jean Wu, Robin Burke, Ed H. Chi, Joseph A. Konstan, Julian J. McAuley, Yves Raimond, and Hao Zhang. 2020. Developing a Recommendation Benchmark for MLPerf Training and Inference. CoRR abs\/2003.07336 (2020)."},{"key":"e_1_2_1_97_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-022-2031-y"},{"key":"e_1_2_1_98_1","volume-title":"Proceedings of the Workshop on Deep Learning for Search and Recommendation (DL4SR) at CIKM.","author":"Xiao Tesi","year":"2022","unstructured":"Tesi Xiao, Xia Xiao, Ming Chen, and Youlong Chen. 2022. Field-wise Embedding Size Search via Structural Hard Auxiliary Mask Pruning for Click-Through Rate Prediction. In Proceedings of the Workshop on Deep Learning for Search and Recommendation (DL4SR) at CIKM."},{"key":"e_1_2_1_99_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC41405.2020.00025"},{"key":"e_1_2_1_100_1","volume-title":"Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In 9th International Conference on Learning Representations (ICLR).","author":"Xiong Lee","year":"2021","unstructured":"Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N. Bennett, Junaid Ahmed, and Arnold Overwijk. 2021. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In 9th International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_1_101_1","volume-title":"Proceedings of the International Conference on Management of Data (SIGMOD).","author":"Xu Zhiqiang","year":"2021","unstructured":"Zhiqiang Xu, Dong Li, Weijie Zhao, Xing Shen, Tianbo Huang, Xiaoyun Li, and Ping Li. 2021. Agile and Accurate CTR Prediction Model Training for Massive-Scale Online Advertising Systems. In Proceedings of the International Conference on Management of Data (SIGMOD)."},{"key":"e_1_2_1_102_1","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482065"},{"key":"e_1_2_1_103_1","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482130"},{"key":"e_1_2_1_104_1","volume-title":"Ping Tak Peter Tang, and Andrew Tulloch","author":"Yang Jie Amy","year":"2020","unstructured":"Jie Amy Yang, Jianyu Huang, Jongsoo Park, Ping Tak Peter Tang, and Andrew Tulloch. 2020. Mixed-Precision Embedding Using a Cache. CoRR abs\/2010.11305 (2020)."},{"key":"e_1_2_1_105_1","doi-asserted-by":"publisher","DOI":"10.14778\/3421424.3421430"},{"key":"e_1_2_1_106_1","first-page":"1","article-title":"i-Razor: A Differentiable Neural Input Razor for Feature Selection and Dimension Search in DNN-Based Recommender Systems","volume":"01","author":"Yao Yao","year":"2023","unstructured":"Yao Yao, Bin Liu, Haoxun He, Dakui Sheng, Ke Wang, Li Xiao, and Huanhuan Cao. 2023. i-Razor: A Differentiable Neural Input Razor for Feature Selection and Dimension Search in DNN-Based Recommender Systems. IEEE Transactions on Knowledge & Data Engineering 01 (2023), 1--14.","journal-title":"IEEE Transactions on Knowledge & Data Engineering"},{"key":"e_1_2_1_107_1","volume-title":"Proceedings of Machine Learning and Systems (MLSys).","author":"Yin Chunxing","year":"2021","unstructured":"Chunxing Yin, Bilge Acun, Carole-Jean Wu, and Xing Liu. 2021. TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models. In Proceedings of Machine Learning and Systems (MLSys)."},{"key":"e_1_2_1_108_1","doi-asserted-by":"publisher","DOI":"10.1007\/s41019-023-00217-8"},{"key":"e_1_2_1_109_1","doi-asserted-by":"publisher","DOI":"10.1145\/3383313.3412227"},{"key":"e_1_2_1_110_1","volume-title":"Model-enhanced Vector Index. CoRR abs\/2309.13335","author":"Zhang Hailin","year":"2023","unstructured":"Hailin Zhang, Yujing Wang, Qi Chen, Ruiheng Chang, Ting Zhang, Ziming Miao, Yingyan Hou, Yang Ding, Xupeng Miao, Haonan Wang, Bochen Pang, Yuefeng Zhan, Hao Sun, Weiwei Deng, Qi Zhang, Fan Yang, Xing Xie, Mao Yang, and Bin Cui. 2023. Model-enhanced Vector Index. CoRR abs\/2309.13335 (2023)."},{"key":"e_1_2_1_111_1","volume-title":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR).","author":"Zhang Jia-Dong","year":"2015","unstructured":"Jia-Dong Zhang and Chi-Yin Chow. 2015. GeoSoCa: Exploiting Geographical, Social and Categorical Correlations for Point-of-Interest Recommendations. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR)."},{"key":"e_1_2_1_112_1","volume-title":"Workshop on Systems for ML at NeurIPS.","author":"Zhang Jian","year":"2018","unstructured":"Jian Zhang, Jiyan Yang, and Hector Yuen. 2018. Training with low-precision embedding tables. In Workshop on Systems for ML at NeurIPS."},{"key":"e_1_2_1_113_1","volume-title":"PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems. In 38th IEEE International Conference on Data Engineering (ICDE).","author":"Zhang Yuanxing","year":"2022","unstructured":"Yuanxing Zhang, Langshi Chen, Siran Yang, Man Yuan, Huimin Yi, Jie Zhang, Jiamang Wang, Jianbo Dong, Yunlong Xu, Yue Song, Yong Li, Di Zhang, Wei Lin, Lin Qu, and Bo Zheng. 2022. PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems. In 38th IEEE International Conference on Data Engineering (ICDE)."},{"key":"e_1_2_1_114_1","volume-title":"Proceedings of Machine Learning and Systems (MLSys).","author":"Zhao Weijie","year":"2020","unstructured":"Weijie Zhao, Deping Xie, Ronglai Jia, Yulei Qian, Ruiquan Ding, Mingming Sun, and Ping Li. 2020. Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems. In Proceedings of Machine Learning and Systems (MLSys)."},{"key":"e_1_2_1_115_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3358045"},{"key":"e_1_2_1_116_1","volume-title":"AutoEmb: Automated Embedding Dimensionality Search in Streaming Recommendations. In IEEE International Conference on Data Mining (ICDM).","author":"Zhao Xiangyu","year":"2021","unstructured":"Xiangyu Zhao, Haochen Liu, Wenqi Fan, Hui Liu, Jiliang Tang, Chong Wang, Ming Chen, Xudong Zheng, Xiaobing Liu, and Xiwang Yang. 2021. AutoEmb: Automated Embedding Dimensionality Search in Streaming Recommendations. In IEEE International Conference on Data Mining (ICDM)."},{"key":"e_1_2_1_117_1","volume-title":"Proceedings of the Web Conference (WWW).","author":"Zhao Xiangyu","year":"2021","unstructured":"Xiangyu Zhao, Haochen Liu, Hui Liu, Jiliang Tang, Weiwei Guo, Jun Shi, Sida Wang, Huiji Gao, and Bo Long. 2021. AutoDim: Field-aware Embedding Dimension Searchin Recommender Systems. In Proceedings of the Web Conference (WWW)."},{"key":"e_1_2_1_118_1","doi-asserted-by":"publisher","DOI":"10.14778\/3529337.3529349"},{"key":"e_1_2_1_119_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219823"},{"key":"e_1_2_1_120_1","doi-asserted-by":"publisher","DOI":"10.14778\/3547305.3547325"},{"key":"e_1_2_1_121_1","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482486"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3636218.3636234","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,5]],"date-time":"2024-03-05T17:07:34Z","timestamp":1709658454000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3636218.3636234"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12]]},"references-count":121,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,12]]}},"alternative-id":["10.14778\/3636218.3636234"],"URL":"https:\/\/doi.org\/10.14778\/3636218.3636234","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2023,12]]},"assertion":[{"value":"2024-03-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}