{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T05:17:26Z","timestamp":1755839846352,"version":"3.41.0"},"reference-count":38,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2025,3,22]],"date-time":"2025-03-22T00:00:00Z","timestamp":1742601600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["U22B2020"],"award-info":[{"award-number":["U22B2020"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Research Foundation Singapore and DSO National Laboratories under the AI Singapore Programme","award":["AISG2-RP-2020-019"],"award-info":[{"award-number":["AISG2-RP-2020-019"]}]},{"name":"RIE 2020 Advanced Manufacturing and Engineering Programmatic Fund","award":["A20G8b0102"],"award-info":[{"award-number":["A20G8b0102"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Recomm. Syst."],"published-print":{"date-parts":[[2025,9,30]]},"abstract":"<jats:p>\n            In recent years, recommender systems have advanced rapidly, where embedding learning for users and items plays a critical role. A standard method learns a unique embedding vector for each user and item. However, such a method has two important limitations in real-world applications: (1) it is hard to learn embeddings that generalize well for users and items that have rare interactions, and (2) it may incur unbearably high memory costs when the number of users and items scales up. Existing approaches either can only address one of the limitations or have flawed overall performances. In this article, we propose Clustered Embedding Learning (CEL) as an integrated solution to these two problems. CEL is a plug-and-play embedding learning framework that can be combined with any differentiable feature interaction model. It is capable of achieving improved performance, especially for cold users and items, with reduced memory cost. CEL enables automatic and dynamic clustering of users and items in a top-down fashion, where clustered entities could jointly learn a shared embedding. The accelerated version of CEL has an optimal time complexity, which supports efficient online updates. Theoretically, we prove the identifiability and the existence of a unique optimal number of clusters for CEL in the context of nonnegative matrix factorization. Empirically, we validate the effectiveness of CEL on three public datasets and one business dataset, showing its consistently superior performance against state-of-the-art methods. In particular, when incorporating CEL into the business model, it brings an improvement of\n            <jats:inline-formula content-type=\"math\/tex\">\n              <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\(+0.6\\%\\)<\/jats:tex-math>\n            <\/jats:inline-formula>\n            in AUC, which translates into a significant revenue gain; meanwhile, the size of the embedding table gets 2,650 times smaller. Additionally, we demonstrate that if there is enough memory, learning a personalized embedding for each user and item around their clustering centers is feasible and can further boost performance. In this article, we enhance and extend the personalization technique we initially proposed in our earlier work\u00a0[\n            <jats:xref ref-type=\"bibr\">4<\/jats:xref>\n            ], which introduced an offset regularization to prevent personalized embeddings from drifting too far away from the central (cluster) embedding, thereby mitigating overfitting. However, in\u00a0[\n            <jats:xref ref-type=\"bibr\">4<\/jats:xref>\n            ], we simply applied a uniform regularization weight across all embeddings, which, given the considerable variation in the number of their associated interactions, is suboptimal. To address this, we investigate in this article the strategies for non-uniform offset regularization that adjusts regularization weights according to the number of associated interactions, which leads to significant improvements compared with uniform offset regularization. Furthermore, we extend CEL into Meta-CEL, factoring in future personalization during cluster optimization, which leads to additional enhancements in personalization performance.\n          <\/jats:p>","DOI":"10.1145\/3665933","type":"journal-article","created":{"date-parts":[[2024,5,27]],"date-time":"2024-05-27T11:56:53Z","timestamp":1716811013000},"page":"1-38","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Learning Personalizable Clustered Embedding for Recommender Systems"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3628-7555","authenticated-orcid":false,"given":"Yizhou","family":"Chen","sequence":"first","affiliation":[{"name":"Shopee Pte Ltd., Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4621-3127","authenticated-orcid":false,"given":"Guangda","family":"Huzhang","sequence":"additional","affiliation":[{"name":"Shopee Pte Ltd., Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3869-5357","authenticated-orcid":false,"given":"Anxiang","family":"Zeng","sequence":"additional","affiliation":[{"name":"SCSE, Nanyang Technological University, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9906-8969","authenticated-orcid":false,"given":"Qingtao","family":"Yu","sequence":"additional","affiliation":[{"name":"Shopee Pte Ltd., Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4701-5379","authenticated-orcid":false,"given":"Hui","family":"Sun","sequence":"additional","affiliation":[{"name":"Shopee Pte Ltd., Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5103-7001","authenticated-orcid":false,"given":"Heng-Yi","family":"Li","sequence":"additional","affiliation":[{"name":"Shopee Pte Ltd., Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1428-197X","authenticated-orcid":false,"given":"Jingyi","family":"Li","sequence":"additional","affiliation":[{"name":"Shopee Pte Ltd., Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7535-8125","authenticated-orcid":false,"given":"Yabo","family":"Ni","sequence":"additional","affiliation":[{"name":"SCSE, Nanyang Technological University, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6893-8650","authenticated-orcid":false,"given":"Han","family":"Yu","sequence":"additional","affiliation":[{"name":"SCSE, Nanyang Technological University, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2407-961X","authenticated-orcid":false,"given":"Zhiming","family":"Zhou","sequence":"additional","affiliation":[{"name":"Key Laboratory of Interdisciplinary Research of Computation and Economics, Shanghai University of Finance and Economics, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,3,22]]},"reference":[{"key":"e_1_3_3_2_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i8.16818"},{"volume-title":"Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'07)","year":"2007","key":"e_1_3_3_3_2","unstructured":"David Arthur and Sergei Vassilvitskii. 2007. k-means++: The advantages of careful seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'07). Vol. 7, 1027--1035."},{"key":"e_1_3_3_4_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1009740529316"},{"key":"e_1_3_3_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3543507.3583362"},{"volume-title":"Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (Boston, MA, USA) (DLRS'16)","key":"e_1_3_3_6_2","unstructured":"Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (Boston, MA, USA) (DLRS'16). Association for Computing Machinery, New York, NY, USA, 7--10. https:\/\/doi.org\/10.1145\/2988450.2988454"},{"key":"e_1_3_3_7_2","unstructured":"DeepRec. 2021. Adaptive Embedding. https:\/\/github.com\/alibaba\/DeepRec\/blob\/main\/README.md"},{"key":"e_1_3_3_8_2","volume-title":"Proceedings of the International Conference on Machine Learning (ICML'17)","author":"Finn Chelsea","year":"2017","unstructured":"Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the International Conference on Machine Learning (ICML'17). PMLR, 1126--1135."},{"key":"e_1_3_3_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2018.2789405"},{"key":"e_1_3_3_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISIT45174.2021.9517710"},{"key":"e_1_3_3_11_2","first-page":"1","article-title":"Principal coordinates analysis","author":"Gower John C.","year":"2014","unstructured":"John C. Gower. 2014. Principal coordinates analysis. Wiley StatsRef: Statistics Reference Online, 1\u20137.","journal-title":"Wiley StatsRef: Statistics Reference Online"},{"key":"e_1_3_3_12_2","article-title":"DeepFM: A factorization-machine based neural network for CTR prediction","author":"Guo Huifeng","year":"2017","unstructured":"Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247 (2017).","journal-title":"arXiv preprint arXiv:1703.04247"},{"key":"e_1_3_3_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052569"},{"issue":"9","key":"e_1_3_3_14_2","first-page":"5149","article-title":"Meta-learning in neural networks: A survey","volume":"44","author":"Hospedales Timothy","year":"2021","unstructured":"Timothy Hospedales, Antreas Antoniou, Paul Micaelli, and Amos Storkey. 2021. Meta-learning in neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 9 (2021), 5149\u20135169.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_3_15_2","doi-asserted-by":"crossref","unstructured":"James Kirkpatrick Razvan Pascanu Neil Rabinowitz Joel Veness Guillaume Desjardins Andrei A. Rusu Kieran Milan John Quan Tiago Ramalho Agnieszka Grabska-Barwinska Demis Hassabis Claudia Clopath Dharshan Kumaran and Raia Hadsell. 2017. Overcoming catastrophic forgetting in neural networks. In Proceedings of the National Academy of Sciences (PNAS) 114 13 (2017) 3521--3526. https:\/\/doi.org\/10.1073\/pnas.1611835114","DOI":"10.1073\/pnas.1611835114"},{"key":"e_1_3_3_16_2","doi-asserted-by":"publisher","DOI":"10.1038\/44565"},{"key":"e_1_3_3_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401436"},{"key":"e_1_3_3_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3220007"},{"key":"e_1_3_3_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2014.02.018"},{"key":"e_1_3_3_20_2","article-title":"Experience replay for continual learning","author":"Rolnick David","year":"2019","unstructured":"David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy Lillicrap, and Gregory Wayne. 2019. Experience replay for continual learning. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch\u00e9-Buc, E. Fox, and R. Garnett (Eds.). Vol. 32, Curran Associates, Inc.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_21_2","article-title":"A comparative study of divisive hierarchical clustering algorithms","author":"Roux Maurice","year":"2015","unstructured":"Maurice Roux. 2015. A comparative study of divisive hierarchical clustering algorithms. arXiv preprint arXiv:1506.08977 (2015).","journal-title":"arXiv preprint arXiv:1506.08977"},{"key":"e_1_3_3_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/0024-3795(94)00114-6"},{"key":"e_1_3_3_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403059"},{"key":"e_1_3_3_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401119"},{"key":"e_1_3_3_25_2","unstructured":"Michael Steinbach George Karypis and Vipin Kumar. 2000. A comparison of document clustering techniques. Technical Report. University of Minnesota Twin Cities Department of Computer Science and Engineering. Retrieved from the University Digital Conservancy https:\/\/hdl.handle.net\/11299\/215421"},{"key":"e_1_3_3_26_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.10491"},{"volume-title":"Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI'17)","key":"e_1_3_3_27_2","unstructured":"Zhu Sun, Jie Yang, Jie Zhang, Alessandro Bozzon, Yu Chen, and Chi Xu. 2017. MRLR: Multi-level representation learning for personalized ranking in recommendation. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI'17). 2807--2813. https:\/\/doi.org\/10.24963\/ijcai.2017\/391"},{"key":"e_1_3_3_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/3366423.3380266"},{"key":"e_1_3_3_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8683466"},{"key":"e_1_3_3_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2018.2789443"},{"key":"e_1_3_3_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553516"},{"key":"e_1_3_3_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482065"},{"key":"e_1_3_3_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482130"},{"key":"e_1_3_3_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2016.2614491"},{"key":"e_1_3_3_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00748"},{"volume-title":"Proceedings of the 14th ACM Conference on Recommender Systems (Virtual Event, Brazil) (RecSys'20)","key":"e_1_3_3_36_2","unstructured":"Caojin Zhang, Yicun Liu, Yuanpu Xie, Sofia Ira Ktena, Alykhan Tejani, Akshay Gupta, Pranay Kumar Myana, Deepak Dilipkumar, Suvadip Paul, Ikuhiro Ihara, Prasang Upadhyaya, Ferenc Huszar, and Wenzhe Shi. 2020. Model size reduction using frequency based double hashing for recommender systems. In Proceedings of the 14th ACM Conference on Recommender Systems (Virtual Event, Brazil) (RecSys'20). Association for Computing Machinery, New York, NY, USA, 521--526. https:\/\/doi.org\/10.1145\/3383313.3412227"},{"key":"e_1_3_3_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2911502"},{"key":"e_1_3_3_38_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33015941"},{"key":"e_1_3_3_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219823"}],"container-title":["ACM Transactions on Recommender Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3665933","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3665933","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T23:44:27Z","timestamp":1750290267000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3665933"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,22]]},"references-count":38,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,9,30]]}},"alternative-id":["10.1145\/3665933"],"URL":"https:\/\/doi.org\/10.1145\/3665933","relation":{},"ISSN":["2770-6699"],"issn-type":[{"type":"electronic","value":"2770-6699"}],"subject":[],"published":{"date-parts":[[2025,3,22]]},"assertion":[{"value":"2023-11-12","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-05-10","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-22","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}