{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,6]],"date-time":"2026-06-06T11:51:45Z","timestamp":1780746705178,"version":"3.54.1"},"publisher-location":"New York, NY, USA","reference-count":48,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,6,21]],"date-time":"2023-06-21T00:00:00Z","timestamp":1687305600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Department of Energy","award":["DEAC02-06CH11357"],"award-info":[{"award-number":["DEAC02-06CH11357"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,6,21]]},"DOI":"10.1145\/3577193.3593717","type":"proceedings-article","created":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T18:47:05Z","timestamp":1687286825000},"page":"324-335","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3008-9133","authenticated-orcid":false,"given":"Chengming","family":"Zhang","sequence":"first","affiliation":[{"name":"Indiana University, Bloomington, IN, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4072-9990","authenticated-orcid":false,"given":"Shaden","family":"Smith","sequence":"additional","affiliation":[{"name":"Microsoft, Redmond, WA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9807-7978","authenticated-orcid":false,"given":"Baixi","family":"Sun","sequence":"additional","affiliation":[{"name":"Indiana University, Bloomington, IN, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1101-9148","authenticated-orcid":false,"given":"Jiannan","family":"Tian","sequence":"additional","affiliation":[{"name":"Indiana University, Bloomington, IN, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-4381-1899","authenticated-orcid":false,"given":"Jonathan","family":"Soifer","sequence":"additional","affiliation":[{"name":"Microsoft, Redmond, United States of America"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6244-1264","authenticated-orcid":false,"given":"Xiaodong","family":"Yu","sequence":"additional","affiliation":[{"name":"Argonne National Laboratory, Lemont, IL, United States of America"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8402-1436","authenticated-orcid":false,"given":"Shuaiwen Leon","family":"Song","sequence":"additional","affiliation":[{"name":"Microsoft, Redmond, WA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0478-8854","authenticated-orcid":false,"given":"Yuxiong","family":"He","sequence":"additional","affiliation":[{"name":"Microsoft, Redmond, WA, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5422-4497","authenticated-orcid":false,"given":"Dingwen","family":"Tao","sequence":"additional","affiliation":[{"name":"Indiana University, Bloomington, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2023,6,21]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"https:\/\/developer.arm.com\/Tools%20and%20Software\/Arm%20Compiler%20for%20Linux","year":"2023","unstructured":"arm. https:\/\/developer.arm.com\/Tools%20and%20Software\/Arm%20Compiler%20for%20Linux , 2023 . Online . arm. https:\/\/developer.arm.com\/Tools%20and%20Software\/Arm%20Compiler%20for%20Linux, 2023. Online."},{"key":"e_1_3_2_1_2_1","volume-title":"https:\/\/developer.arm.com\/Tools%20and%20Software\/Arm%20Performance%20Libraries","year":"2023","unstructured":"arm. https:\/\/developer.arm.com\/Tools%20and%20Software\/Arm%20Performance%20Libraries , 2023 . Online . arm. https:\/\/developer.arm.com\/Tools%20and%20Software\/Arm%20Performance%20Libraries, 2023. Online."},{"key":"e_1_3_2_1_3_1","volume-title":"https:\/\/instances.vantage.sh\/aws\/ec2\/p3.2xlarge","author":"AWS.","year":"2023","unstructured":"AWS. https:\/\/instances.vantage.sh\/aws\/ec2\/p3.2xlarge , 2023 . Online . AWS. https:\/\/instances.vantage.sh\/aws\/ec2\/p3.2xlarge, 2023. Online."},{"key":"e_1_3_2_1_4_1","volume-title":"https:\/\/instances.vantage.sh\/aws\/ec2\/c5a.16xlarge","author":"AWS.","year":"2023","unstructured":"AWS. https:\/\/instances.vantage.sh\/aws\/ec2\/c5a.16xlarge , 2023 . Online . AWS. https:\/\/instances.vantage.sh\/aws\/ec2\/c5a.16xlarge, 2023. Online."},{"key":"e_1_3_2_1_5_1","first-page":"711","volume-title":"2021 IEEE International Conference on Cluster Computing (CLUSTER)","author":"Shahneous Bari Md Abdullah","year":"2021","unstructured":"Md Abdullah Shahneous Bari , Barbara Chapman , Anthony Curtis , Robert J Harrison , Eva Siegmann , Nikolay A Simakov , and Matthew D Jones . A64fx performance : experience on ookami . In 2021 IEEE International Conference on Cluster Computing (CLUSTER) , pages 711 -- 718 . IEEE, 2021 . Md Abdullah Shahneous Bari, Barbara Chapman, Anthony Curtis, Robert J Harrison, Eva Siegmann, Nikolay A Simakov, and Matthew D Jones. A64fx performance: experience on ookami. In 2021 IEEE International Conference on Cluster Computing (CLUSTER), pages 711--718. IEEE, 2021."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052694"},{"key":"e_1_3_2_1_7_1","first-page":"46","volume-title":"Icml","volume":"98","author":"Billsus Daniel","year":"1998","unstructured":"Daniel Billsus , Michael J Pazzani , Learning collaborative information filters . In Icml , volume 98 , pages 46 -- 54 , 1998 . Daniel Billsus, Michael J Pazzani, et al. Learning collaborative information filters. In Icml, volume 98, pages 46--54, 1998."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-76361_2"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3077136.3080797"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i01.5330"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2021.3061394"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2959100.2959190"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/3304222.3304232"},{"key":"e_1_3_2_1_14_1","volume-title":"Proceedings of the AAAI Conference on artificial intelligence","volume":"31","author":"Dong Xin","year":"2017","unstructured":"Xin Dong , Lei Yu , Zhonghuo Wu , Yuxia Sun , Lingfeng Yuan , and Fangxi Zhang . A hybrid collaborative filtering model with deep structure for recommender systems . In Proceedings of the AAAI Conference on artificial intelligence , volume 31 , 2017 . Xin Dong, Lei Yu, Zhonghuo Wu, Yuxia Sun, Lingfeng Yuan, and Fangxi Zhang. A hybrid collaborative filtering model with deep structure for recommender systems. In Proceedings of the AAAI Conference on artificial intelligence, volume 31, 2017."},{"key":"e_1_3_2_1_15_1","volume-title":"https:\/\/eigen.tuxfamily.org\/index.php?title=Main_Page","year":"2023","unstructured":"Eigen. https:\/\/eigen.tuxfamily.org\/index.php?title=Main_Page , 2023 . Online . Eigen. https:\/\/eigen.tuxfamily.org\/index.php?title=Main_Page, 2023. Online."},{"key":"e_1_3_2_1_16_1","first-page":"249","volume-title":"Proceedings of the thirteenth international conference on artificial intelligence and statistics","author":"Glorot Xavier","year":"2010","unstructured":"Xavier Glorot and Yoshua Bengio . Understanding the difficulty of training deep feedforward neural networks . In Proceedings of the thirteenth international conference on artificial intelligence and statistics , pages 249 -- 256 . JMLR Workshop and Conference Proceedings , 2010 . Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 249--256. JMLR Workshop and Conference Proceedings, 2010."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2006.100"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICIRD.2018.8376299"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2872427.2883037"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401063"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Il Im and Alexander Hars. Does a one-size recommendation system fit all? the effectiveness of collaborative filtering based recommendation systems across different domains and search modes. ACM Transactions on Information Systems (TOIS) 26(1):4--es 2007.  Il Im and Alexander Hars. Does a one-size recommendation system fit all? the effectiveness of collaborative filtering based recommendation systems across different domains and search modes. ACM Transactions on Information Systems (TOIS) 26(1):4--es 2007.","DOI":"10.1145\/1292591.1292595"},{"key":"e_1_3_2_1_22_1","volume-title":"https:\/\/www.intel.com\/content\/www\/us\/en\/developer\/tools\/oneapi\/base-toolkit.htm","year":"2023","unstructured":"intel. https:\/\/www.intel.com\/content\/www\/us\/en\/developer\/tools\/oneapi\/base-toolkit.htm , 2023 . Online . intel. https:\/\/www.intel.com\/content\/www\/us\/en\/developer\/tools\/oneapi\/base-toolkit.htm, 2023. Online."},{"key":"e_1_3_2_1_23_1","volume-title":"Recommendation systems: Principles, methods and evaluation. Egyptian informatics journal, 16(3):261--273","author":"Isinkaye Folasade Olubusola","year":"2015","unstructured":"Folasade Olubusola Isinkaye , Yetunde O Folajimi , and Bolande Adefowoke Ojokoh . Recommendation systems: Principles, methods and evaluation. Egyptian informatics journal, 16(3):261--273 , 2015 . Folasade Olubusola Isinkaye, Yetunde O Folajimi, and Bolande Adefowoke Ojokoh. Recommendation systems: Principles, methods and evaluation. Egyptian informatics journal, 16(3):261--273, 2015."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3523227.3547387"},{"key":"e_1_3_2_1_25_1","volume-title":"Fbgemm: Enabling high-performance low-precision deep learning inference. arXiv preprint arXiv:2101.05615","author":"Khudia Daya","year":"2021","unstructured":"Daya Khudia , Jianyu Huang , Protonu Basu , Summer Deng , Haixin Liu , Jongsoo Park , and Mikhail Smelyanskiy . Fbgemm: Enabling high-performance low-precision deep learning inference. arXiv preprint arXiv:2101.05615 , 2021 . Daya Khudia, Jianyu Huang, Protonu Basu, Summer Deng, Haixin Liu, Jongsoo Park, and Mikhail Smelyanskiy. Fbgemm: Enabling high-performance low-precision deep learning inference. arXiv preprint arXiv:2101.05615, 2021."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1145\/347090.347181","volume-title":"Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining","author":"Kitts Brendan","year":"2000","unstructured":"Brendan Kitts , David Freed , and Martin Vrieze . Cross-sell : a fast promotion-tunable customer-item recommendation method based on conditionally independent probabilities . In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining , pages 437 -- 446 , 2000 . Brendan Kitts, David Freed, and Martin Vrieze. Cross-sell: a fast promotion-tunable customer-item recommendation method based on conditionally independent probabilities. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 437--446, 2000."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.3390\/electronics11010141"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557072"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2009.263"},{"issue":"7","key":"e_1_3_2_1_30_1","doi-asserted-by":"crossref","first-page":"1530","DOI":"10.1109\/TPDS.2017.2718515","article-title":"A novel matrix factorization approach for large-scale collaborative filtering recommender systems on gpus","volume":"29","author":"Li Hao","year":"2017","unstructured":"Hao Li , Kenli Li , Jiyao An , and Keqin Li. Msgd : A novel matrix factorization approach for large-scale collaborative filtering recommender systems on gpus . IEEE Transactions on Parallel and Distributed Systems , 29 ( 7 ): 1530 -- 1544 , 2017 . Hao Li, Kenli Li, Jiyao An, and Keqin Li. Msgd: A novel matrix factorization approach for large-scale collaborative filtering recommender systems on gpus. IEEE Transactions on Parallel and Distributed Systems, 29(7):1530--1544, 2017.","journal-title":"IEEE Transactions on Parallel and Distributed Systems"},{"key":"e_1_3_2_1_31_1","first-page":"1243","volume-title":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","author":"Mao Kelong","year":"2021","unstructured":"Kelong Mao , Jieming Zhu , Jinpeng Wang , Quanyu Dai , Zhenhua Dong , Xi Xiao , and Xiuqiang He. Simplex : A simple and strong baseline for collaborative filtering . In Proceedings of the 30th ACM International Conference on Information & Knowledge Management , pages 1243 -- 1252 , 2021 . Kelong Mao, Jieming Zhu, Jinpeng Wang, Quanyu Dai, Zhenhua Dong, Xi Xiao, and Xiuqiang He. Simplex: A simple and strong baseline for collaborative filtering. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 1243--1252, 2021."},{"key":"e_1_3_2_1_32_1","first-page":"V4","volume-title":"2010 International Conference On Computer Design and Applications","volume":"4","author":"Pan Chenguang","year":"2010","unstructured":"Chenguang Pan and Wenxin Li . Research paper recommendation with topic analysis . In 2010 International Conference On Computer Design and Applications , volume 4 , pages V4 -- 264 . IEEE, 2010 . Chenguang Pan and Wenxin Li. Research paper recommendation with topic analysis. In 2010 International Conference On Computer Design and Applications, volume 4, pages V4--264. IEEE, 2010."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3240323.3240356"},{"key":"e_1_3_2_1_34_1","volume-title":"https:\/\/www.psc.edu\/resources\/bridges-2\/","year":"2023","unstructured":"psc. https:\/\/www.psc.edu\/resources\/bridges-2\/ , 2023 . Online . psc. https:\/\/www.psc.edu\/resources\/bridges-2\/, 2023. Online."},{"key":"e_1_3_2_1_35_1","volume-title":"https:\/\/pytorch.org\/docs\/stable\/profiler.html","year":"2023","unstructured":"PyTorch. https:\/\/pytorch.org\/docs\/stable\/profiler.html , 2023 . Online . PyTorch. https:\/\/pytorch.org\/docs\/stable\/profiler.html, 2023. Online."},{"key":"e_1_3_2_1_36_1","volume-title":"A lock-free approach to parallelizing stochastic gradient descent. Advances in neural information processing systems, 24","author":"Recht Benjamin","year":"2011","unstructured":"Benjamin Recht , Christopher Re , Stephen Wright , and Feng Niu . Hogwild! : A lock-free approach to parallelizing stochastic gradient descent. Advances in neural information processing systems, 24 , 2011 . Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. Hogwild!: A lock-free approach to parallelizing stochastic gradient descent. Advances in neural information processing systems, 24, 2011."},{"key":"e_1_3_2_1_37_1","volume-title":"Bpr: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618","author":"Rendle Steffen","year":"2012","unstructured":"Steffen Rendle , Christoph Freudenthaler , Zeno Gantner , and Lars Schmidt-Thieme . Bpr: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 , 2012 . Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. Bpr: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618, 2012."},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/192844.192905"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2006.4"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/371920.372071"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.5555\/1768197.1768208"},{"key":"e_1_3_2_1_42_1","volume-title":"Ngat4rec: Neighbor-aware graph attention network for recommendation. arXiv preprint arXiv:2010.12256","author":"Song Jinbo","year":"2020","unstructured":"Jinbo Song , Chao Chang , Fei Sun , Xinbo Song , and Peng Jiang . Ngat4rec: Neighbor-aware graph attention network for recommendation. arXiv preprint arXiv:2010.12256 , 2020 . Jinbo Song, Chao Chang, Fei Sun, Xinbo Song, and Peng Jiang. Ngat4rec: Neighbor-aware graph attention network for recommendation. arXiv preprint arXiv:2010.12256, 2020."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2005.251"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1145\/2907294.2907297","volume-title":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","author":"Tan Wei","year":"2016","unstructured":"Wei Tan , Liangliang Cao , and Liana Fong . Faster and cheaper: Parallelizing large-scale matrix factorization on gpus . In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing , pages 219 -- 230 , 2016 . Wei Tan, Liangliang Cao, and Liana Fong. Faster and cheaper: Parallelizing large-scale matrix factorization on gpus. In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, pages 219--230, 2016."},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3240323.3240369"},{"key":"e_1_3_2_1_46_1","first-page":"2495","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"Wang Feng","year":"2021","unstructured":"Feng Wang and Huaping Liu . Understanding the behaviour of contrastive loss . In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition , pages 2495 -- 2504 , 2021 . Feng Wang and Huaping Liu. Understanding the behaviour of contrastive loss. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pages 2495--2504, 2021."},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331267"},{"key":"e_1_3_2_1_48_1","volume-title":"Fast and scalable matrix factorization. arXiv preprint arXiv:1610.05838","author":"Xie Xiaolong","year":"2016","unstructured":"Xiaolong Xie , Wei Tan , Liana L Fong , and Yun Liang . Cumf_sgd : Fast and scalable matrix factorization. arXiv preprint arXiv:1610.05838 , 2016 . Xiaolong Xie, Wei Tan, Liana L Fong, and Yun Liang. Cumf_sgd: Fast and scalable matrix factorization. arXiv preprint arXiv:1610.05838, 2016."}],"event":{"name":"ICS '23: 37th International Conference on Supercomputing","location":"Orlando FL USA","acronym":"ICS '23","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture"]},"container-title":["Proceedings of the 37th International Conference on Supercomputing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3577193.3593717","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/abs\/10.1145\/3577193.3593717","content-type":"text\/html","content-version":"vor","intended-application":"syndication"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:47:31Z","timestamp":1750178851000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3577193.3593717"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,21]]},"references-count":48,"alternative-id":["10.1145\/3577193.3593717","10.1145\/3577193"],"URL":"https:\/\/doi.org\/10.1145\/3577193.3593717","relation":{},"subject":[],"published":{"date-parts":[[2023,6,21]]},"assertion":[{"value":"2023-06-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}