{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T01:18:00Z","timestamp":1773278280512,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":62,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,6,11]],"date-time":"2022-06-11T00:00:00Z","timestamp":1654905600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Samsung Advanced Institute of Technology (SAIT)"},{"name":"National Research Foundation of Korea (NRF)","award":["NRF-2021R1A2C2091753, NRF-2018R1A5A1059921"],"award-info":[{"award-number":["NRF-2021R1A2C2091753, NRF-2018R1A5A1059921"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,6,18]]},"DOI":"10.1145\/3470496.3527386","type":"proceedings-article","created":{"date-parts":[[2022,5,31]],"date-time":"2022-05-31T19:06:01Z","timestamp":1654023961000},"page":"860-873","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":24,"title":["Training personalized recommendation systems from (GPU) scratch"],"prefix":"10.1145","author":[{"given":"Youngeun","family":"Kwon","sequence":"first","affiliation":[{"name":"KAIST"}]},{"given":"Minsoo","family":"Rhu","sequence":"additional","affiliation":[{"name":"KAIST"}]}],"member":"320","published-online":{"date-parts":[[2022,6,11]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001138"},{"key":"e_1_3_2_1_2_1","unstructured":"Alibaba. 2018. User Behavior Data from Taobao for Recommendation. https:\/\/tianchi.aliyun.com\/dataset\/dataDetail?dataId=649.  Alibaba. 2018. User Behavior Data from Taobao for Recommendation. https:\/\/tianchi.aliyun.com\/dataset\/dataDetail?dataId=649."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA51647.2021.00080"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541967"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.58"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001177"},{"key":"e_1_3_2_1_7_1","volume-title":"Proceedings of Machine Learning and Systems (MLSys).","author":"Cho Minsik","year":"2018","unstructured":"Minsik Cho , Tung D Le , U Finkler , Haruiki Imai , Yasushi Negishi , Taro Sekiyama , Saritha Vinod , Vladimir Zolotov , Kiyokuni Kawachiya , David S Kung , and Hillery C Hunter . 2018 . Large Model Support for Deep Learning in Caffe and Chainer . In Proceedings of Machine Learning and Systems (MLSys). Minsik Cho, Tung D Le, U Finkler, Haruiki Imai, Yasushi Negishi, Taro Sekiyama, Saritha Vinod, Vladimir Zolotov, Kiyokuni Kawachiya, David S Kung, and Hillery C Hunter. 2018. Large Model Support for Deep Learning in Caffe and Chainer. In Proceedings of Machine Learning and Systems (MLSys)."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2959100.2959190"},{"key":"e_1_3_2_1_9_1","unstructured":"Criteo. 2013. Criteo Terabyte Click Logs. https:\/\/labs.criteo.com\/2013\/12\/download-terabyte-click-logs\/.  Criteo. 2013. Criteo Terabyte Click Logs. https:\/\/labs.criteo.com\/2013\/12\/download-terabyte-click-logs\/."},{"key":"e_1_3_2_1_10_1","unstructured":"Facebook. 2019. Accelerating Facebook's Infrastructure with Application-Specific Hardware. https:\/\/code.fb.com\/data-center-engineering\/accelerating-infrastructure\/.  Facebook. 2019. Accelerating Facebook's Infrastructure with Application-Specific Hardware. https:\/\/code.fb.com\/data-center-engineering\/accelerating-infrastructure\/."},{"key":"e_1_3_2_1_11_1","unstructured":"Facebook. 2019. MLPerf Training Script for DLRM. https:\/\/github.com\/facebookresearch\/dlrm\/blob\/master\/bench\/run_and_time.sh.  Facebook. 2019. MLPerf Training Script for DLRM. https:\/\/github.com\/facebookresearch\/dlrm\/blob\/master\/bench\/run_and_time.sh."},{"key":"e_1_3_2_1_12_1","unstructured":"Google. 2020. Cloud TPUs: ML Accelerators for TensorFlow.  Google. 2020. Cloud TPUs: ML Accelerators for TensorFlow."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462976"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00084"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA47549.2020.00047"},{"key":"e_1_3_2_1_16_1","volume":"201","author":"Han Song","unstructured":"Song Han , Xingyu Liu , Huizi Mao , Jing Pu , Ardavan Pedram , Mark Horowitz , and William J. Dally. 201 6. EIE: Efficient Inference Engine on Compressed Deep Neural Network. In Proceedings of the International Symposium on Computer Architecture (ISCA). Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark Horowitz, and William J. Dally. 2016. EIE: Efficient Inference Engine on Compressed Deep Neural Network. In Proceedings of the International Symposium on Computer Architecture (ISCA).","journal-title":"William J. Dally."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3373376.3378530"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00083"},{"key":"e_1_3_2_1_19_1","unstructured":"Intel. 2016. Intel Processor Counter Monitor (PCM). https:\/\/github.com\/opcm\/pcm.  Intel. 2016. Intel Processor Counter Monitor (PCM). https:\/\/github.com\/opcm\/pcm."},{"key":"e_1_3_2_1_20_1","unstructured":"JEDEC. 2018. High Bandwidth Memory (HBM2) DRAM. (2018).  JEDEC. 2018. High Bandwidth Memory (HBM2) DRAM. (2018)."},{"key":"e_1_3_2_1_21_1","volume-title":"Layer-centric Memory Reuse and Data Migration for Extreme-scale Deep Learning on Many-core Architectures. ACM Transactions on Architecture and Code Optimization (TACO)","author":"Jin Hai","year":"2018","unstructured":"Hai Jin , Bo Liu , Wenbin Jiang , Yang Ma , Xuanhua Shi , Bingsheng He , and Shaofeng Zhao . 2018. Layer-centric Memory Reuse and Data Migration for Extreme-scale Deep Learning on Many-core Architectures. ACM Transactions on Architecture and Code Optimization (TACO) ( 2018 ). Hai Jin, Bo Liu, Wenbin Jiang, Yang Ma, Xuanhua Shi, Bingsheng He, and Shaofeng Zhao. 2018. Layer-centric Memory Reuse and Data Migration for Extreme-scale Deep Learning on Many-core Architectures. ACM Transactions on Architecture and Code Optimization (TACO) (2018)."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2016.7783722"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA52012.2021.00059"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00070"},{"key":"e_1_3_2_1_26_1","volume-title":"TRiM: Tensor Reduction in Memory","author":"Kim Byeongho","unstructured":"Byeongho Kim , Jaehyun Park , Eojin Lee , Minsoo Rhu , and Jung Ho Ahn . 2020. TRiM: Tensor Reduction in Memory , In IEEE Computer Architecture Letters. IEEE Computer Architecture Letters . Byeongho Kim, Jaehyun Park, Eojin Lee, Minsoo Rhu, and Jung Ho Ahn. 2020. TRiM: Tensor Reduction in Memory, In IEEE Computer Architecture Letters. IEEE Computer Architecture Letters."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001178"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3173162.3173176"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358284"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA51647.2021.00029"},{"key":"e_1_3_2_1_31_1","volume-title":"A Case for Memory-Centric HPC System Architecture for Training Deep Neural Networks","author":"Kwon Youngeun","unstructured":"Youngeun Kwon and Minsoo Rhu . 2018. A Case for Memory-Centric HPC System Architecture for Training Deep Neural Networks . In IEEE Computer Architecture Letters . Youngeun Kwon and Minsoo Rhu. 2018. A Case for Memory-Centric HPC System Architecture for Training Deep Neural Networks. In IEEE Computer Architecture Letters."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2018.00021"},{"key":"e_1_3_2_1_33_1","volume-title":"A Disaggregated Memory System for Deep Learning","author":"Kwon Youngeun","unstructured":"Youngeun Kwon and Minsoo Rhu . 2019. A Disaggregated Memory System for Deep Learning . In IEEE Micro . Youngeun Kwon and Minsoo Rhu. 2019. A Disaggregated Memory System for Deep Learning. In IEEE Micro."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3445814.3446717"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3037697.3037740"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001179"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS51385.2021.00033"},{"key":"e_1_3_2_1_38_1","volume-title":"Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI).","author":"Mohoney Jason","year":"2021","unstructured":"Jason Mohoney , Roger Waleffe , Henry Xu , Theodoros Rekatsinas , and Shivaram Venkataraman . 2021 . Marius: Learning Massive Graph Embeddings on a Single Machine . In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI). Jason Mohoney, Roger Waleffe, Henry Xu, Theodoros Rekatsinas, and Shivaram Venkataraman. 2021. Marius: Learning Massive Graph Embeddings on a Single Machine. In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI)."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"crossref","unstructured":"Dheevatsa Mudigere Yuchen Hao Jianyu Huang Zhihao Jia Andrew Tulloch Srinivas Sridharan Xing Liu Mustafa Ozdal Jade Nie Jongsoo Park Liang Luo Jie Amy Yang Leon Gao Dmytro Ivchenko Aarti Basant Yuxi Hu Jiyan Yang Ehsan K. Ardestani Xiaodong Wang Rakesh Komuravelli Ching-Hsiang Chu Serhat Yilmaz Huayu Li Jiyuan Qian Zhuobo Feng Yinbin Ma Junjie Yang Ellie Wen Hong Li Lin Yang Chonglin Sun Whitney Zhao Dimitry Melts Krishna Dhulipala KR Kishore Tyler Graf Assaf Eisenman Kiran Kumar Matam Adi Gangidi Guoqiang Jerry Chen Manoj Krishnan Avinash Nayak Krishnakumar Nair Bharath Muthiah Mahmoud khorashadi Pallab Bhattacharya Petr Lapukhov Maxim Naumov Ajit Mathews Lin Qiao Mikhail Smelyanskiy Bill Jia and Vijay Rao. 2021. Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models. In arxiv.org.  Dheevatsa Mudigere Yuchen Hao Jianyu Huang Zhihao Jia Andrew Tulloch Srinivas Sridharan Xing Liu Mustafa Ozdal Jade Nie Jongsoo Park Liang Luo Jie Amy Yang Leon Gao Dmytro Ivchenko Aarti Basant Yuxi Hu Jiyan Yang Ehsan K. Ardestani Xiaodong Wang Rakesh Komuravelli Ching-Hsiang Chu Serhat Yilmaz Huayu Li Jiyuan Qian Zhuobo Feng Yinbin Ma Junjie Yang Ellie Wen Hong Li Lin Yang Chonglin Sun Whitney Zhao Dimitry Melts Krishna Dhulipala KR Kishore Tyler Graf Assaf Eisenman Kiran Kumar Matam Adi Gangidi Guoqiang Jerry Chen Manoj Krishnan Avinash Nayak Krishnakumar Nair Bharath Muthiah Mahmoud khorashadi Pallab Bhattacharya Petr Lapukhov Maxim Naumov Ajit Mathews Lin Qiao Mikhail Smelyanskiy Bill Jia and Vijay Rao. 2021. Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models. In arxiv.org.","DOI":"10.1145\/3470496.3533727"},{"key":"e_1_3_2_1_40_1","unstructured":"Dheevatsa Mudigere Yuchen Hao Jianyu Huang Andrew Tulloch Srinivas Sridharan Xing Liu Mustafa Ozdal Jade Nie Jongsoo Park Liang Luo etal 2021. High-performance Distributed Training of Large-scale Deep Learning Recommendation Models. In arxiv.org.  Dheevatsa Mudigere Yuchen Hao Jianyu Huang Andrew Tulloch Srinivas Sridharan Xing Liu Mustafa Ozdal Jade Nie Jongsoo Park Liang Luo et al. 2021. High-performance Distributed Training of Large-scale Deep Learning Recommendation Models. In arxiv.org."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341301.3359646"},{"key":"e_1_3_2_1_42_1","unstructured":"Maxim Naumov Dheevatsa Mudigere Hao-Jun Michael Shi Jianyu Huang Narayanan Sundaraman Jongsoo Park Xiaodong Wang Udit Gupta Carole-Jean Wu Alisson G. Azzolini Dmytro Dzhulgakov Andrey Mallevich Ilia Cherniavskii Yinghai Lu Raghuraman Krishnamoorthi Ansha Yu Volodymyr Kondratenko Stephanie Pereira Xianjie Chen Wenlin Chen Vijay Rao Bill Jia Liang Xiong and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. In arxiv.org.  Maxim Naumov Dheevatsa Mudigere Hao-Jun Michael Shi Jianyu Huang Narayanan Sundaraman Jongsoo Park Xiaodong Wang Udit Gupta Carole-Jean Wu Alisson G. Azzolini Dmytro Dzhulgakov Andrey Mallevich Ilia Cherniavskii Yinghai Lu Raghuraman Krishnamoorthi Ansha Yu Volodymyr Kondratenko Stephanie Pereira Xianjie Chen Wenlin Chen Vijay Rao Bill Jia Liang Xiong and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. In arxiv.org."},{"key":"e_1_3_2_1_43_1","unstructured":"NVIDIA. 2011. NVIDIA System Management Interface (nvidia-smi). https:\/\/developer.nvidia.com\/nvidia-system-management-interface.  NVIDIA. 2011. NVIDIA System Management Interface (nvidia-smi). https:\/\/developer.nvidia.com\/nvidia-system-management-interface."},{"key":"e_1_3_2_1_44_1","unstructured":"NVIDIA. 2016. NVIDIA CUDA Programming Guide.  NVIDIA. 2016. NVIDIA CUDA Programming Guide."},{"key":"e_1_3_2_1_45_1","unstructured":"NVIDIA. 2019. cuBLAS Library. (2019).  NVIDIA. 2019. cuBLAS Library. (2019)."},{"key":"e_1_3_2_1_46_1","unstructured":"NVIDIA. 2019. cuDNN: GPU Accelerated Deep Learning.  NVIDIA. 2019. cuDNN: GPU Accelerated Deep Learning."},{"key":"e_1_3_2_1_47_1","unstructured":"NVIDIA. 2020. NVIDIA Tesla A100.  NVIDIA. 2020. NVIDIA Tesla A100."},{"key":"e_1_3_2_1_48_1","volume":"201","author":"Parashar Angshuman","unstructured":"Angshuman Parashar , Minsoo Rhu , Anurag Mukkara , Antonio Puglielli , Rangharajan Venkatesan , Brucek Khailany , Joel Emer , Stephen W. Keckler , and William J. Dally. 201 7. SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks. In Proceedings of the International Symposium on Computer Architecture (ISCA). Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W. Keckler, and William J. Dally. 2017. SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks. In Proceedings of the International Symposium on Computer Architecture (ISCA).","journal-title":"William J. Dally."},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3466752.3480080"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3373376.3378505"},{"key":"e_1_3_2_1_51_1","unstructured":"PyTorch. 2019. http:\/\/pytorch.org.  PyTorch. 2019. http:\/\/pytorch.org."},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"crossref","unstructured":"Samyam Rajbhandari Olatunji Ruwase Jeff Rasley Shaden Smith and Yuxiong He. 2021. ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning. In arxiv.org.  Samyam Rajbhandari Olatunji Ruwase Jeff Rasley Shaden Smith and Yuxiong He. 2021. ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning. In arxiv.org.","DOI":"10.1145\/3458817.3476205"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA51647.2021.00057"},{"key":"e_1_3_2_1_54_1","volume-title":"Proceedings of the International Symposium on Microarchitecture (MICRO).","author":"Rhu Minsoo","unstructured":"Minsoo Rhu , Natalia Gimelshein , Jason Clemons , Arslan Zulfiqar , and Stephen W. Keckler . 2016. vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design . In Proceedings of the International Symposium on Microarchitecture (MICRO). Minsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, and Stephen W. Keckler. 2016. vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design. In Proceedings of the International Symposium on Microarchitecture (MICRO)."},{"key":"e_1_3_2_1_55_1","volume-title":"Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA).","author":"Rhu Minsoo","unstructured":"Minsoo Rhu , Mike O'Connor , Niladrish Chatterjee , Jeff Pool , Youngeun Kwon , and Stephen W. Keckler . 2018. Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks . In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA). Minsoo Rhu, Mike O'Connor, Niladrish Chatterjee, Jeff Pool, Youngeun Kwon, and Stephen W. Keckler. 2018. Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks. In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA)."},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178487.3178491"},{"key":"e_1_3_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3445814.3446763"},{"key":"e_1_3_2_1_58_1","volume-title":"Proceedings of Machine Learning and Systems (MLSys).","author":"Xie Deping","year":"2020","unstructured":"Deping Xie , Ronglai Jia , Yulei Qian , Ruiquan Ding , Mingming Sun , and Ping Li . 2020 . Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems . In Proceedings of Machine Learning and Systems (MLSys). Deping Xie, Ronglai Jia, Yulei Qian, Ruiquan Ding, Mingming Sun, and Ping Li. 2020. Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems. In Proceedings of Machine Learning and Systems (MLSys)."},{"key":"e_1_3_2_1_59_1","volume-title":"Ping Tak Peter Tang, and Andrew Tulloch","author":"Yang Jie Amy","year":"2020","unstructured":"Jie Amy Yang , Jianyu Huang , Jongsoo Park , Ping Tak Peter Tang, and Andrew Tulloch . 2020 . Mixed-Precision Embedding Using a Cache. In arxiv.org. Jie Amy Yang, Jianyu Huang, Jongsoo Park, Ping Tak Peter Tang, and Andrew Tulloch. 2020. Mixed-Precision Embedding Using a Cache. In arxiv.org."},{"key":"e_1_3_2_1_60_1","volume-title":"Proceedings of Machine Learning and Systems (MLSys).","author":"Yi Xinyang","year":"2018","unstructured":"Xinyang Yi , Yi-Fan Chen , Sukriti Ramesh , Vinu Rajashekhar , Lichan Hong , Noah Fiedel , Nandini Seshadri , Lukasz Heldt , Xiang Wu , and EH Chi . 2018 . Factorized Deep Retrieval and Distributed TensorFlow Serving . In Proceedings of Machine Learning and Systems (MLSys). Xinyang Yi, Yi-Fan Chen, Sukriti Ramesh, Vinu Rajashekhar, Lichan Hong, Noah Fiedel, Nandini Seshadri, Lukasz Heldt, Xiang Wu, and EH Chi. 2018. Factorized Deep Retrieval and Distributed TensorFlow Serving. In Proceedings of Machine Learning and Systems (MLSys)."},{"key":"e_1_3_2_1_61_1","volume-title":"Proceedings of Machine Learning and Systems (MLSys).","author":"Yin Chunxing","year":"2021","unstructured":"Chunxing Yin , Bilge Acun , Carole-Jean Wu , and Xing Liu . 2021 . TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models . In Proceedings of Machine Learning and Systems (MLSys). Chunxing Yin, Bilge Acun, Carole-Jean Wu, and Xing Liu. 2021. TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models. In Proceedings of Machine Learning and Systems (MLSys)."},{"key":"e_1_3_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3358045"}],"event":{"name":"ISCA '22: The 49th Annual International Symposium on Computer Architecture","location":"New York New York","acronym":"ISCA '22","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture","IEEE CS TCAA IEEE CS technical committee on architectural acoustics"]},"container-title":["Proceedings of the 49th Annual International Symposium on Computer Architecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3527386","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3470496.3527386","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:27Z","timestamp":1750188627000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3527386"}},"subtitle":["look forward not backwards"],"short-title":[],"issued":{"date-parts":[[2022,6,11]]},"references-count":62,"alternative-id":["10.1145\/3470496.3527386","10.1145\/3470496"],"URL":"https:\/\/doi.org\/10.1145\/3470496.3527386","relation":{},"subject":[],"published":{"date-parts":[[2022,6,11]]},"assertion":[{"value":"2022-06-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}