{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T05:05:15Z","timestamp":1750309515299,"version":"3.41.0"},"reference-count":69,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2025,4,10]],"date-time":"2025-04-10T00:00:00Z","timestamp":1744243200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Recomm. Syst."],"published-print":{"date-parts":[[2025,12,31]]},"abstract":"<jats:p>\n            Deep learning-based recommender models (DLRMs) have become an essential component of many modern recommender systems. Several companies are now building large compute clusters reserved for DLRM training, driving new interest in cost- and time-saving optimizations. The systems challenges faced in this setting are unique; while typical deep learning (DL) training jobs are dominated by model execution times, the most important factor in DLRM training performance is often\n            <jats:italic>online data ingestion.<\/jats:italic>\n          <\/jats:p>\n          <jats:p>\n            In this article, we study real-world DLRM data processing pipelines taken from our compute cluster at Netflix to observe the performance impacts of online ingestion and identify shortfalls in existing data pipeline optimizers. Our studies lead us to design a new solution for data pipeline optimization,\n            <jats:sc>InTuneX<\/jats:sc>\n            .\n          <\/jats:p>\n          <jats:p>\n            <jats:sc>InTuneX<\/jats:sc>\n            \u00a0is designed for production-scale multi-node recommender data pipelines. It unifies and tackles the challenges of both\n            <jats:italic>intra-<\/jats:italic>\n            and\n            <jats:italic>inter-<\/jats:italic>\n            node pipeline optimization. We achieve this with a multi-agent reinforcement learning (RL) design, simultaneously optimizing node assignments at the cluster level and CPU assignments within nodes.\n          <\/jats:p>\n          <jats:p>\n            Our experiments show that\n            <jats:sc>InTuneX<\/jats:sc>\n            \u00a0can build optimized data pipeline configurations within minutes. We apply\n            <jats:sc>InTuneX<\/jats:sc>\n            \u00a0to our cluster and find that it increases single-node data ingestion throughput by as much as 2.29 \u00d7 versus state-of-the-art optimizers, while improving the cost-efficiency of multi-node pipelines by 15% to 25%.\n          <\/jats:p>","DOI":"10.1145\/3704923","type":"journal-article","created":{"date-parts":[[2024,12,9]],"date-time":"2024-12-09T10:57:07Z","timestamp":1733741827000},"page":"1-29","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Reinforcement Learning for Intra- &amp; Inter-Node Recommender Data Pipeline Optimization"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0214-4812","authenticated-orcid":false,"given":"Kabir","family":"Nagrecha","sequence":"first","affiliation":[{"name":"Computer Science &amp; Engineering, University of California San Diego, La Jolla, United States and Machine Learning Platform, Netflix Inc, Los Gatos, United States"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-1948-6407","authenticated-orcid":false,"given":"Lingyi","family":"Liu","sequence":"additional","affiliation":[{"name":"Machine Learning Platform, Netflix Inc, Los Gatos, United States"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-7783-9035","authenticated-orcid":false,"given":"Pablo","family":"Delgado","sequence":"additional","affiliation":[{"name":"Machine Learning Platform, Netflix Inc, Los Gatos, United States"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,4,10]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"crossref","first-page":"802","DOI":"10.1109\/HPCA51647.2021.00072","volume-title":"2021 IEEE International Symposium on High-performance Computer Architecture (HPCA\u201921)","author":"Acun Bilge","year":"2021","unstructured":"Bilge Acun, Matthew Murphy, Xiaodong Wang, Jade Nie, Carole-Jean Wu, and Kim Hazelwood. 2021. Understanding training efficiency of deep learning recommendation models at scale. In 2021 IEEE International Symposium on High-performance Computer Architecture (HPCA\u201921). IEEE, 802\u2013814."},{"key":"e_1_3_1_3_2","article-title":"Heterogeneous acceleration pipeline for recommendation system training","author":"Adnan Muhammad","year":"2022","unstructured":"Muhammad Adnan, Yassaman Ebrahimzadeh Maboud, Divya Mahajan, and Prashant J. Nair. 2022. Heterogeneous acceleration pipeline for recommendation system training. arXiv preprint arXiv:2204.05436 (2022).","journal-title":"arXiv preprint arXiv:2204.05436"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3620678.3624666"},{"key":"e_1_3_1_5_2","unstructured":"Justin A. Boyan and Andrew W. Moore. 1994. Generalization in reinforcement learning: Safely approximating the value function. In Proceedings of the 7th International Conference on Neural Information Processing Systems (Denver Colorado) (NIPS\u201994). MIT Press Cambridge MA USA 369\u2013376."},{"key":"e_1_3_1_6_2","article-title":"Seed: Simple, efficient, and effective data management via large language models","author":"Chen Zui","year":"2023","unstructured":"Zui Chen, Lei Cao, Sam Madden, Ju Fan, Nan Tang, Zihui Gu, Zeyuan Shang, Chunwei Liu, Michael Cafarella, and Tim Kraska. 2023. Seed: Simple, efficient, and effective data management via large language models. arXiv preprint arXiv:2310.00749 (2023).","journal-title":"arXiv preprint arXiv:2310.00749"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/2988450.2988454"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/40.46766"},{"key":"e_1_3_1_9_2","first-page":"16344","article-title":"Flashattention: Fast and memory-efficient exact attention with io-awareness","volume":"35","author":"Dao Tri","year":"2022","unstructured":"Tri Dao, Dan Fu, Stefano Ermon, Atri Rudra, and Christopher R\u00e9. 2022. Flashattention: Fast and memory-efficient exact attention with io-awareness. Advances in Neural Information Processing Systems 35 (2022), 16344\u201316359.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/3460231.3474255"},{"key":"e_1_3_1_11_2","article-title":"Random offset block embedding array (ROBE) for CriteoTB benchmark MLPerf DLRM model: 1000 times compression and 3.1 times faster inference","author":"Desai Aditya","year":"2021","unstructured":"Aditya Desai, Li Chou, and Anshumali Shrivastava. 2021. Random offset block embedding array (ROBE) for CriteoTB benchmark MLPerf DLRM model: 1000 times compression and 3.1 times faster inference. arXiv preprint arXiv:2108.02191 (2021).","journal-title":"arXiv preprint arXiv:2108.02191"},{"key":"e_1_3_1_12_2","first-page":"130","volume-title":"International Workshop on Explainable, Transparent Autonomous Agents and Multi-agent Systems","author":"Dey Sumanta","year":"2023","unstructured":"Sumanta Dey, Sharat Bhat, Pallab Dasgupta, and Soumyajit Dey. 2023. Imperative action masking for safe exploration in reinforcement learning. In International Workshop on Explainable, Transparent Autonomous Agents and Multi-agent Systems. Springer, 130\u2013142."},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10710-017-9314-z"},{"key":"e_1_3_1_14_2","doi-asserted-by":"crossref","first-page":"488","DOI":"10.1109\/HPCA47549.2020.00047","volume-title":"2020 IEEE International Symposium on High Performance Computer Architecture (HPCA\u201920)","author":"Gupta Udit","year":"2020","unstructured":"Udit Gupta, Carole-Jean Wu, Xiaodong Wang, Maxim Naumov, Brandon Reagen, David Brooks, Bradford Cottel, Kim Hazelwood, Mark Hempstead, Bill Jia, et\u00a0al. 2020. The architectural implications of Facebook\u2019s DNN-based personalized recommendation. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA\u201920). IEEE, 488\u2013501."},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","unstructured":"Ameer Haj-Ali Nesreen K. Ahmed Ted Willke Joseph Gonzalez Krste Asanovic and Ion Stoica. 2019. A View on Deep Reinforcement Learning in System Optimization. 10.48550\/ARXIV.1908.01275","DOI":"10.48550\/ARXIV.1908.01275"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","unstructured":"Aaron Harlap Deepak Narayanan Amar Phanishayee Vivek Seshadri Nikhil Devanur Greg Ganger and Phil Gibbons. 2018. PipeDream: Fast and Efficient Pipeline Parallel DNN Training. 10.48550\/ARXIV.1806.03377","DOI":"10.48550\/ARXIV.1806.03377"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCOM.2017.1700246"},{"key":"e_1_3_1_18_2","article-title":"Gpipe: Efficient training of giant neural networks using pipeline parallelism","volume":"32","author":"Huang Yanping","year":"2019","unstructured":"Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Dehao Chen, Mia Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, et\u00a0al. 2019. Gpipe: Efficient training of giant neural networks using pipeline parallelism. Advances in Neural Information Processing Systems 32 (2019), 103\u2013112.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","unstructured":"Yanping Huang Youlong Cheng Ankur Bapna Orhan Firat Mia Xu Chen Dehao Chen HyoukJoong Lee Jiquan Ngiam Quoc V. Le Yonghui Wu and Zhifeng Chen. 2018. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism. 10.48550\/ARXIV.1811.06965","DOI":"10.48550\/ARXIV.1811.06965"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3517848"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","unstructured":"Zhihao Jia Matei Zaharia and Alex Aiken. 2018. Beyond Data and Model Parallelism for Deep Neural Networks. 10.48550\/ARXIV.1807.05358","DOI":"10.48550\/ARXIV.1807.05358"},{"key":"e_1_3_1_22_2","volume-title":"On the Sample Complexity of Reinforcement Learning","author":"Kakade Sham Machandranath","year":"2003","unstructured":"Sham Machandranath Kakade. 2003. On the Sample Complexity of Reinforcement Learning. Ph. D. Dissertation. University College London."},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","unstructured":"Anssi Kanervisto Christian Scheller and Ville Hautam\u00e4ki. 2020. Action Space Shaping in Deep Reinforcement Learning. 10.48550\/ARXIV.2004.00980","DOI":"10.48550\/ARXIV.2004.00980"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","unstructured":"Sanjay Krishnan Zongheng Yang Ken Goldberg Joseph Hellerstein and Ion Stoica. 2018. Learning to Optimize Join Queries with Deep Reinforcement Learning. 10.48550\/ARXIV.1808.03196","DOI":"10.48550\/ARXIV.1808.03196"},{"key":"e_1_3_1_25_2","unstructured":"Kubernetes. 2023. Updating the Pod\u2019s resources. https:\/\/kubernetes.io\/docs\/tasks\/configure-pod-container\/resize-container-resources\/#updating-the-pod-s-resources"},{"key":"e_1_3_1_26_2","first-page":"33","article-title":"Plumber: Diagnosing and removing performance bottlenecks in machine learning data pipelines","volume":"4","author":"Kuchnik Michael","year":"2022","unstructured":"Michael Kuchnik, Ana Klimovic, Jiri Simsa, Virginia Smith, and George Amvrosiadis. 2022. Plumber: Diagnosing and removing performance bottlenecks in machine learning data pipelines. Proceedings of Machine Learning and Systems 4 (2022), 33\u201351.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_1_27_2","volume-title":"11th Annual Conference on Innovative Data Systems Research (CIDR\u201921)","author":"Kumar Arun","year":"2021","unstructured":"Arun Kumar, Supun Nakandala, Yuhao Zhang, Side Li, Advitya Gemawat, and Kabir Nagrecha. 2021. Cerebro: A layered data platform for scalable deep learning. In 11th Annual Conference on Innovative Data Systems Research (CIDR\u201921)."},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","unstructured":"Dmitry Lepikhin HyoukJoong Lee Yuanzhong Xu Dehao Chen Orhan Firat Yanping Huang Maxim Krikun Noam Shazeer and Zhifeng Chen. 2020. GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding. 10.48550\/ARXIV.2006.16668","DOI":"10.48550\/ARXIV.2006.16668"},{"key":"e_1_3_1_29_2","article-title":"Massively parallel hyperparameter tuning","volume":"5","author":"Li Liam","year":"2018","unstructured":"Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. 2018. Massively parallel hyperparameter tuning. arXiv preprint arXiv:1810.05934 5 (2018).","journal-title":"arXiv preprint arXiv:1810.05934"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","unstructured":"Zhuohan Li Siyuan Zhuang Shiyuan Guo Danyang Zhuo Hao Zhang Dawn Song and Ion Stoica. 2021. TeraPipe: Token-Level Pipeline Parallelism for Training Large-scale Language Models. 10.48550\/ARXIV.2102.07988","DOI":"10.48550\/ARXIV.2102.07988"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.14778\/3342263.3342644"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3211954.3211957"},{"key":"e_1_3_1_33_2","first-page":"336","article-title":"Mlperf training benchmark","volume":"2","author":"Mattson Peter","year":"2020","unstructured":"Peter Mattson, Christine Cheng, Gregory Diamos, Cody Coleman, Paulius Micikevicius, David Patterson, Hanlin Tang, Gu-Yeon Wei, Peter Bailis, Victor Bittorf, et\u00a0al. 2020. Mlperf training benchmark. Proceedings of Machine Learning and Systems 2 (2020), 336\u2013349.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","unstructured":"Jayashree Mohan Amar Phanishayee Ashish Raniwala and Vijay Chidambaram. 2020. Analyzing and Mitigating Data Stalls in DNN Training. 10.48550\/ARXIV.2007.06775","DOI":"10.48550\/ARXIV.2007.06775"},{"key":"e_1_3_1_35_2","unstructured":"Philipp Moritz Robert Nishihara Stephanie Wang Alexey Tumanov Richard Liaw Eric Liang Melih Elibol Zongheng Yang William Paul Michael I. Jordan and Ion Stoica. 2018. Ray: A Distributed Framework for Emerging AI Applications. arxiv:1712.05889 [cs.DC]"},{"key":"e_1_3_1_36_2","unstructured":"Wenlong Mou Zheng Wen and Xi Chen. 2020. On the Sample Complexity of Reinforcement Learning with Policy Space Generalization. arxiv:2008.07353 [cs.LG] https:\/\/arxiv.org\/abs\/2008.07353"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","unstructured":"Dheevatsa Mudigere Yuchen Hao Jianyu Huang Zhihao Jia Andrew Tulloch Srinivas Sridharan Xing Liu Mustafa Ozdal Jade Nie Jongsoo Park Liang Luo Jie Amy Yang Leon Gao Dmytro Ivchenko Aarti Basant Yuxi Hu Jiyan Yang Ehsan K. Ardestani Xiaodong Wang Rakesh Komuravelli Ching-Hsiang Chu Serhat Yilmaz Huayu Li Jiyuan Qian Zhuobo Feng Yinbin Ma Junjie Yang Ellie Wen Hong Li Lin Yang Chonglin Sun Whitney Zhao Dimitry Melts Krishna Dhulipala K. R. Kishore Tyler Graf Assaf Eisenman Kiran Kumar Matam Adi Gangidi Guoqiang Jerry Chen Manoj Krishnan Avinash Nayak Krishnakumar Nair Bharath Muthiah Mahmoud khorashadi Pallab Bhattacharya Petr Lapukhov Maxim Naumov Ajit Mathews Lin Qiao Mikhail Smelyanskiy Bill Jia and Vijay Rao. 2021. Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models. 10.48550\/ARXIV.2104.05158","DOI":"10.48550\/ARXIV.2104.05158"},{"key":"e_1_3_1_38_2","article-title":"tf. data: A machine learning data processing framework","author":"Murray Derek G.","year":"2021","unstructured":"Derek G. Murray, Jiri Simsa, Ana Klimovic, and Ihor Indyk. 2021. tf. data: A machine learning data processing framework. arXiv preprint arXiv:2101.12127 (2021).","journal-title":"arXiv preprint arXiv:2101.12127"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3450571"},{"key":"e_1_3_1_40_2","unstructured":"Kabir Nagrecha. 2023. Systems for Parallel and Distributed Large-Model Deep Learning Training. arXiv:2301.02691 [cs.DC] https:\/\/arxiv.org\/abs\/2301.02691"},{"key":"e_1_3_1_41_2","unstructured":"Kabir Nagrecha and Arun Kumar. 2022. Hydra: A System for Large Multi-model Deep Learning. arxiv:2110.08633 [cs.DC]"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","unstructured":"Kabir Nagrecha and Arun Kumar. 2024. Saturn: An optimized data system for multi-large-model deep learning workloads. Proc. VLDB Endow. 17 4 (March 2024) 712\u2013725. 10.14778\/3636218.3636227","DOI":"10.14778\/3636218.3636227"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3604915.3608778"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3517846"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397461"},{"key":"e_1_3_1_46_2","first-page":"481","volume-title":"14th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201920)","author":"Narayanan Deepak","year":"2020","unstructured":"Deepak Narayanan, Keshav Santhanam, Fiodar Kazhamiaka, Amar Phanishayee, and Matei Zaharia. 2020. Heterogeneity-aware cluster scheduling policies for deep learning workloads. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201920). 481\u2013498."},{"key":"e_1_3_1_47_2","unstructured":"Maxim Naumov Dheevatsa Mudigere Hao-Jun Michael Shi Jianyu Huang Narayanan Sundaraman Jongsoo Park Xiaodong Wang Udit Gupta Carole-Jean Wu Alisson G. Azzolini Dmytro Dzhulgakov Andrey Mallevich Ilia Cherniavskii Yinghai Lu Raghuraman Krishnamoorthi Ansha Yu Volodymyr Kondratenko Stephanie Pereira Xianjie Chen Wenlin Chen Vijay Rao Bill Jia Liang Xiong and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. arXiv:1906.00091 [cs.IR] https:\/\/arxiv.org\/abs\/1906.00091"},{"key":"e_1_3_1_48_2","unstructured":"Kunle Olukotun. 2022. Systems for ML and ML for Systems: A Virtuous Cycle. https:\/\/mlsys.org\/virtual\/2022\/invited-talk\/2065MLSys"},{"key":"e_1_3_1_49_2","first-page":"126","article-title":"Virtualflow: Decoupling deep learning models from the underlying hardware","volume":"4","author":"Or Andrew","year":"2022","unstructured":"Andrew Or, Haoyu Zhang, and Michael None Freedman. 2022. Virtualflow: Decoupling deep learning models from the underlying hardware. Proceedings of Machine Learning and Systems 4 (2022), 126\u2013140.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","unstructured":"Jennifer Ortiz Magdalena Balazinska Johannes Gehrke and S. Sathiya Keerthi. 2018. Learning State Representations for Query Optimization with Deep Reinforcement Learning. 10.48550\/ARXIV.1803.08604","DOI":"10.48550\/ARXIV.1803.08604"},{"key":"e_1_3_1_51_2","unstructured":"Guilherme Penedo Quentin Malartic Daniel Hesslow Ruxandra Cojocaru Alessandro Cappelli Hamza Alobeidli Baptiste Pannier Ebtesam Almazrouei and Julien Launay. 2023. The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data and Web Data Only. arxiv:2306.01116 [cs.CL]"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1145\/3190508.3190517"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2021.3052895"},{"key":"e_1_3_1_54_2","first-page":"1","volume-title":"OSDI","author":"Qiao Aurick","year":"2021","unstructured":"Aurick Qiao, Sang Keun Choe, Suhas Jayaram Subramanya, Willie Neiswanger, Qirong Ho, Hao Zhang, Gregory R. Ganger, and Eric P. Xing. 2021. Pollux: Co-adaptive cluster scheduling for goodput-optimized deep learning. In OSDI, Vol. 21. 1\u201318."},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","unstructured":"Jie Ren Samyam Rajbhandari Reza Yazdani Aminabadi Olatunji Ruwase Shuangyan Yang Minjia Zhang Dong Li and Yuxiong He. 2021. ZeRO-Offload: Democratizing Billion-Scale Model Training. 10.48550\/ARXIV.2101.06840","DOI":"10.48550\/ARXIV.2101.06840"},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503222.3507777"},{"key":"e_1_3_1_57_2","doi-asserted-by":"publisher","unstructured":"Mohammad Shoeybi Mostofa Patwary Raul Puri Patrick LeGresley Jared Casper and Bryan Catanzaro. 2019. Megatron-LM: Training Multi-billion Parameter Language Models Using Model Parallelism. 10.48550\/ARXIV.1909.08053","DOI":"10.48550\/ARXIV.1909.08053"},{"key":"e_1_3_1_58_2","unstructured":"David Silver Thomas Hubert Julian Schrittwieser Ioannis Antonoglou Matthew Lai Arthur Guez Marc Lanctot Laurent Sifre Dharshan Kumaran Thore Graepel Timothy Lillicrap Karen Simonyan and Demis Hassabis. 2017. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815 [cs.AI]"},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","unstructured":"G. Tesauro N. K. Jong R. Das and M. N. Bennani. 2006. A hybrid reinforcement learning approach to autonomic resource allocation. In 2006 IEEE International Conference on Autonomic Computing. 65\u201373. 10.1109\/ICAC.2006.1662383","DOI":"10.1109\/ICAC.2006.1662383"},{"key":"e_1_3_1_60_2","volume-title":"Proceedings of the 4th Connectionist Models Summer School","author":"Thrun Sebastian","year":"1993","unstructured":"Sebastian Thrun and Anton Schwartz. 1993. Issues in using function approximation for reinforcement learning. In Proceedings of the 4th Connectionist Models Summer School, Vol. 255. Hillsdale, NJ, 263."},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3064029"},{"key":"e_1_3_1_62_2","first-page":"664","article-title":"sensai: Convnets decomposition via class parallelism for fast inference on live data","volume":"3","author":"Wang Guanhua","year":"2021","unstructured":"Guanhua Wang, Zhuang Liu, Brandon Hsieh, Siyuan Zhuang, Joseph Gonzalez, Trevor Darrell, and Ion Stoica. 2021. sensai: Convnets decomposition via class parallelism for fast inference on live data. Proceedings of Machine Learning and Systems 3 (2021), 664\u2013679.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_1_63_2","first-page":"1387","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wang Pei","year":"2021","unstructured":"Pei Wang, Kabir Nagrecha, and Nuno Vasconcelos. 2021. Gradient-based algorithms for machine teaching. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1387\u20131396."},{"key":"e_1_3_1_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/3523227.3547405"},{"key":"e_1_3_1_65_2","doi-asserted-by":"publisher","DOI":"10.1145\/3523227.3546765"},{"key":"e_1_3_1_66_2","doi-asserted-by":"publisher","unstructured":"Carole-Jean Wu Ramya Raghavendra Udit Gupta Bilge Acun Newsha Ardalani Kiwan Maeng Gloria Chang Fiona Aga Behram James Huang Charles Bai Michael Gschwind Anurag Gupta Myle Ott Anastasia Melnikov Salvatore Candido David Brooks Geeta Chauhan Benjamin Lee Hsien-Hsin S. Lee Bugra Akyildiz Maximilian Balandat Joe Spisak Ravi Jain Mike Rabbat and Kim Hazelwood. 2021. Sustainable AI: Environmental Implications Challenges and Opportunities. 10.48550\/ARXIV.2111.00364","DOI":"10.48550\/ARXIV.2111.00364"},{"key":"e_1_3_1_67_2","first-page":"595","volume-title":"13th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201918)","author":"Xiao Wencong","year":"2018","unstructured":"Wencong Xiao, Romil Bhardwaj, Ramachandran Ramjee, Muthian Sivathanu, Nipun Kwatra, Zhenhua Han, Pratyush Patel, Xuan Peng, Hanyu Zhao, Quanlu Zhang, et\u00a0al. 2018. Gandiva: Introspective cluster scheduling for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201918). 595\u2013610."},{"key":"e_1_3_1_68_2","first-page":"533","volume-title":"14th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201920)","author":"Xiao Wencong","year":"2020","unstructured":"Wencong Xiao, Shiru Ren, Yong Li, Yang Zhang, Pengyang Hou, Zhi Li, Yihui Feng, Wei Lin, and Yangqing Jia. 2020. AntMan: Dynamic scaling on GPU clusters for deep learning. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201920). 533\u2013548."},{"key":"e_1_3_1_69_2","doi-asserted-by":"publisher","unstructured":"Mark Zhao Niket Agarwal Aarti Basant Bu\u01e7ra Gedik Satadru Pan Mustafa Ozdal Rakesh Komuravelli Jerry Pan Tianshu Bao Haowei Lu Sundaram Narayanan Jack Langman Kevin Wilfong Harsha Rastogi Carole-Jean Wu Christos Kozyrakis and Parik Pol. 2022. Understanding data storage and ingestion for large-scale deep recommendation model training. In Proceedings of the 49th Annual International Symposium on Computer Architecture. ACM. 10.1145\/3470496.3533044","DOI":"10.1145\/3470496.3533044"},{"key":"e_1_3_1_70_2","doi-asserted-by":"publisher","unstructured":"Victor Zhong Caiming Xiong and Richard Socher. 2017. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. 10.48550\/ARXIV.1709.00103","DOI":"10.48550\/ARXIV.1709.00103"}],"container-title":["ACM Transactions on Recommender Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3704923","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3704923","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:18:01Z","timestamp":1750295881000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3704923"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,10]]},"references-count":69,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,12,31]]}},"alternative-id":["10.1145\/3704923"],"URL":"https:\/\/doi.org\/10.1145\/3704923","relation":{},"ISSN":["2770-6699"],"issn-type":[{"type":"electronic","value":"2770-6699"}],"subject":[],"published":{"date-parts":[[2025,4,10]]},"assertion":[{"value":"2024-02-09","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-09-09","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-10","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}