{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T01:24:28Z","timestamp":1776993868770,"version":"3.51.4"},"reference-count":73,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2022,5,26]],"date-time":"2022-05-26T00:00:00Z","timestamp":1653523200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea","doi-asserted-by":"publisher","award":["NRF-2021R1A6A1A13044830"],"award-info":[{"award-number":["NRF-2021R1A6A1A13044830"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Institute of Information & communications Technology Planning & Evaluation","award":["2015-0-00280"],"award-info":[{"award-number":["2015-0-00280"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Meas. Anal. Comput. Syst."],"published-print":{"date-parts":[[2022,5,26]]},"abstract":"<jats:p>The prediction of the resource consumption for the distributed training of deep learning models is of paramount importance, as it can inform a priori users how long their training would take and also enable users to manage the cost of training. Yet, no such prediction is available for users because the resource consumption itself varies significantly according to \"settings\" such as GPU types and also by \"workloads\" like deep learning models. Previous studies have aimed to derive or model such a prediction, but they fall short of accommodating the various combinations of settings and workloads together. This study presents Driple that designs graph neural networks to predict the resource consumption of diverse workloads. Driple also designs transfer learning to extend the graph neural networks to adapt to differences in settings. The evaluation results show that Driple can effectively predict a wide range of workloads and settings. At the same time, Driple can efficiently reduce the time required to tailor the prediction for different settings by up to 7.3\u00d7.<\/jats:p>","DOI":"10.1145\/3530895","type":"journal-article","created":{"date-parts":[[2022,6,6]],"date-time":"2022-06-06T17:16:18Z","timestamp":1654535778000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["Prediction of the Resource Consumption of Distributed Deep Learning Systems"],"prefix":"10.1145","volume":"6","author":[{"given":"Gyeongsik","family":"Yang","sequence":"first","affiliation":[{"name":"Korea University, Seoul, South Korea"}]},{"given":"Changyong","family":"Shin","sequence":"additional","affiliation":[{"name":"Korea University, Seoul, South Korea"}]},{"given":"Jeunghwan","family":"Lee","sequence":"additional","affiliation":[{"name":"Korea University, Seoul, South Korea"}]},{"given":"Yeonho","family":"Yoo","sequence":"additional","affiliation":[{"name":"Korea University, Seoul, South Korea"}]},{"given":"Chuck","family":"Yoo","sequence":"additional","affiliation":[{"name":"Korea University, Seoul, South Korea"}]}],"member":"320","published-online":{"date-parts":[[2022,6,6]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2020. NVIDIA Titan RTX is here. https:\/\/www.nvidia.com\/en-us\/deep-learning-ai\/products\/titan-rtx\/ Accessed: 2022-01-02.  2020. NVIDIA Titan RTX is here. https:\/\/www.nvidia.com\/en-us\/deep-learning-ai\/products\/titan-rtx\/ Accessed: 2022-01-02."},{"key":"e_1_2_1_2_1","unstructured":"2020. NVIDIA V100 | NVIDIA. https:\/\/www.nvidia.com\/en-us\/data-center\/v100\/. Accessed: 2021--12-09).  2020. NVIDIA V100 | NVIDIA. https:\/\/www.nvidia.com\/en-us\/data-center\/v100\/. Accessed: 2021--12-09)."},{"key":"e_1_2_1_3_1","unstructured":"2021. Benchmarks\/scripts\/tf_cnn_benchmarks \u00b7 TENSORFLOW\/benchmarks. https:\/\/github.com\/tensorflow\/ benchmarks\/tree\/master\/scripts\/tf_cnn_benchmarks Accessed: 2021-09--25.  2021. Benchmarks\/scripts\/tf_cnn_benchmarks \u00b7 TENSORFLOW\/benchmarks. https:\/\/github.com\/tensorflow\/ benchmarks\/tree\/master\/scripts\/tf_cnn_benchmarks Accessed: 2021-09--25."},{"key":"e_1_2_1_4_1","unstructured":"2021. Graphics reinvented: NVIDIA GeForce RTX 2080 Ti graphics card. https:\/\/www.nvidia.com\/en-us\/geforce\/ graphics-cards\/rtx-2080-ti\/ Accessed: 2021--12-07.  2021. Graphics reinvented: NVIDIA GeForce RTX 2080 Ti graphics card. https:\/\/www.nvidia.com\/en-us\/geforce\/ graphics-cards\/rtx-2080-ti\/ Accessed: 2021--12-07."},{"key":"e_1_2_1_5_1","unstructured":"2021. NVIDIA A100 GPUs. https:\/\/www.nvidia.com\/en-us\/data-center\/a100\/ Accessed: 2021--12--21.  2021. NVIDIA A100 GPUs. https:\/\/www.nvidia.com\/en-us\/data-center\/a100\/ Accessed: 2021--12--21."},{"key":"e_1_2_1_6_1","volume-title":"Proc. ACM Meas. Anal. Comput. Syst.","volume":"6","unstructured":"2021. NVIDIA Collective Communications Library (NCCL). https:\/\/developer.nvidia.com\/nccl Accessed: 2022-01-03 . Proc. ACM Meas. Anal. Comput. Syst. , Vol. 6 , No. 2, Article 29. Publication date: June 2022. 29:22 Gyeongsik Yang, Changyong Shin, Jeunghwan Lee, Yeonho Yoo, and Chuck Yoo 2021. NVIDIA Collective Communications Library (NCCL). https:\/\/developer.nvidia.com\/nccl Accessed: 2022-01-03. Proc. ACM Meas. Anal. Comput. Syst., Vol. 6, No. 2, Article 29. Publication date: June 2022. 29:22 Gyeongsik Yang, Changyong Shin, Jeunghwan Lee, Yeonho Yoo, and Chuck Yoo"},{"key":"e_1_2_1_7_1","unstructured":"2021. NVML API Reference Guide :: GPU Deployment and Management Documentation. https:\/\/docs.nvidia.com\/ deploy\/nvml-api\/structnvmlUtilization__t.html#structnvmlUtilization__t Accessed: 2021-09--24.  2021. NVML API Reference Guide :: GPU Deployment and Management Documentation. https:\/\/docs.nvidia.com\/ deploy\/nvml-api\/structnvmlUtilization__t.html#structnvmlUtilization__t Accessed: 2021-09--24."},{"key":"e_1_2_1_8_1","unstructured":"2021. TCPDUMP & LIBPCAP. https:\/\/www.tcpdump.org\/. Accessed: 2021-09--24.  2021. TCPDUMP & LIBPCAP. https:\/\/www.tcpdump.org\/. Accessed: 2021-09--24."},{"key":"e_1_2_1_9_1","unstructured":"2022. Driple. https:\/\/github.com\/gsyang33\/Driple. Accessed: 2022-04-07.  2022. Driple. https:\/\/github.com\/gsyang33\/Driple. Accessed: 2022-04-07."},{"key":"e_1_2_1_10_1","unstructured":"2022. A library of sklearn compatible categorical variable encoders. https:\/\/github.com\/scikit-learn-contrib\/category_ encoders Accessed: 2022-01--13.  2022. A library of sklearn compatible categorical variable encoders. https:\/\/github.com\/scikit-learn-contrib\/category_ encoders Accessed: 2022-01--13."},{"key":"e_1_2_1_11_1","unstructured":"2022. ONNX. https:\/\/onnx.ai\/. Accessed: 2022-04-06.  2022. ONNX. https:\/\/onnx.ai\/. Accessed: 2022-04-06."},{"key":"e_1_2_1_12_1","unstructured":"2022. tf.io.write_graph | TensorFlow Core. https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/io\/write_graph. Accessed: 2022-04-06.  2022. tf.io.write_graph | TensorFlow Core. https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/io\/write_graph. Accessed: 2022-04-06."},{"key":"e_1_2_1_13_1","volume-title":"12th USENIX symposium on operating systems design and implementation (OSDI 16)","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , 2016 . TensorFlow: A system for large-scale machine learning . In 12th USENIX symposium on operating systems design and implementation (OSDI 16) . 265--283. Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16). 265--283."},{"key":"e_1_2_1_14_1","volume-title":"Shreyan Gupta, Hongzi Mao, and Mohammad Alizadeh.","author":"Addanki Ravichandra","year":"2019","unstructured":"Ravichandra Addanki , Shaileshh Bojja Venkatakrishnan , Shreyan Gupta, Hongzi Mao, and Mohammad Alizadeh. 2019 . Placeto : Learning generalizable device placement algorithms for distributed machine learning. arXiv preprint arXiv:1906.08879 (2019). Ravichandra Addanki, Shaileshh Bojja Venkatakrishnan, Shreyan Gupta, Hongzi Mao, and Mohammad Alizadeh. 2019. Placeto: Learning generalizable device placement algorithms for distributed machine learning. arXiv preprint arXiv:1906.08879 (2019)."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330693"},{"key":"e_1_2_1_16_1","volume-title":"Impact of dataset size and variety on the effectiveness of deep learning and transfer learning for plant disease classification. Computers and electronics in agriculture 153","author":"Arnal Barbedo Jayme Garcia","year":"2018","unstructured":"Jayme Garcia Arnal Barbedo . 2018. Impact of dataset size and variety on the effectiveness of deep learning and transfer learning for plant disease classification. Computers and electronics in agriculture 153 ( 2018 ), 46--53. Jayme Garcia Arnal Barbedo. 2018. Impact of dataset size and variety on the effectiveness of deep learning and transfer learning for plant disease classification. Computers and electronics in agriculture 153 (2018), 46--53."},{"key":"e_1_2_1_17_1","unstructured":"Peter W Battaglia Jessica B Hamrick Victor Bapst Alvaro Sanchez-Gonzalez Vinicius Zambaldi Mateusz Malinowski Andrea Tacchetti David Raposo Adam Santoro Ryan Faulkner etal 2018. Relational inductive biases deep learning and graph networks. arXiv preprint arXiv:1806.01261 (2018).  Peter W Battaglia Jessica B Hamrick Victor Bapst Alvaro Sanchez-Gonzalez Vinicius Zambaldi Mateusz Malinowski Andrea Tacchetti David Raposo Adam Santoro Ryan Faulkner et al. 2018. Relational inductive biases deep learning and graph networks. arXiv preprint arXiv:1806.01261 (2018)."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3320060"},{"key":"e_1_2_1_19_1","unstructured":"Tom B Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell etal 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020).  Tom B Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020)."},{"key":"e_1_2_1_20_1","volume-title":"MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274","author":"Chen Tianqi","year":"2015","unstructured":"Tianqi Chen , Mu Li , Yutian Li , Min Lin , Naiyan Wang , Minjie Wang , Tianjun Xiao , Bing Xu , Chiyuan Zhang , and Zheng Zhang . 2015. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 ( 2015 ). Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015)."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3419111.3421307"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1179"},{"key":"e_1_2_1_23_1","first-page":"13260","article-title":". Principal Neighbourhood Aggregation for Graph Nets","volume":"33","author":"Corso Gabriele","year":"2020","unstructured":"Gabriele Corso , Luca Cavalleri , Dominique Beaini , Pietro Li\u00f2 , 2020 . Principal Neighbourhood Aggregation for Graph Nets . In Advances in Neural Information Processing Systems , Vol. 33. 13260 -- 13271 . Gabriele Corso, Luca Cavalleri, Dominique Beaini, Pietro Li\u00f2, 2020. Principal Neighbourhood Aggregation for Graph Nets. In Advances in Neural Information Processing Systems, Vol. 33. 13260--13271.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_2_1_25_1","volume-title":"BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.5555\/2832415.2832572"},{"key":"e_1_2_1_27_1","volume-title":"Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21)","author":"Geoffrey X Yu","year":"2021","unstructured":"X Yu Geoffrey , Yubo Gao , Pavel Golikov , and Gennady Pekhimenko . 2021 . Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21) . 503--521. X Yu Geoffrey, Yubo Gao, Pavel Golikov, and Gennady Pekhimenko. 2021. Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). 503--521."},{"key":"e_1_2_1_28_1","volume-title":"International conference on machine learning. PMLR, 1263--1272","author":"Gilmer Justin","year":"2017","unstructured":"Justin Gilmer , Samuel S Schoenholz , Patrick F Riley , Oriol Vinyals , and George E Dahl . 2017 . Neural message passing for quantum chemistry . In International conference on machine learning. PMLR, 1263--1272 . Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. 2017. Neural message passing for quantum chemistry. In International conference on machine learning. PMLR, 1263--1272."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0169-2070(99)00007-2"},{"key":"e_1_2_1_30_1","volume-title":"Advances in Neural Information Processing Systems","volume":"29","author":"Harwath David","year":"2016","unstructured":"David Harwath , Antonio Torralba , and James Glass . 2016 . Unsupervised Learning of Spoken Language with Visual Context . In Advances in Neural Information Processing Systems , Vol. 29 . Curran Associates, Inc. Proc. ACM Meas. Anal. Comput. Syst. , Vol. 6, No. 2, Article 29. Publication date: June 2022. Prediction of the Resource Consumption of Distributed Deep Learning Systems 29:23 David Harwath, Antonio Torralba, and James Glass. 2016. Unsupervised Learning of Spoken Language with Visual Context. In Advances in Neural Information Processing Systems, Vol. 29. Curran Associates, Inc. Proc. ACM Meas. Anal. Comput. Syst., Vol. 6, No. 2, Article 29. Publication date: June 2022. Prediction of the Resource Consumption of Distributed Deep Learning Systems 29:23"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_32_1","volume-title":"Long short-term memory. Neural computation 9, 8","author":"Hochreiter Sepp","year":"1997","unstructured":"Sepp Hochreiter and J\u00fcrgen Schmidhuber . 1997. Long short-term memory. Neural computation 9, 8 ( 1997 ), 1735--1780. Sepp Hochreiter and J\u00fcrgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780."},{"key":"e_1_2_1_33_1","volume-title":"Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146","author":"Howard Jeremy","year":"2018","unstructured":"Jeremy Howard and Sebastian Ruder . 2018. Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146 ( 2018 ). Jeremy Howard and Sebastian Ruder. 2018. Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146 (2018)."},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Huang Gao","unstructured":"Gao Huang , Zhuang Liu , Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely Connected Convolutional Networks . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2654889"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2018.8622396"},{"key":"e_1_2_1_37_1","volume-title":"TensorExpress: In-Network Communication Scheduling for Distributed Deep Learning. In 2020 IEEE 13th International Conference on Cloud Computing. 25--27","author":"Kang Minkoo","year":"2020","unstructured":"Minkoo Kang , Gyeongsik Yang , Yeonho Yoo , and Chuck Yoo . 2020 . TensorExpress: In-Network Communication Scheduling for Distributed Deep Learning. In 2020 IEEE 13th International Conference on Cloud Computing. 25--27 . Minkoo Kang, Gyeongsik Yang, Yeonho Yoo, and Chuck Yoo. 2020. TensorExpress: In-Network Communication Scheduling for Distributed Deep Learning. In 2020 IEEE 13th International Conference on Cloud Computing. 25--27."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.3390\/s21010174"},{"key":"e_1_2_1_39_1","volume-title":"Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907","author":"Kipf Thomas N","year":"2016","unstructured":"Thomas N Kipf and Max Welling . 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 ( 2016 ). Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-4012"},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of Machine Translation Summit X: Papers","author":"Koehn Philipp","year":"2005","unstructured":"Philipp Koehn . 2005 . Europarl: A Parallel Corpus for Statistical Machine Translation . In Proceedings of Machine Translation Summit X: Papers . Phuket, Thailand, 79--86. Philipp Koehn. 2005. Europarl: A Parallel Corpus for Statistical Machine Translation. In Proceedings of Machine Translation Summit X: Papers. Phuket, Thailand, 79--86."},{"key":"e_1_2_1_43_1","volume-title":"ImageNet classification with deep convolutional neural networks. Advances in neural information processing systems 25","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey E Hinton . 2012. ImageNet classification with deep convolutional neural networks. Advances in neural information processing systems 25 ( 2012 ), 1097--1105. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012), 1097--1105."},{"key":"e_1_2_1_44_1","volume-title":"ATP: In-network Aggregation for Multi-tenant Learning. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21)","author":"Lao ChonLam","year":"2021","unstructured":"ChonLam Lao , Yanfang Le , Kshiteej Mahajan , Yixi Chen , Wenfei Wu , Aditya Akella , and Michael Swift . 2021 . ATP: In-network Aggregation for Multi-tenant Learning. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21) . USENIX Association, 741--761. ChonLam Lao, Yanfang Le, Kshiteej Mahajan, Yixi Chen, Wenfei Wu, Aditya Akella, and Michael Swift. 2021. ATP: In-network Aggregation for Multi-tenant Learning. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, 741--761."},{"key":"e_1_2_1_45_1","volume-title":"Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493","author":"Li Yujia","year":"2015","unstructured":"Yujia Li , Daniel Tarlow , Marc Brockschmidt , and Richard Zemel . 2015. Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493 ( 2015 ). Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2015. Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493 (2015)."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData50022.2020.9378252"},{"key":"e_1_2_1_47_1","volume-title":"17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20)","author":"Mahajan Kshiteej","year":"2020","unstructured":"Kshiteej Mahajan , Arjun Balasubramanian , Arjun Singhvi , Shivaram Venkataraman , Aditya Akella , Amar Phanishayee , and Shuchi Chawla . 2020 . Themis: Fair and efficient GPU cluster scheduling . In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20) . 289--304. Kshiteej Mahajan, Arjun Balasubramanian, Arjun Singhvi, Shivaram Venkataraman, Aditya Akella, Amar Phanishayee, and Shuchi Chawla. 2020. Themis: Fair and efficient GPU cluster scheduling. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). 289--304."},{"key":"e_1_2_1_48_1","volume-title":"Wortman Vaughan (Eds.)","volume":"34","author":"Moosbauer Julia","year":"2021","unstructured":"Julia Moosbauer , Julia Herbinger , Giuseppe Casalicchio , Marius Lindauer , and Bernd Bischl . 2021 . Explaining Hyperparameter Optimization via Partial Dependence Plots. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J . Wortman Vaughan (Eds.) , Vol. 34 . Curran Associates, Inc., 2280--2291. Julia Moosbauer, Julia Herbinger, Giuseppe Casalicchio, Marius Lindauer, and Bernd Bischl. 2021. Explaining Hyperparameter Optimization via Partial Dependence Plots. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 2280--2291."},{"key":"e_1_2_1_49_1","volume-title":"REGAL: Transfer learning for fast optimization of computation graphs. arXiv preprint arXiv:1905.02494","author":"Paliwal Aditya","year":"2019","unstructured":"Aditya Paliwal , Felix Gimeno , Vinod Nair , Yujia Li , Miles Lubin , Pushmeet Kohli , and Oriol Vinyals . 2019 . REGAL: Transfer learning for fast optimization of computation graphs. arXiv preprint arXiv:1905.02494 (2019). Aditya Paliwal, Felix Gimeno, Vinod Nair, Yujia Li, Miles Lubin, Pushmeet Kohli, and Oriol Vinyals. 2019. REGAL: Transfer learning for fast optimization of computation graphs. arXiv preprint arXiv:1905.02494 (2019)."},{"key":"e_1_2_1_50_1","volume-title":"Armand Vilalta, Jonatan Moreno, Eduard Ayguad\u00e9, Jes\u00fas Labarta, Ulises Cort\u00e9s, and Toyotaro Suzumura.","author":"Par\u00e9s Ferran","year":"2018","unstructured":"Ferran Par\u00e9s , Dario Garcia Gasulla , Armand Vilalta, Jonatan Moreno, Eduard Ayguad\u00e9, Jes\u00fas Labarta, Ulises Cort\u00e9s, and Toyotaro Suzumura. 2018 . Fluid Communities : A Competitive, Scalable and Diverse Community Detection Algorithm. In Complex Networks & Their Applications VI, Chantal Cherifi, Hocine Cherifi, M\u00e1rton Karsai, and Mirco Musolesi (Eds.). Springer International Publishing , Cham, 229--240. Ferran Par\u00e9s, Dario Garcia Gasulla, Armand Vilalta, Jonatan Moreno, Eduard Ayguad\u00e9, Jes\u00fas Labarta, Ulises Cort\u00e9s, and Toyotaro Suzumura. 2018. Fluid Communities: A Competitive, Scalable and Diverse Community Detection Algorithm. In Complex Networks & Their Applications VI, Chantal Cherifi, Hocine Cherifi, M\u00e1rton Karsai, and Mirco Musolesi (Eds.). Springer International Publishing, Cham, 229--240."},{"key":"e_1_2_1_51_1","unstructured":"Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga etal 2019. PyTorch: An imperative style high-performance deep learning library. Advances in neural information processing systems 32 (2019) 8026--8037.  Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga et al. 2019. PyTorch: An imperative style high-performance deep learning library. Advances in neural information processing systems 32 (2019) 8026--8037."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/3190508.3190517"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341301.3359642"},{"key":"e_1_2_1_54_1","volume-title":"Paleo: A performance model for deep neural networks.","author":"Qi Hang","year":"2017","unstructured":"Hang Qi , Evan R Sparks , and Ameet Talwalkar . 2017 . Paleo: A performance model for deep neural networks. (2017). Hang Qi, Evan R Sparks, and Ameet Talwalkar. 2017. Paleo: A performance model for deep neural networks. (2017)."},{"key":"e_1_2_1_55_1","volume-title":"Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning. In 15th Proc. ACM Meas. Anal. Comput. Syst.","volume":"6","author":"Qiao Aurick","year":"2022","unstructured":"Aurick Qiao , Sang Keun Choe , Suhas Jayaram Subramanya , Willie Neiswanger , Qirong Ho , Hao Zhang , Gregory R. Ganger , and Eric P. Xing . 2021 . Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning. In 15th Proc. ACM Meas. Anal. Comput. Syst. , Vol. 6 , No. 2, Article 29. Publication date : June 2022 . 29:24 Gyeongsik Yang, Changyong Shin, Jeunghwan Lee, Yeonho Yoo, and Chuck Yoo USENIX Symposium on Operating Systems Design and Implementation (OSDI 21). USENIX Association, 1--18. Aurick Qiao, Sang Keun Choe, Suhas Jayaram Subramanya, Willie Neiswanger, Qirong Ho, Hao Zhang, Gregory R. Ganger, and Eric P. Xing. 2021. Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning. In 15th Proc. ACM Meas. Anal. Comput. Syst., Vol. 6, No. 2, Article 29. Publication date: June 2022. 29:24 Gyeongsik Yang, Changyong Shin, Jeunghwan Lee, Yeonho Yoo, and Chuck Yoo USENIX Symposium on Operating Systems Design and Implementation (OSDI 21). USENIX Association, 1--18."},{"key":"e_1_2_1_56_1","volume-title":"NIPS 2005 workshop on transfer learning","volume":"898","author":"Rosenstein Michael T","year":"2005","unstructured":"Michael T Rosenstein , Zvika Marx , Leslie Pack Kaelbling , and Thomas G Dietterich . 2005 . To transfer or not to transfer . In NIPS 2005 workshop on transfer learning , Vol. 898 . 1--4. Michael T Rosenstein, Zvika Marx, Leslie Pack Kaelbling, and Thomas G Dietterich. 2005. To transfer or not to transfer. In NIPS 2005 workshop on transfer learning, Vol. 898. 1--4."},{"key":"e_1_2_1_57_1","volume-title":"An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747","author":"Ruder Sebastian","year":"2016","unstructured":"Sebastian Ruder . 2016. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 ( 2016 ). Sebastian Ruder. 2016. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)."},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.308"},{"key":"e_1_2_1_59_1","volume-title":"Theano: A Python framework for fast computation of mathematical expressions. arXiv preprint arXiv:1605.02688","author":"Development Team The Theano","year":"2016","unstructured":"The Theano Development Team , Rami Al-Rfou , Guillaume Alain , Amjad Almahairi , Christof Angermueller , Dzmitry Bahdanau , Nicolas Ballas , Fr\u00e9d\u00e9ric Bastien , Justin Bayer , Anatoly Belikov , 2016 . Theano: A Python framework for fast computation of mathematical expressions. arXiv preprint arXiv:1605.02688 (2016). The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Fr\u00e9d\u00e9ric Bastien, Justin Bayer, Anatoly Belikov, et al. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv preprint arXiv:1605.02688 (2016)."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/HOTI52880.2021.00018"},{"key":"e_1_2_1_61_1","volume-title":"Graph Attention Networks. In International Conference on Learning Representations.","author":"Cucurull Guillem","year":"2018","unstructured":"Guillem Cucurull , Arantxa Casanova , Adriana Romero , Pietro Li\u00f2 , and Yoshua Bengio . 2018 . Graph Attention Networks. In International Conference on Learning Representations. Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Li\u00f2, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations."},{"key":"e_1_2_1_62_1","volume-title":"Order matters: Sequence to sequence for sets. arXiv preprint arXiv:1511.06391","author":"Vinyals Oriol","year":"2015","unstructured":"Oriol Vinyals , Samy Bengio , and Manjunath Kudlur . 2015. Order matters: Sequence to sequence for sets. arXiv preprint arXiv:1511.06391 ( 2015 ). Oriol Vinyals, Samy Bengio, and Manjunath Kudlur. 2015. Order matters: Sequence to sequence for sets. arXiv preprint arXiv:1511.06391 (2015)."},{"key":"e_1_2_1_63_1","volume-title":"Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research","volume":"10737","author":"Wang Xin","year":"2021","unstructured":"Xin Wang , Shuyi Fan , Kun Kuang , and Wenwu Zhu . 2021 . Explainable Automated Graph Representation Learning with Hyperparameter Importance . In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research , Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 10727-- 10737 . Xin Wang, Shuyi Fan, Kun Kuang, and Wenwu Zhu. 2021. Explainable Automated Graph Representation Learning with Hyperparameter Importance. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 10727--10737."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2020.2978386"},{"key":"e_1_2_1_65_1","volume-title":"13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Xiao Wencong","year":"2018","unstructured":"Wencong Xiao , Romil Bhardwaj , Ramachandran Ramjee , Muthian Sivathanu , Nipun Kwatra , Zhenhua Han , Pratyush Patel , Xuan Peng , Hanyu Zhao , Quanlu Zhang , 2018 . Gandiva: Introspective cluster scheduling for deep learning . In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . 595--610. Wencong Xiao, Romil Bhardwaj, Ramachandran Ramjee, Muthian Sivathanu, Nipun Kwatra, Zhenhua Han, Pratyush Patel, Xuan Peng, Hanyu Zhao, Quanlu Zhang, et al. 2018. Gandiva: Introspective cluster scheduling for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 595--610."},{"key":"e_1_2_1_66_1","volume-title":"AntMan: Dynamic Scaling on GPU Clusters for Deep Learning. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20)","author":"Xiao Wencong","year":"2020","unstructured":"Wencong Xiao , Shiru Ren , Yong Li , Yang Zhang , Pengyang Hou , Zhi Li , Yihui Feng , Wei Lin , and Yangqing Jia . 2020 . AntMan: Dynamic Scaling on GPU Clusters for Deep Learning. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20) . 533--548. Wencong Xiao, Shiru Ren, Yong Li, Yang Zhang, Pengyang Hou, Zhi Li, Yihui Feng, Wei Lin, and Yangqing Jia. 2020. AntMan: Dynamic Scaling on GPU Clusters for Deep Learning. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 533--548."},{"key":"e_1_2_1_67_1","volume-title":"International Conference on Learning Representations.","author":"Xu Keyulu","year":"2019","unstructured":"Keyulu Xu , Weihua Hu , Jure Leskovec , and Stefanie Jegelka . 2019 . How Powerful are Graph Neural Networks? . In International Conference on Learning Representations. Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In International Conference on Learning Representations."},{"key":"e_1_2_1_68_1","volume-title":"Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv preprint arXiv:1703.06345","author":"Yang Zhilin","year":"2017","unstructured":"Zhilin Yang , Ruslan Salakhutdinov , and William W Cohen . 2017. Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv preprint arXiv:1703.06345 ( 2017 ). Zhilin Yang, Ruslan Salakhutdinov, and William W Cohen. 2017. Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv preprint arXiv:1703.06345 (2017)."},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/3386367.3432728"},{"key":"e_1_2_1_70_1","volume-title":"International conference on machine learning. PMLR, 5085--5094","author":"Ying Wei","year":"2018","unstructured":"Wei Ying , Yu Zhang , Junzhou Huang , and Qiang Yang . 2018 . Transfer learning via learning to transfer . In International conference on machine learning. PMLR, 5085--5094 . Wei Ying, Yu Zhang, Junzhou Huang, and Qiang Yang. 2018. Transfer learning via learning to transfer. In International conference on machine learning. PMLR, 5085--5094."},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.aiopen.2021.01.001"},{"key":"e_1_2_1_72_1","volume-title":"GDP: Generalized device placement for dataflow graphs. arXiv preprint arXiv:1910.01578","author":"Zhou Yanqi","year":"2019","unstructured":"Yanqi Zhou , Sudip Roy , Amirali Abdolrashidi , Daniel Wong , Peter C Ma , Qiumin Xu , Ming Zhong , Hanxiao Liu , Anna Goldie , Azalia Mirhoseini , 2019 . GDP: Generalized device placement for dataflow graphs. arXiv preprint arXiv:1910.01578 (2019). Yanqi Zhou, Sudip Roy, Amirali Abdolrashidi, Daniel Wong, Peter C Ma, Qiumin Xu, Ming Zhong, Hanxiao Liu, Anna Goldie, Azalia Mirhoseini, et al. 2019. GDP: Generalized device placement for dataflow graphs. arXiv preprint arXiv:1910.01578 (2019)."},{"key":"e_1_2_1_73_1","volume-title":"Daydream: Accurately Estimating the Efficacy of Optimizations for DNN Training. In 2020 USENIX Annual Technical Conference (USENIX ATC 20)","author":"Zhu Hongyu","year":"2020","unstructured":"Hongyu Zhu , Amar Phanishayee , and Gennady Pekhimenko . 2020 . Daydream: Accurately Estimating the Efficacy of Optimizations for DNN Training. In 2020 USENIX Annual Technical Conference (USENIX ATC 20) . 337--352. Hongyu Zhu, Amar Phanishayee, and Gennady Pekhimenko. 2020. Daydream: Accurately Estimating the Efficacy of Optimizations for DNN Training. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 337--352."},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2020.3004555"}],"container-title":["Proceedings of the ACM on Measurement and Analysis of Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3530895","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3530895","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:26Z","timestamp":1750183766000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3530895"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,26]]},"references-count":73,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,5,26]]}},"alternative-id":["10.1145\/3530895"],"URL":"https:\/\/doi.org\/10.1145\/3530895","relation":{},"ISSN":["2476-1249"],"issn-type":[{"value":"2476-1249","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,5,26]]},"assertion":[{"value":"2022-06-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}