{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T19:50:35Z","timestamp":1770493835441,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":55,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,10,12]],"date-time":"2020-10-12T00:00:00Z","timestamp":1602460800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"NSF","award":["1909577"],"award-info":[{"award-number":["1909577"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,10,12]]},"DOI":"10.1145\/3419111.3421302","type":"proceedings-article","created":{"date-parts":[[2020,10,13]],"date-time":"2020-10-13T04:40:25Z","timestamp":1602564025000},"page":"416-430","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":17,"title":["Baechi"],"prefix":"10.1145","author":[{"given":"Beomyeol","family":"Jeon","sequence":"first","affiliation":[{"name":"University of Illinois at Urbana-Champaign"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Linda","family":"Cai","sequence":"additional","affiliation":[{"name":"Princeton University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pallavi","family":"Srivastava","sequence":"additional","affiliation":[{"name":"Microsoft"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jintao","family":"Jiang","sequence":"additional","affiliation":[{"name":"University of California"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaolan","family":"Ke","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yitao","family":"Meng","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cong","family":"Xie","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Indranil","family":"Gupta","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,10,12]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI '16)","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , Manjunath Kudlur , Josh Levenberg , Rajat Monga , Sherry Moore , Derek G. Murray , Benoit Steiner , Paul Tucker , Vijay Vasudevan , Pete Warden , Martin Wicke , Yuan Yu , and Xiaoqiang Zheng . 2016 . TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI '16) . USENIX Association, 265--283. Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI '16). USENIX Association, 265--283."},{"key":"e_1_3_2_2_2_1","volume-title":"Shreyan Gupta, Hongzi Mao, and Mohammad Alizadeh.","author":"Addanki Ravichandra","year":"2019","unstructured":"Ravichandra Addanki , Shaileshh Bojja Venkatakrishnan , Shreyan Gupta, Hongzi Mao, and Mohammad Alizadeh. 2019 . Learning Generalizable Device Placement Algorithms for Distributed Machine Learning. In Advances in Neural Information Processing Systems 32 (NeurIPS '19). Curran Associates, Inc ., 3981--3991. Ravichandra Addanki, Shaileshh Bojja Venkatakrishnan, Shreyan Gupta, Hongzi Mao, and Mohammad Alizadeh. 2019. Learning Generalizable Device Placement Algorithms for Distributed Machine Learning. In Advances in Neural Information Processing Systems 32 (NeurIPS '19). Curran Associates, Inc., 3981--3991."},{"key":"e_1_3_2_2_3_1","volume-title":"Allen and John Cocke","author":"E.","year":"1972","unstructured":"Frances. E. Allen and John Cocke . 1972 . A Catalogue of Optimizing Transformations. Design and Optimization of Compilers ( 1972), 1--30. Frances. E. Allen and John Cocke. 1972. A Catalogue of Optimizing Transformations. Design and Optimization of Compilers (1972), 1--30."},{"key":"e_1_3_2_2_4_1","unstructured":"Amazon. 2020. Amazon Web Services (AWS). https:\/\/aws.amazon.com  Amazon. 2020. Amazon Web Services (AWS). https:\/\/aws.amazon.com"},{"key":"e_1_3_2_2_5_1","volume-title":"Andersen","author":"Andersen Erling D.","year":"2000","unstructured":"Erling D. Andersen and Knud D . Andersen . 2000 . The Mosek Interior Point Optimizer for Linear Programming: An Implementation of the Homogeneous Algorithm. In High Performance Optimization. Springer , 197--232. Erling D. Andersen and Knud D. Andersen. 2000. The Mosek Interior Point Optimizer for Linear Programming: An Implementation of the Homogeneous Algorithm. In High Performance Optimization. Springer, 197--232."},{"key":"e_1_3_2_2_6_1","volume-title":"3rd International Conference on Learning Representations (ICLR '15)","author":"Bahdanau Dzmitry","year":"2015","unstructured":"Dzmitry Bahdanau , Kyunghyun Cho , and Yoshua Bengio . 2015 . Neural Machine Translation by Jointly Learning to Align and Translate . In 3rd International Conference on Learning Representations (ICLR '15) . Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In 3rd International Conference on Learning Representations (ICLR '15)."},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/362946.362974"},{"key":"e_1_3_2_2_8_1","volume-title":"Nested Data-parallelism on the GPU. In 17th ACM SIGPLAN International Conference on Functional Programming (ICFP '12)","author":"Bergstrom Lars","year":"2012","unstructured":"Lars Bergstrom and John Reppy . 2012 . Nested Data-parallelism on the GPU. In 17th ACM SIGPLAN International Conference on Functional Programming (ICFP '12) . ACM, 247--258. Lars Bergstrom and John Reppy. 2012. Nested Data-parallelism on the GPU. In 17th ACM SIGPLAN International Conference on Functional Programming (ICFP '12). ACM, 247--258."},{"key":"e_1_3_2_2_9_1","volume-title":"Scale: System Design. arXiv preprint arXiv:1902.01046","author":"Bonawitz Keith","year":"2019","unstructured":"Keith Bonawitz , Hubert Eichner , Wolfgang Grieskamp , Dzmitry Huba , Alex Ingerman , Vladimir Ivanov , Chloe Kiddon , Jakub Konecny , Stefano Mazzocchi , H Brendan McMahan , 2019 . Towards Federated Learning at Scale: System Design. arXiv preprint arXiv:1902.01046 (2019). Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konecny, Stefano Mazzocchi, H Brendan McMahan, et al. 2019. Towards Federated Learning at Scale: System Design. arXiv preprint arXiv:1902.01046 (2019)."},{"key":"e_1_3_2_2_10_1","volume-title":"Practical Secure Aggregation for Privacy-Preserving Machine Learning. In 24th ACM SIGSAC Conference on Computer and Communications Security (CCS '17)","author":"Bonawitz Keith","year":"2017","unstructured":"Keith Bonawitz , Vladimir Ivanov , Ben Kreuter , Antonio Marcedone , H Brendan McMahan , Sarvar Patel , Daniel Ramage , Aaron Segal , and Karn Seth . 2017 . Practical Secure Aggregation for Privacy-Preserving Machine Learning. In 24th ACM SIGSAC Conference on Computer and Communications Security (CCS '17) . ACM, 1175--1191. Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical Secure Aggregation for Privacy-Preserving Machine Learning. In 24th ACM SIGSAC Conference on Computer and Communications Security (CCS '17). ACM, 1175--1191."},{"key":"e_1_3_2_2_11_1","volume-title":"MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arXiv Preprint arXiv:1512.01274","author":"Chen Tianqi","year":"2015","unstructured":"Tianqi Chen , Mu Li , Yutian Li , Min Lin , Naiyan Wang , Minjie Wang , Tianjun Xiao , Bing Xu , Chiyuan Zhang , and Zheng Zhang . 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arXiv Preprint arXiv:1512.01274 ( 2015 ). Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arXiv Preprint arXiv:1512.01274 (2015)."},{"key":"e_1_3_2_2_12_1","volume-title":"Training Deep Nets With Sublinear Memory Cost. arXiv preprint arXiv:1604.06174","author":"Chen Tianqi","year":"2016","unstructured":"Tianqi Chen , Bing Xu , Chiyuan Zhang , and Carlos Guestrin . 2016. Training Deep Nets With Sublinear Memory Cost. arXiv preprint arXiv:1604.06174 ( 2016 ). Tianqi Chen, Bing Xu, Chiyuan Zhang, and Carlos Guestrin. 2016. Training Deep Nets With Sublinear Memory Cost. arXiv preprint arXiv:1604.06174 (2016)."},{"key":"e_1_3_2_2_13_1","first-page":"12","volume-title":"Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads. Proceedings of the VLDB Endowment 5","author":"Chen Yanpei","year":"2012","unstructured":"Yanpei Chen , Sara Alspaugh , and Randy Katz . 2012 . Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads. Proceedings of the VLDB Endowment 5 , 12 (2012). Yanpei Chen, Sara Alspaugh, and Randy Katz. 2012. Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads. Proceedings of the VLDB Endowment 5, 12 (2012)."},{"key":"e_1_3_2_2_14_1","volume-title":"Ng","author":"Dean Jeffrey","year":"2012","unstructured":"Jeffrey Dean , Greg S. Corrado , Rajat Monga , Kai Chen , Matthieu Devin , Quoc V. Le , Mark Z. Mao , Marc'Aurelio Ranzato , Andrew Senior , Paul Tucker , Ke Yang , and Andrew Y . Ng . 2012 . Large Scale Distributed Deep Networks. In Advances in Neural Information Processing Systems 25 (NIPS '12). Curran Associates Inc ., 1223--1231. Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc'Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Y. Ng. 2012. Large Scale Distributed Deep Networks. In Advances in Neural Information Processing Systems 25 (NIPS '12). Curran Associates Inc., 1223--1231."},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2779052"},{"key":"e_1_3_2_2_16_1","unstructured":"Google. 2020. Google Cloud Platform. https:\/\/cloud.google.com\/gcp\/  Google. 2020. Google Cloud Platform. https:\/\/cloud.google.com\/gcp\/"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ETFA.1995.496773"},{"key":"e_1_3_2_2_18_1","unstructured":"Geoffrey Hinton Nitish Srivastava and Kevin Swersky. 2012. Neural Networks for Machine Learning Lecture 6a Overview of Mini-Batch Gradient Descent. https:\/\/www.cs.toronto.edu\/~tijmen\/csc321\/slides\/lecture_slides_lec6.pdf  Geoffrey Hinton Nitish Srivastava and Kevin Swersky. 2012. Neural Networks for Machine Learning Lecture 6a Overview of Mini-Batch Gradient Descent. https:\/\/www.cs.toronto.edu\/~tijmen\/csc321\/slides\/lecture_slides_lec6.pdf"},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1016\/0167-6377(94)90024-8"},{"key":"e_1_3_2_2_20_1","volume-title":"3rd International Symposium on Parallel Architectures, Algorithms and Programming (PAAP '10)","author":"Hu Jinhua","year":"2010","unstructured":"Jinhua Hu , Jianhua Gu , Guofei Sun , and Tianhai Zhao . 2010 . A Scheduling Strategy on Load Balancing of Virtual Machine Resources in Cloud Computing Environment . In 3rd International Symposium on Parallel Architectures, Algorithms and Programming (PAAP '10) . IEEE, 89--96. Jinhua Hu, Jianhua Gu, Guofei Sun, and Tianhai Zhao. 2010. A Scheduling Strategy on Load Balancing of Virtual Machine Resources in Cloud Computing Environment. In 3rd International Symposium on Parallel Architectures, Algorithms and Programming (PAAP '10). IEEE, 89--96."},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1137\/0218016"},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2654889"},{"key":"e_1_3_2_2_23_1","volume-title":"Exploring Hidden Dimensions in Accelerating Convolutional Neural Networks. In 35th International Conference on Machine Learning (ICML '18)","author":"Jia Zhihao","year":"2018","unstructured":"Zhihao Jia , Sina Lin , Charles R. Qi , and Alex Aiken . 2018 . Exploring Hidden Dimensions in Accelerating Convolutional Neural Networks. In 35th International Conference on Machine Learning (ICML '18) . PMLR, 2274--2283. Zhihao Jia, Sina Lin, Charles R. Qi, and Alex Aiken. 2018. Exploring Hidden Dimensions in Accelerating Convolutional Neural Networks. In 35th International Conference on Machine Learning (ICML '18). PMLR, 2274--2283."},{"key":"e_1_3_2_2_24_1","volume-title":"Beyond Data and Model Parallelism for Deep Neural Networks. In 2nd Conference on Machine Learning and Systems (MLSys '19)","author":"Jia Zhihao","year":"2019","unstructured":"Zhihao Jia , Matei Zaharia , and Alex Aiken . 2019 . Beyond Data and Model Parallelism for Deep Neural Networks. In 2nd Conference on Machine Learning and Systems (MLSys '19) . 1--13. Zhihao Jia, Matei Zaharia, and Alex Aiken. 2019. Beyond Data and Model Parallelism for Deep Neural Networks. In 2nd Conference on Machine Learning and Systems (MLSys '19). 1--13."},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/368996.369025"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/800057.808695"},{"key":"e_1_3_2_2_27_1","volume-title":"STRADS: A Distributed Framework for Scheduled Model Parallel Machine Learning. In 11th European Conference on Computer Systems (EuroSys '16)","author":"Kim Jin Kyu","unstructured":"Jin Kyu Kim , Qirong Ho , Seunghak Lee , Xun Zheng , Wei Dai , Garth A. Gibson , and Eric P. Xing . 2016 . STRADS: A Distributed Framework for Scheduled Model Parallel Machine Learning. In 11th European Conference on Computer Systems (EuroSys '16) . ACM, Article 5, 16 pages. Jin Kyu Kim, Qirong Ho, Seunghak Lee, Xun Zheng, Wei Dai, Garth A. Gibson, and Eric P. Xing. 2016. STRADS: A Distributed Framework for Scheduled Model Parallel Machine Learning. In 11th European Conference on Computer Systems (EuroSys '16). ACM, Article 5, 16 pages."},{"key":"e_1_3_2_2_28_1","volume-title":"MLbase: A Distributed Machine-Learning System. In 6th Biennial Conference on Innovative Data Systems Research (CIDR '13)","author":"Kraska Tim","year":"2013","unstructured":"Tim Kraska , Ameet Talwalkar , and John Duchi . 2013 . MLbase: A Distributed Machine-Learning System. In 6th Biennial Conference on Innovative Data Systems Research (CIDR '13) . Tim Kraska, Ameet Talwalkar, and John Duchi. 2013. MLbase: A Distributed Machine-Learning System. In 6th Biennial Conference on Innovative Data Systems Research (CIDR '13)."},{"key":"e_1_3_2_2_29_1","volume-title":"One Weird Trick for Parallelizing Convolutional Neural Networks. arXiv preprint arXiv:1404.5997","author":"Krizhevsky Alex","year":"2014","unstructured":"Alex Krizhevsky . 2014. One Weird Trick for Parallelizing Convolutional Neural Networks. arXiv preprint arXiv:1404.5997 ( 2014 ). Alex Krizhevsky. 2014. One Weird Trick for Parallelizing Convolutional Neural Networks. arXiv preprint arXiv:1404.5997 (2014)."},{"key":"e_1_3_2_2_30_1","volume-title":"Building High-Level Features Using Large Scale Unsupervised Learning. In 38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '13)","author":"Le Quoc V","year":"2013","unstructured":"Quoc V Le . 2013 . Building High-Level Features Using Large Scale Unsupervised Learning. In 38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '13) . IEEE, 8595--8598. Quoc V Le. 2013. Building High-Level Features Using Large Scale Unsupervised Learning. In 38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '13). IEEE, 8595--8598."},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2741948.2741965"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.dcan.2017.10.002"},{"key":"e_1_3_2_2_33_1","unstructured":"Microsoft. 2020. Microsoft Azure. https:\/\/azure.microsoft.com\/  Microsoft. 2020. Microsoft Azure. https:\/\/azure.microsoft.com\/"},{"key":"e_1_3_2_2_34_1","volume-title":"Hierarchical Planning for Device Placement. In 6th International Conference on Learning Representations (ICLR '18)","author":"Mirhoseini Azalia","year":"2018","unstructured":"Azalia Mirhoseini , Anna Goldie , Hieu Pham , Benoit Steiner , Quoc V Le , and Jeff Dean . 2018 . Hierarchical Planning for Device Placement. In 6th International Conference on Learning Representations (ICLR '18) . Azalia Mirhoseini, Anna Goldie, Hieu Pham, Benoit Steiner, Quoc V Le, and Jeff Dean. 2018. Hierarchical Planning for Device Placement. In 6th International Conference on Learning Representations (ICLR '18)."},{"key":"e_1_3_2_2_35_1","volume-title":"Device Placement Optimization with Reinforcement Learning. In 34th International Conference on Machine Learning (ICML '17)","author":"Mirhoseini Azalia","year":"2017","unstructured":"Azalia Mirhoseini , Hieu Pham , Quoc V. Le , Benoit Steiner , Rasmus Larsen , Yuefeng Zhou , Naveen Kumar , Mohammad Norouzi , Samy Bengio , and Jeff Dean . 2017 . Device Placement Optimization with Reinforcement Learning. In 34th International Conference on Machine Learning (ICML '17) . PMLR, 2430--2439. Azalia Mirhoseini, Hieu Pham, Quoc V. Le, Benoit Steiner, Rasmus Larsen, Yuefeng Zhou, Naveen Kumar, Mohammad Norouzi, Samy Bengio, and Jeff Dean. 2017. Device Placement Optimization with Reinforcement Learning. In 34th International Conference on Machine Learning (ICML '17). PMLR, 2430--2439."},{"key":"e_1_3_2_2_36_1","volume-title":"Schulz","author":"M\u00f6hring Rolf H.","year":"1996","unstructured":"Rolf H. M\u00f6hring , Markus W. Sch\u00e4ffter , and Andreas S . Schulz . 1996 . Scheduling Jobs with Communication Delays : Using Infeasible Solutions for Approximation. In 4th Annual European Symposium on Algorithms (ESA '96). Springer , 76--90. Rolf H. M\u00f6hring, Markus W. Sch\u00e4ffter, and Andreas S. Schulz. 1996. Scheduling Jobs with Communication Delays: Using Infeasible Solutions for Approximation. In 4th Annual European Symposium on Algorithms (ESA '96). Springer, 76--90."},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1287\/opre.45.1.145"},{"key":"e_1_3_2_2_38_1","unstructured":"NVIDIA. 2020. NVLink and NVSwitch. https:\/\/www.nvidia.com\/en-us\/data-center\/nvlink\/  NVIDIA. 2020. NVLink and NVSwitch. https:\/\/www.nvidia.com\/en-us\/data-center\/nvlink\/"},{"key":"e_1_3_2_2_39_1","volume-title":"High-Performance Deep Learning Library. In 33rd Conference on Neural Information Processing Systems (NeurIPS '19)","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke , Sam Gross , Francisco Massa , Adam Lerer , James Bradbury , Gregory Chanan , Trevor Killeen , Zeming Lin , Natalia Gimelshein , Luca Antiga , Alban Desmaison , Andreas Kopf , Edward Yang , Zachary DeVito , Martin Raison , Alykhan Tejani , Sasank Chilamkurthy , Benoit Steiner , Lu Fang , Junjie Bai , and Soumith Chintala . 2019 . PyTorch: An Imperative Style , High-Performance Deep Learning Library. In 33rd Conference on Neural Information Processing Systems (NeurIPS '19) . Curran Associates, Inc., 8024--8035. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In 33rd Conference on Neural Information Processing Systems (NeurIPS '19). Curran Associates, Inc., 8024--8035."},{"key":"e_1_3_2_2_40_1","volume-title":"7th Python in Science Conference (SciPy '08)","author":"Schult Daniel A.","year":"2008","unstructured":"Daniel A. Schult . 2008 . Exploring Network Structure, Dynamics, and Function Using NetworkX . In 7th Python in Science Conference (SciPy '08) . 11--15. Daniel A. Schult. 2008. Exploring Network Structure, Dynamics, and Function Using NetworkX. In 7th Python in Science Conference (SciPy '08). 11--15."},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2945397"},{"key":"e_1_3_2_2_42_1","volume-title":"MLI: An API for Distributed Machine Learning. In 13th IEEE International Conference on Data Mining (ICDM '13)","author":"Sparks E. R.","unstructured":"E. R. Sparks , A. Talwalkar , V. Smith , J. Kottalam , X. Pan , J. Gonzalez , M. J. Franklin , M. I. Jordan , and T. Kraska . 2013 . MLI: An API for Distributed Machine Learning. In 13th IEEE International Conference on Data Mining (ICDM '13) . IEEE, 1187--1192. E. R. Sparks, A. Talwalkar, V. Smith, J. Kottalam, X. Pan, J. Gonzalez, M. J. Franklin, M. I. Jordan, and T. Kraska. 2013. MLI: An API for Distributed Machine Learning. In 13th IEEE International Conference on Data Mining (ICDM '13). IEEE, 1187--1192."},{"key":"e_1_3_2_2_43_1","volume-title":"OptiML: An Implicitly Parallel Domain-Specific Language for Machine Learning. In 28th International Conference on Machine Learning (ICML '11)","author":"Sujeeth Arvind","year":"2011","unstructured":"Arvind Sujeeth , HyoukJoong Lee , Kevin Brown , Tiark Rompf , Hassan Chafi , Michael Wu , Anand Atreya , Martin Odersky , and Kunle Olukotun . 2011 . OptiML: An Implicitly Parallel Domain-Specific Language for Machine Learning. In 28th International Conference on Machine Learning (ICML '11) . PMLR, 609--616. Arvind Sujeeth, HyoukJoong Lee, Kevin Brown, Tiark Rompf, Hassan Chafi, Michael Wu, Anand Atreya, Martin Odersky, and Kunle Olukotun. 2011. OptiML: An Implicitly Parallel Domain-Specific Language for Machine Learning. In 28th International Conference on Machine Learning (ICML '11). PMLR, 609--616."},{"key":"e_1_3_2_2_44_1","volume-title":"Rethinking the Inception Architecture for Computer Vision. In 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '16)","author":"Szegedy Christian","year":"2016","unstructured":"Christian Szegedy , Vincent Vanhoucke , Sergey Ioffe , Jonathon Shlens , and Zbigniew Wojna . 2016 . Rethinking the Inception Architecture for Computer Vision. In 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '16) . IEEE, 2818--2826. Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. 2016. Rethinking the Inception Architecture for Computer Vision. In 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '16). IEEE, 2818--2826."},{"key":"e_1_3_2_2_45_1","unstructured":"TensorFlow Community. 2020. TensorFlow Graph Optimization with Grappler. https:\/\/www.tensorflow.org\/guide\/graph_optimization  TensorFlow Community. 2020. TensorFlow Graph Optimization with Grappler. https:\/\/www.tensorflow.org\/guide\/graph_optimization"},{"key":"e_1_3_2_2_46_1","volume-title":"XLA: Optimizing Compiler for Machine Learning. https:\/\/www.tensorflow.org\/xla","author":"Community TensorFlow","year":"2020","unstructured":"TensorFlow Community . 2020 . XLA: Optimizing Compiler for Machine Learning. https:\/\/www.tensorflow.org\/xla TensorFlow Community. 2020. XLA: Optimizing Compiler for Machine Learning. https:\/\/www.tensorflow.org\/xla"},{"key":"e_1_3_2_2_47_1","volume-title":"Theano: A Python Framework for Fast Computation of Mathematical Expressions. arXiv Preprint arXiv:1605.02688","author":"Team Theano Development","year":"2016","unstructured":"Theano Development Team . 2016 . Theano: A Python Framework for Fast Computation of Mathematical Expressions. arXiv Preprint arXiv:1605.02688 (2016). Theano Development Team. 2016. Theano: A Python Framework for Fast Computation of Mathematical Expressions. arXiv Preprint arXiv:1605.02688 (2016)."},{"key":"e_1_3_2_2_48_1","volume-title":"Progress in Mathematical Programming","author":"Tomlin John A.","unstructured":"John A. Tomlin . 1989. A Note on Comparing Simplex and Interior Methods for Linear Programming . In Progress in Mathematical Programming . Springer , 91--103. John A. Tomlin. 1989. A Note on Comparing Simplex and Interior Methods for Linear Programming. In Progress in Mathematical Programming. Springer, 91--103."},{"key":"e_1_3_2_2_49_1","volume-title":"Lenstra","author":"Veltman Bart","year":"1990","unstructured":"Bart Veltman , B. J. Lageweg , and Jan K . Lenstra . 1990 . Multiprocessor Scheduling with Communication Delays. Parallel computing 16, 2--3 (1990), 173--182. Bart Veltman, B. J. Lageweg, and Jan K. Lenstra. 1990. Multiprocessor Scheduling with Communication Delays. Parallel computing 16, 2--3 (1990), 173--182."},{"key":"e_1_3_2_2_50_1","volume-title":"Supporting Very Large Models Using Automatic Dataflow Graph Partitioning. In 14th European Conference on Computer Systems (EuroSys '19)","author":"Wang Minjie","year":"2019","unstructured":"Minjie Wang , Chien-chin Huang, and Jinyang Li . 2019 . Supporting Very Large Models Using Automatic Dataflow Graph Partitioning. In 14th European Conference on Computer Systems (EuroSys '19) . ACM, Article 26, 17 pages. Minjie Wang, Chien-chin Huang, and Jinyang Li. 2019. Supporting Very Large Models Using Automatic Dataflow Graph Partitioning. In 14th European Conference on Computer Systems (EuroSys '19). ACM, Article 26, 17 pages."},{"key":"e_1_3_2_2_51_1","volume-title":"Characterizing Deep Learning Training Workloads on Alibaba-PAI. In 22nd IEEE International Symposium on Workload Characterization (IISWC '19)","author":"Wang Mengdi","year":"2019","unstructured":"Mengdi Wang , Chen Meng , Guoping Long , Chuan Wu , Jun Yang , Wei Lin , and Yangqing Jia . 2019 . Characterizing Deep Learning Training Workloads on Alibaba-PAI. In 22nd IEEE International Symposium on Workload Characterization (IISWC '19) . IEEE, 189--202. Mengdi Wang, Chen Meng, Guoping Long, Chuan Wu, Jun Yang, Wei Lin, and Yangqing Jia. 2019. Characterizing Deep Learning Training Workloads on Alibaba-PAI. In 22nd IEEE International Symposium on Workload Characterization (IISWC '19). IEEE, 189--202."},{"key":"e_1_3_2_2_52_1","unstructured":"Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey etal 2016. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv preprint arXiv:1609.08144 (2016).  Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey et al. 2016. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv preprint arXiv:1609.08144 (2016)."},{"key":"e_1_3_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783323"},{"key":"e_1_3_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.308533"},{"key":"e_1_3_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCOM.2016.7565185"}],"event":{"name":"SoCC '20: ACM Symposium on Cloud Computing","location":"Virtual Event USA","acronym":"SoCC '20","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGOPS ACM Special Interest Group on Operating Systems"]},"container-title":["Proceedings of the 11th ACM Symposium on Cloud Computing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3419111.3421302","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3419111.3421302","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:32:05Z","timestamp":1750195925000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3419111.3421302"}},"subtitle":["fast device placement of machine learning graphs"],"short-title":[],"issued":{"date-parts":[[2020,10,12]]},"references-count":55,"alternative-id":["10.1145\/3419111.3421302","10.1145\/3419111"],"URL":"https:\/\/doi.org\/10.1145\/3419111.3421302","relation":{},"subject":[],"published":{"date-parts":[[2020,10,12]]},"assertion":[{"value":"2020-10-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}