{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,24]],"date-time":"2025-09-24T08:34:27Z","timestamp":1758702867617,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":33,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,8,9]],"date-time":"2021-08-09T00:00:00Z","timestamp":1628467200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["OIA-2019511"],"award-info":[{"award-number":["OIA-2019511"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006952","name":"Huawei Technologies Canada Co., Ltd.","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006952","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Louisiana Board of Regents"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,8,9]]},"DOI":"10.1145\/3472456.3472523","type":"proceedings-article","created":{"date-parts":[[2021,10,5]],"date-time":"2021-10-05T18:39:57Z","timestamp":1633459197000},"page":"1-10","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Accelerated Device Placement Optimization with Contrastive Learning"],"prefix":"10.1145","author":[{"given":"Hao","family":"Lan","sequence":"first","affiliation":[{"name":"University of Toronto, Canada"}]},{"given":"Li","family":"Chen","sequence":"additional","affiliation":[{"name":"University of Louisiana at Lafayette, United States of America"}]},{"given":"Baochun","family":"Li","sequence":"additional","affiliation":[{"name":"University of Toronto, Canada"}]}],"member":"320","published-online":{"date-parts":[[2021,10,5]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS).","author":"Addanki Ravichandra","year":"2019","unstructured":"Ravichandra Addanki , Shaileshh\u00a0Bojja Venkatakrishnan , Shreyan Gupta , Hongzi Mao , and Mohammad Alizadeh . 2019 . Learning Generalizable Device Placement Algorithms for Distributed Machine Learning . In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS). Ravichandra Addanki, Shaileshh\u00a0Bojja Venkatakrishnan, Shreyan Gupta, Hongzi Mao, and Mohammad Alizadeh. 2019. Learning Generalizable Device Placement Algorithms for Distributed Machine Learning. In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS)."},{"volume-title":"Proc.\u00a0International Conference on Learning Representations (ICLR).","author":"Bahdanau D.","key":"e_1_3_2_1_2_1","unstructured":"D. Bahdanau , C. Kyunghyun , and Y. Bengio . 2015. Neural Machine Translation by Jointly Learning to Align and Translate . In Proc.\u00a0International Conference on Learning Representations (ICLR). D. Bahdanau, C. Kyunghyun, and Y. Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proc.\u00a0International Conference on Learning Representations (ICLR)."},{"key":"e_1_3_2_1_3_1","volume-title":"Proc.\u00a0International Conference on Machine Learning (ICML).","author":"Chen Ting","year":"2020","unstructured":"Ting Chen , Simon Kornblith , Mohammad Norouzi , and Geoffrey\u00a0 E. Hinton . 2020 . A Simple Framework for Contrastive Learning of Visual Representations . In Proc.\u00a0International Conference on Machine Learning (ICML). Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey\u00a0E. Hinton. 2020. A Simple Framework for Contrastive Learning of Visual Representations. In Proc.\u00a0International Conference on Machine Learning (ICML)."},{"key":"e_1_3_2_1_4_1","volume-title":"Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS).","author":"Chen Ting","year":"2020","unstructured":"Ting Chen , Simon Kornblith , Kevin Swersky , Mohammad Norouzi , and Geoffrey\u00a0 E. Hinton . 2020 . Big Self-Supervised Models are Strong Semi-Supervised Learners . In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS). Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, and Geoffrey\u00a0E. Hinton. 2020. Big Self-Supervised Models are Strong Semi-Supervised Learners. In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS)."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Z. Dai Z. Yang Y. Yang J. Carbonell Q.\u00a0V Le and R. Salakhutdinov. 2019. Transformer-XL: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860(2019).  Z. Dai Z. Yang Y. Yang J. Carbonell Q.\u00a0V Le and R. Salakhutdinov. 2019. Transformer-XL: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860(2019).","DOI":"10.18653\/v1\/P19-1285"},{"key":"e_1_3_2_1_6_1","volume-title":"Proc.\u00a0North American Chapter of the Association for Computational Linguistics: Human Language Technologies.","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding . In Proc.\u00a0North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc.\u00a0North American Chapter of the Association for Computational Linguistics: Human Language Technologies."},{"volume-title":"Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS.","author":"Gao Y.","key":"e_1_3_2_1_7_1","unstructured":"Y. Gao , L. Chen , and B. Li . 2018. Post: Device placement with cross-entropy minimization and proximal policy optimization . In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS. Y. Gao, L. Chen, and B. Li. 2018. Post: Device placement with cross-entropy minimization and proximal policy optimization. In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS."},{"volume-title":"Proc.\u00a0International Conference on Machine Learning (ICML).","author":"Gao Y.","key":"e_1_3_2_1_8_1","unstructured":"Y. Gao , L. Chen , and B. Li . 2018. Spotlight: Optimizing Device Placement for Training Deep Neural Networks . In Proc.\u00a0International Conference on Machine Learning (ICML). Y. Gao, L. Chen, and B. Li. 2018. Spotlight: Optimizing Device Placement for Training Deep Neural Networks. In Proc.\u00a0International Conference on Machine Learning (ICML)."},{"key":"e_1_3_2_1_9_1","volume":"201","author":"Hamilton W.","unstructured":"W. Hamilton , Z. Ying , and J. Leskovec. 201 7. Inductive representation learning on large graphs. In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS. W. Hamilton, Z. Ying, and J. Leskovec. 2017. Inductive representation learning on large graphs. In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS.","journal-title":"J. Leskovec."},{"key":"e_1_3_2_1_10_1","volume-title":"Pipedream: Fast and efficient pipeline parallel dnn training. arXiv preprint arXiv:1806.03377(2018).","author":"Harlap A.","year":"2018","unstructured":"A. Harlap , D. Narayanan , A. Phanishayee , Vm Seshadri , N. Devanur , G. Ganger , and P. Gibbons . 2018 . Pipedream: Fast and efficient pipeline parallel dnn training. arXiv preprint arXiv:1806.03377(2018). A. Harlap, D. Narayanan, A. Phanishayee, Vm Seshadri, N. Devanur, G. Ganger, and P. Gibbons. 2018. Pipedream: Fast and efficient pipeline parallel dnn training. arXiv preprint arXiv:1806.03377(2018)."},{"key":"e_1_3_2_1_11_1","unstructured":"S.\u00a0H. Hashemi S.\u00a0A. Jyothi and R.\u00a0H Campbell. 2018. TicTac: Accelerating distributed deep learning with communication scheduling. arXiv preprint arXiv:1803.03288(2018).  S.\u00a0H. Hashemi S.\u00a0A. Jyothi and R.\u00a0H Campbell. 2018. TicTac: Accelerating distributed deep learning with communication scheduling. arXiv preprint arXiv:1803.03288(2018)."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.123"},{"key":"e_1_3_2_1_13_1","volume":"201","author":"He K.","unstructured":"K. He , X. Zhang , S. Ren , and J. Sun. 201 6. Deep Residual Learning for Image Recognition. In Proc.\u00a0IEEE Computer Vision and Pattern Recognition (CVPR). K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In Proc.\u00a0IEEE Computer Vision and Pattern Recognition (CVPR).","journal-title":"J. Sun."},{"key":"e_1_3_2_1_14_1","volume-title":"Proc.\u00a0International Conference on Learning Representations ICLR.","author":"Hjelm Devon","year":"2019","unstructured":"R.\u00a0 Devon Hjelm , Alex Fedorov , Samuel Lavoie-Marchildon , Karan Grewal , Philip Bachman , Adam Trischler , and Yoshua Bengio . 2019 . Learning deep representations by mutual information estimation and maximization . In Proc.\u00a0International Conference on Learning Representations ICLR. R.\u00a0Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Philip Bachman, Adam Trischler, and Yoshua Bengio. 2019. Learning deep representations by mutual information estimation and maximization. In Proc.\u00a0International Conference on Learning Representations ICLR."},{"key":"e_1_3_2_1_15_1","volume-title":"Proc.\u00a0International Conference on Learning Representations (ICLR).","author":"Hu Weihua","year":"2020","unstructured":"Weihua Hu , Bowen Liu , Joseph Gomes , Marinka Zitnik , Percy Liang , Vijay\u00a0 S. Pande , and Jure Leskovec . 2020 . Strategies for Pre-training Graph Neural Networks . In Proc.\u00a0International Conference on Learning Representations (ICLR). Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay\u00a0S. Pande, and Jure Leskovec. 2020. Strategies for Pre-training Graph Neural Networks. In Proc.\u00a0International Conference on Learning Representations (ICLR)."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403237"},{"key":"e_1_3_2_1_17_1","unstructured":"A. Jayarajan J. Wei G. Gibson A. Fedorova and G. Pekhimenko. 2019. Priority-based parameter propagation for distributed DNN training. arXiv preprint arXiv:1905.03960(2019).  A. Jayarajan J. Wei G. Gibson A. Fedorova and G. Pekhimenko. 2019. Priority-based parameter propagation for distributed DNN training. arXiv preprint arXiv:1905.03960(2019)."},{"volume-title":"Proc.\u00a0International Conference on Learning Representations (ICLR).","author":"N.","key":"e_1_3_2_1_18_1","unstructured":"Thomas\u00a0 N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks . In Proc.\u00a0International Conference on Learning Representations (ICLR). Thomas\u00a0N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In Proc.\u00a0International Conference on Learning Representations (ICLR)."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS49936.2021.00068"},{"key":"e_1_3_2_1_20_1","volume":"201","author":"Mirhoseini A.","unstructured":"A. Mirhoseini , A. Goldie , H. Pham , B. Steiner , Quoc\u00a0 V. Le , and J. Dean. 201 8. A Hierarchical Model for Device Placement. In Proc.\u00a0International Conference on Learning Representations (ICLR). A. Mirhoseini, A. Goldie, H. Pham, B. Steiner, Quoc\u00a0V. Le, and J. Dean. 2018. A Hierarchical Model for Device Placement. In Proc.\u00a0International Conference on Learning Representations (ICLR).","journal-title":"J. Dean."},{"key":"e_1_3_2_1_21_1","volume":"201","author":"Mirhoseini A.","unstructured":"A. Mirhoseini , H. Pham , Q. Le , B. Steiner , R. Larsen , Y. Zhou , N. Kumar , M. Norouzi , S. Bengio , and J. Dean. 201 7. Device Placement Optimization with Reinforcement Learning. In Proc.\u00a0International Conference on Machine Learning (ICML). A. Mirhoseini, H. Pham, Q. Le, B. Steiner, R. Larsen, Y. Zhou, N. Kumar, M. Norouzi, S. Bengio, and J. Dean. 2017. Device Placement Optimization with Reinforcement Learning. In Proc.\u00a0International Conference on Machine Learning (ICML).","journal-title":"J. Dean."},{"key":"e_1_3_2_1_22_1","volume-title":"REGAL: Transfer Learning For Fast Optimization of Computation Graphs. arXiv preprint arXiv:1905.02494(2019).","author":"Paliwal Aditya","year":"2019","unstructured":"Aditya Paliwal , Felix Gimeno , Vinod Nair , Yujia Li , Miles Lubin , Pushmeet Kohli , and Oriol Vinyals . 2019 . REGAL: Transfer Learning For Fast Optimization of Computation Graphs. arXiv preprint arXiv:1905.02494(2019). Aditya Paliwal, Felix Gimeno, Vinod Nair, Yujia Li, Miles Lubin, Pushmeet Kohli, and Oriol Vinyals. 2019. REGAL: Transfer Learning For Fast Optimization of Computation Graphs. arXiv preprint arXiv:1905.02494(2019)."},{"key":"e_1_3_2_1_23_1","volume-title":"Proc.\u00a0the 2020 USENIX Annual Technical Conference (ATC).","author":"Park H.","year":"2020","unstructured":"Jay\u00a0 H. Park , Gyeongchan Yun , Chang\u00a0 M. Yi , Nguyen\u00a0 T. Nguyen , Seungmin Lee , Jaesik Choi , Sam\u00a0 H. Noh , and Young-ri Choi. 2020 . HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism . In Proc.\u00a0the 2020 USENIX Annual Technical Conference (ATC). Jay\u00a0H. Park, Gyeongchan Yun, Chang\u00a0M. Yi, Nguyen\u00a0T. Nguyen, Seungmin Lee, Jaesik Choi, Sam\u00a0H. Noh, and Young-ri Choi. 2020. HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism. In Proc.\u00a0the 2020 USENIX Annual Technical Conference (ATC)."},{"key":"e_1_3_2_1_24_1","unstructured":"F. Pellegrini. 2009. Distillating Knowledge about SCOTCH. In Combinatorial Scientific Computing. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik Germany.  F. Pellegrini. 2009. Distillating Knowledge about SCOTCH. In Combinatorial Scientific Computing. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik Germany."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403168"},{"key":"e_1_3_2_1_26_1","volume-title":"Proc.\u00a0International Conference on Learning Representations (ICLR).","author":"Shazeer Noam","year":"2017","unstructured":"Noam Shazeer , Azalia Mirhoseini , Krzysztof Maziarz , Andy Davis , Quoc Le , Geoffrey Hinton , and Jeff Dean . 2017 . Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer . In Proc.\u00a0International Conference on Learning Representations (ICLR). Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. In Proc.\u00a0International Conference on Learning Representations (ICLR)."},{"volume-title":"Proc.\u00a0International Conference on Machine Learning (ICML).","author":"Shulman J.","key":"e_1_3_2_1_27_1","unstructured":"J. Shulman , F. Wolski , P. Dhariwal , A. Radford , and O. Klimov . 2017. Proximal Policy Optimization Algorithms. https:\/\/arxiv.org\/pdf\/1707.06347 . In Proc.\u00a0International Conference on Machine Learning (ICML). J. Shulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. 2017. Proximal Policy Optimization Algorithms. https:\/\/arxiv.org\/pdf\/1707.06347. In Proc.\u00a0International Conference on Machine Learning (ICML)."},{"volume-title":"Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS.","author":"Sutskever I.","key":"e_1_3_2_1_28_1","unstructured":"I. Sutskever , O. Vinyals , and Q. Le . 2014. Sequence to Sequence Learning with Neural Networks . In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS. I. Sutskever, O. Vinyals, and Q. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS."},{"key":"e_1_3_2_1_29_1","unstructured":"A\u00e4ron van\u00a0den Oord Yazhe Li and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. CoRR abs\/1807.03748(2018). arxiv:1807.03748http:\/\/arxiv.org\/abs\/1807.03748  A\u00e4ron van\u00a0den Oord Yazhe Li and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. CoRR abs\/1807.03748(2018). arxiv:1807.03748http:\/\/arxiv.org\/abs\/1807.03748"},{"key":"e_1_3_2_1_30_1","unstructured":"P. Veli\u010dkovi\u0107 W. Fedus W.\u00a0L Hamilton P. Li\u00f2 Y. Bengio and R\u00a0D. Hjelm. 2018. Deep graph infomax. arXiv preprint arXiv:1809.10341(2018).  P. Veli\u010dkovi\u0107 W. Fedus W.\u00a0L Hamilton P. Li\u00f2 Y. Bengio and R\u00a0D. Hjelm. 2018. Deep graph infomax. arXiv preprint arXiv:1809.10341(2018)."},{"key":"e_1_3_2_1_31_1","volume-title":"Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS).","author":"Xie Qizhe","year":"2020","unstructured":"Qizhe Xie , Zihang Dai , Eduard\u00a0 H. Hovy , Thang Luong , and Quoc Le . 2020 . Unsupervised Data Augmentation for Consistency Training . In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS). Qizhe Xie, Zihang Dai, Eduard\u00a0H. Hovy, Thang Luong, and Quoc Le. 2020. Unsupervised Data Augmentation for Consistency Training. In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS)."},{"key":"e_1_3_2_1_32_1","volume-title":"Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS).","author":"You Yuning","year":"2020","unstructured":"Yuning You , Tianlong Chen , Yongduo Sui , Ting Chen , Zhangyang Wang , and Yang Shen . 2020 . Graph Contrastive Learning with Augmentations . In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS). Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph Contrastive Learning with Augmentations. In Proc.\u00a0Advances in Neural Information Processing Systems (NeurIPS)."},{"key":"e_1_3_2_1_33_1","volume-title":"GDP: Generalized Device Placement for Dataflow Graphs. arXiv preprint arXiv:1910.01578(2019).","author":"Zhou Y.","year":"2019","unstructured":"Y. Zhou , S. Roy , A.i Abdolrashidi, D. Wong , P.\u00a0 C Ma , Q. Xu , M. Zhong , H. Liu , A. Goldie , A. Mirhoseini , 2019 . GDP: Generalized Device Placement for Dataflow Graphs. arXiv preprint arXiv:1910.01578(2019). Y. Zhou, S. Roy, A.i Abdolrashidi, D. Wong, P.\u00a0C Ma, Q. Xu, M. Zhong, H. Liu, A. Goldie, A. Mirhoseini, 2019. GDP: Generalized Device Placement for Dataflow Graphs. arXiv preprint arXiv:1910.01578(2019)."}],"event":{"name":"ICPP 2021: 50th International Conference on Parallel Processing","acronym":"ICPP 2021","location":"Lemont IL USA"},"container-title":["50th International Conference on Parallel Processing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472456.3472523","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3472456.3472523","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3472456.3472523","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:17:23Z","timestamp":1750191443000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472456.3472523"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,9]]},"references-count":33,"alternative-id":["10.1145\/3472456.3472523","10.1145\/3472456"],"URL":"https:\/\/doi.org\/10.1145\/3472456.3472523","relation":{},"subject":[],"published":{"date-parts":[[2021,8,9]]},"assertion":[{"value":"2021-10-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}