{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,2,24]],"date-time":"2023-02-24T11:54:21Z","timestamp":1677239661509},"publisher-location":"New York, NY, USA","reference-count":48,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,12,7]]},"DOI":"10.1145\/3423211.3425675","type":"proceedings-article","created":{"date-parts":[[2020,12,11]],"date-time":"2020-12-11T23:03:11Z","timestamp":1607727791000},"update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Fast Training of Deep Learning Models over Multiple GPUs"],"prefix":"10.1145","author":[{"given":"Xiaodong","family":"Yi","sequence":"first","affiliation":[{"name":"The University of Hong Kong"}]},{"given":"Ziyue","family":"Luo","sequence":"additional","affiliation":[{"name":"The University of Hong Kong"}]},{"given":"Chen","family":"Meng","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Mengdi","family":"Wang","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Guoping","family":"Long","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Chuan","family":"Wu","sequence":"additional","affiliation":[{"name":"The University of Hong Kong"}]},{"given":"Jun","family":"Yang","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Wei","family":"Lin","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]}],"member":"320","published-online":{"date-parts":[[2020,12,11]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"2015. LeNet-5 convolutional neural networks. \"http:\/\/yann.lecun.com\/exdb\/lenet\". 2015. LeNet-5 convolutional neural networks. \"http:\/\/yann.lecun.com\/exdb\/lenet\"."},{"key":"e_1_3_2_1_2_1","unstructured":"2016. A New Lightweight Modular and Scalable Deep Learning Framework. \"https:\/\/caffe2.ai\". 2016. A New Lightweight Modular and Scalable Deep Learning Framework. \"https:\/\/caffe2.ai\"."},{"key":"e_1_3_2_1_3_1","unstructured":"2016. Tensorflow slim. \"https:\/\/github.com\/tensorflow\/tensorflow\/tree\/master\/tensorflow\/contrib\/slim\". 2016. Tensorflow slim. \"https:\/\/github.com\/tensorflow\/tensorflow\/tree\/master\/tensorflow\/contrib\/slim\"."},{"key":"e_1_3_2_1_4_1","unstructured":"2017. Tensorflow in-graph implementation. \"https:\/\/github.com\/tensorflow\/examples\/blob\/master\/community\/en\/docs\/deploy\/distributed.md\". 2017. Tensorflow in-graph implementation. \"https:\/\/github.com\/tensorflow\/examples\/blob\/master\/community\/en\/docs\/deploy\/distributed.md\"."},{"key":"e_1_3_2_1_5_1","unstructured":"2017. Tensorflow RunMetadata. \"https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/RunMetadata\". 2017. Tensorflow RunMetadata. \"https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/RunMetadata\"."},{"key":"e_1_3_2_1_6_1","unstructured":"2017. Tensors and Dynamic neural networks in Python with strong GPU acceleration. \"https:\/\/pytorch.org\". 2017. Tensors and Dynamic neural networks in Python with strong GPU acceleration. \"https:\/\/pytorch.org\"."},{"key":"e_1_3_2_1_7_1","unstructured":"2018. Tensorflow Mesh. https:\/\/github.com\/tensorflow\/mesh. 2018. Tensorflow Mesh. https:\/\/github.com\/tensorflow\/mesh."},{"key":"e_1_3_2_1_8_1","unstructured":"2019. GNMT v2 For TensorFlow. \"https:\/\/github.com\/NVIDIA\/DeepLearningExamples\/tree\/master\/TensorFlow\/Translation\/GNMT\". 2019. GNMT v2 For TensorFlow. \"https:\/\/github.com\/NVIDIA\/DeepLearningExamples\/tree\/master\/TensorFlow\/Translation\/GNMT\"."},{"key":"e_1_3_2_1_9_1","unstructured":"Mart\u00edn Abadi Paul Barham Jianmin Chen Zhifeng Chen Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Geoffrey Irving Michael Isard etal 2016. Tensorflow: a system for large-scale machine learning.. In OSDI. Mart\u00edn Abadi Paul Barham Jianmin Chen Zhifeng Chen Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Geoffrey Irving Michael Isard et al. 2016. Tensorflow: a system for large-scale machine learning.. In OSDI."},{"key":"e_1_3_2_1_10_1","volume-title":"Placeto: Efficient Progressive Device Placement Optimization. In NIPS Machine Learning for Systems Workshop.","author":"Addanki Ravichandra","year":"2018","unstructured":"Ravichandra Addanki , Shaileshh Bojja Venkatakrishnan , Shreyan Gupta , Hongzi Mao , and Mohammad Alizadeh . 2018 . Placeto: Efficient Progressive Device Placement Optimization. In NIPS Machine Learning for Systems Workshop. Ravichandra Addanki, Shaileshh Bojja Venkatakrishnan, Shreyan Gupta, Hongzi Mao, and Mohammad Alizadeh. 2018. Placeto: Efficient Progressive Device Placement Optimization. In NIPS Machine Learning for Systems Workshop."},{"key":"e_1_3_2_1_11_1","volume-title":"Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473","author":"Bahdanau Dzmitry","year":"2014","unstructured":"Dzmitry Bahdanau , Kyunghyun Cho , and Yoshua Bengio . 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 ( 2014 ). Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)."},{"key":"e_1_3_2_1_12_1","volume-title":"Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform. arXiv preprint arXiv:1809.02839","author":"Chen Chi-Chung","year":"2018","unstructured":"Chi-Chung Chen , Chia-Lin Yang , and Hsiang-Yun Cheng . 2018. Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform. arXiv preprint arXiv:1809.02839 ( 2018 ). Chi-Chung Chen, Chia-Lin Yang, and Hsiang-Yun Cheng. 2018. Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform. arXiv preprint arXiv:1809.02839 (2018)."},{"key":"e_1_3_2_1_13_1","volume-title":"Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274","author":"Chen Tianqi","year":"2015","unstructured":"Tianqi Chen , Mu Li , Yutian Li , Min Lin , Naiyan Wang , Minjie Wang , Tianjun Xiao , Bing Xu , Chiyuan Zhang , and Zheng Zhang . 2015 . Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015). Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015)."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2012-7"},{"key":"e_1_3_2_1_15_1","volume-title":"Analysis of dawnbench, a time-to-accuracy machine learning performance benchmark. ACM SIGOPS Operating Systems Review","author":"Coleman Cody","year":"2019","unstructured":"Cody Coleman , Daniel Kang , Deepak Narayanan , Luigi Nardi , Tian Zhao , Jian Zhang , Peter Bailis , Kunle Olukotun , Chris R\u00e9 , and Matei Zaharia . 2019. Analysis of dawnbench, a time-to-accuracy machine learning performance benchmark. ACM SIGOPS Operating Systems Review ( 2019 ). Cody Coleman, Daniel Kang, Deepak Narayanan, Luigi Nardi, Tian Zhao, Jian Zhang, Peter Bailis, Kunle Olukotun, Chris R\u00e9, and Matei Zaharia. 2019. Analysis of dawnbench, a time-to-accuracy machine learning performance benchmark. ACM SIGOPS Operating Systems Review (2019)."},{"key":"e_1_3_2_1_16_1","unstructured":"Jeffrey Dean Greg Corrado Rajat Monga Kai Chen Matthieu Devin Mark Mao Andrew Senior Paul Tucker Ke Yang Quoc V Le etal 2012. Large scale distributed deep networks. In Advances in neural information processing systems. 1223--1231. Jeffrey Dean Greg Corrado Rajat Monga Kai Chen Matthieu Devin Mark Mao Andrew Senior Paul Tucker Ke Yang Quoc V Le et al. 2012. Large scale distributed deep networks. In Advances in neural information processing systems. 1223--1231."},{"key":"e_1_3_2_1_17_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_3_2_1_18_1","volume-title":"Post: Device placement with cross-entropy minimization and proximal policy optimization. In Advances in Neural Information Processing Systems. 9971--9980.","author":"Gao Yuanxiang","year":"2018","unstructured":"Yuanxiang Gao , Li Chen , and Baochun Li . 2018 . Post: Device placement with cross-entropy minimization and proximal policy optimization. In Advances in Neural Information Processing Systems. 9971--9980. Yuanxiang Gao, Li Chen, and Baochun Li. 2018. Post: Device placement with cross-entropy minimization and proximal policy optimization. In Advances in Neural Information Processing Systems. 9971--9980."},{"key":"e_1_3_2_1_19_1","volume-title":"International Conference on Machine Learning.","author":"Gao Yuanxiang","year":"2018","unstructured":"Yuanxiang Gao , Li Chen , and Baochun Li . 2018 . Spotlight: Optimizing device placement for training deep neural networks . In International Conference on Machine Learning. Yuanxiang Gao, Li Chen, and Baochun Li. 2018. Spotlight: Optimizing device placement for training deep neural networks. In International Conference on Machine Learning."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Apostolos Gerasoulis and Tao Yang. 1992. A comparison of clustering heuristics for scheduling directed acyclic graphs on multiprocessors. J. Parallel and Distrib. Comput. (1992). Apostolos Gerasoulis and Tao Yang. 1992. A comparison of clustering heuristics for scheduling directed acyclic graphs on multiprocessors. J. Parallel and Distrib. Comput. (1992).","DOI":"10.1016\/0743-7315(92)90012-C"},{"key":"e_1_3_2_1_21_1","volume-title":"Tiresias: A {GPU} Cluster Manager for Distributed Deep Learning. In 16th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 19). 485--500.","author":"Gu Juncheng","year":"2019","unstructured":"Juncheng Gu , Mosharaf Chowdhury , Kang G Shin , Yibo Zhu , Myeongjae Jeon , Junjie Qian , Hongqiang Liu , and Chuanxiong Guo . 2019 . Tiresias: A {GPU} Cluster Manager for Distributed Deep Learning. In 16th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 19). 485--500. Juncheng Gu, Mosharaf Chowdhury, Kang G Shin, Yibo Zhu, Myeongjae Jeon, Junjie Qian, Hongqiang Liu, and Chuanxiong Guo. 2019. Tiresias: A {GPU} Cluster Manager for Distributed Deep Learning. In 16th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 19). 485--500."},{"key":"e_1_3_2_1_22_1","volume-title":"Pipedream: Fast and efficient pipeline parallel dnn training. arXiv preprint arXiv:1806.03377","author":"Harlap Aaron","year":"2018","unstructured":"Aaron Harlap , Deepak Narayanan , Amar Phanishayee , Vivek Seshadri , Nikhil Devanur , Greg Ganger , and Phil Gibbons . 2018 . Pipedream: Fast and efficient pipeline parallel dnn training. arXiv preprint arXiv:1806.03377 (2018). Aaron Harlap, Deepak Narayanan, Amar Phanishayee, Vivek Seshadri, Nikhil Devanur, Greg Ganger, and Phil Gibbons. 2018. Pipedream: Fast and efficient pipeline parallel dnn training. arXiv preprint arXiv:1806.03377 (2018)."},{"key":"e_1_3_2_1_23_1","volume-title":"Sangeetha Abdu Jyothi, and Roy H Campbell","author":"Hashemi Sayed Hadi","year":"2018","unstructured":"Sayed Hadi Hashemi , Sangeetha Abdu Jyothi, and Roy H Campbell . 2018 . TicTac: Accelerating Distributed Deep Learning with Communication Scheduling . arXiv preprint arXiv:1803.03288 (2018). Sayed Hadi Hashemi, Sangeetha Abdu Jyothi, and Roy H Campbell. 2018. TicTac: Accelerating Distributed Deep Learning with Communication Scheduling. arXiv preprint arXiv:1803.03288 (2018)."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_1_25_1","volume-title":"Gpipe: Efficient training of giant neural networks using pipeline parallelism. arXiv preprint arXiv:1811.06965","author":"Huang Yanping","year":"2018","unstructured":"Yanping Huang , Yonglong Cheng , Dehao Chen , HyoukJoong Lee , Jiquan Ngiam , Quoc V Le , and Zhifeng Chen . 2018 . Gpipe: Efficient training of giant neural networks using pipeline parallelism. arXiv preprint arXiv:1811.06965 (2018). Yanping Huang, Yonglong Cheng, Dehao Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V Le, and Zhifeng Chen. 2018. Gpipe: Efficient training of giant neural networks using pipeline parallelism. arXiv preprint arXiv:1811.06965 (2018)."},{"key":"e_1_3_2_1_26_1","volume-title":"Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks. In International Conference on Machine Learning. 2279--2288","author":"Jia Zhihao","year":"2018","unstructured":"Zhihao Jia , Sina Lin , Charles R Qi , and Alex Aiken . 2018 . Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks. In International Conference on Machine Learning. 2279--2288 . Zhihao Jia, Sina Lin, Charles R Qi, and Alex Aiken. 2018. Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks. In International Conference on Machine Learning. 2279--2288."},{"key":"e_1_3_2_1_27_1","volume-title":"Beyond data and model parallelism for deep neural networks. arXiv preprint arXiv:1807.05358","author":"Jia Zhihao","year":"2018","unstructured":"Zhihao Jia , Matei Zaharia , and Alex Aiken . 2018. Beyond data and model parallelism for deep neural networks. arXiv preprint arXiv:1807.05358 ( 2018 ). Zhihao Jia, Matei Zaharia, and Alex Aiken. 2018. Beyond data and model parallelism for deep neural networks. arXiv preprint arXiv:1807.05358 (2018)."},{"key":"e_1_3_2_1_28_1","volume-title":"One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997","author":"Krizhevsky Alex","year":"2014","unstructured":"Alex Krizhevsky . 2014. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997 ( 2014 ). Alex Krizhevsky. 2014. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997 (2014)."},{"key":"e_1_3_2_1_29_1","unstructured":"Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105. Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3152434.3152435"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2640087.2644155"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/3305890.3305932"},{"key":"e_1_3_2_1_33_1","volume-title":"Hal Daum\u00e9 III, and Jordan Boyd-Graber","author":"Nguyen Khanh","year":"2017","unstructured":"Khanh Nguyen , Hal Daum\u00e9 III, and Jordan Boyd-Graber . 2017 . Reinforcement learning for bandit neural machine translation with simulated human feedback. arXiv preprint arXiv:1707.07402 (2017). Khanh Nguyen, Hal Daum\u00e9 III, and Jordan Boyd-Graber. 2017. Reinforcement learning for bandit neural machine translation with simulated human feedback. arXiv preprint arXiv:1707.07402 (2017)."},{"key":"e_1_3_2_1_34_1","volume-title":"Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training. arXiv preprint arXiv:1907.13257","author":"Pal Saptadeep","year":"2019","unstructured":"Saptadeep Pal , Eiman Ebrahimi , Arslan Zulfiqar , Yaosheng Fu , Victor Zhang , Szymon Migacz , David Nellans , and Puneet Gupta . 2019. Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training. arXiv preprint arXiv:1907.13257 ( 2019 ). Saptadeep Pal, Eiman Ebrahimi, Arslan Zulfiqar, Yaosheng Fu, Victor Zhang, Szymon Migacz, David Nellans, and Puneet Gupta. 2019. Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training. arXiv preprint arXiv:1907.13257 (2019)."},{"key":"e_1_3_2_1_35_1","volume-title":"REGAL: Transfer Learning For Fast Optimization of Computation Graphs. arXiv preprint arXiv:1905.02494","author":"Paliwal Aditya","year":"2019","unstructured":"Aditya Paliwal , Felix Gimeno , Vinod Nair , Yujia Li , Miles Lubin , Pushmeet Kohli , and Oriol Vinyals . 2019 . REGAL: Transfer Learning For Fast Optimization of Computation Graphs. arXiv preprint arXiv:1905.02494 (2019). Aditya Paliwal, Felix Gimeno, Vinod Nair, Yujia Li, Miles Lubin, Pushmeet Kohli, and Oriol Vinyals. 2019. REGAL: Transfer Learning For Fast Optimization of Computation Graphs. arXiv preprint arXiv:1905.02494 (2019)."},{"key":"e_1_3_2_1_36_1","volume-title":"Accelerated Training for CNN Distributed Deep Learning through Automatic Resource-Aware Layer Placement. arXiv preprint arXiv:1901.05803","author":"Park Jay H","year":"2019","unstructured":"Jay H Park , Sunghwan Kim , Jinwon Lee , Myeongjae Jeon , and Sam H Noh . 2019. Accelerated Training for CNN Distributed Deep Learning through Automatic Resource-Aware Layer Placement. arXiv preprint arXiv:1901.05803 ( 2019 ). Jay H Park, Sunghwan Kim, Jinwon Lee, Myeongjae Jeon, and Sam H Noh. 2019. Accelerated Training for CNN Distributed Deep Learning through Automatic Resource-Aware Layer Placement. arXiv preprint arXiv:1901.05803 (2019)."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3190508.3190517"},{"key":"e_1_3_2_1_38_1","volume-title":"Mesh-tensorflow: Deep learning for supercomputers. In Advances in Neural Information Processing Systems.","author":"Shazeer Noam","year":"2018","unstructured":"Noam Shazeer , Youlong Cheng , Niki Parmar , Dustin Tran , Ashish Vaswani , Penporn Koanantakool , Peter Hawkins , HyoukJoong Lee , Mingsheng Hong , Cliff Young , 2018 . Mesh-tensorflow: Deep learning for supercomputers. In Advances in Neural Information Processing Systems. Noam Shazeer, Youlong Cheng, Niki Parmar, Dustin Tran, Ashish Vaswani, Penporn Koanantakool, Peter Hawkins, HyoukJoong Lee, Mingsheng Hong, Cliff Young, et al. 2018. Mesh-tensorflow: Deep learning for supercomputers. In Advances in Neural Information Processing Systems."},{"key":"e_1_3_2_1_39_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman . 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 ( 2014 ). Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.308"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.993206"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0022-0000(75)80008-0"},{"key":"e_1_3_2_1_43_1","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems."},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3302424.3303953"},{"key":"e_1_3_2_1_45_1","volume-title":"Stanza: Distributed Deep Learning with Small Communication Footprint. arXiv preprint arXiv:1812.10624","author":"Wu Xiaorui","year":"2018","unstructured":"Xiaorui Wu , Hong Xu , Bo Li , and Yongqiang Xiong . 2018 . Stanza: Distributed Deep Learning with Small Communication Footprint. arXiv preprint arXiv:1812.10624 (2018). Xiaorui Wu, Hong Xu, Bo Li, and Yongqiang Xiong. 2018. Stanza: Distributed Deep Learning with Small Communication Footprint. arXiv preprint arXiv:1812.10624 (2018)."},{"key":"e_1_3_2_1_46_1","unstructured":"Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey etal 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016). Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey et al. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)."},{"key":"e_1_3_2_1_47_1","volume-title":"Recurrent neural network regularization. arXiv preprint arXiv:1409.2329","author":"Zaremba Wojciech","year":"2014","unstructured":"Wojciech Zaremba , Ilya Sutskever , and Oriol Vinyals . 2014. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 ( 2014 ). Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. 2014. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)."},{"key":"e_1_3_2_1_48_1","volume-title":"GDP: Generalized Device Placement for Dataflow Graphs. arXiv preprint arXiv:1910.01578","author":"Zhou Yanqi","year":"2019","unstructured":"Yanqi Zhou , Sudip Roy , Amirali Abdolrashidi , Daniel Wong , Peter C Ma , Qiumin Xu , Ming Zhong , Hanxiao Liu , Anna Goldie , Azalia Mirhoseini , 2019 . GDP: Generalized Device Placement for Dataflow Graphs. arXiv preprint arXiv:1910.01578 (2019). Yanqi Zhou, Sudip Roy, Amirali Abdolrashidi, Daniel Wong, Peter C Ma, Qiumin Xu, Ming Zhong, Hanxiao Liu, Anna Goldie, Azalia Mirhoseini, et al. 2019. GDP: Generalized Device Placement for Dataflow Graphs. arXiv preprint arXiv:1910.01578 (2019)."}],"event":{"name":"Middleware '20: 21st International Middleware Conference","location":"Delft Netherlands","acronym":"Middleware '20","sponsor":["ACM Association for Computing Machinery","IFIP"]},"container-title":["Proceedings of the 21st International Middleware Conference"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3423211.3425675","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,11]],"date-time":"2023-01-11T17:55:11Z","timestamp":1673459711000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3423211.3425675"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,7]]},"references-count":48,"alternative-id":["10.1145\/3423211.3425675","10.1145\/3423211"],"URL":"http:\/\/dx.doi.org\/10.1145\/3423211.3425675","relation":{},"published":{"date-parts":[[2020,12,7]]},"assertion":[{"value":"2020-12-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}