{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T16:48:01Z","timestamp":1775666881971,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":30,"publisher":"ACM","license":[{"start":{"date-parts":[[2018,7,11]],"date-time":"2018-07-11T00:00:00Z","timestamp":1531267200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Department of Energy","award":["DE-AC02-05CH11231"],"award-info":[{"award-number":["DE-AC02-05CH11231"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2018,7,11]]},"DOI":"10.1145\/3210377.3210394","type":"proceedings-article","created":{"date-parts":[[2018,7,12]],"date-time":"2018-07-12T17:46:44Z","timestamp":1531417604000},"page":"77-86","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":47,"title":["Integrated Model, Batch, and Domain Parallelism in Training Neural Networks"],"prefix":"10.1145","author":[{"given":"Amir","family":"Gholami","sequence":"first","affiliation":[{"name":"University of California, Berkeley, Berkeley, CA, USA"}]},{"given":"Ariful","family":"Azad","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory, Berkeley, CA, USA"}]},{"given":"Peter","family":"Jin","sequence":"additional","affiliation":[{"name":"University of California, Berkeley, Berkeley, CA, USA"}]},{"given":"Kurt","family":"Keutzer","sequence":"additional","affiliation":[{"name":"University of California, Berkeley, Berkeley, CA, USA"}]},{"given":"Aydin","family":"Buluc","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory, Berkeley, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2018,7,11]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1137\/090769156"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/1285358.1285359"},{"key":"e_1_3_2_2_4_1","first-page":"571","volume-title":"OSDI","volume":"14","author":"Chilimbi Trishul M","year":"2014","unstructured":"Trishul M Chilimbi , Yutaka Suzue , Johnson Apacible , and Karthik Kalyanaraman . Project adam : Building an efficient and scalable deep learning training system . In OSDI , volume 14 , pages 571 -- 582 , 2014 . Trishul M Chilimbi, Yutaka Suzue, Johnson Apacible, and Karthik Kalyanaraman. Project adam: Building an efficient and scalable deep learning training system. In OSDI, volume 14, pages 571--582, 2014."},{"key":"e_1_3_2_2_5_1","first-page":"1337","volume-title":"International Conference on Machine Learning","author":"Coates Adam","year":"2013","unstructured":"Adam Coates , Brody Huval , Tao Wang , David Wu , Bryan Catanzaro , and Ng Andrew . Deep learning with COTS HPC systems . In International Conference on Machine Learning , pages 1337 -- 1345 , 2013 . Adam Coates, Brody Huval, Tao Wang, David Wu, Bryan Catanzaro, and Ng Andrew. Deep learning with COTS HPC systems. In International Conference on Machine Learning, pages 1337--1345, 2013."},{"key":"e_1_3_2_2_6_1","volume-title":"Distributed deep learning using synchronous stochastic gradient descent. arXiv preprint arXiv:1602.06709","author":"Das Dipankar","year":"2016","unstructured":"Dipankar Das , Sasikanth Avancha , Dheevatsa Mudigere , Karthikeyan Vaidynathan , Srinivas Sridharan , Dhiraj Kalamkar , Bharat Kaul , and Pradeep Dubey . Distributed deep learning using synchronous stochastic gradient descent. arXiv preprint arXiv:1602.06709 , 2016 . Dipankar Das, Sasikanth Avancha, Dheevatsa Mudigere, Karthikeyan Vaidynathan, Srinivas Sridharan, Dhiraj Kalamkar, Bharat Kaul, and Pradeep Dubey. Distributed deep learning using synchronous stochastic gradient descent. arXiv preprint arXiv:1602.06709, 2016."},{"key":"e_1_3_2_2_7_1","first-page":"1223","volume-title":"Advances in neural information processing systems","author":"Dean Jeffrey","year":"2012","unstructured":"Jeffrey Dean , Greg Corrado , Rajat Monga , Kai Chen , Matthieu Devin , Mark Mao , Andrew Senior , Paul Tucker , Ke Yang , Quoc V Le , Large scale distributed deep networks . In Advances in neural information processing systems , pages 1223 -- 1231 , 2012 . Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Andrew Senior, Paul Tucker, Ke Yang, Quoc V Le, et al. Large scale distributed deep networks. In Advances in neural information processing systems, pages 1223--1231, 2012."},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3210377.3210394"},{"key":"e_1_3_2_2_9_1","volume-title":"large minibatch SGD: Training ImageNet in 1 hour. arXiv preprint arXiv:1706.02677","author":"Goyal Priya","year":"2017","unstructured":"Priya Goyal , Piotr Doll\u00e1r , Ross Girshick , Pieter Noordhuis , Lukasz Wesolowski , Aapo Kyrola , Andrew Tulloch , Yangqing Jia , and Kaiming He. Accurate , large minibatch SGD: Training ImageNet in 1 hour. arXiv preprint arXiv:1706.02677 , 2017 . Priya Goyal, Piotr Doll\u00e1r, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. Accurate, large minibatch SGD: Training ImageNet in 1 hour. arXiv preprint arXiv:1706.02677, 2017."},{"key":"e_1_3_2_2_10_1","volume-title":"Brain tumor segmentation with deep neural networks. Medical image analysis, 35:18--31","author":"Havaei Mohammad","year":"2017","unstructured":"Mohammad Havaei , Axel Davy , David Warde-Farley , Antoine Biard , Aaron Courville , Yoshua Bengio , Chris Pal , Pierre-Marc Jodoin , and Hugo Larochelle . Brain tumor segmentation with deep neural networks. Medical image analysis, 35:18--31 , 2017 . Mohammad Havaei, Axel Davy, David Warde-Farley, Antoine Biard, Aaron Courville, Yoshua Bengio, Chris Pal, Pierre-Marc Jodoin, and Hugo Larochelle. Brain tumor segmentation with deep neural networks. Medical image analysis, 35:18--31, 2017."},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_2_12_1","volume-title":"ICLR 2018 Workshop","author":"Jin Peter","year":"2018","unstructured":"Peter Jin , Boris Ginsburg , and Kurt Keutzer . Spatially parallel convolutions . ICLR 2018 Workshop , 2018 . Peter Jin, Boris Ginsburg, and Kurt Keutzer. Spatially parallel convolutions. ICLR 2018 Workshop, 2018."},{"key":"e_1_3_2_2_13_1","volume-title":"How to scale distributed deep learning? arXiv preprint arXiv:1611.04581","author":"Jin Peter H","year":"2016","unstructured":"Peter H Jin , Qiaochu Yuan , Forrest Iandola , and Kurt Keutzer . How to scale distributed deep learning? arXiv preprint arXiv:1611.04581 , 2016 . Peter H Jin, Qiaochu Yuan, Forrest Iandola, and Kurt Keutzer. How to scale distributed deep learning? arXiv preprint arXiv:1611.04581, 2016."},{"key":"e_1_3_2_2_14_1","volume-title":"On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836","author":"Keskar Nitish Shirish","year":"2016","unstructured":"Nitish Shirish Keskar , Dheevatsa Mudigere , Jorge Nocedal , Mikhail Smelyanskiy , and Ping Tak Peter Tang . On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836 , 2016 . Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, and Ping Tak Peter Tang. On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836, 2016."},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.182"},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2016.117"},{"key":"e_1_3_2_2_17_1","first-page":"1097","volume-title":"Advances in neural information processing systems","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey E Hinton . Imagenet classification with deep convolutional neural networks . In Advances in neural information processing systems , pages 1097 - 1105 , 2012 . Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097-1105, 2012."},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"e_1_3_2_2_19_1","volume-title":"MICCAI","author":"Mang A.","year":"2017","unstructured":"A. Mang , S. Tharakan A. Gholami , N. Himthani , S. Subramanian , J. Levitt , M. Azmat , K. Scheufele , M. Mehl , C. Davatzikos , B. Barth , and G. Biros . SIBIA-GlS: Scalable biophysics-based image analysis for glioma segmentation. The multimodal brain tumor image segmentation benchmark (BRATS) , MICCAI , 2017 . A. Mang, S. Tharakan A. Gholami, N. Himthani, S. Subramanian, J. Levitt, M. Azmat, K. Scheufele, M. Mehl, C. Davatzikos, B. Barth, and G. Biros. SIBIA-GlS: Scalable biophysics-based image analysis for glioma segmentation. The multimodal brain tumor image segmentation benchmark (BRATS), MICCAI, 2017."},{"key":"e_1_3_2_2_20_1","first-page":"693","volume-title":"Advances in neural information processing systems","author":"Recht Benjamin","year":"2011","unstructured":"Benjamin Recht , Christopher Re , Stephen Wright , and Feng Niu . Hogwild: A lock-free approach to parallelizing stochastic gradient descent . In Advances in neural information processing systems , pages 693 - 701 , 2011 . Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in neural information processing systems, pages 693-701, 2011."},{"key":"e_1_3_2_2_21_1","first-page":"91","volume-title":"Advances in neural information processing systems","author":"Ren Shaoqing","year":"2015","unstructured":"Shaoqing Ren , Kaiming He , Ross Girshick , and Jian Sun . Faster R-CNN: Towards real-time object detection with region proposal networks . In Advances in neural information processing systems , pages 91 - 99 , 2015 . Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91-99, 2015."},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-739X(98)00043-0"},{"key":"e_1_3_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1137\/140993478"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342005051521"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1096-9128(199704)9:4<255::AID-CPE250>3.0.CO;2-2"},{"key":"e_1_3_2_2_27_1","volume-title":"Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. arXiv preprint arXiv:1612.01051","author":"Wu Bichen","year":"2016","unstructured":"Bichen Wu , Forrest Iandola , Peter H Jin , and Kurt Keutzer . Squeezedet : Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. arXiv preprint arXiv:1612.01051 , 2016 . Bichen Wu, Forrest Iandola, Peter H Jin, and Kurt Keutzer. Squeezedet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. arXiv preprint arXiv:1612.01051, 2016."},{"key":"e_1_3_2_2_28_1","volume-title":"In Review","author":"Wu Bichen","year":"2017","unstructured":"Bichen Wu , Alvin Wan , Xiangyu Yue , and Kurt Keutzer . Squeezeseg : Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud . In In Review , 2017 . Bichen Wu, Alvin Wan, Xiangyu Yue, and Kurt Keutzer. Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud. In In Review, 2017."},{"key":"e_1_3_2_2_29_1","volume-title":"Scaling SGD batch size to 32k for ImageNet training. arXiv preprint arXiv:1708.03888","author":"You Yang","year":"2017","unstructured":"Yang You , Igor Gitman , and Boris Ginsburg . Scaling SGD batch size to 32k for ImageNet training. arXiv preprint arXiv:1708.03888 , 2017 . Yang You, Igor Gitman, and Boris Ginsburg. Scaling SGD batch size to 32k for ImageNet training. arXiv preprint arXiv:1708.03888, 2017."},{"key":"e_1_3_2_2_30_1","volume-title":"ImageNet training in minutes. CoRR, abs\/1709.05011","author":"You Yang","year":"2017","unstructured":"Yang You , Zhao Zhang , C Hsieh , James Demmel , and Kurt Keutzer . ImageNet training in minutes. CoRR, abs\/1709.05011 , 2017 . Yang You, Zhao Zhang, C Hsieh, James Demmel, and Kurt Keutzer. ImageNet training in minutes. CoRR, abs\/1709.05011, 2017."},{"key":"e_1_3_2_2_31_1","first-page":"685","volume-title":"Advances in Neural Information Processing Systems","author":"Zhang Sixin","year":"2015","unstructured":"Sixin Zhang , Anna E Choromanska , and Yann LeCun . Deep learning with elastic averaging SGD . In Advances in Neural Information Processing Systems , pages 685 - 693 , 2015 . Sixin Zhang, Anna E Choromanska, and Yann LeCun. Deep learning with elastic averaging SGD. In Advances in Neural Information Processing Systems, pages 685-693, 2015."}],"event":{"name":"SPAA '18: 30th ACM Symposium on Parallelism in Algorithms and Architectures","location":"Vienna Austria","acronym":"SPAA '18","sponsor":["SIGACT ACM Special Interest Group on Algorithms and Computation Theory","SIGARCH ACM Special Interest Group on Computer Architecture","EATCS European Association for Theoretical Computer Science"]},"container-title":["Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3210377.3210394","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3210377.3210394","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:08:13Z","timestamp":1750208893000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3210377.3210394"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,7,11]]},"references-count":30,"alternative-id":["10.1145\/3210377.3210394","10.1145\/3210377"],"URL":"https:\/\/doi.org\/10.1145\/3210377.3210394","relation":{},"subject":[],"published":{"date-parts":[[2018,7,11]]},"assertion":[{"value":"2018-07-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}