{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T03:46:44Z","timestamp":1772164004674,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":26,"publisher":"ACM","license":[{"start":{"date-parts":[[2016,2,27]],"date-time":"2016-02-27T00:00:00Z","timestamp":1456531200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Ministry of Science and Innovation of Spain (CICYT)","award":["TIN-2012-34557"],"award-info":[{"award-number":["TIN-2012-34557"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2016,2,27]]},"DOI":"10.1145\/2851141.2851158","type":"proceedings-article","created":{"date-parts":[[2016,2,22]],"date-time":"2016-02-22T08:18:49Z","timestamp":1456129129000},"page":"1-12","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Coarse grain parallelization of deep neural networks"],"prefix":"10.1145","author":[{"given":"Marc Gonzalez","family":"Tallada","sequence":"first","affiliation":[{"name":"Universitat Politecnica de Catalunya-BarcelonaTech"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2016,2,27]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop","author":"Bastien F.","year":"2012","unstructured":"F. Bastien , P. Lamblin , R. Pascanu , J. Bergstra , I. J. Goodfellow , A. Bergeron , N. Bouchard , and Y. Bengio . Theano: new features and speed improvements . Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop , 2012 . F. Bastien, P. Lamblin, R. Pascanu, J. Bergstra, I. J. Goodfellow, A. Bergeron, N. Bouchard, and Y. Bengio. Theano: new features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop, 2012."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2007.370397"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/567806.567807"},{"key":"e_1_3_2_1_4_1","volume-title":"Advances in Neural Information Processing Systems","author":"Bottou L.","year":"2008","unstructured":"L. Bottou . The tradeoffs of large scale learning . Advances in Neural Information Processing Systems , 2008 . L. Bottou. The tradeoffs of large scale learning. Advances in Neural Information Processing Systems, 2008."},{"key":"e_1_3_2_1_5_1","volume-title":"cudnn: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759","author":"Chetlur S.","year":"2014","unstructured":"S. Chetlur , C. Woolley , P. Vandermersch , J. Cohen , J. Tran , B. Catanzaro , and E. Shelhamer . cudnn: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759 , 2014 . S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer. cudnn: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759, 2014."},{"key":"e_1_3_2_1_6_1","volume-title":"11th USENIX Symposium on Operating Systems Design and Implementation","author":"Chilimbi T.","year":"2014","unstructured":"T. Chilimbi , Y. Suzue , J. Apacible , and K. Kalyanaraman . Project adam: Building an efficient and scalable deep learning training system . 11th USENIX Symposium on Operating Systems Design and Implementation , 2014 . T. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman. Project adam: Building an efficient and scalable deep learning training system. 11th USENIX Symposium on Operating Systems Design and Implementation, 2014."},{"key":"e_1_3_2_1_7_1","volume-title":"Multicolumn deep neural networks for image classification. Computer Vision and Pattern Recognition. CVPR12","author":"Ciresan D. C.","year":"2012","unstructured":"D. C. Ciresan , U. Meier , and J. Schmidhuber . Multicolumn deep neural networks for image classification. Computer Vision and Pattern Recognition. CVPR12 ., 2012 . D. C. Ciresan, U. Meier, and J. Schmidhuber. Multicolumn deep neural networks for image classification. Computer Vision and Pattern Recognition. CVPR12., 2012."},{"key":"e_1_3_2_1_8_1","first-page":"1337","volume-title":"Proceedings of the 30th International Conference on Machine Learning, ICML 2013","author":"Coates A.","year":"2013","unstructured":"A. Coates , B. Huval , T. Wang , D. J. Wu , B. C. Catanzaro , and A. Y. Ng . Deep learning with COTS HPC systems . In Proceedings of the 30th International Conference on Machine Learning, ICML 2013 , Atlanta, GA, USA , 16-21 June 2013 , pages 1337 -- 1345 , 2013. A. Coates, B. Huval, T. Wang, D. J. Wu, B. C. Catanzaro, and A. Y. Ng. Deep learning with COTS HPC systems. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013, pages 1337--1345, 2013."},{"key":"e_1_3_2_1_9_1","volume-title":"NIPS Workshop","author":"Collobert R.","year":"2011","unstructured":"R. Collobert , K. Kavukcuoglu , and C. Farabet . Torch7: A matlab-like environment for machine learning. In BigLearn , NIPS Workshop , 2011 . URL https:\/\/publidiap.idiap.ch\/downloads\/\/papers\/2011\/Collobert_NIPSWORKSHOP_2011.pdf. R. Collobert, K. Kavukcuoglu, and C. Farabet. Torch7: A matlab-like environment for machine learning. In BigLearn, NIPS Workshop, 2011. URL https:\/\/publidiap.idiap.ch\/downloads\/\/papers\/2011\/Collobert_NIPSWORKSHOP_2011.pdf."},{"key":"e_1_3_2_1_10_1","first-page":"1232","volume-title":"NIPS","author":"Dean J.","year":"2012","unstructured":"J. Dean , G. Corrado , R. Monga , K. Chen , M. Devin , Q. V. Le , M. Z. Mao , M. Ranzato , A. W. Senior , P. A. Tucker , Large scale distributed deep networks . In NIPS , pages 1232 -- 1240 , 2012 . J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. Ranzato, A. W. Senior, P. A. Tucker, et al. Large scale distributed deep networks. In NIPS, pages 1232--1240, 2012."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2012.2211477"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1177\/10943420020160020101"},{"key":"e_1_3_2_1_13_1","volume-title":"The Journal of Machine Learning Research","author":"Duchi J.","year":"2011","unstructured":"J. Duchi , E. Hazan , and Y. Singer . Adaptive subgradient methods for online learning and stochastic optimization . The Journal of Machine Learning Research , 2011 . J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, 2011."},{"key":"e_1_3_2_1_14_1","volume-title":"Protocol buffers. https:\/\/developers.google.com\/protocol-buffers\/","year":"2015","unstructured":"Google. Protocol buffers. https:\/\/developers.google.com\/protocol-buffers\/ , 2015 . Google. Protocol buffers. https:\/\/developers.google.com\/protocol-buffers\/, 2015."},{"key":"e_1_3_2_1_15_1","volume-title":"Deep speech: Scaling up end-to-end speech recognition. CoRR, abs\/1412.5567","author":"Hannun A. Y.","year":"2014","unstructured":"A. Y. Hannun , C. Case , J. Casper , B. C. Catanzaro , G. Diamos , E. Elsen , R. Prenger , S. Satheesh , S. Sengupta , A. Coates , and A. Y. Ng . Deep speech: Scaling up end-to-end speech recognition. CoRR, abs\/1412.5567 , 2014 . URL http:\/\/arxiv.org\/abs\/1412.5567. A. Y. Hannun, C. Case, J. Casper, B. C. Catanzaro, G. Diamos, E. Elsen, R. Prenger, S. Satheesh, S. Sengupta, A. Coates, and A. Y. Ng. Deep speech: Scaling up end-to-end speech recognition. CoRR, abs\/1412.5567, 2014. URL http:\/\/arxiv.org\/abs\/1412.5567."},{"key":"e_1_3_2_1_16_1","volume-title":"Deep neural networks for acoustic modeling in speech recognition","author":"Hinton G.","year":"2012","unstructured":"G. Hinton , L. Deng , D. Yu , G. Dahl , A. Mohamed , N. Jaitly , A. Senior , V. Vanhoucke , P. Nguyen , T. Sainath , and B. Kingsbury . Deep neural networks for acoustic modeling in speech recognition . IEEE Signal Processing Magazine , 2012 . G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 2012."},{"key":"e_1_3_2_1_17_1","volume-title":"Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093","author":"Jia Y.","year":"2014","unstructured":"Y. Jia , E. Shelhamer , J. Donahue , S. Karayev , J. Long , R. Girshick , S. Guadarrama , and T. Darrell . Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 , 2014 . Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014."},{"key":"e_1_3_2_1_18_1","volume-title":"Learning multiple layers of features from tiny images. Technical report","author":"Krizhevsky A.","year":"2009","unstructured":"A. Krizhevsky . Learning multiple layers of features from tiny images. Technical report , University of Toronto , 2009 . URL http:\/\/www.cs.toronto.edu\/~kriz\/cifar.html. A. Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009. URL http:\/\/www.cs.toronto.edu\/~kriz\/cifar.html."},{"key":"e_1_3_2_1_19_1","volume-title":"Advances in Neural Information Processing Systems","author":"Krizhevsky A.","year":"2012","unstructured":"A. Krizhevsky , I. Sutskever , and G. Hinton . Imagenet classification with deep convolutional neural networks . Advances in Neural Information Processing Systems , 2012 . A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 2012."},{"key":"e_1_3_2_1_20_1","first-page":"1097","volume-title":"Advances in Neural Information Processing Systems 25","author":"Krizhevsky A.","year":"2012","unstructured":"A. Krizhevsky , I. Sutskever , and G. E. Hinton . Imagenet classification with deep convolutional neural networks . Advances in Neural Information Processing Systems 25 , pages 1097 -- 1105 , 2012 . A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25, pages 1097--1105, 2012."},{"key":"e_1_3_2_1_21_1","first-page":"306","volume-title":"Gradient Based Learning Applied to Document Recognition","author":"Lecun Y.","year":"2001","unstructured":"Y. Lecun , L. Bottou , Y. Bengio , and P. Haffner . Gradient Based Learning Applied to Document Recognition . IEEE Press , pages 306 -- 351 , 2001 . Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient Based Learning Applied to Document Recognition. IEEE Press, pages 306--351, 2001."},{"key":"e_1_3_2_1_22_1","unstructured":"Y. LeCun C. Cortes and C. Burges. The mnist database of handwritten digits. http:\/\/yann.lecun.com\/exdb\/mnist\/ 2015.  Y. LeCun C. Cortes and C. Burges. The mnist database of handwritten digits. http:\/\/yann.lecun.com\/exdb\/mnist\/ 2015."},{"key":"e_1_3_2_1_23_1","volume-title":"Soviet Mathematics Doklady","author":"Nesterov Y.","year":"1983","unstructured":"Y. Nesterov . A method of solving a convex programming problem with convergence rate o(1\/k) . Soviet Mathematics Doklady , 1983 . Y. Nesterov. A method of solving a convex programming problem with convergence rate o(1\/k). Soviet Mathematics Doklady, 1983."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553486"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_3_2_1_26_1","unstructured":"M. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In Arxiv 1311.2901. http:\/\/arxiv.org\/abs\/1311.2901 2013.  M. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In Arxiv 1311.2901. http:\/\/arxiv.org\/abs\/1311.2901 2013."}],"event":{"name":"PPoPP '16: 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming","location":"Barcelona Spain","acronym":"PPoPP '16","sponsor":["SIGPLAN ACM Special Interest Group on Programming Languages","ACM Association for Computing Machinery"]},"container-title":["Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2851141.2851158","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2851141.2851158","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:43:28Z","timestamp":1750211008000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2851141.2851158"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,2,27]]},"references-count":26,"alternative-id":["10.1145\/2851141.2851158","10.1145\/2851141"],"URL":"https:\/\/doi.org\/10.1145\/2851141.2851158","relation":{"is-identical-to":[{"id-type":"doi","id":"10.1145\/3016078.2851158","asserted-by":"object"}]},"subject":[],"published":{"date-parts":[[2016,2,27]]},"assertion":[{"value":"2016-02-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}