{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T18:01:39Z","timestamp":1775671299790,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":46,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,3,25]],"date-time":"2019-03-25T00:00:00Z","timestamp":1553472000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,3,25]]},"DOI":"10.1145\/3302424.3303957","type":"proceedings-article","created":{"date-parts":[[2019,3,22]],"date-time":"2019-03-22T13:10:03Z","timestamp":1553260203000},"page":"1-15","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":58,"title":["Parallax"],"prefix":"10.1145","author":[{"given":"Soojeong","family":"Kim","sequence":"first","affiliation":[{"name":"Seoul National University"}]},{"given":"Gyeong-In","family":"Yu","sequence":"additional","affiliation":[{"name":"Seoul National University"}]},{"given":"Hojin","family":"Park","sequence":"additional","affiliation":[{"name":"Seoul National University"}]},{"given":"Sungwoo","family":"Cho","sequence":"additional","affiliation":[{"name":"Seoul National University"}]},{"given":"Eunji","family":"Jeong","sequence":"additional","affiliation":[{"name":"Seoul National University"}]},{"given":"Hyeonmin","family":"Ha","sequence":"additional","affiliation":[{"name":"Seoul National University"}]},{"given":"Sanha","family":"Lee","sequence":"additional","affiliation":[{"name":"Seoul National University"}]},{"given":"Joo Seong","family":"Jeong","sequence":"additional","affiliation":[{"name":"Seoul National University"}]},{"given":"Byung-Gon","family":"Chun","sequence":"additional","affiliation":[{"name":"Seoul National University"}]}],"member":"320","published-online":{"date-parts":[[2019,3,25]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. USENIX Association, 265--283","author":"Abadi Mart\u00edn","year":"2016"},{"key":"e_1_3_2_1_2_1","unstructured":"Takuya Akiba Shuji Suzuki and Keisuke Fukuda. 2017. Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes. (2017). arXiv:1711.04325 https:\/\/arxiv.org\/abs\/1711.04325  Takuya Akiba Shuji Suzuki and Keisuke Fukuda. 2017. Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes. (2017). arXiv:1711.04325 https:\/\/arxiv.org\/abs\/1711.04325"},{"key":"e_1_3_2_1_3_1","volume-title":"Proceedings of Advances in Neural Information Processing Systems. Curran Associates, Inc., 1709--1720","author":"Alistarh Dan","year":"2017"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.25080\/Majora-92bf1922-003"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472621"},{"key":"e_1_3_2_1_6_1","unstructured":"Ciprian Chelba Tomas Mikolov Mike Schuster Qi Ge Thorsten Brants Phillipp Koehn and Tony Robinson. 2013. One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling. (2013). arXiv:1312.3005 https:\/\/arxiv.org\/abs\/1312.3005  Ciprian Chelba Tomas Mikolov Mike Schuster Qi Ge Thorsten Brants Phillipp Koehn and Tony Robinson. 2013. One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling. (2013). arXiv:1312.3005 https:\/\/arxiv.org\/abs\/1312.3005"},{"key":"e_1_3_2_1_7_1","unstructured":"Jianmin Chen Rajat Monga Samy Bengio and Rafal J\u00f3zefowicz. 2016. Revisiting Distributed Synchronous SGD. (2016). arXiv:1604.00981 https:\/\/arxiv.org\/abs\/1604.00981  Jianmin Chen Rajat Monga Samy Bengio and Rafal J\u00f3zefowicz. 2016. Revisiting Distributed Synchronous SGD. (2016). arXiv:1604.00981 https:\/\/arxiv.org\/abs\/1604.00981"},{"key":"e_1_3_2_1_8_1","unstructured":"Tianqi Chen Mu Li Yutian Li Min Lin Naiyan Wang Minjie Wang Tianjun Xiao Bing Xu Chiyuan Zhang and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. (2015). arXiv:1512.01274 https:\/\/arxiv.org\/abs\/1512.01274  Tianqi Chen Mu Li Yutian Li Min Lin Naiyan Wang Minjie Wang Tianjun Xiao Bing Xu Chiyuan Zhang and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. (2015). arXiv:1512.01274 https:\/\/arxiv.org\/abs\/1512.01274"},{"key":"e_1_3_2_1_9_1","volume-title":"Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation. USENIX Association, 571--582","author":"Chilimbi Trishul","year":"2014"},{"key":"e_1_3_2_1_10_1","unstructured":"Jan Chorowski Dzmitry Bahdanau Kyunghyun Cho and Yoshua Bengio. 2014. End-to-end continuous speech recognition using attention-based recurrent NN: first results. (2014). arXiv:1412.1602 http:\/\/arxiv.org\/abs\/1412.1602  Jan Chorowski Dzmitry Bahdanau Kyunghyun Cho and Yoshua Bengio. 2014. End-to-end continuous speech recognition using attention-based recurrent NN: first results. (2014). arXiv:1412.1602 http:\/\/arxiv.org\/abs\/1412.1602"},{"key":"e_1_3_2_1_11_1","unstructured":"Facebook. 2017. Caffe2. https:\/\/caffe2.ai  Facebook. 2017. Caffe2. https:\/\/caffe2.ai"},{"key":"e_1_3_2_1_12_1","unstructured":"Priya Goyal Piotr Doll\u00e1r Ross Girshick Pieter Noordhuis Lukasz Wesolowski Aapo Kyrola Andrew Tulloch Yangqing Jia and Kaiming He. 2017. Accurate Large Minibatch SGD: Training ImageNet in 1 Hour. (2017). arXiv:1706.02677 https:\/\/arxiv.org\/abs\/1706.02677  Priya Goyal Piotr Doll\u00e1r Ross Girshick Pieter Noordhuis Lukasz Wesolowski Aapo Kyrola Andrew Tulloch Yangqing Jia and Kaiming He. 2017. Accurate Large Minibatch SGD: Training ImageNet in 1 Hour. (2017). arXiv:1706.02677 https:\/\/arxiv.org\/abs\/1706.02677"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/3171837.3171972"},{"key":"e_1_3_2_1_14_1","volume-title":"Proceedings of International Conference on Learning Representations.","author":"Han Song","year":"2016"},{"key":"e_1_3_2_1_15_1","unstructured":"Aaron Harlap Deepak Narayanan Amar Phanishayee Vivek Seshadri Nikhil Devanur Greg Ganger and Phil Gibbons. 2018. PipeDream: Fast and Efficient Pipeline Parallel DNN Training. arXiv:1806.03377 http:\/\/arxiv.org\/abs\/1806.03377  Aaron Harlap Deepak Narayanan Amar Phanishayee Vivek Seshadri Nikhil Devanur Greg Ganger and Phil Gibbons. 2018. PipeDream: Fast and Efficient Pipeline Parallel DNN Training. arXiv:1806.03377 http:\/\/arxiv.org\/abs\/1806.03377"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_1_17_1","volume-title":"Proceedings of Workshop on Machine Learning Systems in The 32th Annual Conference on Neural Information Processing Systems. IEEE.","author":"Jia Xianyan","year":"2018"},{"key":"e_1_3_2_1_18_1","unstructured":"Rafal Jozefowicz Oriol Vinyals Mike Schuster Noam Shazeer and Yonghui Wu. 2016. Exploring the Limits of Language Modeling. (2016). arXiv:1602.02410v2 https:\/\/arxiv.org\/abs\/1602.02410  Rafal Jozefowicz Oriol Vinyals Mike Schuster Noam Shazeer and Yonghui Wu. 2016. Exploring the Limits of Language Modeling. (2016). arXiv:1602.02410v2 https:\/\/arxiv.org\/abs\/1602.02410"},{"key":"e_1_3_2_1_19_1","unstructured":"Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. (2016). arXiv:1609.02907 http:\/\/arxiv.org\/abs\/1609.02927  Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. (2016). arXiv:1609.02907 http:\/\/arxiv.org\/abs\/1609.02927"},{"key":"e_1_3_2_1_20_1","unstructured":"Sameer Kumar Dheeraj Sreedhar Vaibhav Saxena Yogish Sabharwal and Ashish Verma. 2017. Efficient Training of Convolutional Neural Nets on Large Distributed Systems. (2017). arXiv:1711.00705 http:\/\/arxiv.org\/abs\/1711.00705  Sameer Kumar Dheeraj Sreedhar Vaibhav Saxena Yogish Sabharwal and Ashish Verma. 2017. Efficient Training of Convolutional Neural Nets on Large Distributed Systems. (2017). arXiv:1711.00705 http:\/\/arxiv.org\/abs\/1711.00705"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/2685048.2685095"},{"key":"e_1_3_2_1_22_1","volume-title":"Proceedings of Advances in Neural Information Processing Systems. Curran Associates, Inc., 2181--2191","author":"Lin Ji","year":"2017"},{"key":"e_1_3_2_1_23_1","unstructured":"Jian-Hao Luo and Jianxin Wu. 2018. AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference. (2018). arXiv:1805.08941 http:\/\/arxiv.org\/abs\/1805.08941  Jian-Hao Luo and Jianxin Wu. 2018. AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference. (2018). arXiv:1805.08941 http:\/\/arxiv.org\/abs\/1805.08941"},{"key":"e_1_3_2_1_24_1","unstructured":"Amith R Mamidala Georgios Kollias Chris Ward and Fausto Artico. 2018. MXNET-MPI: Embedding MPI parallelism in Parameter Server Task Model for scaling Deep Learning. (2018). arXiv:1801.03855 https:\/\/arxiv.org\/abs\/1801.03855  Amith R Mamidala Georgios Kollias Chris Ward and Fausto Artico. 2018. MXNET-MPI: Embedding MPI parallelism in Parameter Server Task Model for scaling Deep Learning. (2018). arXiv:1801.03855 https:\/\/arxiv.org\/abs\/1801.03855"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/1111682.1111715"},{"key":"e_1_3_2_1_26_1","unstructured":"NVIDIA. 2013. NVIDIA GPUDirect. https:\/\/developer.nvidia.com\/gpudirect  NVIDIA. 2013. NVIDIA GPUDirect. https:\/\/developer.nvidia.com\/gpudirect"},{"key":"e_1_3_2_1_27_1","unstructured":"NVIDIA. 2017. NCCL. https:\/\/developer.nvidia.com\/nccl  NVIDIA. 2017. NCCL. https:\/\/developer.nvidia.com\/nccl"},{"key":"e_1_3_2_1_28_1","volume-title":"Proceedings of the 30th International Conference on Neural Information Processing Systems. Curran Associates Inc., 4797--4805","author":"van den Oord A\u00e4ron","year":"2016"},{"key":"e_1_3_2_1_29_1","unstructured":"Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017).  Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017)."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2007.370405"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2008.09.002"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Yousef Saad. 2003. Iterative methods for sparse linear systems. SIAM.   Yousef Saad. 2003. Iterative methods for sparse linear systems. SIAM.","DOI":"10.1137\/1.9780898718003"},{"key":"e_1_3_2_1_34_1","unstructured":"Alexander Sergeev and Mike Del Balso. 2018. Horovod. (2018). arXiv:1802.05799 http:\/\/arxiv.org\/abs\/1802.05799  Alexander Sergeev and Mike Del Balso. 2018. Horovod. (2018). arXiv:1802.05799 http:\/\/arxiv.org\/abs\/1802.05799"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/DASC\/PiCom\/DataCom\/CyberSciTec.2018.000-4"},{"key":"e_1_3_2_1_36_1","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. (2014). arXiv:1409.1556 http:\/\/arxiv.org\/abs\/1409.1556  Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. (2014). arXiv:1409.1556 http:\/\/arxiv.org\/abs\/1409.1556"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.308"},{"key":"e_1_3_2_1_38_1","volume-title":"Workshop on Machine Learning Systems in The 29th Annual Conference on Neural Information Processing Systems.","author":"Tokui Seiya","year":"2015"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342009359013"},{"key":"e_1_3_2_1_40_1","unstructured":"Statistical Machine Translation. 2014. wmt. http:\/\/www.statmt.org\/wmt14  Statistical Machine Translation. 2014. wmt. http:\/\/www.statmt.org\/wmt14"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3302424.3303953"},{"key":"e_1_3_2_1_42_1","unstructured":"Minjie Wang Chien-chin Huang and Jinyang Li. 2018. Unifying Data Model and Hybrid Parallelism in Deep Learning via Tensor Tiling. (2018). arXiv:1805.04170 http:\/\/arxiv.org\/abs\/1805.04170  Minjie Wang Chien-chin Huang and Jinyang Li. 2018. Unifying Data Model and Hybrid Parallelism in Deep Learning via Tensor Tiling. (2018). arXiv:1805.04170 http:\/\/arxiv.org\/abs\/1805.04170"},{"key":"e_1_3_2_1_43_1","unstructured":"Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey etal 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. (2016). arXiv:1609.08144 https:\/\/arxiv.org\/abs\/1609.08144  Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey et al. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. (2016). arXiv:1609.08144 https:\/\/arxiv.org\/abs\/1609.08144"},{"key":"e_1_3_2_1_44_1","volume-title":"Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference. USENIX Association, 181--193","author":"Zhang Hao"},{"key":"e_1_3_2_1_45_1","volume-title":"Proceedings of the 25th International Joint Conference on Artificial Intelligence. AAAI Press, 2350--2356","author":"Zhang Wei","year":"2016"},{"key":"e_1_3_2_1_46_1","unstructured":"Aojun Zhou Anbang Yao Yiwen Guo Lin Xu and Yurong Chen. 2017. Incremental network quantization: Towards lossless cnns with low-precision weights. (2017). arXiv:1702.03044 http:\/\/arxiv.org\/abs\/1702.03044  Aojun Zhou Anbang Yao Yiwen Guo Lin Xu and Yurong Chen. 2017. Incremental network quantization: Towards lossless cnns with low-precision weights. (2017). arXiv:1702.03044 http:\/\/arxiv.org\/abs\/1702.03044"}],"event":{"name":"EuroSys '19: Fourteenth EuroSys Conference 2019","location":"Dresden Germany","acronym":"EuroSys '19","sponsor":["SIGOPS ACM Special Interest Group on Operating Systems"]},"container-title":["Proceedings of the Fourteenth EuroSys Conference 2019"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3302424.3303957","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3302424.3303957","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:01:48Z","timestamp":1750208508000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3302424.3303957"}},"subtitle":["Sparsity-aware Data Parallel Training of Deep Neural Networks"],"short-title":[],"issued":{"date-parts":[[2019,3,25]]},"references-count":46,"alternative-id":["10.1145\/3302424.3303957","10.1145\/3302424"],"URL":"https:\/\/doi.org\/10.1145\/3302424.3303957","relation":{},"subject":[],"published":{"date-parts":[[2019,3,25]]},"assertion":[{"value":"2019-03-25","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}