{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,16]],"date-time":"2025-12-16T12:38:37Z","timestamp":1765888717393,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":40,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,8,9]],"date-time":"2021-08-09T00:00:00Z","timestamp":1628467200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,8,9]]},"DOI":"10.1145\/3472456.3472508","type":"proceedings-article","created":{"date-parts":[[2021,10,5]],"date-time":"2021-10-05T18:46:04Z","timestamp":1633459564000},"page":"1-10","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["CD-SGD: Distributed Stochastic Gradient Descent with Compression and Delay Compensation"],"prefix":"10.1145","author":[{"given":"Enda","family":"Yu","sequence":"first","affiliation":[{"name":"National University of Defense Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dezun","family":"Dong","sequence":"additional","affiliation":[{"name":"National University of Defense Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yemao","family":"Xu","sequence":"additional","affiliation":[{"name":"National University of Defense Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuo","family":"Ouyang","sequence":"additional","affiliation":[{"name":"National University of Defense Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiangke","family":"Liao","sequence":"additional","affiliation":[{"name":"National University of Defense Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,10,5]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1045"},{"key":"e_1_3_2_1_2_1","volume-title":"Proc. NIPS","author":"Alistarh Dan","year":"2017","unstructured":"Dan Alistarh , Demjan Grubic , Jerry Li , Ryota Tomioka , and Milan Vojnovic . 2017 . QSGD: Communication-efficient SGD via gradient quantization and encoding . In Proc. NIPS , (2017). 1709\u20131720. Dan Alistarh, Demjan Grubic, Jerry Li, Ryota Tomioka, and Milan Vojnovic. 2017. QSGD: Communication-efficient SGD via gradient quantization and encoding. In Proc. NIPS, (2017). 1709\u20131720."},{"key":"e_1_3_2_1_3_1","volume-title":"Proc. ICML","author":"Assran Mahmoud","year":"2019","unstructured":"Mahmoud Assran , Nicolas Loizou , Nicolas Ballas , and Mike Rabbat . 2019 . Stochastic gradient push for distributed deep learning . In Proc. ICML , (2019). 344\u2013353. Mahmoud Assran, Nicolas Loizou, Nicolas Ballas, and Mike Rabbat. 2019. Stochastic gradient push for distributed deep learning. In Proc. ICML, (2019). 344\u2013353."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3018743.3018769"},{"key":"e_1_3_2_1_5_1","volume-title":"Proc. NIPS","author":"Banner Ron","year":"2018","unstructured":"Ron Banner , Itay Hubara , Elad Hoffer , and Daniel Soudry . 2018 . Scalable methods for 8-bit training of neural networks . In Proc. NIPS , (2018). 5151\u20135159. Ron Banner, Itay Hubara, Elad Hoffer, and Daniel Soudry. 2018. Scalable methods for 8-bit training of neural networks. In Proc. NIPS, (2018). 5151\u20135159."},{"key":"e_1_3_2_1_6_1","volume-title":"Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274(2015).","author":"Chen Tianqi","year":"2015","unstructured":"Tianqi Chen , Mu Li , Yutian Li , Min Lin , Naiyan Wang , Minjie Wang , Tianjun Xiao , Bing Xu , Chiyuan Zhang , and Zheng Zhang . 2015 . Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274(2015). Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274(2015)."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA47549.2020.00056"},{"key":"e_1_3_2_1_9_1","volume-title":"RedSync: reducing synchronization bandwidth for distributed deep learning training system","author":"Fang Jiarui","year":"2019","unstructured":"Jiarui Fang , Haohuan Fu , Guangwen Yang , and Cho-Jui Hsieh . 2019. RedSync: reducing synchronization bandwidth for distributed deep learning training system . In Elsevier JPDC , ( 2019 ), 30\u201339. Jiarui Fang, Haohuan Fu, Guangwen Yang, and Cho-Jui Hsieh. 2019. RedSync: reducing synchronization bandwidth for distributed deep learning training system. In Elsevier JPDC, (2019), 30\u201339."},{"key":"e_1_3_2_1_10_1","unstructured":"Priya Goyal Piotr Doll\u00e1r Ross Girshick Pieter Noordhuis Lukasz Wesolowski Aapo Kyrola Andrew Tulloch Yangqing Jia and Kaiming He. 2017. Accurate large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677(2017).  Priya Goyal Piotr Doll\u00e1r Ross Girshick Pieter Noordhuis Lukasz Wesolowski Aapo Kyrola Andrew Tulloch Yangqing Jia and Kaiming He. 2017. Accurate large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677(2017)."},{"key":"e_1_3_2_1_11_1","volume-title":"Proc. NIPS","author":"Haddadpour Farzin","year":"2019","unstructured":"Farzin Haddadpour , Mohammad\u00a0Mahdi Kamani , Mehrdad Mahdavi , and Viveck\u00a0 R Cadambe . 2019 . Local SGD with periodic averaging: Tighter analysis and adaptive synchronization . In Proc. NIPS , (2019). 11080\u201311092. Farzin Haddadpour, Mohammad\u00a0Mahdi Kamani, Mehrdad Mahdavi, and Viveck\u00a0R Cadambe. 2019. Local SGD with periodic averaging: Tighter analysis and adaptive synchronization. In Proc. NIPS, (2019). 11080\u201311092."},{"key":"e_1_3_2_1_12_1","volume":"201","author":"Han Song","unstructured":"Song Han , Jeff Pool , John Tran , and William\u00a0 J Dally. 201 5. Learning both weights and connections for efficient neural networks. In Proc. ICONIP, (2015). 1135\u20131143. Song Han, Jeff Pool, John Tran, and William\u00a0J Dally. 2015. Learning both weights and connections for efficient neural networks. In Proc. ICONIP, (2015). 1135\u20131143.","journal-title":"J Dally."},{"key":"e_1_3_2_1_13_1","volume-title":"MLSys","author":"Hashemi Sayed\u00a0Hadi","year":"2019","unstructured":"Sayed\u00a0Hadi Hashemi , Sangeetha\u00a0Abdu Jyothi , and Roy\u00a0 H Campbell . 2019 . TicTac: Accelerating Distributed Deep Learning with Communication Scheduling . In MLSys , (2019), 1\u201313. Sayed\u00a0Hadi Hashemi, Sangeetha\u00a0Abdu Jyothi, and Roy\u00a0H Campbell. 2019. TicTac: Accelerating Distributed Deep Learning with Communication Scheduling. In MLSys, (2019), 1\u201313."},{"volume-title":"Learning Multiple Layers of Features from Tiny Images. Master\u2019s thesis","author":"Krizhevsky A","key":"e_1_3_2_1_14_1","unstructured":"A Krizhevsky . 2009. Learning Multiple Layers of Features from Tiny Images. Master\u2019s thesis , University of Tront(2009) , 1\u201360. A Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. Master\u2019s thesis, University of Tront(2009), 1\u201360."},{"key":"e_1_3_2_1_15_1","unstructured":"Yann LeCun. 1998. The MNIST database of handwritten digits. http:\/\/yann. lecun. com\/exdb\/mnist\/ .  Yann LeCun. 1998. The MNIST database of handwritten digits. http:\/\/yann. lecun. com\/exdb\/mnist\/ ."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3332466.3374528"},{"key":"e_1_3_2_1_17_1","volume-title":"Proc. NIPS","author":"Li Youjie","year":"2018","unstructured":"Youjie Li , Mingchao Yu , Songze Li , Salman Avestimehr , Nam\u00a0Sung Kim , and Alexander Schwing . 2018 . PIPE-SGD: A decentralized pipelined SGD framework for distributed deep net training . In Proc. NIPS , (2018). 8045\u20138056. Youjie Li, Mingchao Yu, Songze Li, Salman Avestimehr, Nam\u00a0Sung Kim, and Alexander Schwing. 2018. PIPE-SGD: A decentralized pipelined SGD framework for distributed deep net training. In Proc. NIPS, (2018). 8045\u20138056."},{"key":"e_1_3_2_1_18_1","volume-title":"Use Local SGD. In Proc. ICLR","author":"Lin Tao","year":"2019","unstructured":"Tao Lin , Sebastian\u00a0 U Stich , Kumar\u00a0Kshitij Patel , and Martin Jaggi . 2019 . Don\u2019t Use Large Mini-batches , Use Local SGD. In Proc. ICLR , (2019). 1\u201340. Tao Lin, Sebastian\u00a0U Stich, Kumar\u00a0Kshitij Patel, and Martin Jaggi. 2019. Don\u2019t Use Large Mini-batches, Use Local SGD. In Proc. ICLR, (2019). 1\u201340."},{"key":"e_1_3_2_1_19_1","volume-title":"Proc. ICLR","author":"Lin Yujun","year":"2018","unstructured":"Yujun Lin , Song Han , Huizi Mao , Yu Wang , and Bill Dally . 2018 . Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training . In Proc. ICLR , (2018). 1\u201314. Yujun Lin, Song Han, Huizi Mao, Yu Wang, and Bill Dally. 2018. Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training. In Proc. ICLR, (2018). 1\u201314."},{"key":"e_1_3_2_1_20_1","volume-title":"Proc. ICLR","author":"Liu Zhuang","year":"2018","unstructured":"Zhuang Liu , Mingjie Sun , Tinghui Zhou , Gao Huang , and Trevor Darrell . 2018 . Rethinking the Value of Network Pruning . In Proc. ICLR , (2018). 1\u201321. Zhuang Liu, Mingjie Sun, Tinghui Zhou, Gao Huang, and Trevor Darrell. 2018. Rethinking the Value of Network Pruning. In Proc. ICLR, (2018). 1\u201321."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2020.11.005"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341301.3359642"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3295500.3356222"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2014-274"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM.2019.8737367"},{"key":"e_1_3_2_1_26_1","volume-title":"Proc. ECAI","author":"Shi Shaohuai","year":"2020","unstructured":"Shaohuai Shi , Zhenheng Tang , Qiang Wang , Kaiyong Zhao , and Xiaowen Chu . 2020 . Layer-Wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees . In Proc. ECAI , (2020). 1467\u20131474. Shaohuai Shi, Zhenheng Tang, Qiang Wang, Kaiyong Zhao, and Xiaowen Chu. 2020. Layer-Wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees. In Proc. ECAI, (2020). 1467\u20131474."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM41043.2020.9155269"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2015-354"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Peng Sun Wansen Feng Ruobing Han Shengen Yan and Yonggang Wen. 2019. Optimizing Network Performance for Distributed DNN Training on GPU Clusters: ImageNet\/AlexNet Training in 1.5 Minutes. arXiv e-prints arXiv:1902.06855(2019).  Peng Sun Wansen Feng Ruobing Han Shengen Yan and Yonggang Wen. 2019. Optimizing Network Performance for Distributed DNN Training on GPU Clusters: ImageNet\/AlexNet Training in 1.5 Minutes. arXiv e-prints arXiv:1902.06855(2019).","DOI":"10.1109\/TBDATA.2019.2957478"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33015289"},{"key":"e_1_3_2_1_31_1","volume-title":"Proc. NIPS","author":"Wen Wei","year":"2017","unstructured":"Wei Wen , Cong Xu , Feng Yan , Chunpeng Wu , Yandan Wang , Yiran Chen , and Hai Li . 2017 . TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning . In Proc. NIPS , (2017). 1509\u20131519. Wei Wen, Cong Xu, Feng Yan, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. 2017. TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning. In Proc. NIPS, (2017). 1509\u20131519."},{"key":"e_1_3_2_1_32_1","volume-title":"Proc. ICLR","author":"Wu Shuang","year":"2018","unstructured":"Shuang Wu , Guoqi Li , Feng Chen , and Luping Shi . 2018 . Training and Inference with Integers in Deep Neural Networks . In Proc. ICLR , (2018). 1\u201314. Shuang Wu, Guoqi Li, Feng Chen, and Luping Shi. 2018. Training and Inference with Integers in Deep Neural Networks. In Proc. ICLR, (2018). 1\u201314."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3312570"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3417607"},{"key":"e_1_3_2_1_35_1","unstructured":"Yang You Igor Gitman and Boris Ginsburg. 2017. Large batch training of convolutional networks. arXiv preprint arXiv:1708.03888(2017).  Yang You Igor Gitman and Boris Ginsburg. 2017. Large batch training of convolutional networks. arXiv preprint arXiv:1708.03888(2017)."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3295500.3356137"},{"key":"e_1_3_2_1_37_1","unstructured":"Enda Yu Dezun Dong Yemao Xu Shuo Ouyang and Xiangke Liao. 2021. CD-SGD: Distributed Stochastic Gradient Descent with Compression and Delay Compensation.  Enda Yu Dezun Dong Yemao Xu Shuo Ouyang and Xiangke Liao. 2021. CD-SGD: Distributed Stochastic Gradient Descent with Compression and Delay Compensation."},{"key":"e_1_3_2_1_38_1","volume-title":"Proc. USENIX ATC","author":"Zhang Hao","year":"2017","unstructured":"Hao Zhang , Zeyu Zheng , Shizhen Xu , Wei Dai , Qirong Ho , Xiaodan Liang , Zhiting Hu , Jinliang Wei , Pengtao Xie , and Eric\u00a0 P Xing . 2017 . Poseidon: an efficient communication architecture for distributed deep learning on GPU clusters . In Proc. USENIX ATC , (2017). 181\u2013193. Hao Zhang, Zeyu Zheng, Shizhen Xu, Wei Dai, Qirong Ho, Xiaodan Liang, Zhiting Hu, Jinliang Wei, Pengtao Xie, and Eric\u00a0P Xing. 2017. Poseidon: an efficient communication architecture for distributed deep learning on GPU clusters. In Proc. USENIX ATC, (2017). 181\u2013193."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2018\/447"},{"key":"e_1_3_2_1_40_1","volume-title":"IEEE TPDS","author":"Zhou Qihua","year":"2020","unstructured":"Qihua Zhou , Kun Wang , Haodong Lu , Wenyao Xu , Yanfei Sun , and Song Guo . 2020 . Canary: Decentralized Distributed Deep Learning Via Gradient Sketch and Partition in Multi-Interface Networks . In IEEE TPDS , (2020), 900\u2013917. Qihua Zhou, Kun Wang, Haodong Lu, Wenyao Xu, Yanfei Sun, and Song Guo. 2020. Canary: Decentralized Distributed Deep Learning Via Gradient Sketch and Partition in Multi-Interface Networks. In IEEE TPDS, (2020), 900\u2013917."}],"event":{"name":"ICPP 2021: 50th International Conference on Parallel Processing","acronym":"ICPP 2021","location":"Lemont IL USA"},"container-title":["50th International Conference on Parallel Processing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472456.3472508","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3472456.3472508","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:48:12Z","timestamp":1750193292000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472456.3472508"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,9]]},"references-count":40,"alternative-id":["10.1145\/3472456.3472508","10.1145\/3472456"],"URL":"https:\/\/doi.org\/10.1145\/3472456.3472508","relation":{},"subject":[],"published":{"date-parts":[[2021,8,9]]},"assertion":[{"value":"2021-10-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}