{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T13:53:37Z","timestamp":1770990817094,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":40,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,26]],"date-time":"2021-10-26T00:00:00Z","timestamp":1635206400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea","doi-asserted-by":"publisher","award":["NRF-2020R1A2B5B03001960"],"award-info":[{"award-number":["NRF-2020R1A2B5B03001960"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF (National Science Foundation)","doi-asserted-by":"publisher","award":["#212114824"],"award-info":[{"award-number":["#212114824"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Institute of Information & Communications Technology Planning & Evaluation","award":["2020-0-01373"],"award-info":[{"award-number":["2020-0-01373"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,26]]},"DOI":"10.1145\/3459637.3482412","type":"proceedings-article","created":{"date-parts":[[2021,11,15]],"date-time":"2021-11-15T15:31:14Z","timestamp":1636990274000},"page":"863-872","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["ALADDIN"],"prefix":"10.1145","author":[{"given":"Yunyong","family":"Ko","sequence":"first","affiliation":[{"name":"Hanyang University, Seoul, Republic of Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kibong","family":"Choi","sequence":"additional","affiliation":[{"name":"Hanyang University, Seoul, Republic of Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hyunseung","family":"Jei","sequence":"additional","affiliation":[{"name":"SK Telecom, Seoul, Republic of Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dongwon","family":"Lee","sequence":"additional","affiliation":[{"name":"The Pennsylvania State University, University Park, PA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sang-Wook","family":"Kim","sequence":"additional","affiliation":[{"name":"Hanyang University, Seoul, South Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,10,30]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML). PMLR, 344--353","author":"Assran Mahmoud","year":"2019","unstructured":"Mahmoud Assran , Nicolas Loizou , Nicolas Ballas , and Mike Rabbat . 2019 . Stochas-tic gradient push for distributed deep learning . In Proceedings of the International Conference on Machine Learning (ICML). PMLR, 344--353 . Mahmoud Assran, Nicolas Loizou, Nicolas Ballas, and Mike Rabbat. 2019. Stochas-tic gradient push for distributed deep learning. In Proceedings of the International Conference on Machine Learning (ICML). PMLR, 344--353."},{"key":"e_1_3_2_1_2_1","volume-title":"GoSGD: Distributed Optimization for Deep Learning with Gossip Exchange. arXiv preprint arXiv:1804.01852","author":"Blot Michael","year":"2018","unstructured":"Michael Blot , David Picard , and Matthieu Cord . 2018. GoSGD: Distributed Optimization for Deep Learning with Gossip Exchange. arXiv preprint arXiv:1804.01852 ( 2018 ). Michael Blot, David Picard, and Matthieu Cord. 2018. GoSGD: Distributed Optimization for Deep Learning with Gossip Exchange. arXiv preprint arXiv:1804.01852 (2018)."},{"key":"e_1_3_2_1_3_1","volume-title":"Proceedings of the Advances in Neural Information Processing Systems Workshop on Optimization for Machine Learning","author":"Blot Michael","year":"2016","unstructured":"Michael Blot , David Picard , Matthieu Cord , and Nicolas Thome . 2016 . Gossip training for deep learning . Proceedings of the Advances in Neural Information Processing Systems Workshop on Optimization for Machine Learning (2016). Michael Blot, David Picard, Matthieu Cord, and Nicolas Thome. 2016. Gossip training for deep learning. Proceedings of the Advances in Neural Information Processing Systems Workshop on Optimization for Machine Learning (2016)."},{"key":"e_1_3_2_1_4_1","unstructured":"Tom B Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell etal 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020).  Tom B Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020)."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.5555\/2685048.2685094"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/2999134.2999271"},{"key":"e_1_3_2_1_7_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_3_2_1_8_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML). PMLR, 2933--2942","author":"Eshraghi Nima","year":"2020","unstructured":"Nima Eshraghi and Ben Liang . 2020 . Distributed Online Optimization over a Heterogeneous Network with Any-Batch Mirror Descent . In Proceedings of the International Conference on Machine Learning (ICML). PMLR, 2933--2942 . Nima Eshraghi and Ben Liang. 2020. Distributed Online Optimization over a Heterogeneous Network with Any-Batch Mirror Descent. In Proceedings of the International Conference on Machine Learning (ICML). PMLR, 2933--2942."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.1994.1085"},{"key":"e_1_3_2_1_10_1","volume-title":"large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677","author":"Goyal Priya","year":"2017","unstructured":"Priya Goyal , Piotr Doll\u00e1r , Ross Girshick , Pieter Noordhuis , Lukasz Wesolowski , Aapo Kyrola , Andrew Tulloch , Yangqing Jia , and Kaiming He. 2017. Accurate , large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 ( 2017 ). Priya Goyal, Piotr Doll\u00e1r, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 (2017)."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/2999611.2999748"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1090\/S0273-0979-06-01126-8"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/3454287.3454297"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3035933"},{"key":"e_1_3_2_1_16_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML). PMLR, 4911--4920","author":"Johnson Tyler","year":"2020","unstructured":"Tyler Johnson , Pulkit Agrawal , Haijie Gu , and Carlos Guestrin . 2020 . AdaScale SGD: A User-Friendly Algorithm for Distributed Training . In Proceedings of the International Conference on Machine Learning (ICML). PMLR, 4911--4920 . Tyler Johnson, Pulkit Agrawal, Haijie Gu, and Carlos Guestrin. 2020. AdaScale SGD: A User-Friendly Algorithm for Distributed Training. In Proceedings of the International Conference on Machine Learning (ICML). PMLR, 4911--4920."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/946243.946317"},{"key":"e_1_3_2_1_18_1","unstructured":"Alex Krizhevsky Geoffrey Hinton etal 2009. Learning multiple layers of features from tiny images. (2009).  Alex Krizhevsky Geoffrey Hinton et al. 2009. Learning multiple layers of features from tiny images. (2009)."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.5555\/2685048.2685095"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/3327757.3327900"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295285"},{"key":"e_1_3_2_1_22_1","volume-title":"Proceedings of the international Conference on Machine Learning (ICML). 3049--3058","author":"Lian Xiangru","year":"2018","unstructured":"Xiangru Lian , Wei Zhang , Ce Zhang , and Ji Liu . 2018 . Asynchronous decentralized parallel stochastic gradient descent . In Proceedings of the international Conference on Machine Learning (ICML). 3049--3058 . Xiangru Lian, Wei Zhang, Ce Zhang, and Ji Liu. 2018. Asynchronous decentralized parallel stochastic gradient descent. In Proceedings of the international Conference on Machine Learning (ICML). 3049--3058."},{"key":"e_1_3_2_1_23_1","volume-title":"Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent.","author":"Nadiradze Giorgi","year":"2021","unstructured":"Giorgi Nadiradze , Ilia Markov , Bapi Chatterjee , Vyacheslav Kungurtsev , and Dan Alistarh . 2021 . Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent. (2021). Giorgi Nadiradze, Ilia Markov, Bapi Chatterjee, Vyacheslav Kungurtsev, and Dan Alistarh. 2021. Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent. (2021)."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341301.3359646"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/2986459.2986537"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_3_2_1_27_1","volume-title":"Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799","author":"Sergeev Alexander","year":"2018","unstructured":"Alexander Sergeev and Mike Del Balso . 2018. Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 ( 2018 ). Alexander Sergeev and Mike Del Balso. 2018. Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 (2018)."},{"key":"e_1_3_2_1_28_1","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR).","author":"Simonyan Karen","year":"2015","unstructured":"Karen Simonyan and Andrew Zisserman . 2015 . Very deep convolutional networks for large-scale image recognition . In Proceedings of the International Conference on Learning Representations (ICLR). Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations (ICLR)."},{"key":"e_1_3_2_1_29_1","volume-title":"Cooperative SGD: A unified framework for the design and analysis of communication-efficient SGD algorithms. arXiv preprint arXiv:1808.07576","author":"Wang Jianyu","year":"2018","unstructured":"Jianyu Wang and Gauri Joshi . 2018. Cooperative SGD: A unified framework for the design and analysis of communication-efficient SGD algorithms. arXiv preprint arXiv:1808.07576 ( 2018 ). Jianyu Wang and Gauri Joshi. 2018. Cooperative SGD: A unified framework for the design and analysis of communication-efficient SGD algorithms. arXiv preprint arXiv:1808.07576 (2018)."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33015289"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2020.2974267"},{"key":"e_1_3_2_1_32_1","volume-title":"Large batch training of convo-lutional networks. arXiv preprint arXiv:1708.03888","author":"You Yang","year":"2017","unstructured":"Yang You , Igor Gitman , and Boris Ginsburg . 2017. Large batch training of convo-lutional networks. arXiv preprint arXiv:1708.03888 ( 2017 ). Yang You, Igor Gitman, and Boris Ginsburg. 2017. Large batch training of convo-lutional networks. arXiv preprint arXiv:1708.03888 (2017)."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3295500.3356137"},{"key":"e_1_3_2_1_34_1","volume-title":"Large batch optimization for deep learning: Training bert in 76 minutes. arXiv preprint arXiv:1904.00962","author":"You Yang","year":"2019","unstructured":"Yang You , Jing Li , Sashank Reddi , Jonathan Hseu , Sanjiv Kumar , Srinadh Bho-janapalli, Xiaodan Song , James Demmel , Kurt Keutzer , and Cho-Jui Hsieh . 2019. Large batch optimization for deep learning: Training bert in 76 minutes. arXiv preprint arXiv:1904.00962 ( 2019 ). Yang You, Jing Li, Sashank Reddi, Jonathan Hseu, Sanjiv Kumar, Srinadh Bho-janapalli, Xiaodan Song, James Demmel, Kurt Keutzer, and Cho-Jui Hsieh. 2019. Large batch optimization for deep learning: Training bert in 76 minutes. arXiv preprint arXiv:1904.00962 (2019)."},{"key":"e_1_3_2_1_35_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML). PMLR, 7202--7212","author":"Yu Chen","year":"2019","unstructured":"Chen Yu , Hanlin Tang , Cedric Renggli , Simon Kassing , Ankit Singla , Dan Alistarh , Ce Zhang , and Ji Liu . 2019 . Distributed learning over unreliable networks . In Proceedings of the International Conference on Machine Learning (ICML). PMLR, 7202--7212 . Chen Yu, Hanlin Tang, Cedric Renggli, Simon Kassing, Ankit Singla, Dan Alistarh, Ce Zhang, and Ji Liu. 2019. Distributed learning over unreliable networks. In Proceedings of the International Conference on Machine Learning (ICML). PMLR, 7202--7212."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.5555\/3154690.3154708"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.5555\/2969239.2969316"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCS.2019.00150"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2019.00198"},{"key":"e_1_3_2_1_40_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML). 5970--5979","author":"Zhou Zhengyuan","year":"2018","unstructured":"Zhengyuan Zhou , Panayotis Mertikopoulos , Nicholas Bambos , Peter Glynn , Yinyu Ye , Li-Jia Li , and Li Fei-Fei . 2018 . Distributed Asynchronous Optimiza-tion with Unbounded Delays: How Slow Can You Go? . In Proceedings of the International Conference on Machine Learning (ICML). 5970--5979 . Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Peter Glynn, Yinyu Ye, Li-Jia Li, and Li Fei-Fei. 2018. Distributed Asynchronous Optimiza-tion with Unbounded Delays: How Slow Can You Go?. In Proceedings of the International Conference on Machine Learning (ICML). 5970--5979."}],"event":{"name":"CIKM '21: The 30th ACM International Conference on Information and Knowledge Management","location":"Virtual Event Queensland Australia","acronym":"CIKM '21","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 30th ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3459637.3482412","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3459637.3482412","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3459637.3482412","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:12:23Z","timestamp":1750191143000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3459637.3482412"}},"subtitle":["Asymmetric Centralized Training for Distributed Deep Learning"],"short-title":[],"issued":{"date-parts":[[2021,10,26]]},"references-count":40,"alternative-id":["10.1145\/3459637.3482412","10.1145\/3459637"],"URL":"https:\/\/doi.org\/10.1145\/3459637.3482412","relation":{},"subject":[],"published":{"date-parts":[[2021,10,26]]},"assertion":[{"value":"2021-10-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}