{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T19:14:16Z","timestamp":1774120456325,"version":"3.50.1"},"reference-count":74,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2022,9]]},"abstract":"<jats:p>Existing general purpose frameworks for gigantic model training, i.e., dense models with billions of parameters, cannot scale efficiently on cloud environment with various networking conditions due to large communication overheads. In this paper, we propose MiCS, which Minimizes the Communication Scale to bring down communication overhead. Specifically, by decreasing the number of participants in a communication collective, MiCS can utilize heterogeneous network bandwidth, reduce network traffic over slower links, reduce the latency of communications for maintaining high network bandwidth utilization, and amortize expensive global gradient synchronization overhead. Our evaluation on AWS shows that the system throughput of MiCS is up to 2.89\u00d7 that of the state-of-the-art large model training systems. MiCS achieves near-linear scaling efficiency, which is up to 1.27\u00d7 that of DeepSpeed. MiCS allows us to train a proprietary model with 100 billion parameters on 512 GPUs with 99.4% weak-scaling efficiency, and it is able to saturate over 54.5% theoretical computation power of each GPU on a public cloud with less GPU memory and more restricted networks than DGX-A100 clusters.<\/jats:p>","DOI":"10.14778\/3561261.3561265","type":"journal-article","created":{"date-parts":[[2022,11,16]],"date-time":"2022-11-16T15:32:50Z","timestamp":1668612770000},"page":"37-50","source":"Crossref","is-referenced-by-count":15,"title":["MiCS"],"prefix":"10.14778","volume":"16","author":[{"given":"Zhen","family":"Zhang","sequence":"first","affiliation":[{"name":"Johns Hopkins University"}]},{"given":"Shuai","family":"Zheng","sequence":"additional","affiliation":[{"name":"Amazon Web Services"}]},{"given":"Yida","family":"Wang","sequence":"additional","affiliation":[{"name":"Amazon Web Services"}]},{"given":"Justin","family":"Chiu","sequence":"additional","affiliation":[{"name":"Amazon"}]},{"given":"George","family":"Karypis","sequence":"additional","affiliation":[{"name":"Amazon Web Services"}]},{"given":"Trishul","family":"Chilimbi","sequence":"additional","affiliation":[{"name":"Amazon"}]},{"given":"Mu","family":"Li","sequence":"additional","affiliation":[{"name":"Amazon Web Services"}]},{"given":"Xin","family":"Jin","sequence":"additional","affiliation":[{"name":"Peking University"}]}],"member":"320","published-online":{"date-parts":[[2022,11,16]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1088149.1088183"},{"key":"e_1_2_1_2_1","volume-title":"Varuna: Scalable, Low-cost Training of Massive Deep Learning Models. arXiv preprint arXiv:2111.04007","author":"Athlur Sanjith","year":"2021","unstructured":"Sanjith Athlur , Nitika Saran , Muthian Sivathanu , Ramachandran Ramjee , and Nipun Kwatra . 2021 . Varuna: Scalable, Low-cost Training of Massive Deep Learning Models. arXiv preprint arXiv:2111.04007 (2021). Sanjith Athlur, Nitika Saran, Muthian Sivathanu, Ramachandran Ramjee, and Nipun Kwatra. 2021. Varuna: Scalable, Low-cost Training of Massive Deep Learning Models. arXiv preprint arXiv:2111.04007 (2021)."},{"key":"e_1_2_1_3_1","unstructured":"AWS-P3-Instances 2022. Amazon EC2 P3 Instances. https:\/\/aws.amazon.com\/ec2\/instance-types\/p3\/.  AWS-P3-Instances 2022. Amazon EC2 P3 Instances. https:\/\/aws.amazon.com\/ec2\/instance-types\/p3\/."},{"key":"e_1_2_1_4_1","unstructured":"azure-gpu-ncv3-series 2022. Azure NCv3-series. https:\/\/docs.microsoft.com\/en-us\/azure\/virtual-machines\/ncv3-series.  azure-gpu-ncv3-series 2022. Azure NCv3-series. https:\/\/docs.microsoft.com\/en-us\/azure\/virtual-machines\/ncv3-series."},{"key":"e_1_2_1_5_1","unstructured":"azure-gpu-ndv2-series 2022. Azure Updated NDv2-series. https:\/\/docs.microsoft.com\/en-us\/azure\/virtual-machines\/ndv2-series.  azure-gpu-ndv2-series 2022. Azure Updated NDv2-series. https:\/\/docs.microsoft.com\/en-us\/azure\/virtual-machines\/ndv2-series."},{"key":"e_1_2_1_6_1","unstructured":"Azure-Spot-VM 2022. Azure Spot Virtual Machines. https:\/\/azure.microsoft.com\/en-us\/services\/virtual-machines\/spot\/#overview.  Azure-Spot-VM 2022. Azure Spot Virtual Machines. https:\/\/azure.microsoft.com\/en-us\/services\/virtual-machines\/spot\/#overview."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2005.48"},{"key":"e_1_2_1_8_1","volume-title":"Optical interconnection networks for high-performance computing systems. Reports on Progress in Physics","author":"Biberman Aleksandr","year":"2012","unstructured":"Aleksandr Biberman and Keren Bergman . 2012. Optical interconnection networks for high-performance computing systems. Reports on Progress in Physics ( 2012 ). Aleksandr Biberman and Keren Bergman. 2012. Optical interconnection networks for high-performance computing systems. Reports on Progress in Physics (2012)."},{"key":"e_1_2_1_9_1","unstructured":"Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell etal 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020) 1877--1901.  Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020) 1877--1901."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/1285358.1285359"},{"key":"e_1_2_1_11_1","volume-title":"SpeechStew: Simply mix all available speech recognition data to train one large neural network. arXiv preprint arXiv:2104.02133","author":"Chan William","year":"2021","unstructured":"William Chan , Daniel Park , Chris Lee , Yu Zhang , Quoc Le , and Mohammad Norouzi . 2021. SpeechStew: Simply mix all available speech recognition data to train one large neural network. arXiv preprint arXiv:2104.02133 ( 2021 ). William Chan, Daniel Park, Chris Lee, Yu Zhang, Quoc Le, and Mohammad Norouzi. 2021. SpeechStew: Simply mix all available speech recognition data to train one large neural network. arXiv preprint arXiv:2104.02133 (2021)."},{"key":"e_1_2_1_12_1","volume-title":"Conference on Machine Learning and Systems.","author":"Cho Minsik","year":"2019","unstructured":"Minsik Cho , Ulrich Finkler , and David Kung . 2019 . BlueConnect: Novel hierarchical all-reduce on multi-tired network for deep learning . In Conference on Machine Learning and Systems. Minsik Cho, Ulrich Finkler, and David Kung. 2019. BlueConnect: Novel hierarchical all-reduce on multi-tired network for deep learning. In Conference on Machine Learning and Systems."},{"key":"e_1_2_1_13_1","volume-title":"W2v-bert: Combining contrastive learning and masked language modeling for self-supervised speech pre-training. arXiv preprint arXiv:2108.06209","author":"Chung Yu-An","year":"2021","unstructured":"Yu-An Chung , Yu Zhang , Wei Han , Chung-Cheng Chiu , James Qin , Ruoming Pang , and Yonghui Wu. 2021. W2v-bert: Combining contrastive learning and masked language modeling for self-supervised speech pre-training. arXiv preprint arXiv:2108.06209 ( 2021 ). Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, and Yonghui Wu. 2021. W2v-bert: Combining contrastive learning and masked language modeling for self-supervised speech pre-training. arXiv preprint arXiv:2108.06209 (2021)."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352020.3352024"},{"key":"e_1_2_1_15_1","volume-title":"CoAtNet: Marrying Convolution and Attention for All Data Sizes. arXiv preprint arXiv:2106.04803","author":"Dai Zihang","year":"2021","unstructured":"Zihang Dai , Hanxiao Liu , Quoc V Le , and Mingxing Tan . 2021. CoAtNet: Marrying Convolution and Attention for All Data Sizes. arXiv preprint arXiv:2106.04803 ( 2021 ). Zihang Dai, Hanxiao Liu, Quoc V Le, and Mingxing Tan. 2021. CoAtNet: Marrying Convolution and Attention for All Data Sizes. arXiv preprint arXiv:2106.04803 (2021)."},{"key":"e_1_2_1_16_1","volume-title":"Taming the wild: A unified analysis of hogwild-style algorithms. Advances in Neural Information Processing Systems 28","author":"De Sa Christopher M","year":"2015","unstructured":"Christopher M De Sa , Ce Zhang , Kunle Olukotun , and Christopher R\u00e9. 2015. Taming the wild: A unified analysis of hogwild-style algorithms. Advances in Neural Information Processing Systems 28 ( 2015 ). Christopher M De Sa, Ce Zhang, Kunle Olukotun, and Christopher R\u00e9. 2015. Taming the wild: A unified analysis of hogwild-style algorithms. Advances in Neural Information Processing Systems 28 (2015)."},{"key":"e_1_2_1_17_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_2_1_18_1","unstructured":"FairScale 2022. PyTorch extensions for high performance and large scale training. https:\/\/github.com\/facebookresearch\/fairscale.  FairScale 2022. PyTorch extensions for high performance and large scale training. https:\/\/github.com\/facebookresearch\/fairscale."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3437801.3441593"},{"key":"e_1_2_1_20_1","unstructured":"gcloud-gpu-bandwidths 2022. Google Cloud: Network bandwidths and GPUs. https:\/\/cloud.google.com\/compute\/docs\/gpus\/gpu-network-bandwidth#vm-configurations.  gcloud-gpu-bandwidths 2022. Google Cloud: Network bandwidths and GPUs. https:\/\/cloud.google.com\/compute\/docs\/gpus\/gpu-network-bandwidth#vm-configurations."},{"key":"e_1_2_1_21_1","volume-title":"large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677","author":"Goyal Priya","year":"2017","unstructured":"Priya Goyal , Piotr Doll\u00e1r , Ross Girshick , Pieter Noordhuis , Lukasz Wesolowski , Aapo Kyrola , Andrew Tulloch , Yangqing Jia , and Kaiming He. 2017. Accurate , large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 ( 2017 ). Priya Goyal, Piotr Doll\u00e1r, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 (2017)."},{"key":"e_1_2_1_22_1","volume-title":"Advances in Neural Information Processing Systems","volume":"32","author":"Huang Yanping","year":"2019","unstructured":"Yanping Huang , Youlong Cheng , Ankur Bapna , Orhan Firat , Dehao Chen , Mia Chen , HyoukJoong Lee , Jiquan Ngiam , Quoc V Le , Yonghui Wu , 2019 . Gpipe: Efficient training of giant neural networks using pipeline parallelism . In Advances in Neural Information Processing Systems , Vol. 32 . Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Dehao Chen, Mia Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V Le, Yonghui Wu, et al. 2019. Gpipe: Efficient training of giant neural networks using pipeline parallelism. In Advances in Neural Information Processing Systems, Vol. 32."},{"key":"e_1_2_1_23_1","volume-title":"Priority-based parameter propagation for distributed DNN training. arXiv preprint arXiv:1905.03960","author":"Jayarajan Anand","year":"2019","unstructured":"Anand Jayarajan , Jinliang Wei , Garth Gibson , Alexandra Fedorova , and Gennady Pekhimenko . 2019. Priority-based parameter propagation for distributed DNN training. arXiv preprint arXiv:1905.03960 ( 2019 ). Anand Jayarajan, Jinliang Wei, Garth Gibson, Alexandra Fedorova, and Gennady Pekhimenko. 2019. Priority-based parameter propagation for distributed DNN training. arXiv preprint arXiv:1905.03960 (2019)."},{"key":"e_1_2_1_24_1","volume-title":"International Conference on Machine Learning (ICML). 2279--2288","author":"Jia Zhihao","year":"2018","unstructured":"Zhihao Jia , Sina Lin , Charles R Qi , and Alex Aiken . 2018 . Exploring hidden dimensions in parallelizing convolutional neural networks . In International Conference on Machine Learning (ICML). 2279--2288 . Zhihao Jia, Sina Lin, Charles R Qi, and Alex Aiken. 2018. Exploring hidden dimensions in parallelizing convolutional neural networks. In International Conference on Machine Learning (ICML). 2279--2288."},{"key":"e_1_2_1_25_1","volume-title":"Conference on Machine Learning and Systems","volume":"1","author":"Jia Zhihao","year":"2018","unstructured":"Zhihao Jia , Matei Zaharia , and Alex Aiken . 2018 . Beyond data and model parallelism for deep neural networks . In Conference on Machine Learning and Systems , Vol. 1 . 1--13. Zhihao Jia, Matei Zaharia, and Alex Aiken. 2018. Beyond data and model parallelism for deep neural networks. In Conference on Machine Learning and Systems, Vol. 1. 1--13."},{"key":"e_1_2_1_26_1","unstructured":"Yimin Jiang Yibo Zhu Chang Lan Bairen Yi Yong Cui and Chuanxiong Guo. 2020. A Unified Architecture for Accelerating Distributed {DNN} Training in Heterogeneous GPU\/CPU Clusters. In USENIX OSDI. 463--479.  Yimin Jiang Yibo Zhu Chang Lan Bairen Yi Yong Cui and Chuanxiong Guo. 2020. A Unified Architecture for Accelerating Distributed {DNN} Training in Heterogeneous GPU\/CPU Clusters. In USENIX OSDI. 463--479."},{"key":"e_1_2_1_27_1","volume-title":"ATP: In-network Aggregation for Multi-tenant Learning. In USENIX NSDI. 741--761.","author":"Lao ChonLam","year":"2021","unstructured":"ChonLam Lao , Yanfang Le , Kshiteej Mahajan , Yixi Chen , Wenfei Wu , Aditya Akella , and Michael Swift . 2021 . ATP: In-network Aggregation for Multi-tenant Learning. In USENIX NSDI. 741--761. ChonLam Lao, Yanfang Le, Kshiteej Mahajan, Yixi Chen, Wenfei Wu, Aditya Akella, and Michael Swift. 2021. ATP: In-network Aggregation for Multi-tenant Learning. In USENIX NSDI. 741--761."},{"key":"e_1_2_1_28_1","volume-title":"Gshard: Scaling giant models with conditional computation and automatic sharding. arXiv preprint arXiv:2006.16668","author":"Lepikhin Dmitry","year":"2020","unstructured":"Dmitry Lepikhin , HyoukJoong Lee , Yuanzhong Xu , Dehao Chen , Orhan Firat , Yanping Huang , Maxim Krikun , Noam Shazeer , and Zhifeng Chen . 2020 . Gshard: Scaling giant models with conditional computation and automatic sharding. arXiv preprint arXiv:2006.16668 (2020). Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. 2020. Gshard: Scaling giant models with conditional computation and automatic sharding. arXiv preprint arXiv:2006.16668 (2020)."},{"key":"e_1_2_1_29_1","volume-title":"Hanlin Tang, Samyam Rajbhandari, and Yuxiong He.","author":"Li Conglong","year":"2021","unstructured":"Conglong Li , Ammar Ahmad Awan , Hanlin Tang, Samyam Rajbhandari, and Yuxiong He. 2021 . 1-bit LAMB : Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed . arXiv preprint arXiv:2104.06069 (2021). Conglong Li, Ammar Ahmad Awan, Hanlin Tang, Samyam Rajbhandari, and Yuxiong He. 2021. 1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed. arXiv preprint arXiv:2104.06069 (2021)."},{"key":"e_1_2_1_30_1","volume-title":"Deep gradient compression: Reducing the communication bandwidth for distributed training. arXiv preprint arXiv:1712.01887","author":"Lin Yujun","year":"2017","unstructured":"Yujun Lin , Song Han , Huizi Mao , Yu Wang , and William J Dally . 2017. Deep gradient compression: Reducing the communication bandwidth for distributed training. arXiv preprint arXiv:1712.01887 ( 2017 ). Yujun Lin, Song Han, Huizi Mao, Yu Wang, and William J Dally. 2017. Deep gradient compression: Reducing the communication bandwidth for distributed training. arXiv preprint arXiv:1712.01887 (2017)."},{"key":"e_1_2_1_31_1","volume-title":"Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu , Myle Ott , Naman Goyal , Jingfei Du , Mandar Joshi , Danqi Chen , Omer Levy , Mike Lewis , Luke Zettlemoyer , and Veselin Stoyanov . 2019 . Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019). Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)."},{"key":"e_1_2_1_32_1","volume-title":"Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering. arXiv preprint arXiv:2105.14088","author":"Luo Liang","year":"2021","unstructured":"Liang Luo , Jacob Nelson , Arvind Krishnamurthy , and Luis Ceze . 2021 . Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering. arXiv preprint arXiv:2105.14088 (2021). Liang Luo, Jacob Nelson, Arvind Krishnamurthy, and Luis Ceze. 2021. Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering. arXiv preprint arXiv:2105.14088 (2021)."},{"key":"e_1_2_1_33_1","volume-title":"PLink: Discovering and Exploiting Datacenter Network Locality for Efficient Cloud-based Distributed Training. In Conference on Machine Learning and Systems","volume":"2","author":"Luo Liang","year":"2020","unstructured":"Liang Luo , Peter West , Arvind Krishnamurthy , Luis Ceze , and Jacob Nelson . 2020 . PLink: Discovering and Exploiting Datacenter Network Locality for Efficient Cloud-based Distributed Training. In Conference on Machine Learning and Systems , Vol. 2 . 82--97. Liang Luo, Peter West, Arvind Krishnamurthy, Luis Ceze, and Jacob Nelson. 2020. PLink: Discovering and Exploiting Datacenter Network Locality for Efficient Cloud-based Distributed Training. In Conference on Machine Learning and Systems, Vol. 2. 82--97."},{"key":"e_1_2_1_34_1","volume-title":"Device Placement Optimization with Reinforcement Learning. In International Conference on Machine Learning","volume":"70","author":"Mirhoseini Azalia","year":"2017","unstructured":"Azalia Mirhoseini , Hieu Pham , Quoc V Le , Benoit Steiner , Rasmus Larsen , Yuefeng Zhou , Naveen Kumar , Mohammad Norouzi , Samy Bengio , and Jeff Dean . 2017 . Device Placement Optimization with Reinforcement Learning. In International Conference on Machine Learning , Vol. 70 . 2430--2439. Azalia Mirhoseini, Hieu Pham, Quoc V Le, Benoit Steiner, Rasmus Larsen, Yuefeng Zhou, Naveen Kumar, Mohammad Norouzi, Samy Bengio, and Jeff Dean. 2017. Device Placement Optimization with Reinforcement Learning. In International Conference on Machine Learning, Vol. 70. 2430--2439."},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Deepak Narayanan Aaron Harlap Amar Phanishayee Vivek Seshadri Nikhil R Devanur Gregory R Ganger Phillip B Gibbons and Matei Zaharia. 2019. PipeDream: generalized pipeline parallelism for DNN training. In ACM SOSP. 1--15.  Deepak Narayanan Aaron Harlap Amar Phanishayee Vivek Seshadri Nikhil R Devanur Gregory R Ganger Phillip B Gibbons and Matei Zaharia. 2019. PipeDream: generalized pipeline parallelism for DNN training. In ACM SOSP. 1--15.","DOI":"10.1145\/3341301.3359646"},{"key":"e_1_2_1_36_1","volume-title":"Memory-Efficient Pipeline-Parallel DNN Training. arXiv preprint arXiv:2006.09503","author":"Narayanan Deepak","year":"2020","unstructured":"Deepak Narayanan , Amar Phanishayee , Kaiyu Shi , Xie Chen , and Matei Zaharia . 2020. Memory-Efficient Pipeline-Parallel DNN Training. arXiv preprint arXiv:2006.09503 ( 2020 ). Deepak Narayanan, Amar Phanishayee, Kaiyu Shi, Xie Chen, and Matei Zaharia. 2020. Memory-Efficient Pipeline-Parallel DNN Training. arXiv preprint arXiv:2006.09503 (2020)."},{"key":"e_1_2_1_37_1","volume-title":"Dmitri Vainbrand, Prethvi Kashinkunti, Julie Bernauer, Bryan Catanzaro, Amar Phanishayee, and Matei Zaharia.","author":"Narayanan Deepak","year":"2021","unstructured":"Deepak Narayanan , Mohammad Shoeybi , Jared Casper , Patrick LeGresley , Mostofa Patwary , Vijay Anand Korthikanti , Dmitri Vainbrand, Prethvi Kashinkunti, Julie Bernauer, Bryan Catanzaro, Amar Phanishayee, and Matei Zaharia. 2021 . Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM. arXiv preprint arXiv:2104.04473 (2021). Deepak Narayanan, Mohammad Shoeybi, Jared Casper, Patrick LeGresley, Mostofa Patwary, Vijay Anand Korthikanti, Dmitri Vainbrand, Prethvi Kashinkunti, Julie Bernauer, Bryan Catanzaro, Amar Phanishayee, and Matei Zaharia. 2021. Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM. arXiv preprint arXiv:2104.04473 (2021)."},{"key":"e_1_2_1_38_1","unstructured":"Maxim Naumov Dheevatsa Mudigere Hao-Jun Michael Shi Jianyu Huang Narayanan Sundaraman Jongsoo Park Xiaodong Wang Udit Gupta Carole-Jean Wu Alisson G. Azzolini Dmytro Dzhulgakov Andrey Mallevich Ilia Cherniavskii Yinghai Lu Raghuraman Krishnamoorthi Ansha Yu Volodymyr Kondratenko Stephanie Pereira Xianjie Chen Wenlin Chen Vijay Rao Bill Jia Liang Xiong and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. arXiv preprint arXiv:1906.00091 (2019).  Maxim Naumov Dheevatsa Mudigere Hao-Jun Michael Shi Jianyu Huang Narayanan Sundaraman Jongsoo Park Xiaodong Wang Udit Gupta Carole-Jean Wu Alisson G. Azzolini Dmytro Dzhulgakov Andrey Mallevich Ilia Cherniavskii Yinghai Lu Raghuraman Krishnamoorthi Ansha Yu Volodymyr Kondratenko Stephanie Pereira Xianjie Chen Wenlin Chen Vijay Rao Bill Jia Liang Xiong and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. arXiv preprint arXiv:1906.00091 (2019)."},{"key":"e_1_2_1_39_1","unstructured":"NCCL 2022. NVIDIA Collective Communications Library (NCCL). https:\/\/developer.nvidia.com\/nccl.  NCCL 2022. NVIDIA Collective Communications Library (NCCL). https:\/\/developer.nvidia.com\/nccl."},{"key":"e_1_2_1_40_1","unstructured":"NVIDIA-DGX-A100 2022. NVIDIA DGX A100. https:\/\/images.nvidia.com\/aem-dam\/Solutions\/Data-Center\/nvidia-dgx-a100-datasheet.pdf.  NVIDIA-DGX-A100 2022. NVIDIA DGX A100. https:\/\/images.nvidia.com\/aem-dam\/Solutions\/Data-Center\/nvidia-dgx-a100-datasheet.pdf."},{"key":"e_1_2_1_41_1","doi-asserted-by":"crossref","unstructured":"Yanghua Peng Yibo Zhu Yangrui Chen Yixin Bao Bairen Yi Chang Lan Chuan Wu and Chuanxiong Guo. 2019. A generic communication scheduler for distributed DNN training acceleration. In ACM SOSP. 16--29.  Yanghua Peng Yibo Zhu Yangrui Chen Yixin Bao Bairen Yi Chang Lan Chuan Wu and Chuanxiong Guo. 2019. A generic communication scheduler for distributed DNN training acceleration. In ACM SOSP. 16--29.","DOI":"10.1145\/3341301.3359642"},{"key":"e_1_2_1_42_1","unstructured":"PS-lite 2022. lightweight implementation of the parameter server framework. https:\/\/github.com\/dmlc\/ps-lite.  PS-lite 2022. lightweight implementation of the parameter server framework. https:\/\/github.com\/dmlc\/ps-lite."},{"key":"e_1_2_1_43_1","unstructured":"PyTorch 2022. PyTorch. https:\/\/pytorch.org\/.  PyTorch 2022. PyTorch. https:\/\/pytorch.org\/."},{"key":"e_1_2_1_44_1","unstructured":"Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei Ilya Sutskever etal 2019. Language models are unsupervised multitask learners. OpenAI blog (2019).  Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei Ilya Sutskever et al. 2019. Language models are unsupervised multitask learners. OpenAI blog (2019)."},{"key":"e_1_2_1_45_1","volume-title":"Zero: Memory optimization towards training a trillion parameter models. arXiv preprint arXiv:1910.02054","author":"Rajbhandari Samyam","year":"2019","unstructured":"Samyam Rajbhandari , Jeff Rasley , Olatunji Ruwase , and Yuxiong He . 2019 . Zero: Memory optimization towards training a trillion parameter models. arXiv preprint arXiv:1910.02054 (2019). Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, and Yuxiong He. 2019. Zero: Memory optimization towards training a trillion parameter models. arXiv preprint arXiv:1910.02054 (2019)."},{"key":"e_1_2_1_46_1","volume-title":"ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning. arXiv preprint arXiv:2104.07857","author":"Rajbhandari Samyam","year":"2021","unstructured":"Samyam Rajbhandari , Olatunji Ruwase , Jeff Rasley , Shaden Smith , and Yuxiong He. 2021. ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning. arXiv preprint arXiv:2104.07857 ( 2021 ). Samyam Rajbhandari, Olatunji Ruwase, Jeff Rasley, Shaden Smith, and Yuxiong He. 2021. ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning. arXiv preprint arXiv:2104.07857 (2021)."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3406703"},{"key":"e_1_2_1_48_1","volume-title":"A lock-free approach to parallelizing stochastic gradient descent. Advances in Neural Information Processing Systems 24","author":"Recht Benjamin","year":"2011","unstructured":"Benjamin Recht , Christopher Re , Stephen Wright , and Feng Niu . 2011. Hogwild! : A lock-free approach to parallelizing stochastic gradient descent. Advances in Neural Information Processing Systems 24 ( 2011 ). Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. 2011. Hogwild!: A lock-free approach to parallelizing stochastic gradient descent. Advances in Neural Information Processing Systems 24 (2011)."},{"key":"e_1_2_1_49_1","volume-title":"Olatunji Ruwase, Shuangyan Yang, Minjia Zhang, Dong Li, and Yuxiong He.","author":"Ren Jie","year":"2021","unstructured":"Jie Ren , Samyam Rajbhandari , Reza Yazdani Aminabadi , Olatunji Ruwase, Shuangyan Yang, Minjia Zhang, Dong Li, and Yuxiong He. 2021 . Zero-offload : Democratizing billion-scale model training. arXiv preprint arXiv:2101.06840 (2021). Jie Ren, Samyam Rajbhandari, Reza Yazdani Aminabadi, Olatunji Ruwase, Shuangyan Yang, Minjia Zhang, Dong Li, and Yuxiong He. 2021. Zero-offload: Democratizing billion-scale model training. arXiv preprint arXiv:2101.06840 (2021)."},{"key":"e_1_2_1_50_1","volume-title":"Dan RK Ports, and Peter Richt\u00e1rik","author":"Sapio Amedeo","year":"2019","unstructured":"Amedeo Sapio , Marco Canini , Chen-Yu Ho , Jacob Nelson , Panos Kalnis , Changhoon Kim , Arvind Krishnamurthy , Masoud Moshref , Dan RK Ports, and Peter Richt\u00e1rik . 2019 . Scaling distributed machine learning with in-network aggregation. arXiv preprint arXiv:1903.06701 (2019). Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan RK Ports, and Peter Richt\u00e1rik. 2019. Scaling distributed machine learning with in-network aggregation. arXiv preprint arXiv:1903.06701 (2019)."},{"key":"e_1_2_1_51_1","doi-asserted-by":"crossref","unstructured":"Frank Seide Hao Fu Jasha Droppo Gang Li and Dong Yu. 2014. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns. In INTERSPEECH. 1058--1062.  Frank Seide Hao Fu Jasha Droppo Gang Li and Dong Yu. 2014. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns. In INTERSPEECH. 1058--1062.","DOI":"10.21437\/Interspeech.2014-274"},{"key":"e_1_2_1_52_1","volume-title":"Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799","author":"Sergeev Alexander","year":"2018","unstructured":"Alexander Sergeev and Mike Del Balso . 2018. Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 ( 2018 ). Alexander Sergeev and Mike Del Balso. 2018. Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 (2018)."},{"key":"e_1_2_1_53_1","volume-title":"Mesh-tensorflow: Deep learning for supercomputers. arXiv preprint arXiv:1811.02084","author":"Shazeer Noam","year":"2018","unstructured":"Noam Shazeer , Youlong Cheng , Niki Parmar , Dustin Tran , Ashish Vaswani , Penporn Koanantakool , Peter Hawkins , HyoukJoong Lee , Mingsheng Hong , Cliff Young , 2018 . Mesh-tensorflow: Deep learning for supercomputers. arXiv preprint arXiv:1811.02084 (2018). Noam Shazeer, Youlong Cheng, Niki Parmar, Dustin Tran, Ashish Vaswani, Penporn Koanantakool, Peter Hawkins, HyoukJoong Lee, Mingsheng Hong, Cliff Young, et al. 2018. Mesh-tensorflow: Deep learning for supercomputers. arXiv preprint arXiv:1811.02084 (2018)."},{"key":"e_1_2_1_54_1","volume-title":"Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism. arXiv preprint arXiv:1909.08053","author":"Shoeybi Mohammad","year":"2020","unstructured":"Mohammad Shoeybi , Mostofa Patwary , Raul Puri , Patrick LeGresley , Jared Casper , and Bryan Catanzaro . 2020. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism. arXiv preprint arXiv:1909.08053 ( 2020 ). Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, and Bryan Catanzaro. 2020. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism. arXiv preprint arXiv:1909.08053 (2020)."},{"key":"e_1_2_1_55_1","unstructured":"Ankit Singla P Brighten Godfrey and Alexandra Kolla. 2014. High throughput data center topology design. In USENIX NSDI. 29--41.  Ankit Singla P Brighten Godfrey and Alexandra Kolla. 2014. High throughput data center topology design. In USENIX NSDI. 29--41."},{"key":"e_1_2_1_56_1","unstructured":"Shaden Smith Mostofa Patwary Brandon Norick Patrick LeGresley Samyam Rajbhandari Jared Casper Zhun Liu Shrimai Prabhumoye George Zerveas Vijay Korthikanti etal 2022. Using deepspeed and megatron to train megatron-turing nlg 530b a large-scale generative language model. arXiv preprint arXiv:2201.11990 (2022).  Shaden Smith Mostofa Patwary Brandon Norick Patrick LeGresley Samyam Rajbhandari Jared Casper Zhun Liu Shrimai Prabhumoye George Zerveas Vijay Korthikanti et al. 2022. Using deepspeed and megatron to train megatron-turing nlg 530b a large-scale generative language model. arXiv preprint arXiv:2201.11990 (2022)."},{"key":"e_1_2_1_57_1","unstructured":"TensorFlow 2022. TensorFlow. https:\/\/www.tensorflow.org\/.  TensorFlow 2022. TensorFlow. https:\/\/www.tensorflow.org\/."},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342005051521"},{"key":"e_1_2_1_59_1","unstructured":"Torch CUDA Memory Stats 2022. Torch CUDA Memory Stats. https:\/\/pytorch.org\/docs\/stable\/generated\/torch.cuda.memory_stats.html.  Torch CUDA Memory Stats 2022. Torch CUDA Memory Stats. https:\/\/pytorch.org\/docs\/stable\/generated\/torch.cuda.memory_stats.html."},{"key":"e_1_2_1_60_1","volume-title":"Vinay Ramakrishnaiah, Nirmal Prajapati, Pat McCormick, Jamaludin Mohd-Yusof, Xi Luo, Dheevatsa Mudigere, Jongsoo Park, Misha Smelyanskiy, and Alex Aiken.","author":"Unger Colin","year":"2022","unstructured":"Colin Unger , Zhihao Jia , Wei Wu , Sina Lin , Mandeep Baines , Carlos Efrain Quintero Narvaez , Vinay Ramakrishnaiah, Nirmal Prajapati, Pat McCormick, Jamaludin Mohd-Yusof, Xi Luo, Dheevatsa Mudigere, Jongsoo Park, Misha Smelyanskiy, and Alex Aiken. 2022 . Unity : Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization. In USENIX OSDI. Carlsbad, CA , 267--284. Colin Unger, Zhihao Jia, Wei Wu, Sina Lin, Mandeep Baines, Carlos Efrain Quintero Narvaez, Vinay Ramakrishnaiah, Nirmal Prajapati, Pat McCormick, Jamaludin Mohd-Yusof, Xi Luo, Dheevatsa Mudigere, Jongsoo Park, Misha Smelyanskiy, and Alex Aiken. 2022. Unity: Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization. In USENIX OSDI. Carlsbad, CA, 267--284."},{"key":"e_1_2_1_61_1","volume-title":"arXiv preprint arXiv:1706.03762","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N. Gomez , Lukasz Kaiser , and Illia Polosukhin . 2017. Attention Is All You Need. arXiv preprint arXiv:1706.03762 ( 2017 ). Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. arXiv preprint arXiv:1706.03762 (2017)."},{"key":"e_1_2_1_62_1","volume-title":"Conference on Machine Learning and Systems","volume":"2","author":"Wang Guanhua","year":"2020","unstructured":"Guanhua Wang , Shivaram Venkataraman , Amar Phanishayee , Jorgen Thelin , Nikhil Devanur , and Ion Stoica . 2020 . Blink: Fast and generic collectives for distributed ML . In Conference on Machine Learning and Systems , Vol. 2 . 172--186. Guanhua Wang, Shivaram Venkataraman, Amar Phanishayee, Jorgen Thelin, Nikhil Devanur, and Ion Stoica. 2020. Blink: Fast and generic collectives for distributed ML. In Conference on Machine Learning and Systems, Vol. 2. 172--186."},{"key":"e_1_2_1_63_1","unstructured":"Yuanzhong Xu HyoukJoong Lee Dehao Chen Blake Hechtman Yanping Huang Rahul Joshi Maxim Krikun Dmitry Lepikhin Andy Ly Marcello Maggioni etal 2021. GSPMD: general and scalable parallelization for ML computation graphs. arXiv preprint arXiv:2105.04663 (2021).  Yuanzhong Xu HyoukJoong Lee Dehao Chen Blake Hechtman Yanping Huang Rahul Joshi Maxim Krikun Dmitry Lepikhin Andy Ly Marcello Maggioni et al. 2021. GSPMD: general and scalable parallelization for ML computation graphs. arXiv preprint arXiv:2105.04663 (2021)."},{"key":"e_1_2_1_64_1","volume-title":"PipeMare: Asynchronous Pipeline Parallel DNN Training. arXiv preprint arXiv:1910.05124","author":"Yang Bowen","year":"2020","unstructured":"Bowen Yang , Jian Zhang , Jonathan Li , Christopher R\u00e9 , Christopher R. Aberger , and Christopher De Sa. 2020. PipeMare: Asynchronous Pipeline Parallel DNN Training. arXiv preprint arXiv:1910.05124 ( 2020 ). Bowen Yang, Jian Zhang, Jonathan Li, Christopher R\u00e9, Christopher R. Aberger, and Christopher De Sa. 2020. PipeMare: Asynchronous Pipeline Parallel DNN Training. arXiv preprint arXiv:1910.05124 (2020)."},{"key":"e_1_2_1_65_1","volume-title":"Large batch optimization for deep learning: Training bert in 76 minutes. arXiv preprint arXiv:1904.00962","author":"You Yang","year":"2019","unstructured":"Yang You , Jing Li , Sashank Reddi , Jonathan Hseu , Sanjiv Kumar , Srinadh Bhojanapalli , Xiaodan Song , James Demmel , Kurt Keutzer , and Cho-Jui Hsieh . 2019. Large batch optimization for deep learning: Training bert in 76 minutes. arXiv preprint arXiv:1904.00962 ( 2019 ). Yang You, Jing Li, Sashank Reddi, Jonathan Hseu, Sanjiv Kumar, Srinadh Bhojanapalli, Xiaodan Song, James Demmel, Kurt Keutzer, and Cho-Jui Hsieh. 2019. Large batch optimization for deep learning: Training bert in 76 minutes. arXiv preprint arXiv:1904.00962 (2019)."},{"key":"e_1_2_1_66_1","volume-title":"Imagenet training in minutes. arXiv preprint arXiv:1709.05011","author":"You Yang","year":"2018","unstructured":"Yang You , Zhao Zhang , Cho-Jui Hsieh , James Demmel , and Kurt Keutzer . 2018. Imagenet training in minutes. arXiv preprint arXiv:1709.05011 ( 2018 ). Yang You, Zhao Zhang, Cho-Jui Hsieh, James Demmel, and Kurt Keutzer. 2018. Imagenet training in minutes. arXiv preprint arXiv:1709.05011 (2018)."},{"key":"e_1_2_1_67_1","volume-title":"Wide Residual Networks. arXiv preprint arXiv:1605.07146","author":"Zagoruyko Sergey","year":"2017","unstructured":"Sergey Zagoruyko and Nikos Komodakis . 2017. Wide Residual Networks. arXiv preprint arXiv:1605.07146 ( 2017 ). Sergey Zagoruyko and Nikos Komodakis. 2017. Wide Residual Networks. arXiv preprint arXiv:1605.07146 (2017)."},{"key":"e_1_2_1_68_1","volume-title":"Scaling vision transformers. arXiv preprint arXiv:2106.04560","author":"Zhai Xiaohua","year":"2021","unstructured":"Xiaohua Zhai , Alexander Kolesnikov , Neil Houlsby , and Lucas Beyer . 2021. Scaling vision transformers. arXiv preprint arXiv:2106.04560 ( 2021 ). Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby, and Lucas Beyer. 2021. Scaling vision transformers. arXiv preprint arXiv:2106.04560 (2021)."},{"key":"e_1_2_1_69_1","volume-title":"Xi Victoria Lin, et al","author":"Zhang Susan","year":"2022","unstructured":"Susan Zhang , Stephen Roller , Naman Goyal , Mikel Artetxe , Moya Chen , Shuohui Chen , Christopher Dewan , Mona Diab , Xian Li , Xi Victoria Lin, et al . 2022 . Opt : Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068 (2022). Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, et al. 2022. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068 (2022)."},{"key":"e_1_2_1_70_1","volume-title":"Pushing the limits of semi-supervised learning for automatic speech recognition. arXiv preprint arXiv:2010.10504","author":"Zhang Yu","year":"2020","unstructured":"Yu Zhang , James Qin , Daniel S Park , Wei Han , Chung-Cheng Chiu , Ruoming Pang , Quoc V Le , and Yonghui Wu. 2020. Pushing the limits of semi-supervised learning for automatic speech recognition. arXiv preprint arXiv:2010.10504 ( 2020 ). Yu Zhang, James Qin, Daniel S Park, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Quoc V Le, and Yonghui Wu. 2020. Pushing the limits of semi-supervised learning for automatic speech recognition. arXiv preprint arXiv:2010.10504 (2020)."},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/3405671.3405810"},{"key":"e_1_2_1_72_1","volume-title":"Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning. In USENIX OSDI. 559--578.","author":"Zheng Lianmin","year":"2022","unstructured":"Lianmin Zheng , Zhuohan Li , Hao Zhang , Yonghao Zhuang , Zhifeng Chen , Yanping Huang , Yida Wang , Yuanzhong Xu , Danyang Zhuo , Eric P. Xing , Joseph E. Gonzalez , and Ion Stoica . 2022 . Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning. In USENIX OSDI. 559--578. Lianmin Zheng, Zhuohan Li, Hao Zhang, Yonghao Zhuang, Zhifeng Chen, Yanping Huang, Yida Wang, Yuanzhong Xu, Danyang Zhuo, Eric P. Xing, Joseph E. Gonzalez, and Ion Stoica. 2022. Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning. In USENIX OSDI. 559--578."},{"key":"e_1_2_1_73_1","volume-title":"Accelerated large batch optimization of bert pretraining in 54 minutes. arXiv preprint arXiv:2006.13484","author":"Zheng Shuai","year":"2020","unstructured":"Shuai Zheng , Haibin Lin , Sheng Zha , and Mu Li. 2020. Accelerated large batch optimization of bert pretraining in 54 minutes. arXiv preprint arXiv:2006.13484 ( 2020 ). Shuai Zheng, Haibin Lin, Sheng Zha, and Mu Li. 2020. Accelerated large batch optimization of bert pretraining in 54 minutes. arXiv preprint arXiv:2006.13484 (2020)."},{"key":"e_1_2_1_74_1","volume-title":"Viktor Leis, and Carsten Binnig.","author":"Ziegler Tobias","year":"2022","unstructured":"Tobias Ziegler , Dwarakanandan Bindiganavile Mohan , Viktor Leis, and Carsten Binnig. 2022 . EFA : A Viable Alternative to RDMA over InfiniBand for DBMSs?. In Data Management on New Hardware . Tobias Ziegler, Dwarakanandan Bindiganavile Mohan, Viktor Leis, and Carsten Binnig. 2022. EFA: A Viable Alternative to RDMA over InfiniBand for DBMSs?. In Data Management on New Hardware."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3561261.3561265","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:20:07Z","timestamp":1672219207000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3561261.3561265"}},"subtitle":["near-linear scaling for training gigantic model on public cloud"],"short-title":[],"issued":{"date-parts":[[2022,9]]},"references-count":74,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,9]]}},"alternative-id":["10.14778\/3561261.3561265"],"URL":"https:\/\/doi.org\/10.14778\/3561261.3561265","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2022,9]]}}}