{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,15]],"date-time":"2026-07-15T16:02:16Z","timestamp":1784131336400,"version":"3.55.0"},"publisher-location":"New York, NY, USA","reference-count":42,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,3,28]],"date-time":"2022-03-28T00:00:00Z","timestamp":1648425600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,3,28]]},"DOI":"10.1145\/3492321.3519584","type":"proceedings-article","created":{"date-parts":[[2022,3,28]],"date-time":"2022-03-28T14:28:18Z","timestamp":1648477698000},"page":"472-487","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":76,"title":["Varuna"],"prefix":"10.1145","author":[{"given":"Sanjith","family":"Athlur","sequence":"first","affiliation":[{"name":"Carnegie Mellon University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Nitika","family":"Saran","sequence":"additional","affiliation":[{"name":"Cornell University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Muthian","family":"Sivathanu","sequence":"additional","affiliation":[{"name":"Microsoft Research, Bangalore, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ramachandran","family":"Ramjee","sequence":"additional","affiliation":[{"name":"Microsoft Research, Bangalore, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Nipun","family":"Kwatra","sequence":"additional","affiliation":[{"name":"Microsoft Research, Bangalore, India"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2022,3,28]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"BERT pre-training performance. https:\/\/github.com\/NVIDIA\/DeepLearningExamples\/blob\/master\/PyTorch\/LanguageModeling\/BERT\/README.md#pre-training-nvidia-dgx-1-with-16g.  BERT pre-training performance. https:\/\/github.com\/NVIDIA\/DeepLearningExamples\/blob\/master\/PyTorch\/LanguageModeling\/BERT\/README.md#pre-training-nvidia-dgx-1-with-16g."},{"key":"e_1_3_2_1_2_1","unstructured":"Dynamic Training with Apache MXNet. https:\/\/github.com\/awslabs\/dynamic-training-with-apache-mxnet-on-aws.  Dynamic Training with Apache MXNet. https:\/\/github.com\/awslabs\/dynamic-training-with-apache-mxnet-on-aws."},{"key":"e_1_3_2_1_3_1","unstructured":"GPipe Source code. https:\/\/github.com\/tensorflow\/lingvo\/blob\/master\/lingvo\/core\/gpipe.py.  GPipe Source code. https:\/\/github.com\/tensorflow\/lingvo\/blob\/master\/lingvo\/core\/gpipe.py."},{"key":"e_1_3_2_1_4_1","unstructured":"Nvidia apex library. https:\/\/github.com\/NVIDIA\/apex.  Nvidia apex library. https:\/\/github.com\/NVIDIA\/apex."},{"key":"e_1_3_2_1_5_1","unstructured":"NVIDIA DGX-2. https:\/\/www.nvidia.com\/en-in\/data-center\/dgx-2\/.  NVIDIA DGX-2. https:\/\/www.nvidia.com\/en-in\/data-center\/dgx-2\/."},{"key":"e_1_3_2_1_6_1","unstructured":"NVLAMB Optimizer. https:\/\/developer.nvidia.com\/blog\/pretraining-bert-with-layer-wise-adaptive-learning-rates\/.  NVLAMB Optimizer. https:\/\/developer.nvidia.com\/blog\/pretraining-bert-with-layer-wise-adaptive-learning-rates\/."},{"key":"e_1_3_2_1_7_1","unstructured":"PyTorch Elastic. https:\/\/github.com\/pytorch\/elastic.  PyTorch Elastic. https:\/\/github.com\/pytorch\/elastic."},{"key":"e_1_3_2_1_8_1","unstructured":"Turing-NLG: A 17-billion-parameter language model by Microsoft. https:\/\/www.microsoft.com\/en-us\/research\/blog\/turing-nlg-a-17-billion-parameter-language-model-by-microsoft\/.  Turing-NLG: A 17-billion-parameter language model by Microsoft. https:\/\/www.microsoft.com\/en-us\/research\/blog\/turing-nlg-a-17-billion-parameter-language-model-by-microsoft\/."},{"key":"e_1_3_2_1_9_1","first-page":"265","volume-title":"TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)","volume":"16","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) , volume 16 , pages 265 -- 283 . USENIX Association , 2016 . Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), volume 16, pages 265--283. USENIX Association, 2016."},{"key":"e_1_3_2_1_10_1","unstructured":"Amazon. Amazon ec2 spot instances. run fault-tolerant workloads for up to 90% off. In https:\/\/aws.amazon.com\/ec2\/spot\/.  Amazon. Amazon ec2 spot instances. run fault-tolerant workloads for up to 90% off. In https:\/\/aws.amazon.com\/ec2\/spot\/."},{"key":"e_1_3_2_1_11_1","volume-title":"Language models are few-shot learners. arXiv preprint arXiv:2005.14165","author":"Brown Tom B","year":"2020","unstructured":"Tom B Brown , Benjamin Mann , Nick Ryder , Melanie Subbiah , Jared Kaplan , Prafulla Dhariwal , Arvind Neelakantan , Pranav Shyam , Girish Sastry , Amanda Askell , Language models are few-shot learners. arXiv preprint arXiv:2005.14165 , 2020 . Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020."},{"key":"e_1_3_2_1_12_1","volume-title":"Training deep nets with sublinear memory cost. arXiv preprint arXiv:1604.06174","author":"Chen Tianqi","year":"2016","unstructured":"Tianqi Chen , Bing Xu , Chiyuan Zhang , and Carlos Guestrin . Training deep nets with sublinear memory cost. arXiv preprint arXiv:1604.06174 , 2016 . Tianqi Chen, Bing Xu, Chiyuan Zhang, and Carlos Guestrin. Training deep nets with sublinear memory cost. arXiv preprint arXiv:1604.06174, 2016."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2901318.2901323"},{"key":"e_1_3_2_1_14_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 , 2018 . Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3437801.3441593"},{"key":"e_1_3_2_1_16_1","volume-title":"wages, and profits","author":"Gantt Henry Laurence","year":"1913","unstructured":"Henry Laurence Gantt . Work , wages, and profits . Engineering Magazine Co. , 1913 . Henry Laurence Gantt. Work, wages, and profits. Engineering Magazine Co., 1913."},{"key":"e_1_3_2_1_17_1","unstructured":"Google. Mesh tensorflow - model parallelism made easier. In https:\/\/github.com\/tensorflow\/mesh.  Google. Mesh tensorflow - model parallelism made easier. In https:\/\/github.com\/tensorflow\/mesh."},{"key":"e_1_3_2_1_18_1","unstructured":"Google. Mesh-tensorflow: Model parallelism for supercomputers (tf dev summit '19). In https:\/\/www.youtube.com\/watch?v=HgGyWS40g-g.  Google. Mesh-tensorflow: Model parallelism for supercomputers (tf dev summit '19). In https:\/\/www.youtube.com\/watch?v=HgGyWS40g-g."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3064176.3064182"},{"key":"e_1_3_2_1_20_1","first-page":"103","volume-title":"Advances in Neural Information Processing Systems","author":"Huang Yanping","year":"2019","unstructured":"Yanping Huang , Youlong Cheng , Ankur Bapna , Orhan Firat , Dehao Chen , Mia Chen , HyoukJoong Lee , Jiquan Ngiam , Quoc V Le , Yonghui Wu , : Efficient training of giant neural networks using pipeline parallelism . In Advances in Neural Information Processing Systems , pages 103 -- 112 , 2019 . Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Dehao Chen, Mia Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V Le, Yonghui Wu, et al. Gpipe: Efficient training of giant neural networks using pipeline parallelism. In Advances in Neural Information Processing Systems, pages 103--112, 2019."},{"key":"e_1_3_2_1_21_1","volume-title":"Checkmate: Breaking the memory wall with optimal tensor rematerialization. arXiv preprint arXiv:1910.02653","author":"Jain Paras","year":"2019","unstructured":"Paras Jain , Ajay Jain , Aniruddha Nrusimha , Amir Gholami , Pieter Abbeel , Kurt Keutzer , Ion Stoica , and Joseph E Gonzalez . Checkmate: Breaking the memory wall with optimal tensor rematerialization. arXiv preprint arXiv:1910.02653 , 2019 . Paras Jain, Ajay Jain, Aniruddha Nrusimha, Amir Gholami, Pieter Abbeel, Kurt Keutzer, Ion Stoica, and Joseph E Gonzalez. Checkmate: Breaking the memory wall with optimal tensor rematerialization. arXiv preprint arXiv:1910.02653, 2019."},{"key":"e_1_3_2_1_22_1","volume-title":"Proc. of ML Systems Workshop in NIPS","author":"Meng Chen","year":"2017","unstructured":"Chen Meng , Minmin Sun , Jun Yang , Minghui Qiu , and Yang Gu . Training deeper models by gpu memory optimization on tensorflow . In Proc. of ML Systems Workshop in NIPS , 2017 . Chen Meng, Minmin Sun, Jun Yang, Minghui Qiu, and Yang Gu. Training deeper models by gpu memory optimization on tensorflow. In Proc. of ML Systems Workshop in NIPS, 2017."},{"key":"e_1_3_2_1_23_1","unstructured":"Microsoft. Use low-priority azure vms with batch. In https:\/\/docs.microsoft.com\/en-us\/azure\/batch\/batch-low-pri-vms.  Microsoft. Use low-priority azure vms with batch. In https:\/\/docs.microsoft.com\/en-us\/azure\/batch\/batch-low-pri-vms."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341301.3359646"},{"key":"e_1_3_2_1_25_1","volume-title":"Memory-efficient pipeline-parallel dnn training. arXiv preprint arXiv:2006.09503","author":"Narayanan Deepak","year":"2020","unstructured":"Deepak Narayanan , Amar Phanishayee , Kaiyu Shi , Xie Chen , and Matei Zaharia . Memory-efficient pipeline-parallel dnn training. arXiv preprint arXiv:2006.09503 , 2020 . Deepak Narayanan, Amar Phanishayee, Kaiyu Shi, Xie Chen, and Matei Zaharia. Memory-efficient pipeline-parallel dnn training. arXiv preprint arXiv:2006.09503, 2020."},{"key":"e_1_3_2_1_26_1","volume-title":"Efficient large-scale language model training on gpu clusters. arXiv preprint arXiv:2104.04473","author":"Narayanan Deepak","year":"2021","unstructured":"Deepak Narayanan , Mohammad Shoeybi , Jared Casper , Patrick LeGresley , Mostofa Patwary , Vijay Korthikanti , Dmitri Vainbrand , Prethvi Kashinkunti , Julie Bernauer , Bryan Catanzaro , Efficient large-scale language model training on gpu clusters. arXiv preprint arXiv:2104.04473 , 2021 . Deepak Narayanan, Mohammad Shoeybi, Jared Casper, Patrick LeGresley, Mostofa Patwary, Vijay Korthikanti, Dmitri Vainbrand, Prethvi Kashinkunti, Julie Bernauer, Bryan Catanzaro, et al. Efficient large-scale language model training on gpu clusters. arXiv preprint arXiv:2104.04473, 2021."},{"key":"e_1_3_2_1_27_1","first-page":"400","article-title":"Resource elasticity in distributed deep learning","volume":"2","author":"Or Andrew","year":"2020","unstructured":"Andrew Or , Haoyu Zhang , and Michael Freedman . Resource elasticity in distributed deep learning . Proceedings of Machine Learning and Systems , 2 : 400 -- 411 , 2020 . Andrew Or, Haoyu Zhang, and Michael Freedman. Resource elasticity in distributed deep learning. Proceedings of Machine Learning and Systems, 2:400--411, 2020.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_2_1_28_1","unstructured":"Adam Paszke Sam Gross Soumith Chintala and Gregory Chanan. Pytorch 2017.  Adam Paszke Sam Gross Soumith Chintala and Gregory Chanan. Pytorch 2017."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2008.09.002"},{"issue":"8","key":"e_1_3_2_1_30_1","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford , Jeffrey Wu , Rewon Child , David Luan , Dario Amodei , and Ilya Sutskever . Language models are unsupervised multitask learners . OpenAI Blog , 1 ( 8 ): 9 , 2019 . Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. OpenAI Blog, 1(8):9, 2019.","journal-title":"OpenAI Blog"},{"key":"e_1_3_2_1_31_1","volume-title":"ZeRO: Memory Optimization Towards Training A Trillion Parameter Models. arXiv preprint arXiv:1910.02054","author":"Rajbhandari Samyam","year":"2019","unstructured":"Samyam Rajbhandari , Jeff Rasley , Olatunji Ruwase , and Yuxiong He . ZeRO: Memory Optimization Towards Training A Trillion Parameter Models. arXiv preprint arXiv:1910.02054 , 2019 . Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, and Yuxiong He. ZeRO: Memory Optimization Towards Training A Trillion Parameter Models. arXiv preprint arXiv:1910.02054, 2019."},{"key":"e_1_3_2_1_32_1","volume-title":"Zero-infinity: Breaking the gpu memory wall for extreme scale deep learning. arXiv preprint arXiv:2104.07857","author":"Rajbhandari Samyam","year":"2021","unstructured":"Samyam Rajbhandari , Olatunji Ruwase , Jeff Rasley , Shaden Smith , and Yuxiong He . Zero-infinity: Breaking the gpu memory wall for extreme scale deep learning. arXiv preprint arXiv:2104.07857 , 2021 . Samyam Rajbhandari, Olatunji Ruwase, Jeff Rasley, Shaden Smith, and Yuxiong He. Zero-infinity: Breaking the gpu memory wall for extreme scale deep learning. arXiv preprint arXiv:2104.07857, 2021."},{"key":"e_1_3_2_1_33_1","volume-title":"The cost of training nlp models: A concise overview. arXiv preprint arXiv:2004.08900","author":"Sharir Or","year":"2020","unstructured":"Or Sharir , Barak Peleg , and Yoav Shoham . The cost of training nlp models: A concise overview. arXiv preprint arXiv:2004.08900 , 2020 . Or Sharir, Barak Peleg, and Yoav Shoham. The cost of training nlp models: A concise overview. arXiv preprint arXiv:2004.08900, 2020."},{"key":"e_1_3_2_1_34_1","first-page":"10414","volume-title":"Advances in Neural Information Processing Systems","author":"Shazeer Noam","year":"2018","unstructured":"Noam Shazeer , Youlong Cheng , Niki Parmar , Dustin Tran , Ashish Vaswani , Penporn Koanantakool , Peter Hawkins , HyoukJoong Lee , Mingsheng Hong , Cliff Young , : Deep learning for supercomputers . In Advances in Neural Information Processing Systems , pages 10414 -- 10423 , 2018 . Noam Shazeer, Youlong Cheng, Niki Parmar, Dustin Tran, Ashish Vaswani, Penporn Koanantakool, Peter Hawkins, HyoukJoong Lee, Mingsheng Hong, Cliff Young, et al. Mesh-tensorflow: Deep learning for supercomputers. In Advances in Neural Information Processing Systems, pages 10414--10423, 2018."},{"key":"e_1_3_2_1_35_1","volume-title":"Megatron-lm: Training multi-billion parameter language models using gpu model parallelism. arXiv preprint arXiv:1909.08053","author":"Shoeybi Mohammad","year":"2019","unstructured":"Mohammad Shoeybi , Mostofa Patwary , Raul Puri , Patrick LeGresley , Jared Casper , and Bryan Catanzaro . Megatron-lm: Training multi-billion parameter language models using gpu model parallelism. arXiv preprint arXiv:1909.08053 , 2019 . Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, and Bryan Catanzaro. Megatron-lm: Training multi-billion parameter language models using gpu model parallelism. arXiv preprint arXiv:1909.08053, 2019."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3297858.3304072"},{"key":"e_1_3_2_1_37_1","volume-title":"Efficientnet: Rethinking model scaling for convolutional neural networks","author":"Tan Mingxing","year":"2019","unstructured":"Mingxing Tan and Quoc V. Le . Efficientnet: Rethinking model scaling for convolutional neural networks . 2019 . Mingxing Tan and Quoc V. Le. Efficientnet: Rethinking model scaling for convolutional neural networks. 2019."},{"key":"e_1_3_2_1_38_1","first-page":"5998","volume-title":"Advances in neural information processing systems","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N Gomez , \u0141ukasz Kaiser , and Illia Polosukhin . Attention is all you need . In Advances in neural information processing systems , pages 5998 -- 6008 , 2017 . Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in neural information processing systems, pages 5998--6008, 2017."},{"key":"e_1_3_2_1_39_1","volume-title":"12th {USENIX} Workshop on Hot Topics in Cloud Computing (HotCloud 20)","author":"Wagenl\u00e4nder Marcel","year":"2020","unstructured":"Marcel Wagenl\u00e4nder , Luo Mai , Guo Li , and Peter Pietzuch . Spotnik: Designing distributed machine learning for transient cloud resources . In 12th {USENIX} Workshop on Hot Topics in Cloud Computing (HotCloud 20) , 2020 . Marcel Wagenl\u00e4nder, Luo Mai, Guo Li, and Peter Pietzuch. Spotnik: Designing distributed machine learning for transient cloud resources. In 12th {USENIX} Workshop on Hot Topics in Cloud Computing (HotCloud 20), 2020."},{"key":"e_1_3_2_1_40_1","first-page":"595","volume-title":"13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18)","author":"Xiao Wencong","year":"2018","unstructured":"Wencong Xiao , Romil Bhardwaj , Ramachandran Ramjee , Muthian Sivathanu , Nipun Kwatra , Zhenhua Han , Pratyush Patel , Xuan Peng , Hanyu Zhao , Quanlu Zhang , et al. Gandiva: Introspective cluster scheduling for deep learning . In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18) , pages 595 -- 610 , 2018 . Wencong Xiao, Romil Bhardwaj, Ramachandran Ramjee, Muthian Sivathanu, Nipun Kwatra, Zhenhua Han, Pratyush Patel, Xuan Peng, Hanyu Zhao, Quanlu Zhang, et al. Gandiva: Introspective cluster scheduling for deep learning. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18), pages 595--610, 2018."},{"key":"e_1_3_2_1_41_1","volume-title":"Pipemare: Asynchronous pipeline parallel dnn training. arXiv preprint arXiv:1910.05124","author":"Yang Bowen","year":"2019","unstructured":"Bowen Yang , Jian Zhang , Jonathan Li , Christopher R\u00e9 , Christopher R Aberger , and Christopher De Sa . Pipemare: Asynchronous pipeline parallel dnn training. arXiv preprint arXiv:1910.05124 , 2019 . Bowen Yang, Jian Zhang, Jonathan Li, Christopher R\u00e9, Christopher R Aberger, and Christopher De Sa. Pipemare: Asynchronous pipeline parallel dnn training. arXiv preprint arXiv:1910.05124, 2019."},{"key":"e_1_3_2_1_42_1","volume-title":"International Conference on Learning Representations","author":"You Yang","year":"2019","unstructured":"Yang You , Jing Li , Sashank Reddi , Jonathan Hseu , Sanjiv Kumar , Srinadh Bhojanapalli , Xiaodan Song , James Demmel , Kurt Keutzer , and Cho-Jui Hsieh . Large batch optimization for deep learning: Training bert in 76 minutes . In International Conference on Learning Representations , 2019 . Yang You, Jing Li, Sashank Reddi, Jonathan Hseu, Sanjiv Kumar, Srinadh Bhojanapalli, Xiaodan Song, James Demmel, Kurt Keutzer, and Cho-Jui Hsieh. Large batch optimization for deep learning: Training bert in 76 minutes. In International Conference on Learning Representations, 2019."}],"event":{"name":"EuroSys '22: Seventeenth European Conference on Computer Systems","location":"Rennes France","acronym":"EuroSys '22","sponsor":["SIGOPS ACM Special Interest Group on Operating Systems"]},"container-title":["Proceedings of the Seventeenth European Conference on Computer Systems"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3492321.3519584","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3492321.3519584","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:31:08Z","timestamp":1750188668000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3492321.3519584"}},"subtitle":["scalable, low-cost training of massive deep learning models"],"short-title":[],"issued":{"date-parts":[[2022,3,28]]},"references-count":42,"alternative-id":["10.1145\/3492321.3519584","10.1145\/3492321"],"URL":"https:\/\/doi.org\/10.1145\/3492321.3519584","relation":{},"subject":[],"published":{"date-parts":[[2022,3,28]]},"assertion":[{"value":"2022-03-28","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}