{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T19:17:06Z","timestamp":1777576626606,"version":"3.51.4"},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,7,8]],"date-time":"2024-07-08T00:00:00Z","timestamp":1720396800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,7,8]],"date-time":"2024-07-08T00:00:00Z","timestamp":1720396800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Discov Artif Intell"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Prominent works in the space of Natural Language Processing (NLP) have long attempted to create new innovative models by improving upon previous model training approaches, altering model architecture, and developing more in-depth datasets to better their performance. However, with the quickly advancing field of NLP comes increased greenhouse gas emissions, posing concerns over the environmental damage caused by training LLMs. Gaining a comprehensive understanding of the various costs, particularly those pertaining to environmental aspects, that are associated with artificial intelligence serves as the foundational basis for ensuring safe AI models. Currently, investigations into the CO2 emissions of AI models remain an emerging area of research, and as such, we evaluate the CO2 emissions of well-known large language models, which have an especially high carbon footprint due to their significant amount of model parameters. We argue for the training of LLMs in a way that is responsible and sustainable by suggesting measures for reducing carbon emissions. Furthermore, we discuss how the choice of hardware affects CO2 emissions by contrasting the CO2 emissions during model training for two widely used GPUs. Based on our results, we present the benefits and drawbacks of our proposed solutions and make the argument for the possibility of training more environmentally safe AI models without sacrificing their robustness and performance.<\/jats:p>","DOI":"10.1007\/s44163-024-00149-w","type":"journal-article","created":{"date-parts":[[2024,7,8]],"date-time":"2024-07-08T17:01:47Z","timestamp":1720458107000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":52,"title":["Green AI: exploring carbon footprints, mitigation strategies, and trade offs in large language model training"],"prefix":"10.1007","volume":"4","author":[{"given":"Vivian","family":"Liu","sequence":"first","affiliation":[]},{"given":"Yiqiao","family":"Yin","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,7,8]]},"reference":[{"key":"149_CR1","doi-asserted-by":"crossref","unstructured":"Bender EM, Gebru T, McMillan-Major A, Shmitchell S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, 2021:610\u2013623.","DOI":"10.1145\/3442188.3445922"},{"issue":"12","key":"149_CR2","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1145\/3381831","volume":"63","author":"R Schwartz","year":"2020","unstructured":"Schwartz R, Dodge J. Noah a Smith, and Oren Etzioni. Green ai Commun ACM. 2020;63(12):54\u201363.","journal-title":"Green ai. Commun ACM"},{"key":"149_CR3","unstructured":"Luccioni AS, Hernandez-Garcia A. Counting carbon: a survey of factors influencing the emissions of machine learning. arXiv preprintarXiv:2302.08476, 2023."},{"key":"149_CR4","first-page":"1877","volume":"33","author":"T Brown","year":"2020","unstructured":"Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, et al. Language models are few-shot learners. Adv Neural Inf Process Syst. 2020;33:1877\u2013901.","journal-title":"Adv Neural Inf Process Syst"},{"key":"149_CR5","unstructured":"Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. Roberta: A robustly optimized bert pretraining approach. arXiv preprintarXiv:1907.11692, 2019."},{"key":"149_CR6","unstructured":"Clark K, Luong MT, Le QV, Manning CD. Electra: pre-training text encoders as discriminators rather than generators. arXiv preprintarXiv:2003.10555, 2020."},{"issue":"8","key":"149_CR7","first-page":"9","volume":"1","author":"A Radford","year":"2019","unstructured":"Radford A, Jeffrey W, Child R, Luan D, Amodei D, Sutskever I, et al. Language models are unsupervised multitask learners. OpenAI blog. 2019;1(8):9.","journal-title":"OpenAI blog"},{"key":"149_CR8","unstructured":"Radford A, Narasimhan K, Salimans T, Sutskever I, et al. Improving language understanding by generative pre-training. OpenAI, 2018."},{"key":"149_CR9","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser \u0141, Polosukhin I. Attention is all you need. Adv Neural Inf Process Syst, 2017;30."},{"issue":"140","key":"149_CR10","first-page":"1","volume":"21","author":"R Colin","year":"2020","unstructured":"Colin R. Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR. 2020;21(140):1.","journal-title":"JMLR"},{"key":"149_CR11","unstructured":"Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R. Albert: a lite bert for self-supervised learning of language representations. arXiv preprintarXiv:1909.11942, 2019."},{"key":"149_CR12","unstructured":"Sanh V, Debut L, Chaumond J, Wolf T. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprintarXiv:1910.01108, 2019."},{"key":"149_CR13","unstructured":"Lacoste A, Luccioni A, Schmidt V, Dandres T. Quantifying the carbon emissions of machine learning. arXiv preprintarXiv:1910.09700, 2019."},{"key":"149_CR14","doi-asserted-by":"crossref","unstructured":"Budennyy SA, Lazarev VD, Zakharenko NN, Korovin AN, Plosskaya OA, Dimitrov DV, Akhripkin VS, Pavlov IV, Oseledets IV, Barsola IS, Egorov IV, et\u00a0al. Eco2ai: carbon emissions tracking of machine learning models as the first step towards sustainable ai. In Doklady Mathematics, volume 106, pages S118\u2013S128. Springer, 2022.","DOI":"10.1134\/S1064562422060230"},{"issue":"1","key":"149_CR15","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1038\/s41368-023-00239-y","volume":"15","author":"O Hanyao Huang","year":"2023","unstructured":"Hanyao Huang O, Zheng DW, Yin J, Wang Z, Ding S, Yin H, Chuan X, Yang R, Zheng Q, et al. Chatgpt for shaping the future of dentistry: the potential of multi-modal large language model. Int J Oral Sci. 2023;15(1):29.","journal-title":"Int J Oral Sci"},{"key":"149_CR16","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1016\/j.ins.2015.02.024","volume":"307","author":"Peipei Xia","year":"2015","unstructured":"Xia Peipei, Zhang Li, Li Fanzhang. Learning similarity with cosine similarity ensemble. Inf Sci. 2015;307:39\u201352.","journal-title":"Inf Sci"},{"key":"149_CR17","unstructured":"Agirre E, Bos J, Diab M, Manandhar S, Marton Y, Yuret D. Semeval-2012 task 6: a pilot on semantic textual similarity. In * SEM 2012: The First Joint Conference on Lexical and Computational Semantics\u2013Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), 2012:385\u2013393."},{"key":"149_CR18","doi-asserted-by":"crossref","unstructured":"Rajpurkar P, Zhang J, Lopyrev K, Liang P. SQuAD: 100,000+ questions for machine comprehension of text. In Jian Su, Kevin Duh, and Xavier Carreras, editors, Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2383\u20132392, Austin, Texas, November 2016. Association for Computational Linguistics.","DOI":"10.18653\/v1\/D16-1264"},{"key":"149_CR19","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-02181-7","volume-title":"Pretrained transformers for text ranking: Bert and beyond","author":"Jimmy Lin","year":"2022","unstructured":"Lin Jimmy, Nogueira Rodrigo, Yates Andrew. Pretrained transformers for text ranking: Bert and beyond. Berlin: Springer Nature; 2022."},{"key":"149_CR20","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprintarXiv:1810.04805, 2018."},{"key":"149_CR21","doi-asserted-by":"crossref","unstructured":"Clark K, Khandelwal U, Levy O, Manning CD. What does bert look at? an analysis of bert\u2019s attention. arXiv preprintarXiv:1906.04341, 2019.","DOI":"10.18653\/v1\/W19-4828"},{"issue":"4","key":"149_CR22","doi-asserted-by":"publisher","first-page":"3129","DOI":"10.1007\/s12652-021-03439-8","volume":"14","author":"JJ Bird","year":"2023","unstructured":"Bird JJ, Ek\u00e1rt A, Faria DR. Chatbot interaction with artificial intelligence: human data augmentation with t5 and language transformer ensemble for text classification. J Amb Intell Human Comput. 2023;14(4):3129\u201344.","journal-title":"J Amb Intell Human Comput"},{"key":"149_CR23","doi-asserted-by":"publisher","first-page":"662","DOI":"10.1162\/tacl_a_00338","volume":"8","author":"M Bartolo","year":"2020","unstructured":"Bartolo M, Roberts A, Welbl J, Riedel S, Stenetorp P. Beat the ai: investigating adversarial human annotation for reading comprehension. Trans Assoc Comput Linguistics. 2020;8:662\u201378.","journal-title":"Trans Assoc Comput Linguistics"},{"key":"149_CR24","unstructured":"Courty B, Schmidt V, Luccioni S, Goyal-Kamal, Coutarel M, Feld B, Lecourt J, Connell J, Saboni A, Inimaz, supatomic, Mathilde L\u00e9val, Blanche L, Cruveiller A, ouminasara, Zhao F, Joshi A, Bogroff A, de\u00a0Lavoreille H, Laskaris N, Abati E, Blank D, Wang Z, ArminCatovic, Marc Alencon, Micha\u0142 St\u0229ch\u0142y, Bauer C, Otavio L, JPW, and MinervaBooks. mlco2\/codecarbon: v2.4.1, May 2024."},{"key":"149_CR25","doi-asserted-by":"crossref","unstructured":"Dai Z, Yang Z, Yang Y, Carbonell J, Le QV, Salakhutdinov R. Transformer-xl: attentive language models beyond a fixed-length context. arXiv preprintarXiv:1901.02860, 2019.","DOI":"10.18653\/v1\/P19-1285"}],"container-title":["Discover Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-024-00149-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44163-024-00149-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-024-00149-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,8]],"date-time":"2024-07-08T17:03:30Z","timestamp":1720458210000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44163-024-00149-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,8]]},"references-count":25,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["149"],"URL":"https:\/\/doi.org\/10.1007\/s44163-024-00149-w","relation":{},"ISSN":["2731-0809"],"issn-type":[{"value":"2731-0809","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,8]]},"assertion":[{"value":"15 December 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 June 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 July 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"49"}}