{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T00:18:50Z","timestamp":1775089130717,"version":"3.50.1"},"reference-count":46,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2025,11,7]],"date-time":"2025-11-07T00:00:00Z","timestamp":1762473600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Software"],"abstract":"<jats:p>Root cause analysis (RCA) identifies the faults and vulnerabilities underlying software failures, informing better design and maintenance decisions. Earlier approaches typically framed RCA as a classification task, predicting coarse categories of root causes. With recent advances in large language models (LLMs), RCA can be treated as a generative task that produces natural language explanations of faults. We introduce RCEGen, a framework that leverages state-of-the-art open-source LLMs to generate root cause explanations (RCEs) directly from bug reports. Using 298 reports, we evaluated five LLMs in conjunction with human developers and LLM judges across three key aspects: correctness, clarity, and reasoning depth. Qwen2.5-Coder-Instruct achieved the strongest performance (correctness \u2248 0.89, clarity \u2248 0.88, reasoning \u2248 0.65, overall \u2248 0.79), and RCEs exhibited high semantic fidelity (CodeBERTScore \u2248 0.98) to developer-written references despite low lexical overlap. The results demonstrated that LLMs achieve high accuracy in root cause identification from bug report titles and descriptions, particularly when reports contained error logs and reproduction steps.<\/jats:p>","DOI":"10.3390\/software4040029","type":"journal-article","created":{"date-parts":[[2025,11,7]],"date-time":"2025-11-07T14:55:15Z","timestamp":1762527315000},"page":"29","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["RCEGen: A Generative Approach for Automated Root Cause Analysis Using Large Language Models (LLMs)"],"prefix":"10.3390","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-9714-1877","authenticated-orcid":false,"given":"Rubel Hassan","family":"Mollik","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, College of Engineering, University of North Texas, Denton, TX 76207, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-4875-3762","authenticated-orcid":false,"given":"Arup","family":"Datta","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, College of Engineering, University of North Texas, Denton, TX 76207, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-5867-8699","authenticated-orcid":false,"given":"Anamul Haque","family":"Mollah","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, College of Engineering, University of North Texas, Denton, TX 76207, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6700-719X","authenticated-orcid":false,"given":"Wajdi","family":"Aljedaani","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, College of Engineering, University of North Texas, Denton, TX 76207, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,7]]},"reference":[{"key":"ref_1","first-page":"45","article-title":"Root cause analysis for beginners","volume":"37","author":"Rooney","year":"2004","journal-title":"Qual. Prog."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Van Moll, J., Jacobs, J., Freimut, B., and Trienekens, J. (2002, January 6\u20138). The importance of life cycle modeling to defect detection and prevention. Proceedings of the 10th International Workshop on Software Technology and Engineering Practice, Montreal, QC, Canada.","DOI":"10.1109\/STEP.2002.1267624"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Adeel, K., Ahmad, S., and Akhtar, S. (2005, January 27). Defect prevention techniques and its usage in requirements gathering-industry practices. Proceedings of the 2005 Student Conference on Engineering Sciences and Technology, Karachi, Pakistan.","DOI":"10.1109\/SCONEST.2005.4382875"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Davies, S., and Roper, M. (2014, January 18\u201319). What\u2019s in a bug report?. Proceedings of the 8th ACM\/IEEE International Symposium on Empirical Software Engineering and Measurement, Torino, Italy.","DOI":"10.1145\/2652524.2652541"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Xia, X., Lo, D., Wang, X., and Zhou, B. (2013, January 14\u201317). Accurate developer recommendation for bug resolution. Proceedings of the 2013 20th Working Conference on Reverse Engineering (WCRE), Koblenz, Germany.","DOI":"10.1109\/WCRE.2013.6671282"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Bettenburg, N., Just, S., Schr\u00f6ter, A., Weiss, C., Premraj, R., and Zimmermann, T. (2008, January 9\u201314). What makes a good bug report?. Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Atlanta, GA, USA.","DOI":"10.1145\/1453101.1453146"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2492248.2492263","article-title":"Empirical study of root cause analysis of software failure","volume":"38","author":"Dalal","year":"2013","journal-title":"ACM SIGSOFT Softw. Eng. Notes"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Baysal, O., Holmes, R., and Godfrey, M.W. (2012, January 5). Revisiting bug triage and resolution practices. Proceedings of the 2012 First International Workshop on User Evaluation for Software Engineering Researchers (USER), Zurich, Switzerland.","DOI":"10.1109\/USER.2012.6226578"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Lal, H., and Pahwa, G. (2017, January 12\u201313). Root cause analysis of software bugs using machine learning techniques. Proceedings of the 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence, Noida, India.","DOI":"10.1109\/CONFLUENCE.2017.7943132"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Hirsch, T., and Hofer, B. (2020, January 12\u201315). Root cause prediction based on bug reports. Proceedings of the 2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), Coimbra, Portugal.","DOI":"10.1109\/ISSREW51248.2020.00067"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"100189","DOI":"10.1016\/j.array.2022.100189","article-title":"Using textual bug reports to predict the fault category of software bugs","volume":"15","author":"Hirsch","year":"2022","journal-title":"Array"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"63916","DOI":"10.1109\/ACCESS.2023.3288156","article-title":"Nature-based prediction model of bug reports based on Ensemble Machine Learning Model","volume":"11","author":"Alsaedi","year":"2023","journal-title":"IEEE Access"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"985","DOI":"10.1007\/s11219-024-09675-3","article-title":"LLM-BRC: A large language model-based bug report classification framework","volume":"32","author":"Du","year":"2024","journal-title":"Softw. Qual. J."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1665","DOI":"10.1007\/s10664-013-9258-8","article-title":"Bug characteristics in open source software","volume":"19","author":"Tan","year":"2014","journal-title":"Empir. Softw. Eng."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Thung, F., Lo, D., and Jiang, L. (2013, January 14\u201317). Automatic recovery of root causes from bug-fixing changes. Proceedings of the 2013 20th Working Conference on Reverse Engineering (WCRE), Koblenz, Germany.","DOI":"10.1109\/WCRE.2013.6671284"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Kawrykow, D., and Robillard, M.P. (2011, January 21\u201328). Non-essential changes in version histories. Proceedings of the 33rd International Conference on Software Engineering, Honolulu, HI, USA.","DOI":"10.1145\/1985793.1985842"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"110538","DOI":"10.1016\/j.jss.2020.110538","article-title":"Analyzing bug fix for automatic bug cause classification","volume":"163","author":"Ni","year":"2020","journal-title":"J. Syst. Softw."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"943","DOI":"10.1109\/32.177364","article-title":"Orthogonal defect classification-a concept for in-process measurements","volume":"18","author":"Chillarege","year":"1992","journal-title":"IEEE Trans. Softw. Eng."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1109\/TSE.2007.70731","article-title":"Change distilling: Tree differencing for fine-grained source code change extraction","volume":"33","author":"Fluri","year":"2007","journal-title":"IEEE Trans. Softw. Eng."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Falleri, J.R., Morandat, F., Blanc, X., Martinez, M., and Monperrus, M. (2014, January 15\u201319). Fine-grained and accurate source code differencing. Proceedings of the 29th ACM\/IEEE International Conference on Automated Software Engineering, Vsters, Sweden.","DOI":"10.1145\/2642937.2642982"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zhou, B., Neamtiu, I., and Gupta, R. (2015, January 27\u201329). Predicting concurrency bugs: How many, what kind and where are they?. Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, Nanjing, China.","DOI":"10.1145\/2745802.2745807"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"50496","DOI":"10.1109\/ACCESS.2021.3069248","article-title":"Capbug-a framework for automatic bug categorization and prioritization using nlp and machine learning algorithms","volume":"9","author":"Ahmed","year":"2021","journal-title":"IEEE Access"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Tabassum, N., Namoun, A., Alyas, T., Tufail, A., Taqi, M., and Kim, K.H. (2023). Classification of bugs in cloud computing applications using machine learning techniques. Appl. Sci., 13.","DOI":"10.3390\/app13052880"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Limsettho, N., Hata, H., Monden, A., and Matsumoto, K. (2014, January 12\u201313). Automatic unsupervised bug report categorization. Proceedings of the 2014 6th International Workshop on Empirical Software Engineering in Practice, Osaka, Japan.","DOI":"10.1109\/IWESEP.2014.8"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1027","DOI":"10.1142\/S0218194016500352","article-title":"Unsupervised bug report categorization using clustering and labeling algorithm","volume":"26","author":"Limsettho","year":"2016","journal-title":"Int. J. Softw. Eng. Knowl. Eng."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"630","DOI":"10.1049\/sfw2.12073","article-title":"An unsupervised cross project model for crashing fault residence identification","volume":"16","author":"Liu","year":"2022","journal-title":"IET Softw."},{"key":"ref_27","unstructured":"(2010). IEEE Standard Classification for Software Anomalies (Standard No. IEEE Std 1044-2009)."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Catolino, G., Palomba, F., Zaidman, A., and Ferrucci, F. (2019). Not all bugs are the same: Understanding, characterizing, and classifying the root cause of bugs. arXiv.","DOI":"10.1016\/j.jss.2019.03.002"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Ahmed, T., Pai, K.S., Devanbu, P., and Barr, E. (2024, January 14\u201320). Automatic semantic augmentation of language model prompts (for code summarization). Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering, Lisbon, Portugal.","DOI":"10.1145\/3597503.3639183"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Jin, M., Shahriar, S., Tufano, M., Shi, X., Lu, S., Sundaresan, N., and Svyatkovskiy, A. (2023, January 3\u20139). Inferfix: End-to-end program repair with llms. Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, San Francisco, CA, USA.","DOI":"10.1145\/3611643.3613892"},{"key":"ref_31","unstructured":"Plein, L., and Bissyand\u00e9, T.F. (2023). Can llms demystify bug reports?. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"78562","DOI":"10.1109\/ACCESS.2024.3397326","article-title":"SUMLLAMA: Efficient Contrastive Representations and Fine-Tuned Adapters for Bug Report Summarization","volume":"12","author":"Xiang","year":"2024","journal-title":"IEEE Access"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhang, X., Ghosh, S., Bansal, C., Wang, R., Ma, M., Kang, Y., and Rajmohan, S. (2024, January 15\u201319). Automated root causing of cloud incidents using in-context learning with GPT-4. Proceedings of the Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering, Porto de Galinhas, Brazil.","DOI":"10.1145\/3663529.3663846"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Du, X., Li, C., Ma, X., and Zheng, Z. (2024, January 14\u201320). How Does Pre-trained Language Model Perform on Deep Learning Framework Bug Prediction?. Proceedings of the 2024 IEEE\/ACM 46th International Conference on Software Engineering: Companion Proceedings, Lisbon, Portugal.","DOI":"10.1145\/3639478.3643113"},{"key":"ref_35","unstructured":"Kumar, A., Haiduc, S., Das, P.P., and Chakrabarti, P.P. (2024). LLMs as Evaluators: A Novel Approach to Evaluate Bug Report Summarization. arXiv."},{"key":"ref_36","first-page":"21558","article-title":"Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation","volume":"36","author":"Liu","year":"2023","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_37","unstructured":"Hui, B., Yang, J., Cui, Z., Yang, J., Liu, D., Zhang, L., Liu, T., Zhang, J., Yu, B., and Lu, K. (2024). Qwen2.5-Coder Technical Report. arXiv."},{"key":"ref_38","unstructured":"Guo, D., Zhu, Q., Yang, D., Xie, Z., Dong, K., Zhang, W., Chen, G., Bi, X., Wu, Y., and Li, Y.K. (2024). DeepSeek-Coder: When the Large Language Model Meets Programming\u2014The Rise of Code Intelligence. arXiv."},{"key":"ref_39","unstructured":"AI, M. (2025, July 03). Codestral: A State-of-the-Art Code Language Model. Available online: https:\/\/mistral.ai\/news\/codestral\/."},{"key":"ref_40","unstructured":"Rozi\u00e8re, B., Gehring, J., Gloeckle, F., Sootla, S., Gat, I., Tan, X.E., Adi, Y., Liu, J., Sauvestre, R., and Remez, T. (2024). Code Llama: Open Foundation Models for Code. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Huang, S., Cheng, T., Liu, J.K., Hao, J., Song, L., Xu, Y., Yang, J., Liu, J., Zhang, C., and Chai, L. (2025). OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models. arXiv.","DOI":"10.18653\/v1\/2025.acl-long.1591"},{"key":"ref_42","unstructured":"Hurst, A., Lerer, A., Goucher, A.P., Perelman, A., Ramesh, A., Clark, A., Ostrow, A., Welihinda, A., Hayes, A., and Radford, A. (2024). GPT-4o System Card. arXiv."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1955","DOI":"10.1145\/3728963","article-title":"Can llms replace human evaluators? an empirical study of llm-as-a-judge in software engineering","volume":"2","author":"Wang","year":"2025","journal-title":"Proc. ACM Softw. Eng."},{"key":"ref_44","unstructured":"Tan, S., Zhuang, S., Montgomery, K., Tang, W.Y., Cuadron, A., Wang, C., Popa, R.A., and Stoica, I. (2024). Judgebench: A benchmark for evaluating llm-based judges. arXiv."},{"key":"ref_45","unstructured":"Liu, A., Feng, B., Xue, B., Wang, B., Wu, B., Lu, C., Zhao, C., Deng, C., Zhang, C., and Ruan, C. (2025). DeepSeek-V3 Technical Report. arXiv."},{"key":"ref_46","unstructured":"Yamane, T. (1973). Statistics: An Introductory Analysis, John Weatherhill, Inc."}],"container-title":["Software"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2674-113X\/4\/4\/29\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,11]],"date-time":"2025-11-11T05:14:06Z","timestamp":1762838046000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2674-113X\/4\/4\/29"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,7]]},"references-count":46,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["software4040029"],"URL":"https:\/\/doi.org\/10.3390\/software4040029","relation":{},"ISSN":["2674-113X"],"issn-type":[{"value":"2674-113X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,7]]}}}