{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T17:59:39Z","timestamp":1777485579186,"version":"3.51.4"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2024,6,13]],"date-time":"2024-06-13T00:00:00Z","timestamp":1718236800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,6,13]],"date-time":"2024-06-13T00:00:00Z","timestamp":1718236800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100016135","name":"Universit\u00e4t Passau","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100016135","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Empir Software Eng"],"published-print":{"date-parts":[[2024,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Context<\/jats:title>\n                <jats:p>The identification of bugs within issues reported to an issue tracking system is crucial for triage. Machine learning models have shown promising results for this task. However, we have only limited knowledge of how such models identify bugs. Explainable AI methods like LIME and SHAP can be used to increase this knowledge.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Objective<\/jats:title>\n                <jats:p>We want to understand if explainable AI provides explanations that are reasonable to us as humans and align with our assumptions about the model\u2019s decision-making. We also want to know if the quality of predictions is correlated with the quality of explanations.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Methods<\/jats:title>\n                <jats:p>We conduct a study where we rate LIME and SHAP explanations based on their quality of explaining the outcome of an issue type prediction model. For this, we rate the quality of the explanations, i.e., if they align with our expectations and help us understand the underlying machine learning model.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We found that both LIME and SHAP give reasonable explanations and that correct predictions are well explained. Further, we found that SHAP outperforms LIME due to a lower ambiguity and a higher contextuality that can be attributed to the ability of the deep SHAP variant to capture sentence fragments.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusion<\/jats:title>\n                <jats:p>We conclude that the model finds explainable signals for both bugs and non-bugs. Also, we recommend that research dealing with the quality of explanations for classification tasks reports and investigates rater agreement, since the rating of explanations is highly subjective.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1007\/s10664-024-10469-1","type":"journal-article","created":{"date-parts":[[2024,6,13]],"date-time":"2024-06-13T08:01:51Z","timestamp":1718265711000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["Studying the explanations for the automated prediction of bug and non-bug issues using LIME and SHAP"],"prefix":"10.1007","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9336-2075","authenticated-orcid":false,"given":"Lukas","family":"Schulte","sequence":"first","affiliation":[]},{"given":"Benjamin","family":"Ledel","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9765-2803","authenticated-orcid":false,"given":"Steffen","family":"Herbold","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,6,13]]},"reference":[{"key":"10469_CR1","unstructured":"Alvarez-Melis D, Jaakkola TS (2018) On the robustness of interpretability methods. 1806.08049"},{"key":"10469_CR2","unstructured":"Biran O, Cotton C (2017) Explanation and justification in machine learning: a survey. In: IJCAI-17 workshop on explainable AI (XAI), vol\u00a08. pp 8\u201313"},{"key":"10469_CR3","volume-title":"Statistical Power Analysis for the Behavioral Sciences","author":"J Cohen","year":"1988","unstructured":"Cohen J (1988) Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates"},{"key":"10469_CR4","unstructured":"Cook TD, Campbell DT, Day A (1979) Quasi-experimentation: Design & analysis issues for field settings, vol 351. Houghton Mifflin Boston"},{"issue":"3","key":"10469_CR5","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1007\/BF02310555","volume":"16","author":"LJ Cronbach","year":"1951","unstructured":"Cronbach LJ (1951) Coefficient alpha and the internal structure of tests. Psychometrika 16(3):297\u2013334","journal-title":"Psychometrika"},{"issue":"5","key":"10469_CR6","doi-asserted-by":"publisher","first-page":"378","DOI":"10.1037\/h0031619","volume":"76","author":"JL Fleiss","year":"1971","unstructured":"Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378","journal-title":"Psychol Bull"},{"key":"10469_CR7","unstructured":"Garreau D, von Luxburg U (2020) Looking deeper into tabular lime. 2008.11092"},{"key":"10469_CR8","doi-asserted-by":"crossref","unstructured":"Goyal Y, Khot T, Summers-Stay D, Batra D, Parikh D (2017) Making the v in vqa matter: Elevating the role of image understanding in visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 6904\u20136913","DOI":"10.1109\/CVPR.2017.670"},{"issue":"5","key":"10469_CR9","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3236009","volume":"51","author":"R Guidotti","year":"2018","unstructured":"Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2018) A survey of methods for explaining black box models. ACM Comput Survey 51(5):1\u201342","journal-title":"ACM Comput Survey"},{"issue":"6","key":"10469_CR10","doi-asserted-by":"publisher","first-page":"5333","DOI":"10.1007\/s10664-020-09885-w","volume":"25","author":"S Herbold","year":"2020","unstructured":"Herbold S, Trautsch A, Trautsch F (2020) On the feasibility of automated prediction of bug and non-bug issues. Empir Softw Eng 25(6):5333\u20135369","journal-title":"Empir Softw Eng"},{"issue":"2","key":"10469_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s10664-021-10092-4","volume":"27","author":"S Herbold","year":"2022","unstructured":"Herbold S, Trautsch A, Trautsch F, Ledel B (2022) Problems with szz and features: An empirical study of the state of practice of defect prediction data collection. Empir Softw Eng 27(2):1\u201349","journal-title":"Empir Softw Eng"},{"key":"10469_CR12","doi-asserted-by":"crossref","unstructured":"Herzig K, Just S, Zeller A (2013) It\u2019s not a bug, it\u2019s a feature: how misclassification impacts bug prediction. In: 2013 35th international conference on software engineering (ICSE). IEEE, pp 392\u2013401","DOI":"10.1109\/ICSE.2013.6606585"},{"key":"10469_CR13","doi-asserted-by":"crossref","unstructured":"Jiarpakdee J, Tantithamthavorn CK, Grundy J (2021) Practitioner\u2019s perceptions of the goals and visual explanations of defect prediction models. In: 2021 IEEE\/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE, pp 432\u2013443","DOI":"10.1109\/MSR52588.2021.00055"},{"key":"10469_CR14","unstructured":"Kokalj E, \u0160krlj B, Lavra\u010d N, Pollak S, Robnik-\u0160ikonja M (2021) Bert meets shapley: Extending shap explanations to transformer-based classifiers. In: Proceedings of the EACL hackashop on news media content analysis and automated report generation. pp 16\u201321"},{"key":"10469_CR15","unstructured":"Ledel B, Herbold S (2022) Studying the explanations for the automated prediction of bug and non-bug issues using lime and shap. arXiv:2209.07623"},{"key":"10469_CR16","unstructured":"Likert R (1932) A technique for the measurement of attitudes. Archives of psychology"},{"key":"10469_CR17","unstructured":"Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. Adv Neural Inform Process Syst 30"},{"issue":"4","key":"10469_CR18","doi-asserted-by":"publisher","first-page":"1982","DOI":"10.1007\/s10664-017-9574-5","volume":"23","author":"L Madeyski","year":"2018","unstructured":"Madeyski L, Kitchenham B (2018) Effect sizes and their variance for ab\/ba crossover design studies. Empir Softw Eng 23(4):1982\u20132017","journal-title":"Empir Softw Eng"},{"key":"10469_CR19","doi-asserted-by":"publisher","first-page":"104041","DOI":"10.1016\/j.compbiomed.2020.104041","volume":"126","author":"PR Magesh","year":"2020","unstructured":"Magesh PR, Myloth RD, Tom RJ (2020) An explainable machine learning model for early detection of parkinson\u2019s disease using lime on datscan imagery. Comput Biol Med 126:104041","journal-title":"Comput Biol Med"},{"issue":"1","key":"10469_CR20","doi-asserted-by":"publisher","first-page":"127","DOI":"10.3905\/jfds.2020.1.047","volume":"3","author":"X Man","year":"2021","unstructured":"Man X, Chan EP (2021) The best way to select features? comparing mda, lime, and shap. J Financial Data Sci 3(1):127\u2013139","journal-title":"J Financial Data Sci"},{"key":"10469_CR21","doi-asserted-by":"crossref","unstructured":"Mills C, Pantiuchina J, Parra E, Bavota G, Haiduc S (2018) Are bug reports enough for text retrieval-based bug localization? In: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, pp 381\u2013392","DOI":"10.1109\/ICSME.2018.00046"},{"key":"10469_CR22","unstructured":"Mishra S, Sturm BL, Dixon S (2017) Local interpretable model-agnostic explanations for music content analysis. In: ISMIR. pp 537\u2013543"},{"key":"10469_CR23","doi-asserted-by":"crossref","unstructured":"Palacio DN, McCrystal D, Moran K, Bernal-C\u00e1rdenas C, Poshyvanyk D, Shenefiel C (2019) Learning to identify security-related issues using convolutional neural networks. In: 2019 IEEE International conference on software maintenance and evolution (ICSME). IEEE, pp 140\u2013144","DOI":"10.1109\/ICSME.2019.00024"},{"key":"10469_CR24","doi-asserted-by":"crossref","unstructured":"Palatnik de Sousa I, Vellasco Maria Bernardes Rebuzzi, M, Costa da Silva E (2019) Local interpretable model-agnostic explanations for classification of lymph node metastases. Sensors 19(13):2969","DOI":"10.3390\/s19132969"},{"key":"10469_CR25","doi-asserted-by":"crossref","unstructured":"Pornprasit C, Tantithamthavorn C, Jiarpakdee J, Fu M, Thongtanunam P (2021) Pyexplainer: Explaining the predictions of just-in-time defect models. In: 2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE), IEEE, pp 407\u2013418","DOI":"10.1109\/ASE51524.2021.9678763"},{"key":"10469_CR26","doi-asserted-by":"crossref","unstructured":"Ribeiro MT, Singh S, Guestrin C (2016a) \" why should i trust you?\" explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. pp 1135\u20131144","DOI":"10.1145\/2939672.2939778"},{"key":"10469_CR27","unstructured":"Ribeiro MT, Singh S, Guestrin C (2016b) Model-agnostic interpretability of machine learning. 1606.05386"},{"key":"10469_CR28","doi-asserted-by":"crossref","unstructured":"Roy S, Laberge G, Roy B, Khomh F, Nikanjam A, Mondal S (2022) Why don\u2019t xai techniques agree? characterizing the disagreements between post-hoc explanations of defect predictions. In: 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, pp 444\u2013448","DOI":"10.1109\/ICSME55016.2022.00056"},{"issue":"2","key":"10469_CR29","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1007\/s10664-008-9102-8","volume":"14","author":"P Runeson","year":"2009","unstructured":"Runeson P, H\u00f6st M (2009) Guidelines for conducting and reporting case study research in software engineering. Empir Softw Eng 14(2):131\u2013164","journal-title":"Empir Softw Eng"},{"key":"10469_CR30","doi-asserted-by":"publisher","first-page":"34","DOI":"10.3389\/fpsyg.2012.00034","volume":"3","author":"Y Sheng","year":"2012","unstructured":"Sheng Y, Sheng Z (2012) Is coefficient alpha robust to non-normal data? Front Psychol 3:34","journal-title":"Front Psychol"},{"key":"10469_CR31","doi-asserted-by":"crossref","unstructured":"Slack D, Hilgard S, Jia E, Singh S, Lakkaraju H (2020) Fooling lime and shap: Adversarial attacks on post hoc explanation methods. In: Proceedings of the AAAI\/ACM conference on AI, ethics, and society. pp 180\u2013186","DOI":"10.1145\/3375627.3375830"},{"issue":"1","key":"10469_CR32","doi-asserted-by":"publisher","first-page":"72","DOI":"10.2307\/1412159","volume":"15","author":"C Spearman","year":"1904","unstructured":"Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15(1):72. https:\/\/doi.org\/10.2307\/1412159","journal-title":"Am J Psychol"},{"issue":"6","key":"10469_CR33","doi-asserted-by":"publisher","first-page":"1273","DOI":"10.1007\/s11165-016-9602-2","volume":"48","author":"KS Taber","year":"2017","unstructured":"Taber KS (2017) The use of cronbach\u2019s alpha when developing and reporting research instruments in science education. Res Sci Educ 48(6):1273\u20131296. https:\/\/doi.org\/10.1007\/s11165-016-9602-2","journal-title":"Res Sci Educ"},{"key":"10469_CR34","doi-asserted-by":"crossref","unstructured":"Taherdoost H (2016) Sampling methods in research methodology; how to choose a sampling technique for research. How to choose a sampling technique for research (April 10, 2016)","DOI":"10.2139\/ssrn.3205035"},{"key":"10469_CR35","doi-asserted-by":"crossref","unstructured":"Tantithamthavorn CK, Jiarpakdee J (2021) Explainable ai for software engineering. In: 2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE). IEEE, pp 1\u20132","DOI":"10.1109\/ASE51524.2021.9678580"},{"key":"10469_CR36","doi-asserted-by":"crossref","unstructured":"Trautsch A, Herbold S (2022) Predicting issue types with sebert. In: Proceedings of the 1st international workshop on natural language-based software engineering. pp 37\u201339","DOI":"10.1145\/3528588.3528661"},{"key":"10469_CR37","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, Curran Associates, Inc., vol\u00a030. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2017\/file\/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf"},{"key":"10469_CR38","doi-asserted-by":"crossref","unstructured":"Visani G, Bagli E, Chesani F, Poluzzi A, Capuzzo D (2020) Statistical stability indices for lime: obtaining reliable explanations for machine learning models. Journal of the Operational Research Society pp 1\u201311","DOI":"10.1080\/01605682.2020.1865846"},{"issue":"4","key":"10469_CR39","doi-asserted-by":"publisher","first-page":"1487","DOI":"10.1109\/TSE.2022.3178469","volume":"49","author":"J Von der Mosel","year":"2022","unstructured":"Von der Mosel J, Trautsch A, Herbold S (2022) On the validity of pre-trained transformers for natural language processing in the software engineering domain. IEEE Trans Software Eng 49(4):1487\u20131507","journal-title":"IEEE Trans Software Eng"},{"key":"10469_CR40","unstructured":"Wattanakriengkrai S, Thongtanunam P, Tantithamthavorn C, Hata H, ichi Matsumoto K (2020) Predicting defective lines using a model-agnostic technique. arXiv:2009.03612"},{"key":"10469_CR41","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-29044-2","volume-title":"Experimentation in Software Engineering","author":"C Wohlin","year":"2012","unstructured":"Wohlin C, Runeson P, H\u00f6st M, Ohlsson MC, Regnell B, Wesslen A (2012) Experimentation in Software Engineering. Springer Publishing Company, Incorporated"},{"key":"10469_CR42","unstructured":"Zhang Y, Song K, Sun Y, Tan S, Udell M (2019) Why should you trust my explanation? Understanding Uncertainty in LIME Explanations"}],"container-title":["Empirical Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-024-10469-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10664-024-10469-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-024-10469-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,5]],"date-time":"2024-07-05T15:21:16Z","timestamp":1720192876000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10664-024-10469-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,13]]},"references-count":42,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,7]]}},"alternative-id":["10469"],"URL":"https:\/\/doi.org\/10.1007\/s10664-024-10469-1","relation":{},"ISSN":["1382-3256","1573-7616"],"issn-type":[{"value":"1382-3256","type":"print"},{"value":"1573-7616","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,13]]},"assertion":[{"value":"27 February 2024","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 June 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no competing interests to declare that are relevant to the content of this article.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflicts of interest\/Competing interests"}}],"article-number":"93"}}