{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T22:56:18Z","timestamp":1773442578152,"version":"3.50.1"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2024,4,18]],"date-time":"2024-04-18T00:00:00Z","timestamp":1713398400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62032004, 61902050, 62202079"],"award-info":[{"award-number":["62032004, 61902050, 62202079"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Dalian Excellent Young Project","award":["2022RY35"],"award-info":[{"award-number":["2022RY35"]}]},{"name":"Postgraduate Education Reform Project of Liaoning Province","award":["Liao Jiao Tong [2023] No. 385-151"],"award-info":[{"award-number":["Liao Jiao Tong [2023] No. 385-151"]}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"crossref","award":["3132023255, 3132023257, DUT22RC(3)028, DUT22ZD101"],"award-info":[{"award-number":["3132023255, 3132023257, DUT22RC(3)028, DUT22ZD101"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Dalian Maritime University Teacher Development Project","award":["JF2023Y05"],"award-info":[{"award-number":["JF2023Y05"]}]},{"name":"China Higher Education Association 2023 Higher Education Scientific Research","award":["23LK0408"],"award-info":[{"award-number":["23LK0408"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2024,5,31]]},"abstract":"<jats:p>The aim of Just-In-Time (JIT) defect prediction is to predict software changes that are prone to defects in a project in a timely manner, thereby improving the efficiency of software development and ensuring software quality. Identifying changes that introduce bugs is a critical task in just-in-time defect prediction, and researchers have introduced the SZZ approach and its variants to label these changes. However, it has been shown that different SZZ algorithms introduce noise to the dataset to a certain extent, which may reduce the predictive performance of the model. To address this limitation, we propose the Confident Learning Imbalance (CLI) model. The model identifies and excludes samples whose labels may be corrupted by estimating the joint distribution of noisy labels and true labels, and mitigates the impact of noisy data on the performance of the prediction model. The CLI consists of two components: identifying noisy data (Confident Learning Component) and generating a predicted probability matrix for imbalanced data (Imbalanced Data Probabilistic Prediction Component). The IDPP component generates precise predicted probabilities for each instance in the training set, while the CL component uses the generated predicted probability matrix and noise labels to clean up the noise and build a classification model. We evaluate the performance of our model through extensive experiments on a total of 126,526 changes from ten Apache open source projects, and the results show that our model outperforms the baseline methods.<\/jats:p>","DOI":"10.1145\/3637226","type":"journal-article","created":{"date-parts":[[2023,12,11]],"date-time":"2023-12-11T11:26:45Z","timestamp":1702294005000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Estimating Uncertainty in Labeled Changes by SZZ Tools on Just-In-Time Defect Prediction"],"prefix":"10.1145","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8554-6365","authenticated-orcid":false,"given":"Shikai","family":"Guo","sequence":"first","affiliation":[{"name":"Dalian Maritime University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-6489-8541","authenticated-orcid":false,"given":"Dongmin","family":"Li","sequence":"additional","affiliation":[{"name":"Dalian Maritime University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3219-1131","authenticated-orcid":false,"given":"Lin","family":"Huang","sequence":"additional","affiliation":[{"name":"Dalian Maritime University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-2943-4113","authenticated-orcid":false,"given":"Sijia","family":"Lv","sequence":"additional","affiliation":[{"name":"Dalian Maritime University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5848-6398","authenticated-orcid":false,"given":"Rong","family":"Chen","sequence":"additional","affiliation":[{"name":"Dalian Maritime University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1923-0669","authenticated-orcid":false,"given":"Hui","family":"Li","sequence":"additional","affiliation":[{"name":"Dalian Maritime University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5068-1938","authenticated-orcid":false,"given":"Xiaochen","family":"Li","sequence":"additional","affiliation":[{"name":"Dalian University of Technology, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8674-4948","authenticated-orcid":false,"given":"He","family":"Jiang","sequence":"additional","affiliation":[{"name":"Dalian University of Technology, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,4,18]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"2021. CLI Details. (2021). Retrieved from https:\/\/github.com\/Andyldm\/CLI\/"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00116829"},{"issue":"5","key":"e_1_3_1_4_2","first-page":"1288","article-title":"Just-in-time software defect prediction: Literature review.","volume":"30","author":"R. Yan M Xia X. Cai L., Fan Y.","year":"2019","unstructured":"Yan M Xia X. Cai L., Fan Y. R.. 2019. Just-in-time software defect prediction: Literature review. Ruan Jian Xue Bao\/Journal of Software 30, 5 (2019), 1288\u20131307. Retrieved from http:\/\/www.jos.org.cn\/1000-9825\/5713.html","journal-title":"Ruan Jian Xue Bao\/Journal of Software"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2016.2616306"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-011-9173-9"},{"key":"e_1_3_1_7_2","article-title":"The impact of dormant defects on defect prediction: A study of 19 apache projects.","author":"Davide Falessi","year":"2022","unstructured":"Falessi Davide, Ahluwalia Aalok, and Penta Massimiliano Di. 2022. The impact of dormant defects on defect prediction: A study of 19 apache projects. ACM Transactions on Software Engineering and Methodology (2022).","journal-title":"ACM Transactions on Software Engineering and Methodology"},{"key":"e_1_3_1_8_2","first-page":"973","volume-title":"Proceedings of the 17th International Joint Conference on Artificial Intelligence, IJCAI 2001","author":"Elkan Charles","year":"2001","unstructured":"Charles Elkan. 2001. The foundations of cost-sensitive learning. In Proceedings of the 17th International Joint Conference on Artificial Intelligence, IJCAI 2001. Bernhard Nebel (Ed.), Morgan Kaufmann, 973\u2013978."},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2019.2929761"},{"key":"e_1_3_1_10_2","volume-title":"Data Mining: Concepts and Techniques, Second Edition","author":"Han Jiawei","year":"2006","unstructured":"Jiawei Han and Micheline Kamber. 2006. Data Mining: Concepts and Techniques, Second Edition. Elsevier."},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00524"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2008.239"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSR.2019.00016"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380361"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00342"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2013.6693087"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSM.2010.5609530"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/ESEM.2007.28"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2012.70"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1009896203228"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2007.70773"},{"key":"e_1_3_1_22_2","article-title":"Dealing with noise in defect prediction.","author":"Kim Sunghun","year":"2011","unstructured":"Sunghun Kim, Hongyu Zhang, Rongxin Wu, and Liang Gong. 2011. Dealing with noise in defect prediction. ACM International Conference on Software Engineering (2011).","journal-title":"ACM International Conference on Software Engineering"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2006.23"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2008.90"},{"key":"e_1_3_1_25_2","series-title":"Proceedings of Machine Learning Research","first-page":"3128","volume-title":"Proceedings of the 35th International Conference on Machine Learning, ICML 2018.","volume":"80","author":"Lipton Zachary C.","year":"2018","unstructured":"Zachary C. Lipton, Yu-Xiang Wang, and Alexander J. Smola. 2018. Detecting and correcting for label shift with black box predictors. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018.Jennifer G. Dy and Andreas Krause (Eds.), Proceedings of Machine Learning Research, Vol. 80, PMLR, 3128\u20133136. Retrieved from http:\/\/proceedings.mlr.press\/v80\/lipton18a.html"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2008.17"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10515-010-0069-5"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2015.12"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1002\/bltj.2229"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/SANER.2018.8330225"},{"key":"e_1_3_1_31_2","unstructured":"Curtis G. Northcutt Lu Jiang and Isaac L. Chuang. 2019. Confident learning: Estimating uncertainty in dataset labels. arXiv:1911.00068. Retrieved from https:\/\/arxiv.org\/abs\/1911.00068"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSR52588.2021.00049"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-021-09970-8"},{"key":"e_1_3_1_34_2","article-title":"Watch out for extrinsic bugs! a case study of their impact in just-in-time bug prediction models on the openstack project.","author":"Rodr\u00edguez-P\u00e9rez G.","year":"2020","unstructured":"G. Rodr\u00edguez-P\u00e9rez, M. Nagappan, and G. Robles. 2020. Watch out for extrinsic bugs! a case study of their impact in just-in-time bug prediction models on the openstack project. IEEE Transactions on Software Engineering (2020).","journal-title":"IEEE Transactions on Software Engineering"},{"issue":"6","key":"e_1_3_1_35_2","first-page":"662","article-title":"Investigation of the effects of factor analysis based dimension reduction on classification performance","volume":"28","author":"Shi Hongbo","year":"2007","unstructured":"Hongbo Shi and Yali L\u00fc. 2007. Investigation of the effects of factor analysis based dimension reduction on classification performance. Zhongbei Daxue Xuebao (Ziran Kexue Ban)\/Journal of North University of China (Natural Science Edition) 28, 6 (2007), 662\u2013677.","journal-title":"Zhongbei Daxue Xuebao (Ziran Kexue Ban)\/Journal of North University of China (Natural Science Edition)"},{"key":"e_1_3_1_36_2","article-title":"An extensive empirical study of inconsistent labels in multi-version-project defect datasets.","author":"Shiran Liu","year":"2022","unstructured":"Liu Shiran, Zhaoqiang Guo, Yanhui Li, Chuanqi Wang, Lin Chen, Zhongbin Sun, and Yuming Zhou.2022. An extensive empirical study of inconsistent labels in multi-version-project defect datasets. arXiv.org (2022).","journal-title":"arXiv.org"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/1082983.1083147"},{"key":"e_1_3_1_38_2","volume-title":"Software Maintenance Management:","unstructured":"Swanson and E. Burton. Software Maintenance Management:. Software Maintenance Management:."},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-008-9103-7"},{"key":"e_1_3_1_40_2","unstructured":"Colin Wei Jason D. Lee Qiang Liu and Tengyu Ma. 2018. On the margin theory of feedforward neural networks. arXiv:1810.05369. Retrieved from https:\/\/arxiv.org\/abs\/1810.05369"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2017.03.007"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/QRS.2015.14"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/2950290.2950353"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-004-0751-8"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/1368088.1368161"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3637226","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3637226","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T23:43:42Z","timestamp":1750290222000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3637226"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,18]]},"references-count":44,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,5,31]]}},"alternative-id":["10.1145\/3637226"],"URL":"https:\/\/doi.org\/10.1145\/3637226","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,18]]},"assertion":[{"value":"2022-02-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-12-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-04-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}