{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T02:32:59Z","timestamp":1768012379408,"version":"3.49.0"},"reference-count":60,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2024,6,27]],"date-time":"2024-06-27T00:00:00Z","timestamp":1719446400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"crossref","award":["DP210100041"],"award-info":[{"award-number":["DP210100041"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Faculty Postgraduate Publications Award"},{"name":"Faculty of Information Technology of Monash University"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2024,7,31]]},"abstract":"<jats:p>Defect predictors, static bug detectors, and humans inspecting the code can propose locations in the program that are more likely to be buggy before they are discovered through testing. Automated test generators such as search-based software testing (SBST) techniques can use this information to direct their search for test cases to likely buggy code, thus speeding up the process of detecting existing bugs in those locations. Often the predictions given by these tools or humans are imprecise, which can misguide the SBST technique and may deteriorate its performance. In this article, we study the impact of imprecision in defect prediction on the bug detection effectiveness of SBST.<\/jats:p>\n          <jats:p>Our study finds that the recall of the defect predictor, i.e., the proportion of correctly identified buggy code, has a significant impact on bug detection effectiveness of SBST with a large effect size. More precisely, the SBST technique detects 7.5 fewer bugs on average (out of 420 bugs) for every 5% decrements of the recall. However, the effect of precision, a measure for false alarms, is not of meaningful practical significance, as indicated by a very small effect size.<\/jats:p>\n          <jats:p>In the context of combining defect prediction and SBST, our recommendation is to increase the recall of defect predictors as a primary objective and precision as a secondary objective. In our experiments, we find that 75% precision is as good as 100% precision. To account for the imprecision of defect predictors, in particular low recall values, SBST techniques should be designed to search for test cases that also cover the predicted non-buggy parts of the program, while prioritising the parts that have been predicted as buggy.<\/jats:p>","DOI":"10.1145\/3655022","type":"journal-article","created":{"date-parts":[[2024,4,4]],"date-time":"2024-04-04T12:18:39Z","timestamp":1712233119000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["On the Impact of Lower Recall and Precision in Defect Prediction for Guiding Search-based Software Testing"],"prefix":"10.1145","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5080-9276","authenticated-orcid":false,"given":"Anjana","family":"Perera","sequence":"first","affiliation":[{"name":"Faculty of Information Technology, Monash University, Melbourne, Australia and Oracle Labs, Brisbane, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1511-2163","authenticated-orcid":false,"given":"Burak","family":"Turhan","sequence":"additional","affiliation":[{"name":"Faculty of Information Technology and Electrical Engineering, University of Oulu, Oulu, Finland and Monash University, Melbourne, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1716-690X","authenticated-orcid":false,"given":"Aldeida","family":"Aleti","sequence":"additional","affiliation":[{"name":"Faculty of Information Technology, Monash University, Melbourne, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4470-1824","authenticated-orcid":false,"given":"Marcel","family":"B\u00f6hme","sequence":"additional","affiliation":[{"name":"Max Planck Institute for Security and Privacy, Bochum, Germany and Monash University, Melbourne, Australia"}]}],"member":"320","published-online":{"date-parts":[[2024,6,27]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/SCAM.2012.28"},{"key":"e_1_3_1_3_2","article-title":"E-APR: Mapping the effectiveness of automated program repair","author":"Aleti Aldeida","year":"2020","unstructured":"Aldeida Aleti and Matias Martinez. 2020. E-APR: Mapping the effectiveness of automated program repair. arXiv preprint arXiv:2002.03968 (2020).","journal-title":"arXiv preprint arXiv:2002.03968"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE-SEIP.2017.27"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-013-9249-9"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/MS.2008.130"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11219-016-9353-3"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2006.888386"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/2642937.2643002"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1177\/001316447503500304"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-019-09778-7"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1037\/0033-2909.112.1.155"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSR.2019.00017"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/32.92910"},{"key":"e_1_3_1_15_2","unstructured":"EvoSuite. 2019. EvoSuite - Automated Generation of JUnit Test Suites for Java Classes. Retrieved from https:\/\/github.com\/EvoSuite\/evosuite"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.3758\/BF03193146"},{"key":"e_1_3_1_17_2","unstructured":"Martin Fowler and Matthew Foemmel. 2006. Continuous Integration. Retrieved from https:\/\/www.martinfowler.com\/articles\/continuousIntegration.html"},{"key":"e_1_3_1_18_2","unstructured":"Gordon Fraser. 2018. EvoSuite - Automatic Test Suite Generation for Java. Retrieved from http:\/\/www.evosuite.org\/"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/QSIC.2011.19"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2012.14"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICST.2017.38"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/2372251.2372285"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2011.103"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.5555\/2337223.2337247"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2019.2957794"},{"key":"e_1_3_1_26_2","article-title":"Large-scale manual validation of bug fixing commits: A fine-grained analysis of tangling","volume":"2011","author":"Herbold Steffen","year":"2020","unstructured":"Steffen Herbold, Alexander Trautsch, Benjamin Ledel, Alireza Aghamohammadi, Taher Ahmed Ghaleb, Kuljit Kaur Chahal, Tim Bossenmaier, Bhaveet Nagaria, Philip Makedonski, Matin Nili Ahmadabadi, Krist\u00f3f Szabados, Helge Spieker, Matej Madeja, Nathaniel Hoy, Valentina Lenarduzzi, Shangwen Wang, Gema Rodr\u00edguez-P\u00e9rez, Ricardo Colomo Palacios, Roberto Verdecchia, Paramvir Singh, Yihao Qin, Debasish Chakroborti, Willard Davis, Vijay Walunj, Hongjun Wu, Diego Marcilio, Omar Alam, Abdullah Aldaeej, Idan Amit, Burak Turhan, Simon Eismann, Anna-Katharina Wickert, Ivano Malavolta, Mat\u00fas Sul\u00edr, Fatemeh Fard, Austin Z. Henley, Stratos Kourtzanidis, Eray Tuzun, Christoph Treude, Simin Maleki Shamasbi, Ivan Pashchenko, Marvin Wyrich, James Davis, Alexander Serebrenik, Ella Albrecht, Ethem Utku Aktas, Daniel Str\u00fcber, and Johannes Erbel. 2020. Large-scale manual validation of bug fixing commits: A fine-grained analysis of tangling. CoRR abs\/2011.06244 (2020).","journal-title":"CoRR"},{"key":"e_1_3_1_27_2","volume-title":"Proceedings of the 30th International Workshop on Principles of Diagnosis DX\u201919","author":"Hershkovich Eran","year":"2019","unstructured":"Eran Hershkovich, Roni Stern, Rui Abreu, and Amir Elmishali. 2019. Prediction-guided software test generation. In Proceedings of the 30th International Workshop on Principles of Diagnosis DX\u201919."},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2017.2770124"},{"key":"e_1_3_1_29_2","unstructured":"Rene Just. 2019. Defects4J\u2014A Database of Real Faults and an Experimental Infrastructure to Enable Controlled Experiments in Software Engineering Research. Retrieved from https:\/\/github.com\/rjust\/defects4j"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/2610384.2628055"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/2635868.2635929"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/MS.2005.149"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2013.6606583"},{"key":"e_1_3_1_34_2","unstructured":"Chris Lewis and Rong Ou. 2011. Bug Prediction at Google. Retrieved from http:\/\/google-engtools.blogspot.com\/2011\/12\/"},{"key":"e_1_3_1_35_2","volume-title":"Comparing Welch ANOVA, a Kruskal-Wallis Test, and Traditional ANOVA in Case of Heterogeneity of Variance","author":"Liu Hangcheng","year":"2015","unstructured":"Hangcheng Liu. 2015. Comparing Welch ANOVA, a Kruskal-Wallis Test, and Traditional ANOVA in Case of Heterogeneity of Variance. Virginia Commonwealth University."},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/SANER.2019.8667991"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1951.10500769"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2007.70721"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.21236\/ADA143533"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/32.57623"},{"key":"e_1_3_1_41_2","unstructured":"A. J. V. Offutt. 1989. Automatic test data generation. (1989). Georgia Institute of Technology Tech. Rep."},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICST.2015.7102604"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2017.2663435"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2018.08.009"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICST.2019.00041"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2017.62"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/3324884.3416612"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2022.3147008"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-015-9424-2"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSM.1999.792604"},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2015.76"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1145\/3196398.3196473"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1002\/stvr.1701"},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2015.86"},{"key":"e_1_3_1_55_2","article-title":"Empirical evaluation of fault localisation using code and change metrics","author":"Sohn Joengju","year":"2019","unstructured":"Joengju Sohn and Shin Yoo. 2019. Empirical evaluation of fault localisation using code and change metrics. IEEE Trans. Softw. Eng. 47, 8 (2019).","journal-title":"IEEE Trans. Softw. Eng."},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.2307\/3001913"},{"key":"e_1_3_1_57_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2018.2877678"},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1145\/3383219.3383232"},{"issue":"3","key":"e_1_3_1_59_2","first-page":"295","article-title":"Which effect size measure is appropriate for one-way and two-way ANOVA models? A Monte Carlo simulation study","volume":"16","author":"Yigit Soner","year":"2018","unstructured":"Soner Yigit and Mehmet Mendes. 2018. Which effect size measure is appropriate for one-way and two-way ANOVA models? A Monte Carlo simulation study. Revstat Stat. J. 16, 3 (2018), 295\u2013313.","journal-title":"Revstat Stat. J."},{"key":"e_1_3_1_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2007.70706"},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/1595696.1595713"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3655022","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3655022","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:03:51Z","timestamp":1750291431000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3655022"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,27]]},"references-count":60,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,7,31]]}},"alternative-id":["10.1145\/3655022"],"URL":"https:\/\/doi.org\/10.1145\/3655022","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,27]]},"assertion":[{"value":"2022-09-06","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-02-29","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-06-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}