{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T21:44:26Z","timestamp":1774129466519,"version":"3.50.1"},"reference-count":61,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2025,2,20]],"date-time":"2025-02-20T00:00:00Z","timestamp":1740009600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2025,3,31]]},"abstract":"<jats:p>\n            In this article, we empirically study the suitability of tests as acceptance criteria for automated program fixes, by checking patches produced by automated repair tools using a bug-finding tool, as opposed to previous works that used tests or manual inspections. We develop a number of experiments in which faulty programs from\n            <jats:italic>IntroClass<\/jats:italic>\n            , a known benchmark for program repair techniques, are fed to the program repair tools GenProg, Angelix, AutoFix, and Nopol, using test suites of varying quality, including those accompanying the benchmark. We then check the produced patches against formal specifications using a bug-finding tool. Our results show that, in the studied scenarios, automated program repair tools are significantly more likely to accept a spurious program fix than producing an actual one. Using bounded-exhaustive suites larger than the originally given ones (with about 100 and 1,000 tests) we verify that overfitting is reduced but (a) few new correct repairs are generated and (b) some tools see their performance reduced by the larger suites and fewer correct repairs are produced. Finally, by comparing with previous work, we show that overfitting is underestimated in semantics-based tools and that patches not discarded using held-out tests may be discarded using a bug-finding tool.\n          <\/jats:p>","DOI":"10.1145\/3702971","type":"journal-article","created":{"date-parts":[[2024,11,12]],"date-time":"2024-11-12T15:53:09Z","timestamp":1731426789000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["An Empirical Study on the Suitability of Test-based Patch Acceptance Criteria"],"prefix":"10.1145","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-8804-7331","authenticated-orcid":false,"given":"Luciano","family":"Zem\u00edn","sequence":"first","affiliation":[{"name":"Instituto Tecnol\u00f3gico de Buenos Aires (ITBA), Buenos Aires, Argentina"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9198-185X","authenticated-orcid":false,"given":"Ariel","family":"Godio","sequence":"additional","affiliation":[{"name":"Instituto Tecnol\u00f3gico de Buenos Aires (ITBA), Buenos Aires, Argentina"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3716-3607","authenticated-orcid":false,"given":"C\u00e9sar","family":"Cornejo","sequence":"additional","affiliation":[{"name":"National Council for Scientific and Technical Research (CONICET) and Department of Computer Science, National University of R\u00edo Cuarto, R\u00edo Cuarto, Argentina"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1611-3969","authenticated-orcid":false,"given":"Renzo","family":"Degiovanni","sequence":"additional","affiliation":[{"name":"Luxembourg Institute of Science and Technology, Esch-sur-Alzette, Luxembourg"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3115-2739","authenticated-orcid":false,"given":"Sim\u00f3n","family":"Guti\u00e9rrez Brida","sequence":"additional","affiliation":[{"name":"Department of Computer Science, National University of R\u00edo Cuarto, R\u00edo Cuarto, Argentina"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0979-4623","authenticated-orcid":false,"given":"Germ\u00e1n","family":"Regis","sequence":"additional","affiliation":[{"name":"Department of Computer Science, National University of R\u00edo Cuarto, R\u00edo Cuarto, Argentina"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0532-5296","authenticated-orcid":false,"given":"Nazareno","family":"Aguirre","sequence":"additional","affiliation":[{"name":"National Council for Scientific and Technical Research (CONICET) and Department of Computer Science, National University of R\u00edo Cuarto, R\u00edo Cuarto, Argentina"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5592-1355","authenticated-orcid":false,"given":"Marcelo Fabi\u00e1n","family":"Frias","sequence":"additional","affiliation":[{"name":"The University of Texas at El Paso, El Paso, TX, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,2,20]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2008.4630793"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.3233\/978-1-58603-929-5-825"},{"key":"e_1_3_1_4_2","first-page":"37","volume-title":"Automated Software Engineering","author":"Bharadwaj R.","year":"1999","unstructured":"R. Bharadwaj and C. Heitmeyer. 1999. Model Checking Complete Requirements Specifications Using Abstraction. Automated Software Engineering 6, 1 (1999), 37\u201368."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/566172.566191"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10009-004-0167-4"},{"key":"e_1_3_1_7_2","first-page":"209","volume-title":"OSDI \u201908","author":"Cadar C.","year":"2008","unstructured":"C. Cadar, D. Dunbar, and D. Engler. 2008. KLEE: Unassisted and Automatic Generation of High-coverage Tests for Complex Systems Programs. In OSDI \u201908, 209\u2013224."},{"key":"e_1_3_1_8_2","first-page":"3","volume-title":"Summaries of Talks Presented at the Summer Institute for Symbolic Logic","author":"Church A.","year":"1960","unstructured":"A. Church. 1960. Application of Recursive Arithmetic to the Problem of Circuit Synthesis In Summaries of Talks Presented at the Summer Institute for Symbolic Logic Cornell University, 2nd edn. Communications Research Division, Institute for Defense Analyses, Princeton, NJ, 3\u201350."},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.5555\/332656"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.5555\/1986308.1986347"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICST.2010.66"},{"key":"e_1_3_1_12_2","volume-title":"Evolutionary Computation: A Unified Approach","author":"De Jong K. A.","year":"2006","unstructured":"K. A. De Jong. 2006. Evolutionary Computation: A Unified Approach. MIT Press."},{"key":"e_1_3_1_13_2","volume-title":"IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs","author":"Durieux T.","year":"2016","unstructured":"T. Durieux and M. Monperrus. 2016. IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs. Research Report, Universite Lille."},{"key":"e_1_3_1_14_2","first-page":"2","article-title":"Static Verification for Code Contracts","author":"F\u00e4hndrich M.","year":"2010","unstructured":"M. F\u00e4hndrich. 2010. Static Verification for Code Contracts. In SAS \u201910, 2\u20135.","journal-title":"SAS \u201910"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TOPI.2012.6229809"},{"key":"e_1_3_1_16_2","first-page":"174","volume-title":"RE \u201901","author":"Fuxman A.","year":"2001","unstructured":"A. Fuxman, M. Pistore, J. Mylopoulos, and P. Traverso. 2001. Model Checking Early Requirements Specification in Tropos. In RE \u201901, 174\u2013181."},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2013.15"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-19835-9_15"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2011.104"},{"key":"e_1_3_1_20_2","first-page":"1135","article-title":"Bounded Exhaustive Search of Alloy Specification Repairs","author":"Brida Sim\u00f3n Guti\u00e9rrez","year":"2021","unstructured":"Sim\u00f3n Guti\u00e9rrez Brida, Germ\u00e1n Regis, Guolong Zheng, Hamid Bagheri, ThanhVu Nguyen, Nazareno Aguirre, and Marcelo F. Frias. 2021. Bounded Exhaustive Search of Alloy Specification Repairs. In ICSE \u201921, 1135\u20131147.","journal-title":"ICSE \u201921"},{"key":"e_1_3_1_21_2","volume-title":"Software Abstractions: Logic, Language and Analysis","author":"Jackson D.","year":"2006","unstructured":"D. Jackson. 2006. Software Abstractions: Logic, Language and Analysis. The MIT Press."},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/347324.383378"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2009.5090029"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/1592434.1592438"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1007\/11513988_23"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/2568225.2568258"},{"key":"e_1_3_1_27_2","first-page":"295","volume-title":"ASE \u201913","author":"Ke Y.","year":"2013","unstructured":"Y. Ke, K. T. Stolee, C. Le Goues, and Y. Brun. 2013. Repairing Programs with Semantic Code Search. In ASE \u201913, 295\u2013306."},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.31274\/etd-180810-4399"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15898-8_5"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.5555\/2486788.2486893"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISSRE.2015.7381813"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2015.2454513"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/3318162"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/HICSS.2007.462"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/2786805.2786811"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/2884781.2884872"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.5555\/2818754.2818811"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/2884781.2884807"},{"key":"e_1_3_1_39_2","first-page":"40","volume-title":"IEEE Computer","author":"Meyer B.","year":"1992","unstructured":"B. Meyer. 1992. Applying \u201cDesign by Contract\u201d. IEEE Computer 25 (1992), 40\u201351."},{"key":"e_1_3_1_40_2","volume-title":"A Touch of Class","author":"Meyer B.","year":"2013","unstructured":"B. Meyer. 2013. A Touch of Class, 2nd corrected ed. Springer."},{"key":"e_1_3_1_41_2","first-page":"22","article-title":"Programs That Test Themselves","volume":"42","author":"Meyer B.","year":"2009","unstructured":"B. Meyer, A. Fiva, I. Ciupa, A. Leitner, Y. Wei, and E. Stapf. 2009. Programs That Test Themselves. IEEE Software 42 (2009), 22\u201324.","journal-title":"IEEE Software"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.5555\/2486788.2486890"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/1217856.1217859"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1007\/11814948_18"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICST49551.2021.00033"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/2568225.2568254"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/2771783.2771791"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/2786805.2786825"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1007\/11560548_6"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-79124-9_10"},{"key":"e_1_3_1_51_2","first-page":"501","volume-title":"WCRE \u201912","author":"Trudel M.","unstructured":"M. Trudel, C. Furia, and M. Nordio. Automatic C to O-O Translation with C2Eiffel. In WCRE \u201912. IEEE, 501\u2013502."},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-39799-8_64"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1145\/1831708.1831716"},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2009.5070536"},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/1735223.1735249"},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2016.2560811"},{"key":"e_1_3_1_57_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-017-9577-2"},{"key":"e_1_3_1_58_2","unstructured":"Xuan-Bach D. Le Ferdian Thung David Lo and Claire Le Goues. Paper [56] reproducibility package. Retrieved September 2 2021 from https:\/\/doi.org\/10.5281\/zenodo.1012686"},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-020-09920-w"},{"key":"e_1_3_1_60_2","first-page":"14","author":"Zem\u00edn Luciano","year":"2017","unstructured":"Luciano Zem\u00edn, Sim\u00f3n Guti\u00e9rrez Brida, Ariel Godio, C\u00e9sar Cornejo, Renzo Degiovanni, Germ\u00e1n Regis, Nazareno Aguirre, and Marcelo F. Frias. 2017. An Analysis of the Suitability of Test-based Patch Acceptance Criteria.In SBST@ICSE \u201917, 14\u201320.","journal-title":"An Analysis of the Suitability of Test-based Patch Acceptance Criteria.In SBST@ICSE \u201917"},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/3533767.3534369"},{"key":"e_1_3_1_62_2","first-page":"637","article-title":"FLACK: Counterexample-guided Fault Localization for Alloy Models","author":"Zheng Guolong","year":"2021","unstructured":"Guolong Zheng, ThanhVu Nguyen, Sim\u00f3n Guti\u00e9rrez Brida, Germ\u00e1n Regis, Marcelo F. Frias, Nazareno Aguirre, and Hamid Bagheri. 2021. FLACK: Counterexample-guided Fault Localization for Alloy Models. In ICSE \u201921, 637\u2013648.","journal-title":"ICSE \u201921"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3702971","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3702971","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:18:04Z","timestamp":1750295884000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3702971"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,20]]},"references-count":61,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,3,31]]}},"alternative-id":["10.1145\/3702971"],"URL":"https:\/\/doi.org\/10.1145\/3702971","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,20]]},"assertion":[{"value":"2022-06-25","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-08-30","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-02-20","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}