{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,16]],"date-time":"2026-05-16T16:20:11Z","timestamp":1778948411395,"version":"3.51.4"},"reference-count":85,"publisher":"Association for Computing Machinery (ACM)","issue":"8","funder":[{"DOI":"10.13039\/501100001321","name":"National Research Foundation","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001321","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Cyber Security Agency","award":["NCRP25-P04-TAICeN"],"award-info":[{"award-number":["NCRP25-P04-TAICeN"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2025,11,30]]},"abstract":"<jats:p>\n            Automated Program Repair (APR) aims to automatically generate patches for rectifying software bugs. Recent strides in Large Language Models (LLM), such as ChatGPT, have yielded encouraging outcomes in APR, especially within the conversation-driven APR framework. Nevertheless, the efficacy of conversation-driven APR is contingent on the quality of the feedback information. In this article, we propose\n            <jats:italic toggle=\"yes\">ContrastRepair<\/jats:italic>\n            , a novel conversation-based APR approach that augments conversation-driven APR by providing LLMs with contrastive test pairs. A test pair consists of a failing test and a passing test, which offer contrastive feedback to the LLM. Our key insight is to minimize the difference between the generated passing test and the given failing test, which can better isolate the root causes of bugs. By providing such informative feedback,\n            <jats:italic toggle=\"yes\">ContrastRepair<\/jats:italic>\n            enables the LLM to produce effective bug fixes. The implementation of\n            <jats:italic toggle=\"yes\">ContrastRepair<\/jats:italic>\n            is based on the state-of-the-art LLM, ChatGPT, and it iteratively interacts with ChatGPT until plausible patches are generated. We evaluate\n            <jats:italic toggle=\"yes\">ContrastRepair<\/jats:italic>\n            on multiple benchmark datasets, including Defects4J, QuixBugs, and HumanEval-Java. The results demonstrate that\n            <jats:italic toggle=\"yes\">ContrastRepair<\/jats:italic>\n            significantly outperforms existing methods, achieving a new state-of-the-art in program repair. For instance, among Defects4J 1.2 and 2.0,\n            <jats:italic toggle=\"yes\">ContrastRepair<\/jats:italic>\n            correctly repairs 143 out of all 337 bug cases, while the best-performing baseline fixes 124 bugs.\n          <\/jats:p>","DOI":"10.1145\/3719345","type":"journal-article","created":{"date-parts":[[2025,3,4]],"date-time":"2025-03-04T08:29:31Z","timestamp":1741076971000},"page":"1-31","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":16,"title":["<i>ContrastRepair<\/i>\n            : Enhancing Conversation-Based Automated Program Repair via Contrastive Test Case Pairs"],"prefix":"10.1145","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-8248-1981","authenticated-orcid":false,"given":"Jiaolong","family":"Kong","sequence":"first","affiliation":[{"name":"Singapore Management University, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1288-6502","authenticated-orcid":false,"given":"Xiaofei","family":"Xie","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8982-1483","authenticated-orcid":false,"given":"Mingfei","family":"Cheng","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5598-4006","authenticated-orcid":false,"given":"Shangqing","family":"Liu","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3728-9541","authenticated-orcid":false,"given":"Xiaoning","family":"Du","sequence":"additional","affiliation":[{"name":"Monash University, Melbourne, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-8002-8068","authenticated-orcid":false,"given":"Qi","family":"Guo","sequence":"additional","affiliation":[{"name":"Tianjin University, Tianjin, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,10,3]]},"reference":[{"key":"e_1_3_2_1_2","first-page":"1","volume-title":"Proceedings of the 37th IEEE\/ACM International Conference on Automated Software Engineering","author":"Ahmed Toufique","year":"2022","unstructured":"Toufique Ahmed and Premkumar Devanbu. 2022. Few-shot training LLMs for project-specific code-summarization. In Proceedings of the 37th IEEE\/ACM International Conference on Automated Software Engineering, 1\u20135."},{"key":"e_1_3_2_2_2","unstructured":"Fatih Kadir Akin. 2023. The art of ChatGPT prompting. Retrieved from https:\/\/fka.gumroad.com\/l\/art-of-chatgpt-prompting"},{"key":"e_1_3_2_3_2","volume-title":"Quantify the Time and Cost Saved Using Reversible Debuggers","author":"Britton Tom","year":"2012","unstructured":"Tom Britton, Lisa Jeng, Graham Carver, and Paul Cheak. 2012. Quantify the Time and Cost Saved Using Reversible Debuggers. Technical Report. Cambridge Judge Business School."},{"key":"e_1_3_2_4_2","volume-title":"Reversible Debugging Software-Quantify the Time and Cost Saved Using Reversible Debuggers","author":"Britton Tom","year":"2013","unstructured":"Tom Britton, Lisa Jeng, Graham Carver, Paul Cheak, and Tomer Katzenellenbogen. 2013. Reversible Debugging Software-Quantify the Time and Cost Saved Using Reversible Debuggers. University of Cambridge."},{"key":"e_1_3_2_5_2","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems, Vol. 33, 1877\u20131901.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_6_2","unstructured":"Volodymyr Buell. 2023. javaobj-py3. Retrieved from https:\/\/pypi.org\/project\/javaobj-py3\/"},{"key":"e_1_3_2_7_2","first-page":"209","volume-title":"Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation","author":"Cadar Cristian","year":"2008","unstructured":"Cristian Cadar, Daniel Dunbar, Dawson R. Engler, et al. 2008. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, Vol. 8, 209\u2013224."},{"key":"e_1_3_2_8_2","first-page":"725","volume-title":"Proceedings of the 2015 IEEE Symposium on Security and Privacy","author":"Cha Sang Kil","year":"2015","unstructured":"Sang Kil Cha, Maverick Woo, and David Brumley. 2015. Program-adaptive mutational fuzzing. In Proceedings of the 2015 IEEE Symposium on Security and Privacy. IEEE, 725\u2013741."},{"key":"e_1_3_2_9_2","doi-asserted-by":"crossref","unstructured":"Yupeng Chang Xu Wang Jindong Wang Yuan Wu Kaijie Zhu Hao Chen Linyi Yang Xiaoyuan Yi Cunxiang Wang Yidong Wang et al. 2023. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology 15 3 (2024) 1\u201345.","DOI":"10.1145\/3641289"},{"key":"e_1_3_2_10_2","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman et al. 2021. Evaluating large language models trained on code. arXiv:2107.03374. https:\/\/arxiv.org\/pdf\/2107.03374"},{"key":"e_1_3_2_11_2","unstructured":"Shigeru Chiba and Muga Nishizawa. 2023. Javassist. Retrieved from https:\/\/www.javassist.org\/"},{"key":"e_1_3_2_12_2","first-page":"4302","article-title":"Deep reinforcement learning from human preferences. In","author":"Christiano Paul F.","year":"2017","unstructured":"Paul F. Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. 2017. Deep reinforcement learning from human preferences. In In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS\u201917). Curran Associates Inc., Red Hook, NY, 4302\u20134310.","journal-title":"Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS\u201917)"},{"issue":"3","key":"e_1_3_2_13_2","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1145\/363958.363994","article-title":"A technique for computer detection and correction of spelling errors","volume":"7","author":"Damerau Fred J.","year":"1964","unstructured":"Fred J. Damerau. 1964. A technique for computer detection and correction of spelling errors. Communications of the ACM 7, 3 (1964), 171\u2013176.","journal-title":"Communications of the ACM"},{"key":"e_1_3_2_14_2","doi-asserted-by":"crossref","unstructured":"Yihong Dong Xue Jiang Zhi Jin and Ge Li. 2024. Self-collaboration code generation via ChatGPT. ACM Transactions on Software Engineering and Methodology 33 7 (2024) 1\u201338.","DOI":"10.1145\/3672459"},{"key":"e_1_3_2_15_2","first-page":"1469","volume-title":"Proceedings of the 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE)","author":"Fan Zhiyu","year":"2023","unstructured":"Zhiyu Fan, Xiang Gao, Martin Mirchev, Abhik Roychoudhury, and Shin Hwei Tan. 2023. Automated repair of programs from large language models. In Proceedings of the 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1469\u20131481."},{"key":"e_1_3_2_16_2","doi-asserted-by":"crossref","unstructured":"Zhangyin Feng Daya Guo Duyu Tang Nan Duan Xiaocheng Feng Ming Gong Linjun Shou Bing Qin Ting Liu Daxin Jiang et al. 2020. Codebert: A pre-trained model for programming and natural languages. arXiv:2002.08155. Retrieved from https:\/\/arxiv.org\/pdf\/2002.08155","DOI":"10.18653\/v1\/2020.findings-emnlp.139"},{"key":"e_1_3_2_17_2","doi-asserted-by":"crossref","first-page":"681","DOI":"10.1007\/s11023-020-09548-1","article-title":"GPT-3: Its nature, scope, limits, and consequences","volume":"30","author":"Floridi Luciano","year":"2020","unstructured":"Luciano Floridi and Massimo Chiriatti. 2020. GPT-3: Its nature, scope, limits, and consequences. Minds and Machines 30 (2020), 681\u2013694.","journal-title":"Minds and Machines"},{"key":"e_1_3_2_18_2","first-page":"8","volume-title":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","author":"Gao Xiang","year":"2019","unstructured":"Xiang Gao, Sergey Mechtaev, and Abhik Roychoudhury. 2019. Crash-avoiding program repair. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, 8\u201318."},{"key":"e_1_3_2_19_2","doi-asserted-by":"crossref","unstructured":"Daya Guo Shuai Lu Nan Duan Yanlin Wang Ming Zhou and Jian Yin. 2022. Unixcoder: Unified cross-modal pre-training for code representation. arXiv:2203.03850. Retrieved from https:\/\/arxiv.org\/pdf\/2203.03850","DOI":"10.18653\/v1\/2022.acl-long.499"},{"key":"e_1_3_2_20_2","doi-asserted-by":"crossref","unstructured":"Qi Guo Junming Cao Xiaofei Xie Shangqing Liu Xiaohong Li Bihuan Chen and Xin Peng. 2024. Exploring the potential of ChatGPT in automated code refinement: An empirical study. In Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering 1\u201313.","DOI":"10.1145\/3597503.3623306"},{"key":"e_1_3_2_21_2","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1145\/3650212.3652130","volume-title":"Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis","author":"Guo Qi","year":"2024","unstructured":"Qi Guo, Xiaohong Li, Xiaofei Xie, Shangqing Liu, Ze Tang, Ruitao Feng, Junjie Wang, Jidong Ge, and Lei Bu. 2024. FT2Ra: A fine-tuning-inspired approach to retrieval-augmented code completion. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, 313\u2013324."},{"key":"e_1_3_2_22_2","first-page":"5549","volume-title":"Proceedings of the International Conference on Artificial Intelligence and Statistics","author":"Hegselmann Stefan","year":"2023","unstructured":"Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag. 2023. Tabllm: Few-shot classification of tabular data with large language models. In Proceedings of the International Conference on Artificial Intelligence and Statistics. PMLR, 5549\u20135581."},{"key":"e_1_3_2_23_2","first-page":"445","volume-title":"Proceedings of the 21st USENIX Security Symposium (USENIX Security\u201912)","author":"Holler Christian","year":"2012","unstructured":"Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with code fragments. In Proceedings of the 21st USENIX Security Symposium (USENIX Security\u201912), 445\u2013458."},{"key":"e_1_3_2_24_2","doi-asserted-by":"crossref","unstructured":"Nan Jiang Kevin Liu Thibaud Lutellier and Lin Tan. 2023. Impact of code language models on automated program repair. In 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE). IEEE 1430\u20131442.","DOI":"10.1109\/ICSE48619.2023.00125"},{"key":"e_1_3_2_25_2","first-page":"1161","volume-title":"Proceedings of the 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE)","author":"Jiang Nan","year":"2021","unstructured":"Nan Jiang, Thibaud Lutellier, and Lin Tan. 2021. Cure: Code-aware neural machine translation for automatic program repair. In Proceedings of the 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1161\u20131173."},{"key":"e_1_3_2_26_2","doi-asserted-by":"crossref","unstructured":"Vladimir Karpukhin Barlas O\u011fuz Sewon Min Patrick Lewis Ledell Wu Sergey Edunov Danqi Chen and Wen-tau Yih. 2020. Dense passage retrieval for open-domain question answering. arXiv:2004.04906. Retrieved from https:\/\/arxiv.org\/pdf\/2004.04906v2\/1000","DOI":"10.18653\/v1\/2020.emnlp-main.550"},{"issue":"7","key":"e_1_3_2_27_2","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1145\/360248.360252","article-title":"Symbolic execution and program testing","volume":"19","author":"King James C.","year":"1976","unstructured":"James C. King. 1976. Symbolic execution and program testing. Communications of the ACM 19, 7 (1976), 385\u2013394.","journal-title":"Communications of the ACM"},{"key":"e_1_3_2_28_2","unstructured":"Jiaolong Kong. 2025. ContrastRepair. Retrieved from https:\/\/github.com\/kjl960913\/ContrastRepair\/tree\/main"},{"key":"e_1_3_2_29_2","doi-asserted-by":"crossref","first-page":"492","DOI":"10.1145\/1134285.1134355","volume-title":"Proceedings of the 28th international conference on Software engineering","author":"LaToza Thomas D.","year":"2006","unstructured":"Thomas D. LaToza, Gina Venolia, and Robert DeLine. 2006. Maintaining mental models: A study of developer work habits. In Proceedings of the 28th international conference on Software engineering, 492\u2013501."},{"key":"e_1_3_2_30_2","first-page":"593","volume-title":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","author":"Le Xuan-Bach D","year":"2017","unstructured":"Xuan-Bach D Le, Duc-Hiep Chu, David Lo, Claire Le Goues, and Willem Visser. 2017. S3: Syntax-and semantic-guided repair synthesis via programming by examples. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 593\u2013604."},{"key":"e_1_3_2_31_2","first-page":"213","volume-title":"Proceedings of the 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"1","author":"Le Xuan Bach D.","year":"2016","unstructured":"Xuan Bach D. Le, David Lo, and Claire Le Goues. 2016. History driven program repair. In Proceedings of the 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. IEEE, 213\u2013224."},{"issue":"1","key":"e_1_3_2_32_2","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1109\/TSE.2011.104","article-title":"Genprog: A generic method for automatic software repair","volume":"38","author":"Le Goues Claire","year":"2011","unstructured":"Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2011. Genprog: A generic method for automatic software repair. IEEE Transactions on Software Engineering 38, 1 (2011), 54\u201372.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_3_2_33_2","first-page":"707","volume-title":"Soviet Physics Doklady","author":"Levenshtein Vladimir I.","year":"1966","unstructured":"Vladimir I. Levenshtein et al. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet Physics Doklady, Vol. 10, Soviet Union, 707\u2013710."},{"key":"e_1_3_2_34_2","unstructured":"Jiawei Liu Chunqiu Steven Xia Yuyao Wang and Lingming Zhang. 2023. Is your code generated by ChatGPT really correct? rigorous evaluation of large language models for code generation. Advances in Neural Information Processing Systems 36 (2023) 21558\u201321572."},{"key":"e_1_3_2_35_2","first-page":"1","volume-title":"Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER)","author":"Liu Kui","year":"2019","unstructured":"Kui Liu, Anil Koyuncu, Dongsun Kim, and Tegawend\u00e9 F. Bissyand\u00e9. 2019. Avatar: Fixing semantic bugs with fix patterns of static analysis violations. In Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 1\u201312."},{"key":"e_1_3_2_36_2","first-page":"31","volume-title":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","author":"Liu Kui","year":"2019","unstructured":"Kui Liu, Anil Koyuncu, Dongsun Kim, and Tegawend\u00e9 F Bissyand\u00e9. 2019. TBar: Revisiting template-based automated program repair. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, 31\u201342."},{"issue":"9","key":"e_1_3_2_37_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3560815","article-title":"Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing","volume":"55","author":"Liu Pengfei","year":"2023","unstructured":"Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys 55, 9 (2023), 1\u201335.","journal-title":"ACM Computing Surveys"},{"key":"e_1_3_2_38_2","unstructured":"Shangqing Liu Yu Chen Xiaofei Xie Jingkai Siow and Yang Liu. 2020. Retrieval-augmented generation for code summarization via hybrid GNN. arXiv:2006.05405. Retrieved from https:\/\/arxiv.org\/pdf\/2006.05405"},{"issue":"8","key":"e_1_3_2_39_2","first-page":"1","article-title":"Automated commit intelligence by pre-training","volume":"33","author":"Liu Shangqing","year":"2024","unstructured":"Shangqing Liu, Yanzhou Li, Xiaofei Xie, Wei Ma, Guozhu Meng, and Yang Liu. 2024. Automated commit intelligence by pre-training. ACM Transactions on Software Engineering and Methodology 33, 8 (2024), 1\u201330.","journal-title":"ACM Transactions on Software Engineering and Methodology"},{"key":"e_1_3_2_40_2","unstructured":"Yang Liu. 2019. Fine-tune BERT for extractive summarization. arXiv:1903.10318. Retrieved from https:\/\/arxiv.org\/pdf\/1903.10318"},{"key":"e_1_3_2_41_2","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1145\/2786805.2786811","volume-title":"Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering","author":"Long Fan","year":"2015","unstructured":"Fan Long and Martin Rinard. 2015. Staged program repair with condition synthesis. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, 166\u2013178."},{"key":"e_1_3_2_42_2","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1145\/3395363.3397369","volume-title":"Proceedings of the 29th ACM SIGSOFT international symposium on software testing and analysis","author":"Lutellier Thibaud","year":"2020","unstructured":"Thibaud Lutellier, Hung Viet Pham, Lawrence Pang, Yitong Li, Moshi Wei, and Lin Tan. 2020. Coconut: Combining context-aware neural translation models using ensemble for program repair. In Proceedings of the 29th ACM SIGSOFT international symposium on software testing and analysis, 101\u2013114."},{"key":"e_1_3_2_43_2","unstructured":"Lezhi Ma Shangqing Liu Lei Bu Shangru Li Yida Wang and Yang Liu. 2024. SpecEval: Evaluating code comprehension in large language models via program specifications. arXiv:2409.12866. Retrieved from https:\/\/arxiv.org\/pdf\/2409.12866"},{"key":"e_1_3_2_44_2","unstructured":"Lezhi Ma Shangqing Liu Yi Li Xiaofei Xie and Lei Bu. 2024. SpecGen: Automated generation of formal program specifications via large language models. arXiv:2401.08807. Retrieved from https:\/\/arxiv.org\/pdf\/2401.08807"},{"key":"e_1_3_2_45_2","first-page":"468","volume-title":"Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER)","author":"Madeiral Fernanda","year":"2019","unstructured":"Fernanda Madeiral, Simon Urli, Marcelo Maia, and Martin Monperrus. 2019. Bears: An extensible java bug benchmark for automatic program repair studies. In Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 468\u2013478."},{"key":"e_1_3_2_46_2","first-page":"416","volume-title":"Proceedings of the 29th International Conference on Software Engineering (ICSE\u201907)","author":"Majumdar Rupak","year":"2007","unstructured":"Rupak Majumdar and Koushik Sen. 2007. Hybrid concolic testing. In Proceedings of the 29th International Conference on Software Engineering (ICSE\u201907). IEEE, 416\u2013426."},{"key":"e_1_3_2_47_2","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1145\/2931037.2948705","volume-title":"Proceedings of the 25th International Symposium on Software Testing and Analysis","author":"Martinez Matias","year":"2016","unstructured":"Matias Martinez and Martin Monperrus. 2016. Astor: A program repair library for java. In Proceedings of the 25th International Symposium on Software Testing and Analysis, 441\u2013444."},{"key":"e_1_3_2_48_2","doi-asserted-by":"crossref","first-page":"691","DOI":"10.1145\/2884781.2884807","volume-title":"Proceedings of the 38th International Conference on Software Engineering","author":"Mechtaev Sergey","year":"2016","unstructured":"Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2016. Angelix: Scalable multiline program patch synthesis via symbolic analysis. In Proceedings of the 38th International Conference on Software Engineering, 691\u2013701."},{"issue":"12","key":"e_1_3_2_49_2","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1145\/96267.96279","article-title":"An empirical study of the reliability of UNIX utilities","volume":"33","author":"Miller Barton P.","year":"1990","unstructured":"Barton P. Miller, Lars Fredriksen, and Bryan So. 1990. An empirical study of the reliability of UNIX utilities. Communications of the ACM 33, 12 (1990), 32\u201344.","journal-title":"Communications of the ACM"},{"issue":"2","key":"e_1_3_2_50_2","first-page":"1","article-title":"Recent advances in natural language processing via large pre-trained language models: A survey","volume":"56","author":"Min Bonan","year":"2023","unstructured":"Bonan Min, Hayley Ross, Elior Sulem, Amir Pouran Ben Veyseh, Thien Huu Nguyen, Oscar Sainz, Eneko Agirre, Ilana Heintz, and Dan Roth. 2023. Recent advances in natural language processing via large pre-trained language models: A survey. ACM Computing Surveys 56, 2 (2023), 1\u201340.","journal-title":"ACM Computing Surveys"},{"issue":"2","key":"e_1_3_2_51_2","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1109\/MSP.2005.55","article-title":"Violating assumptions with fuzzing","volume":"3","author":"Oehlert Peter","year":"2005","unstructured":"Peter Oehlert. 2005. Violating assumptions with fuzzing. IEEE Security & Privacy 3, 2 (2005), 58\u201362.","journal-title":"IEEE Security & Privacy"},{"key":"e_1_3_2_52_2","unstructured":"OpenAI. 2023. ChatGPT. Retrieved from https:\/\/chat.openai.com\/"},{"key":"e_1_3_2_53_2","unstructured":"OpenAI. 2023. GPT-4 Technical Report. Retrieved from https:\/\/cdn.openai.com\/papers\/gpt-4.pdf"},{"key":"e_1_3_2_54_2","first-page":"27730","article-title":"Training language models to follow instructions with human feedback","volume":"35","author":"Ouyang Long","year":"2022","unstructured":"Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems, Vol. 35, 27730\u201327744.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_55_2","first-page":"181","volume-title":"Proceedings of the 29th USENIX Security Symposium (USENIX Security\u201920)","author":"Poeplau Sebastian","year":"2020","unstructured":"Sebastian Poeplau and Aur\u00e9lien Francillon. 2020. Symbolic execution with \\(\\{\\) SymCC \\(\\}\\) : Don\u2019t interpret, compile!. In Proceedings of the 29th USENIX Security Symposium (USENIX Security\u201920), 181\u2013198."},{"key":"e_1_3_2_56_2","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1145\/3524459.3527351","volume-title":"Proceedings of the 3rd International Workshop on Automated Program Repair","author":"Julian Aron Prenner","year":"2022","unstructured":"Julian Aron Prenner, Hlib Babii, and Romain Robbes. 2022. Can OpenAI\u2019s codex fix bugs? An evaluation on QuixBugs. In Proceedings of the 3rd International Workshop on Automated Program Repair, 69\u201375."},{"issue":"4","key":"e_1_3_2_57_2","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1561\/1500000019","article-title":"The probabilistic relevance framework: BM25 and beyond","volume":"3","author":"Robertson Stephen","year":"2009","unstructured":"Stephen Robertson and Hugo Zaragoza. 2009. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends\u00ae in Information Retrieval 3, 4 (2009), 333\u2013389.","journal-title":"Foundations and Trends\u00ae in Information Retrieval"},{"issue":"5","key":"e_1_3_2_58_2","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1145\/1095430.1081750","article-title":"CUTE: A concolic unit testing engine for C","volume":"30","author":"Sen Koushik","year":"2005","unstructured":"Koushik Sen, Darko Marinov, and Gul Agha. 2005. CUTE: A concolic unit testing engine for C. ACM SIGSOFT Software Engineering Notes 30, 5 (2005), 263\u2013272.","journal-title":"ACM SIGSOFT Software Engineering Notes"},{"key":"e_1_3_2_59_2","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1109\/SecDev.2016.043","volume-title":"Proceedings of the 2016 IEEE Cybersecurity Development (SecDev)","author":"Serebryany Kosta","year":"2016","unstructured":"Kosta Serebryany. 2016. Continuous fuzzing with libfuzzer and addresssanitizer. In Proceedings of the 2016 IEEE Cybersecurity Development (SecDev). IEEE, 157\u2013157."},{"key":"e_1_3_2_60_2","doi-asserted-by":"crossref","unstructured":"Dominik Sobania Martin Briesch Carol Hanna and Justyna Petke. 2023. An analysis of the automatic bug fixing performance of ChatGPT. arXiv:2301.08653. Retrieved from https:\/\/arxiv.org\/pdf\/2301.08653","DOI":"10.1109\/APR59189.2023.00012"},{"key":"e_1_3_2_61_2","first-page":"3104","article-title":"Sequence to sequence learning with neural networks","author":"Sutskever Ilya","year":"2014","unstructured":"Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS\u201914). Vol. 2, MIT Press, Cambridge, MA, 3104\u20133112.","journal-title":"Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS\u201914)"},{"key":"e_1_3_2_62_2","first-page":"6000","article-title":"Attention is all you need. In","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS\u201917). Curran Associates Inc., Red Hook, NY, 6000\u20136010.","journal-title":"Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS\u201917)"},{"key":"e_1_3_2_63_2","doi-asserted-by":"crossref","first-page":"8696","DOI":"10.18653\/v1\/2021.emnlp-main.685","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Wang Yue","year":"2021","unstructured":"Yue Wang, Weishi Wang, Shafiq Joty, and Steven C. H. Hoi. 2021. CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 8696\u20138708."},{"key":"e_1_3_2_64_2","first-page":"1","volume-title":"Proceedings of the 40th International Conference on Software Engineering","author":"Wen Ming","year":"2018","unstructured":"Ming Wen, Junjie Chen, Rongxin Wu, Dan Hao, and Shing-Chi Cheung. 2018. Context-aware patch generation for better automated program repair. In Proceedings of the 40th International Conference on Software Engineering, 1\u201311."},{"key":"e_1_3_2_65_2","unstructured":"Jules White Quchen Fu Sam Hays Michael Sandborn Carlos Olea Henry Gilbert Ashraf Elnashar Jesse Spencer-Smith and Douglas C Schmidt. 2023. A prompt pattern catalog to enhance prompt engineering with ChatGPT. arXiv:2302.11382. Retrieved from https:\/\/file.mixpaper.cn\/paper_store\/2023\/681177f8-cd15-4e0f-a23b-997c6b9f9dd2.pdf"},{"key":"e_1_3_2_66_2","first-page":"1","volume-title":"Proceedings of the ACM on Programming Languages 4","author":"Winterer Dominik","year":"2020","unstructured":"Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2020. On the unusual effectiveness of type-aware operator mutations for testing SMT solvers. Proceedings of the ACM on Programming Languages 4, OOPSLA (2020), 1\u201325."},{"key":"e_1_3_2_67_2","first-page":"1482","volume-title":"2023IEEE\/ACM 45th International Conference on Software Engineering (ICSE)","author":"Xia Chunqiu Steven","year":"2023","unstructured":"Chunqiu Steven Xia, Yuxiang Wei, and Lingming Zhang. 2023. Automated program repair in the era of large pre-trained language models. In 2023IEEE\/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1482\u20131494."},{"key":"e_1_3_2_68_2","first-page":"959","volume-title":"Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE \u201922)","author":"Xia Chunqiu Steven","year":"2022","unstructured":"Chunqiu Steven Xia and Lingming Zhang. 2022. Less training, more repairing please: Revisiting automated program repair via zero-shot learning. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE \u201922). ACM, New York, NY, 959\u2013971. DOI: 10.1145\/3540250.3549101"},{"key":"e_1_3_2_69_2","unstructured":"Chunqiu Steven Xia and Lingming Zhang. 2023. Conversational automated program repair. arXiv:2301.13246. Retrieved from https:\/\/arxiv.org\/pdf\/2301.13246"},{"key":"e_1_3_2_70_2","first-page":"819","volume-title":"Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis","author":"Xia Chunqiu Steven","year":"2024","unstructured":"Chunqiu Steven Xia and Lingming Zhang. 2024. Automated program repair via conversation: Fixing 162 out of 337 bugs for \\(\\$\\) 0.42 each using ChatGPT. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, 819\u2013831."},{"key":"e_1_3_2_71_2","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1145\/3106237.3106274","volume-title":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","author":"Yang Jinqiu","year":"2017","unstructured":"Jinqiu Yang, Alexey Zhikhartsev, Yuefei Liu, and Lin Tan. 2017. Better test cases for better automated program repair. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 831\u2013841."},{"key":"e_1_3_2_72_2","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1145\/1993498.1993532","volume-title":"Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation","author":"Yang Xuejun","year":"2011","unstructured":"Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 283\u2013294."},{"key":"e_1_3_2_73_2","unstructured":"Xianjun Yang Yan Li Xinlu Zhang Haifeng Chen and Wei Cheng. 2023. Exploring the limits of ChatGPT for query or aspect-based text summarization. arXiv:2302.08081. Retrieved from https:\/\/arxiv.org\/pdf\/2302.08081"},{"key":"e_1_3_2_74_2","first-page":"5753","article-title":"Xlnet: Generalized autoregressive pretraining for language understanding. In","volume":"32","author":"Yang Zhilin","year":"2019","unstructured":"Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R. Salakhutdinov, and Quoc V. Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. In Advances in Neural Information Processing Systems, Vol. 32, 5753\u20135763.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_75_2","first-page":"1","volume-title":"Proceedings of the 37th IEEE\/ACM International Conference on Automated Software Engineering","author":"Ye He","year":"2022","unstructured":"He Ye, Matias Martinez, Xiapu Luo, Tao Zhang, and Martin Monperrus. 2022. Selfapr: Self-supervised program repair with test execution diagnostics. In Proceedings of the 37th IEEE\/ACM International Conference on Automated Software Engineering, 1\u201313."},{"key":"e_1_3_2_76_2","doi-asserted-by":"crossref","first-page":"1506","DOI":"10.1145\/3510003.3510222","volume-title":"Proceedings of the 44th International Conference on Software Engineering","author":"Ye He","year":"2022","unstructured":"He Ye, Matias Martinez, and Martin Monperrus. 2022. Neural program repair with execution-based backpropagation. In Proceedings of the 44th International Conference on Software Engineering, 1506\u20131518."},{"key":"e_1_3_2_77_2","first-page":"1","volume-title":"Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering","author":"Ye He","year":"2024","unstructured":"He Ye and Martin Monperrus. 2024. ITER: Iterative neural repair for multi-location patches. In Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering, 1\u201313."},{"key":"e_1_3_2_78_2","unstructured":"M. Zalewski. 2018. American fuzzing lop (AFL). Retrieved from https:\/\/lcamtuf.coredump.cx\/afl\/"},{"issue":"2","key":"e_1_3_2_79_2","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1109\/32.988498","article-title":"Simplifying and isolating failure-inducing input","volume":"28","author":"Zeller Andreas","year":"2002","unstructured":"Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and isolating failure-inducing input. IEEE Transactions on Software Engineering 28, 2 (2002), 183\u2013200.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_3_2_80_2","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1145\/3533767.3534390","volume-title":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","author":"Zeng Zhengran","year":"2022","unstructured":"Zhengran Zeng, Hanzhuo Tan, Haotian Zhang, Jing Li, Yuqun Zhang, and Lingming Zhang. 2022. An extensive study on pre-trained models for program understanding and generation. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, 39\u201351."},{"key":"e_1_3_2_81_2","unstructured":"Biao Zhang Barry Haddow and Alexandra Birch. 2023. Prompting large language model for machine translation: A case study. In International Conference on Machine Learning. PMLR 41092\u201341110."},{"key":"e_1_3_2_82_2","unstructured":"Quanjun Zhang Chunrong Fang Yuxiang Ma Weisong Sun and Zhenyu Chen. 2023. A survey of learning-based automated program repair. arXiv:2301.03270. Retrieved from https:\/\/arxiv.org\/pdf\/2301.03270"},{"key":"e_1_3_2_83_2","unstructured":"Wayne Xin Zhao Kun Zhou Junyi Li Tianyi Tang Xiaolei Wang Yupeng Hou Yingqian Min Beichen Zhang Junjie Zhang Zican Dong et al. 2023. A survey of large language models. arXiv:2303.18223. Retrieved from https:\/\/paper-notes.zhjwpku.com\/assets\/pdfs\/llm_survey_2303.18223.pdf"},{"key":"e_1_3_2_84_2","first-page":"341","volume-title":"Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering","author":"Zhu Qihao","year":"2021","unstructured":"Qihao Zhu, Zeyu Sun, Yuan-an Xiao, Wenjie Zhang, Kang Yuan, Yingfei Xiong, and Lu Zhang. 2021. A syntax-guided edit decoder for neural program repair. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 341\u2013353."},{"key":"e_1_3_2_85_2","unstructured":"Daniel M Ziegler Nisan Stiennon Jeffrey Wu Tom B Brown Alec Radford Dario Amodei Paul Christiano and Geoffrey Irving. 2019. Fine-tuning language models from human preferences. arXiv:1909.08593. Retrieved from https:\/\/arxiv.org\/pdf\/1909.08593"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3719345","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,3]],"date-time":"2025-10-03T15:07:16Z","timestamp":1759504036000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3719345"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,3]]},"references-count":85,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2025,11,30]]}},"alternative-id":["10.1145\/3719345"],"URL":"https:\/\/doi.org\/10.1145\/3719345","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,3]]},"assertion":[{"value":"2024-03-03","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-02-04","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-03","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}