{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T22:56:19Z","timestamp":1776120979590,"version":"3.50.1"},"reference-count":161,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2025,5,24]],"date-time":"2025-05-24T00:00:00Z","timestamp":1748044800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Singapore Ministry of Education (MoE) Tier3","award":["MOE-MOET32021-0001"],"award-info":[{"award-number":["MOE-MOET32021-0001"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2025,6,30]]},"abstract":"<jats:p>Automatic programming has seen increasing popularity due to the emergence of tools like GitHub Copilot which rely on Large Language Models (LLMs). At the same time, automatically generated code faces challenges during deployment due to concerns around quality and trust. In this article, we study automated coding in a general sense and study the concerns around code quality, security, and related issues of programmer responsibility. These are key issues for organizations while deciding on the usage of automatically generated code. We discuss how advances in software engineering such as program repair and analysis can enable automatic programming. We conclude with a forward looking view, focusing on the programming environment of the near future, where programmers may need to switch to different roles to fully utilize the power of automatic programming. Automated repair of automatically generated programs from LLMs can help produce higher assurance code from LLMs, along with evidence of assurance.<\/jats:p>","DOI":"10.1145\/3708519","type":"journal-article","created":{"date-parts":[[2024,12,14]],"date-time":"2024-12-14T14:52:21Z","timestamp":1734187941000},"page":"1-33","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":29,"title":["Automatic Programming: Large Language Models and Beyond"],"prefix":"10.1145","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3666-5798","authenticated-orcid":false,"given":"Michael R.","family":"Lyu","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3406-5235","authenticated-orcid":false,"given":"Baishakhi","family":"Ray","sequence":"additional","affiliation":[{"name":"Computer Science, Columbia University, New York, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7127-1137","authenticated-orcid":false,"given":"Abhik","family":"Roychoudhury","sequence":"additional","affiliation":[{"name":"Computer Science, National University of Singapore, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8633-3372","authenticated-orcid":false,"given":"Shin Hwei","family":"Tan","sequence":"additional","affiliation":[{"name":"Computer Science and Software Engineering, Concordia University, Montreal, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6328-8839","authenticated-orcid":false,"given":"Patanamon","family":"Thongtanunam","sequence":"additional","affiliation":[{"name":"School of Computing and Information Systems, University of Melbourne, Melbourne, Australia"}]}],"member":"320","published-online":{"date-parts":[[2025,5,24]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"Quinn Radich Kent Sharkey David Coulter Dan Mabee Drew Batchelor and Michael Satran. 2021. Application compatibility toolkit (ACT). Retrieved from https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/win7appqual\/application-compatibility-toolkit\u2013act-"},{"key":"e_1_3_2_3_2","series-title":"Association for Computational Linguistics","first-page":"2655","volume-title":"In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT \u201921)","author":"Ahmad Wasi Uddin","year":"2021","unstructured":"Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2021. Unified pre-training for program understanding and generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT \u201921). Association for Computational Linguistics, 2655\u20132668."},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.449"},{"key":"e_1_3_2_5_2","unstructured":"Toufique Ahmed Kunal Suresh Pai Premkumar Devanbu and Earl T. Barr. 2023. Automatic semantic augmentation of language model prompts. arXiv:2304.06815. Retrieved from http:\/\/arxiv.org\/abs\/2304.06815"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3208071"},{"key":"e_1_3_2_7_2","first-page":"1","volume-title":"Syntax-guided Synthesis","author":"Alur Rajeev","year":"2013","unstructured":"Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. 2013. Syntax-guided Synthesis. IEEE, 1\u20138."},{"key":"e_1_3_2_8_2","doi-asserted-by":"crossref","unstructured":"Shushan Arakelyan Rocktim Jyoti Das Yi Mao and Xiang Ren. 2023. Exploring distributional shifts in large language models for code analysis. arXiv:2303.09128. Retrieved from http:\/\/arxiv.org\/abs\/2303.09128","DOI":"10.18653\/v1\/2023.emnlp-main.1013"},{"issue":"6","key":"e_1_3_2_9_2","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1007\/s10664-023-10380-1","article-title":"Is Github\u2019s copilot as bad as humans at introducing vulnerabilities in code?","volume":"28","author":"Asare Owura","year":"2023","unstructured":"Owura Asare, Meiyappan Nagappan, and N. Asokan. 2023. Is Github\u2019s copilot as bad as humans at introducing vulnerabilities in code? Empirical Software Engineering 28, 6 (2023), 129.","journal-title":"Empirical Software Engineering"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","unstructured":"Jacob Austin Augustus Odena Maxwell Nye Maarten Bosma Henryk Michalewski David Dohan Ellen Jiang Carrie Cai Michael Terry Quoc Le et al. 2021. Program synthesis with large language models. arXiv:2108.07732. Retrieved from 10.48550\/arXiv.2108.07732","DOI":"10.48550\/arXiv.2108.07732"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3586030"},{"issue":"6","key":"e_1_3_2_12_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1218776.1218781","article-title":"Trustworthy software systems: A discussion of basic concepts and terminology","volume":"31","author":"Becker Steffen","year":"2006","unstructured":"Steffen Becker, Wilhelm Hasselbring, Alexandra Paul, Marko Boskovic, Heiko Koziolek, Jan Ploski, Abhishek Dhama, Henrik Lipskoch, Matthias Rohr, Daniel Winteler, et al. 2006. Trustworthy software systems: A discussion of basic concepts and terminology. ACM SIGSOFT Software Engineering Notes 31, 6 (2006), 1\u201318.","journal-title":"ACM SIGSOFT Software Engineering Notes"},{"issue":"6","key":"e_1_3_2_13_2","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1145\/3582083","article-title":"Taking flight with copilot: Early insights and opportunities of AI-powered pair-programming tools","volume":"20","author":"Bird Christian","year":"2023","unstructured":"Christian Bird, Denae Ford, Thomas Zimmermann, Nicole Forsgren, Eirini Kalliamvakou, Travis Lowdermilk, and Idan Gazit. 2023. Taking flight with copilot: Early insights and opportunities of AI-powered pair-programming tools. Queue 20, 6 (2023), 35\u201357.","journal-title":"Queue"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2023.3267446"},{"key":"e_1_3_2_15_2","unstructured":"ChatGPT. 2022. ChatGPT. Retrieved from https:\/\/chat.openai.com\/"},{"key":"e_1_3_2_16_2","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman et al. 2021. Evaluating large language models trained on code. arXiv:2107.03374. Retrieved from https:\/\/arxiv.org\/abs\/2107.03374"},{"issue":"9","key":"e_1_3_2_17_2","first-page":"1943","article-title":"SequenceR: Sequence-to-sequence learning for end-to-end program repair","volume":"47","author":"Chen Zimin","year":"2021","unstructured":"Zimin Chen, Steve Kommrusch, Michele Tufano, Louis-No\u00ebl Pouchet, Denys Poshyvanyk, and Martin Monperrus. 2021. SequenceR: Sequence-to-sequence learning for end-to-end program repair. IEEE Transactions on Software Engineering 47, 9 (2021), 1943\u20131959.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_3_2_18_2","first-page":"242","volume-title":"International Symposium on Software Testing and Analysis (ISSTA \u201918)","author":"Blasi Arianna","year":"2018","unstructured":"Arianna Blasi, Alberto Goffi, Konstantin Kuznetsov, Alessandra Gorla, Michael D. Ernst, Mauro Pezz\u00e8, and Sergio Delgado Castellanos. 2018. Translating code comments to procedure specifications. In International Symposium on Software Testing and Analysis (ISSTA \u201918), 242\u2013253."},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/4235.996017"},{"key":"e_1_3_2_20_2","unstructured":"DeepSeek. 2023. Deepseek coder: Let the code write itself. Retrieved from https:\/\/github.com\/deepseek-ai\/DeepSeek-Coder"},{"key":"e_1_3_2_21_2","doi-asserted-by":"crossref","unstructured":"Paul Denny Viraj Kumar and Nasser Giacaman. 2022. Conversing with Copilot: Exploring prompt engineering for solving CS1 problems using natural language. arXiv:2210.15157. Retrieved from http:\/\/arxiv.org\/abs\/2210.15157","DOI":"10.1145\/3545945.3569823"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","unstructured":"Yihong Dong Xue Jiang Zhi Jin and Ge Li. 2023. Self-collaboration code generation via ChatGPT. arXiv:2304.07590. Retrieved from 10.48550\/arXiv.2304.07590","DOI":"10.48550\/arXiv.2304.07590"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","unstructured":"Xueying Du Mingwei Liu Kaixin Wang Hanlin Wang Junwei Liu Yixuan Chen Jiayi Feng Chaofeng Sha Xin Peng and Yiling Lou. 2023. ClassEval: A manually-crafted benchmark for evaluating LLMs on class-level code generation. arXiv:2308.01861. Retrieved from 10.48550\/arXiv.2308.01861","DOI":"10.48550\/arXiv.2308.01861"},{"key":"e_1_3_2_24_2","unstructured":"Larry Ellison. 2023. Oracle\u2019s vision for the future. Keynote at Oracle CloudWorld. Retrieved from https:\/\/www.youtube.com\/watch?v=63DmgBN1rSI"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00128"},{"key":"e_1_3_2_26_2","doi-asserted-by":"crossref","first-page":"1536","DOI":"10.18653\/v1\/2020.findings-emnlp.139","article-title":"CodeBERT: A pre-trained model for programming and natural languages","volume":"2020","author":"Feng Zhangyin","year":"2020","unstructured":"Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A pre-trained model for programming and natural languages. In Findings of the Association for Computational Linguistics (EMNLP \u201920), Findings of ACL, Vol. EMNLP 2020, Association for Computational Linguistics, 1536\u20131547.","journal-title":"Findings of the Association for Computational Linguistics (EMNLP \u201920)"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(71)90010-5"},{"key":"e_1_3_2_28_2","first-page":"1229","volume-title":"Proceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE \u201923)","author":"First Emily","year":"2023","unstructured":"Emily First, Markus Rabe, Talia Ringer, and Yuriy Brun. 2023. Baldur: Whole-proof generation and repair with large language models. In Proceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE \u201923), 1229\u20131241."},{"key":"e_1_3_2_29_2","first-page":"947","volume-title":"Proceedings of the Genetic and Evolutionary Computation Conference (GECCO \u201909)","author":"Forrest Stephanie","year":"2009","unstructured":"Stephanie Forrest, ThanhVu Nguyen, Westley Weimer, and Claire Le Goues. 2009. A genetic programming approach to automated software repair. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO \u201909). ACM, 947\u2013954."},{"key":"e_1_3_2_30_2","unstructured":"Daniel Fried Armen Aghajanyan Jessy Lin Sida Wang Eric Wallace Freda Shi Ruiqi Zhong Wen-tau Yih Luke Zettlemoyer and Mike Lewis. 2022. InCoder: A generative model for code infilling and synthesis. arXiv: 2204.05999."},{"key":"e_1_3_2_31_2","first-page":"177","volume-title":"Proceedings of the International Symposium on Software Testing and Analysis (ISSTA \u201912)","author":"Fry Zachary P.","year":"2012","unstructured":"Zachary P. Fry, Bryan Landau, and Westley Weimer. 2012. A Human study of patch maintainability. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA \u201912). ACM, 177\u2013187."},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3639477.3639746"},{"key":"e_1_3_2_33_2","first-page":"1","volume-title":"Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering (ICSE \u201924)","author":"Gao Shuzheng","year":"2024","unstructured":"Shuzheng Gao, Wenxin Mao, Cuiyun Gao, Li Li, Xing Hu, Xin Xia, and Michael R. Lyu. 2024. Learning in the wild: Towards leveraging unlabeled data for effectively tuning pre-trained code models. In Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering (ICSE \u201924). ACM, 1\u201313."},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE56229.2023.00109"},{"key":"e_1_3_2_35_2","first-page":"8","volume-title":"Proceedings of the ACM International Symposium on Software Testing and Analysis (ISSTA \u201919)","author":"Gao Xiang","year":"2019","unstructured":"Xiang Gao, Sergey Mechtaev, and Abhik Roychoudhury. 2019. Crash-avoiding program repair. In Proceedings of the ACM International Symposium on Software Testing and Analysis (ISSTA \u201919), 8\u201318."},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","unstructured":"Xiang Gao Yannic Noller and Abhik Roychoudhury. 2023. Program repair. arXiv:2211.12787. Retrieved from 10.48550\/arXiv.2211.12787","DOI":"10.48550\/arXiv.2211.12787"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3418461"},{"key":"e_1_3_2_38_2","first-page":"167","volume-title":"Future of Software Engineering (FOSE \u201914), co-located with International Conference on Software Engineering (ICSE \u201914)","author":"Gordon A. D.","year":"2014","unstructured":"A. D. Gordon, T. A. Henzinger, A. V. Nori, and S. K. Rajamani. 2014. Probabilistic programming. In Future of Software Engineering (FOSE \u201914), co-located with International Conference on Software Engineering (ICSE \u201914), 167\u2013181."},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2011.104"},{"key":"e_1_3_2_40_2","first-page":"183","article-title":"Theorem proving by resolution as a basis for question-answering systems","volume":"4","author":"Cordell Green","year":"1969","unstructured":"Cordell Green. 1969. Theorem proving by resolution as a basis for question-answering systems. Machine Intelligence 4 (1969), 183\u2013205.","journal-title":"Machine Intelligence"},{"key":"e_1_3_2_41_2","unstructured":"Kai Greshake Sahar Abdelnabi Shailesh Mishra Christoph Endres Thorsten Holz and Mario Fritz. 2023. More than you\u2019ve asked for: A comprehensive analysis of novel prompt injection threats to application-integrated large language models. arXiv:2302.12173. Retrieved from http:\/\/dx.doi.org\/10.48550\/arXiv.2302.12173"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/SANER53432.2022.00112"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/1926385.1926423"},{"key":"e_1_3_2_44_2","doi-asserted-by":"crossref","unstructured":"Sumit Gulwani Oleksandr Polozov and Rishabh Singh. 2017. Program synthesis. Foundations and Trends\u00ae in Programming Languages 4 1\u20132 (2017) 1\u2013119.","DOI":"10.1561\/2500000010"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","unstructured":"Suriya Gunasekar Yi Zhang Jyoti Aneja Caio C\u00e9sar Teodoro Mendes Allie Del Giorno Sivakanth Gopi Mojan Javaheripi Piero Kauffmann Gustavo de Rosa Olli Saarikivi et al. 2023. Textbooks are all you need. arXiv:2306.11644. Retrieved from 10.48550\/arXiv.2306.11644","DOI":"10.48550\/arXiv.2306.11644"},{"key":"e_1_3_2_46_2","series-title":"Association for Computational Linguistics","first-page":"7212","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL \u201922)","volume":"1","author":"Guo Daya","year":"2022","unstructured":"Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, and Jian Yin. 2022. UniXcoder: Unified cross-modal pre-training for code representation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL \u201922), Long Papers, Vol. 1. Association for Computational Linguistics, 7212\u20137225."},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","unstructured":"Qi Guo Junming Cao Xiaofei Xie Shangqing Liu Xiaohong Li Bihuan Chen and Xin Peng. 2024. Exploring the potential of ChatGPT in automated code refinement: An empirical study. ACM 1\u201313. DOI: 10.1145\/3597503.3623306","DOI":"10.1145\/3597503.3623306"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.10742"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","unstructured":"Yiyang Hao Ge Li Yongqiang Liu Xiaowei Miao He Zong Siyuan Jiang Yang Liu and He Wei. 2022. Aixbench: A code generation benchmark dataset. arXiv:2206.13179. Retrieved from 10.48550\/arXiv.2206.13179","DOI":"10.48550\/arXiv.2206.13179"},{"key":"e_1_3_2_50_2","series-title":"Association for Computational Linguistics","doi-asserted-by":"crossref","first-page":"925","DOI":"10.18653\/v1\/D18-1111","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Hayati Shirley Anugrah","year":"2018","unstructured":"Shirley Anugrah Hayati, Rapha\u00ebl Olivier, Pravalika Avvaru, Pengcheng Yin, Anthony Tomasic, and Graham Neubig. 2018. Retrieval-based neural code generation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 925\u2013930."},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICWS.2017.13"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","unstructured":"Dan Hendrycks Steven Basart Saurav Kadavath Mantas Mazeika Akul Arora Ethan Guo Collin Burns Samir Puranik Horace He Dawn Song et al. 2021. Measuring coding challenge competence with Apps. arXiv:2105.09938. Retrieved from 10.48550\/arXiv.2105.09938","DOI":"10.48550\/arXiv.2105.09938"},{"issue":"2","key":"e_1_3_2_53_2","first-page":"9:1","article-title":"Using formal specifications to support testing","volume":"41","author":"Hierons Robert M.","year":"2009","unstructured":"Robert M. Hierons, Kirill Bogdanov, Jonathan P. Bowen, Rance Cleaveland, John Derrick, Jeremy Dick, Marian Gheorghe, Mark Harman, Kalpesh Kapoor, Paul J. Krause, et al. 2009. Using formal specifications to support testing. ACM Computing Surveys 41, 2 (2009), 9:1\u20139:76.","journal-title":"ACM Computing Surveys"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE56229.2023.00181"},{"key":"e_1_3_2_55_2","first-page":"539","volume-title":"IEEE Symposium on Security and Privacy (S&P 19)","author":"Huang Zhen","year":"2019","unstructured":"Zhen Huang, David Lie, Gang Tan, and Trent Jaeger. 2019. Using safety properties to generate vulnerability patches. In IEEE Symposium on Security and Privacy (S&P 19), 539\u2013554."},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00082"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSME55016.2022.00058"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1192"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","unstructured":"Naman Jain King Han Alex Gu Wen-Ding Li Fanjia Yan Tianjun Zhang Sida Wang Armando Solar-Lezama Koushik Sen and Ion Stoica. 2024. LiveCodeBench: Holistic and contamination free evaluation of large language models for code. arXiv:2403.07974. Retrieved from 10.48550\/arXiv.2403.07974","DOI":"10.48550\/arXiv.2403.07974"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","unstructured":"Kevin Jesse Toufique Ahmed Premkumar T. Devanbu and Emily Morgan. 2023. Large language models and simple stupid bugs. arxiv:2303.11455. Retrieved from 10.48550\/arXiv.2303.11455","DOI":"10.48550\/arXiv.2303.11455"},{"key":"e_1_3_2_61_2","first-page":"215","volume-title":"Proceedings of the International Conference on Software Engineering (ICSE \u201910)","author":"Jha Susmit","year":"2010","unstructured":"Susmit Jha, Sumit Gulwani, Sanjit Seshia, and Ashish Tiwari. 2010. Oracle-guided component-based program synthesis. In Proceedings of the International Conference on Software Engineering (ICSE \u201910), 215\u2013224."},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1145\/3491102.3501870"},{"key":"e_1_3_2_63_2","first-page":"1161","volume-title":"Proceedings of the 43rd IEEE\/ACM International Conference on Software Engineering (ICSE \u201921)","author":"Jiang Nan","year":"2021","unstructured":"Nan Jiang, Thibaud Lutellier, and Lin Tan. 2021. CURE: Code-aware neural machine translation for automatic program repair. In Proceedings of the 43rd IEEE\/ACM International Conference on Software Engineering (ICSE \u201921). IEEE, 1161\u20131173."},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","unstructured":"Zhihan Jiang Jinyang Liu Zhuangbin Chen Yichen Li Junjie Huang Yintong Huo Pinjia He Jiazhen Gu and Michael R. Lyu. 2023. LLMParser: A LLM-based log parsing framework. arXiv:2310.01796. Retrieved from 10.48550\/arXiv.2310.01796","DOI":"10.48550\/arXiv.2310.01796"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","unstructured":"Carlos E. Jimenez John Yang Alexander Wettig Shunyu Yao Kexin Pei Ofir Press and Karthik Narasimhan. 2023. SWE-bench: Can language models resolve real-world GitHub issues? arXiv:2310.06770. Retrieved from 10.48550\/arXiv.2310.06770","DOI":"10.48550\/arXiv.2310.06770"},{"key":"e_1_3_2_66_2","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1007\/11513988_23","volume-title":"Proceedings of the 17th International Conference on Computer Aided Verification (CAV \u201905)","volume":"3576","author":"Jobstmann Barbara","year":"2005","unstructured":"Barbara Jobstmann, Andreas Griesmayer, and Roderick Bloem. 2005. Program repair as a game. In Proceedings of the 17th International Conference on Computer Aided Verification (CAV \u201905). Lecture Notes in Computer Science, Vol. 3576, Springer, 226\u2013238."},{"key":"e_1_3_2_67_2","doi-asserted-by":"publisher","unstructured":"Tae-Hwan Jung. 2021. Commitbert: Commit message generation using pre-trained programming language model. arXiv:2105.14242. Retrieved from 10.48550\/arXiv.2105.14242","DOI":"10.48550\/arXiv.2105.14242"},{"key":"e_1_3_2_68_2","unstructured":"Jared Kaplan Sam McCandlish Tom Henighan Tom B. Brown Benjamin Chess Rewon Child Scott Gray Alec Radford Jeffrey Wu and Dario Amodei. 2020. Scaling laws for neural language models. arXiv:2001.08361."},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1145\/3544548.3580919"},{"key":"e_1_3_2_70_2","first-page":"802","volume-title":"Proceedings of the 35th International Conference on Software Engineering (ICSE \u201913)","author":"Kim Dongsun","year":"2013","unstructured":"Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. In Proceedings of the 35th International Conference on Software Engineering (ICSE \u201913). IEEE Computer Society, 802\u2013811."},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00175355"},{"key":"e_1_3_2_72_2","unstructured":"Cognition Labs. 2024. Devin AI software engineer. Retrieved from https:\/\/www.cognition-labs.com\/introducing-devin"},{"key":"e_1_3_2_73_2","doi-asserted-by":"publisher","DOI":"10.1145\/3318162"},{"key":"e_1_3_2_74_2","doi-asserted-by":"publisher","unstructured":"Jia Li Ge Li Yongmin Li and Zhi Jin. 2023. Enabling programming thinking in large language models toward code generation. arXiv:2305.06599. Retrieved from 10.48550\/arXiv.2305.06599","DOI":"10.48550\/arXiv.2305.06599"},{"key":"e_1_3_2_75_2","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3549099"},{"key":"e_1_3_2_76_2","unstructured":"Raymond Li Loubna Ben Allal Yangtian Zi Niklas Muennighoff Denis Kocetkov Chenghao Mou Marc Marone Christopher Akiki Jia Li Jenny Chim et al. 2023. StarCoder: May the source be with you! arXiv:2305.06161."},{"key":"e_1_3_2_77_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.abq1158"},{"key":"e_1_3_2_78_2","unstructured":"Yichen Li Yintong Huo Zhihan Jiang Renyi Zhong Pinjia He Yuxin Su and Michael R. Lyu. 2023. Exploring the effectiveness of LLMs in automated logging generation: An empirical study. arXiv:2307.05950."},{"key":"e_1_3_2_79_2","doi-asserted-by":"publisher","unstructured":"Yichen Li Yintong Huo Renyi Zhong Zhihan Jiang Jinyang Liu Junjie Huang Jiazhen Gu Pinjia He and Michael R Lyu. 2024. Go Static: Contextualized logging statement generation. arXiv:2402.12958. Retrieved from 10.48550\/arXiv.2402.12958","DOI":"10.48550\/arXiv.2402.12958"},{"key":"e_1_3_2_80_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380345"},{"key":"e_1_3_2_81_2","first-page":"1461","volume-title":"Proceedings of the 43rd IEEE\/ACM International Conference on Software Engineering (ICSE \u201921)","author":"Li Zhenhao","year":"2021","unstructured":"Zhenhao Li, Heng Li, Tse-Hsun Peter Chen, and Weiyi Shang. 2021. DeepLV: Suggesting log levels using ordinal based neural networks. In Proceedings of the 43rd IEEE\/ACM International Conference on Software Engineering (ICSE \u201921). IEEE, 1461\u20131472."},{"key":"e_1_3_2_82_2","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3549081"},{"key":"e_1_3_2_83_2","unstructured":"Dianshu Liao Shidong Pan Qing Huang Xiaoxue Ren Zhenchang Xing Huan Jin and Qinying Li. 2023. Context-aware code generation framework for code repositories: Local global and third-party library awareness. arXiv:2312.05772."},{"key":"e_1_3_2_84_2","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1603.06744"},{"key":"e_1_3_2_85_2","first-page":"398","article-title":"Automated code editing with search-generate-modify","author":"Liu Changshu","year":"2023","unstructured":"Changshu Liu, Pelin Cetin, Yogesh Patodia, Saikat Chakraborty, Yangruibo Ding, and Baishakhi Ray. 2023. Automated code editing with search-generate-modify. IEEE Transaction of Software Engineering (2023), 398\u2013399.","journal-title":"IEEE Transaction of Software Engineering"},{"key":"e_1_3_2_86_2","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00638"},{"key":"e_1_3_2_87_2","article-title":"Refining ChatGPT-generated code: Characterizing and mitigating code quality issues","author":"Liu Yue","year":"2023","unstructured":"Yue Liu, Thanh Le-Cong, Ratnadira Widyasari, Chakkrit Tantithamthavorn, Li Li, Xuan-Bach D. Le, and David Lo. 2023. Refining ChatGPT-generated code: Characterizing and mitigating code quality issues. ACM Transactions on Software Engineering and Methodology (2023).","journal-title":"ACM Transactions on Software Engineering and Methodology"},{"key":"e_1_3_2_88_2","first-page":"176","volume-title":"2019 34th IEEE\/ACM International Conference on Automated Software Engineering (ASE)","author":"Liu Zhongxin","year":"2019","unstructured":"Zhongxin Liu, Xin Xia, Christoph Treude, David Lo, and Shanping Li. 2019. Automatic generation of pull request descriptions. In 2019 34th IEEE\/ACM International Conference on Automated Software Engineering (ASE), 176\u2013188."},{"key":"e_1_3_2_89_2","first-page":"166","volume-title":"Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC\/FSE \u201915)","author":"Long Fan","year":"2015","unstructured":"Fan Long and Martin Rinard. 2015. Staged program repair with condition synthesis. In Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC\/FSE \u201915), 166\u2013178."},{"key":"e_1_3_2_90_2","doi-asserted-by":"publisher","DOI":"10.1145\/2837614.2837617"},{"key":"e_1_3_2_91_2","first-page":"1","volume-title":"Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1 (NeurIPS Datasets and Benchmarks \u201921)","author":"Lu Shuai","year":"2021","unstructured":"Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, et al. 2021. CodeXGLUE: A machine learning benchmark dataset for code understanding and generation. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1 (NeurIPS Datasets and Benchmarks \u201921). Joaquin Vanschoren and Sai-Kit Yeung (Eds.), 1\u201316."},{"key":"e_1_3_2_92_2","unstructured":"Ziyang Luo Can Xu Pu Zhao Qingfeng Sun Xiubo Geng Wenxiang Hu Chongyang Tao Jing Ma Qingwei Lin and Daxin Jiang. 2023. WizardCoder: Empowering code large language models with evol-instruct. arXiv:2306.08568."},{"key":"e_1_3_2_93_2","doi-asserted-by":"publisher","DOI":"10.1145\/3395363.339736"},{"key":"e_1_3_2_94_2","first-page":"448","volume-title":"Proceedings of the 37th IEEE\/ACM International Conference on Software Engineering (ICSE \u201915)","volume":"1","author":"Mechtaev Sergey","year":"2015","unstructured":"Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2015. DirectFix: Looking for simple program repairs. In Proceedings of the 37th IEEE\/ACM International Conference on Software Engineering (ICSE \u201915), Vol. 1, IEEE Computer Society, 448\u2013458."},{"key":"e_1_3_2_95_2","doi-asserted-by":"publisher","DOI":"10.1145\/2884781.2884807"},{"key":"e_1_3_2_96_2","volume-title":"PACM-SE, Proceedings of International Conference on Foundations of Software Engineering (FSE \u201924)","author":"Misu Md Rakib Hossain","year":"2024","unstructured":"Md Rakib Hossain Misu, Cristina V. Lopes, Iris Ma, and James Noble. 2024. Towards AI assisted synthesis of verified D afny methods. In PACM-SE, Proceedings of International Conference on Foundations of Software Engineering (FSE \u201924)."},{"key":"e_1_3_2_97_2","first-page":"772","volume-title":"Proceedings of the 35th International Conference on Software Engineering (ICSE \u201913)","author":"Nguyen Hoang Duong Thien","year":"2013","unstructured":"Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. SemFix: Program repair via semantic analysis. In Proceedings of the 35th International Conference on Software Engineering (ICSE \u201913). IEEE Computer Society, 772\u2013781. DOI: https:\/\/dl.acm.org\/doi\/abs\/10.5555\/2486788.2486890"},{"key":"e_1_3_2_98_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2021.05.039"},{"key":"e_1_3_2_99_2","unstructured":"Erik Nijkamp Bo Pang Hiroaki Hayashi Lifu Tu Huan Wang Yingbo Zhou Silvio Savarese and Caiming Xiong. 2022. Codegen: An open large language model for code with multi-turn program synthesis. arXiv:2203.13474."},{"key":"e_1_3_2_100_2","first-page":"2133","volume-title":"Proceedings of the 32nd USENIX Security Symposium (USENIX Security \u201923)","author":"Niu Liang","year":"2023","unstructured":"Liang Niu, Shujaat Mirza, Zayd Maradni, and Christina P\u00f6pper. 2023. CodexLeaks: Privacy leaks from code generation language models in GitHub copilot. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security \u201923), 2133\u20132150. DOI: https:\/\/dl.acm.org\/doi\/10.5555\/3620237.3620357"},{"key":"e_1_3_2_101_2","unstructured":"OpenAI. 2023. GPT-4 technical report. arXiv:2303.08774."},{"key":"e_1_3_2_102_2","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2108.11601"},{"key":"e_1_3_2_103_2","first-page":"8026","article-title":"Pytorch: An imperative style, high-performance deep learning library","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems (NeurIPS \u201919), 8026\u20138037.","journal-title":"Advances in Neural Information Processing Systems (NeurIPS \u201919)"},{"key":"e_1_3_2_104_2","unstructured":"David A. Patterson Joseph Gonzalez Quoc V. Le Chen Liang Lluis-Miquel Munguia Daniel Rothchild David R. So Maud Texier and Jeff Dean. 2021. Carbon emissions and large neural network training. arXiv:2104.10350."},{"key":"e_1_3_2_105_2","doi-asserted-by":"publisher","DOI":"10.1109\/SP46214.2022.9833571"},{"key":"e_1_3_2_106_2","first-page":"4:1","volume-title":"Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering (ICSE \u201924)","author":"Peng Yun","year":"2024","unstructured":"Yun Peng, Shuzheng Gao, Cuiyun Gao, Yintong Huo, and Michael R. Lyu. 2024. Domain knowledge matters: Improving prompts with fix templates for repairing Python type errors. In Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering (ICSE \u201924). ACM, 4:1\u20134:13."},{"key":"e_1_3_2_107_2","doi-asserted-by":"publisher","DOI":"10.1145\/3576915.3623157"},{"key":"e_1_3_2_108_2","doi-asserted-by":"publisher","DOI":"10.1145\/75277.75293"},{"key":"e_1_3_2_109_2","unstructured":"Gabriel Poesia Oleksandr Polozov Vu Le Ashish Tiwari Gustavo Soares Christopher Meek and Sumit Gulwani. 2022. Synchromesh: Reliable code generation from pre-trained language models. arXiv:2201.11227."},{"key":"e_1_3_2_110_2","doi-asserted-by":"publisher","DOI":"10.1145\/2568225.25682"},{"key":"e_1_3_2_111_2","doi-asserted-by":"publisher","DOI":"10.1145\/2771783.2771791"},{"key":"e_1_3_2_112_2","series-title":"Association for Computational Linguistics","first-page":"1139","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL \u201917)","volume":"1","author":"Rabinovich Maxim","year":"2017","unstructured":"Maxim Rabinovich, Mitchell Stern, and Dan Klein. 2017. Abstract syntax networks for code generation and semantic parsing. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL \u201917), Long Papers, Vol. 1. Association for Computational Linguistics, 1139\u20131149."},{"key":"e_1_3_2_113_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581641.3584037"},{"key":"e_1_3_2_114_2","unstructured":"Baptiste Rozi\u00e8re Jonas Gehring Fabian Gloeckle Sten Sootla Itai Gat Xiaoqing Ellen Tan Yossi Adi Jingyu Liu Tal Remez J\u00e9r\u00e9my Rapin et al. 2023. Code llama: Open foundation models for code. arXiv:2308.12950."},{"key":"e_1_3_2_115_2","doi-asserted-by":"publisher","unstructured":"Gabriel Ryan Siddhartha Jain Mingyue Shang Shiqi Wang Xiaofei Ma Murali Krishna Ramanathan and Baishakhi Ray. 2024. Code-aware prompting: A study of coverage guided test generation in regression setting using LLM. arXiv:2402.00097. Retrieved from 10.1145\/364376","DOI":"10.1145\/364376"},{"key":"e_1_3_2_116_2","unstructured":"Fred B. Schneider National Research Council et al. 1999. Trust in cyberspace. National Academy Press Washington DC. DOI: https:\/\/dl.acm.org\/doi\/10.5555\/552385"},{"key":"e_1_3_2_117_2","first-page":"309","volume-title":"Proceedings of the USENIX Conference on Annual Technical Conference","author":"Serebryany Konstantin","year":"2012","unstructured":"Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: A fast address sanity checker. In Proceedings of the USENIX Conference on Annual Technical Conference, 309\u2013318."},{"key":"e_1_3_2_118_2","doi-asserted-by":"publisher","DOI":"10.1145\/3453483.3454051"},{"key":"e_1_3_2_119_2","series-title":"Proceedings of Machine Learning Research","first-page":"31693","volume-title":"Proceedings of the International Conference on Machine Learning (ICML \u201923)","volume":"202","author":"Shrivastava Disha","year":"2023","unstructured":"Disha Shrivastava, Hugo Larochelle, and Daniel Tarlow. 2023. Repository-level prompt generation for large language models of code. In Proceedings of the International Conference on Machine Learning (ICML \u201923), Proceedings of Machine Learning Research, Vol. 202, PMLR, 31693\u201331715."},{"key":"e_1_3_2_120_2","unstructured":"Mohammed Latif Siddiq Joanna C. S. Santos Ridwanul Hasan Tanvir Noshin Ulfat Fahmid Al Rifat and Vinicius Carvalho Lopes. 2023. Exploring the effectiveness of large language models in generating unit tests. arXiv:2305.00418."},{"key":"e_1_3_2_121_2","doi-asserted-by":"publisher","unstructured":"Manav Singhal Tushar Aggarwal Abhijeet Awasthi Nagarajan Natarajan and Aditya Kanade. 2024. NoFunEval: Funny how code LMs falter on requirements beyond functional correctness. arXiv:2401.15963. Retrieved from 10.48550\/arXiv.2401.15963","DOI":"10.48550\/arXiv.2401.15963"},{"key":"e_1_3_2_122_2","unstructured":"Giriprasad Sridhara Ranjani H. G. and Sourav Mazumdar. 2023. ChatGPT: A study on its utility for ubiquitous software engineering tasks. arXiv:2305.16837."},{"key":"e_1_3_2_123_2","doi-asserted-by":"publisher","unstructured":"Lichao Sun Yue Huang Haoran Wang Siyuan Wu Qihui Zhang Chujie Gao Yixin Huang Wenhan Lyu Yixuan Zhang Xiner Li et al. 2024. Trustllm: Trustworthiness in large language models. arXiv:2401.05561. Retrieved from 10.48550\/arXiv.2401.05561","DOI":"10.48550\/arXiv.2401.05561"},{"key":"e_1_3_2_124_2","first-page":"8984","volume-title":"Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI \u201920), The 32nd Innovative Applications of Artificial Intelligence Conference (IAAI \u201920), The 10th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI \u201920)","author":"Sun Zeyu","year":"2020","unstructured":"Zeyu Sun, Qihao Zhu, Yingfei Xiong, Yican Sun, Lili Mou, and Lu Zhang. 2020. TreeGen: A tree-based transformer architecture for code generation. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI \u201920), The 32nd Innovative Applications of Artificial Intelligence Conference (IAAI \u201920), The 10th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI \u201920). AAAI Press, 8984\u20138991."},{"key":"e_1_3_2_125_2","first-page":"471","volume-title":"Proceedings of the 37th IEEE\/ACM International Conference on Software Engineering (ICSE \u201915)","volume":"1","author":"Tan Shin Hwei","year":"2015","unstructured":"Shin Hwei Tan and Abhik Roychoudhury. 2015. Relifix: Automated repair of software regressions. In Proceedings of the 37th IEEE\/ACM International Conference on Software Engineering (ICSE \u201915), Vol. 1, IEEE Computer Society, 471\u2013482."},{"key":"e_1_3_2_126_2","doi-asserted-by":"publisher","DOI":"10.1145\/2950290.2950295"},{"key":"e_1_3_2_127_2","first-page":"1","volume-title":"PLATEAU Workshop","author":"Tang Ningzhi","year":"2023","unstructured":"Ningzhi Tang, Meng Chen, Zheng Ning, Aakash Bansal, Yu Huang, Collin McMillan, and Toby Jia-Jun Li. 2023. An empirical study of developer behaviors for validating and repairing AI-generated code. In PLATEAU Workshop, 1\u201315."},{"key":"e_1_3_2_128_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510067"},{"key":"e_1_3_2_129_2","article-title":"Code review automation: Strengths and weaknesses of the state of the art","author":"Tufano Rosalia","year":"2023","unstructured":"Rosalia Tufano, Ozren Dabic, Antonio Mastropaolo, Matteo Ciniselli, and Gabriele Bavota. 2023. Code review automation: Strengths and weaknesses of the state of the art. IEEE Transactions on Software Engineering (Feb. 2023).","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_3_2_130_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510621"},{"key":"e_1_3_2_131_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE43902.2021.00027"},{"key":"e_1_3_2_132_2","doi-asserted-by":"publisher","DOI":"10.1145\/3491101.3519665"},{"key":"e_1_3_2_133_2","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1007\/978-3-031-43835-6_23","volume-title":"Proceedings of the International Conference on Quantitative Evaluation of Systems (QEST \u201923)","author":"Voogd Erik","year":"2023","unstructured":"Erik Voogd, Einar Broch Johnsen, Alexandra Silva, Zachary J. Susag, and Andrzej W\u0105sowski. 2023. Symbolic semantics for probabilistic programs. In Proceedings of the International Conference on Quantitative Evaluation of Systems (QEST \u201923), 329\u2013345."},{"key":"e_1_3_2_134_2","first-page":"241","volume-title":"Proceedings of the 1st International Joint Conference on Artificial Intelligence","author":"Waldinger Richard J.","year":"1969","unstructured":"Richard J. Waldinger and Richard C. T. Lee. 1969. PROW: A step toward automatic program writing. In Proceedings of the 1st International Joint Conference on Artificial Intelligence. William Kaufmann, 241\u2013252."},{"key":"e_1_3_2_135_2","doi-asserted-by":"publisher","DOI":"10.1145\/3314221.3314588"},{"key":"e_1_3_2_136_2","doi-asserted-by":"crossref","first-page":"8696","DOI":"10.18653\/v1\/2021.emnlp-main.685","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP \u201921)","author":"Wang Yue","year":"2021","unstructured":"Yue Wang, Weishi Wang, Shafiq R. Joty, and Steven C. H. Hoi. 2021. CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP \u201921). Association for Computational Linguistics, 8696\u20138708."},{"key":"e_1_3_2_137_2","doi-asserted-by":"crossref","unstructured":"Zhiruo Wang Grace Cuenca Shuyan Zhou Frank F. Xu and Graham Neubig. 2022. Mconala: A benchmark for code generation from multiple natural languages. arXiv:2203.08388.","DOI":"10.18653\/v1\/2023.findings-eacl.20"},{"key":"e_1_3_2_138_2","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2206.07682"},{"key":"e_1_3_2_139_2","unstructured":"Yuxiang Wei Zhe Wang Jiawei Liu Yifeng Ding and Lingming Zhang. 2023. Magicoder: Source code is all you need. arXiv:2312.02120."},{"key":"e_1_3_2_140_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2013.6693094"},{"key":"e_1_3_2_141_2","first-page":"479","volume-title":"Proceedings of the 26th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER \u201919)","author":"White Martin","year":"2019","unstructured":"Martin White, Michele Tufano, Matias Martinez, Martin Monperrus, and Denys Poshyvanyk. 2019. Sorting and transforming program repair ingredients via deep learning code similarities. In Proceedings of the 26th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER \u201919). IEEE, 479\u2013490."},{"key":"e_1_3_2_142_2","unstructured":"Hongqiu Wu Hai Zhao and Min Zhang. 2020. Code summarization with structure-induced transformer. arXiv:2012.14710. Retrieved from http:\/\/arxiv.org\/abs\/2012.14710"},{"key":"e_1_3_2_143_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00129"},{"key":"e_1_3_2_144_2","unstructured":"Chunqiu Steven Xia and Lingming Zhang. 2023. Keep the conversation going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT. arXiv:2304.00385."},{"key":"e_1_3_2_145_2","doi-asserted-by":"publisher","unstructured":"Tao Xiao Hideaki Hata Christoph Treude and Kenichi Matsumoto. 2024. Generative AI for pull request descriptions: Adoption impact and developer interventions. DOI: 10.1145\/3643773","DOI":"10.1145\/3643773"},{"key":"e_1_3_2_146_2","first-page":"572","volume-title":"Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering","author":"Xie Zhuokui","year":"2023","unstructured":"Zhuokui Xie, Yinghao Chen, Chen Zhi, Shuiguang Deng, and Jianwei Yin. 2023. ChatUniTest: A ChatGPT-based automated unit test generation tool. In Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering, 572\u2013576."},{"key":"e_1_3_2_147_2","doi-asserted-by":"publisher","DOI":"10.1145\/3520312.3534862"},{"key":"e_1_3_2_148_2","series-title":"Association for Computational Linguistics","first-page":"6045","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL \u201920)","author":"Xu Frank F.","year":"2020","unstructured":"Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, and Graham Neubig. 2020. Incorporating external knowledge through pre-training for natural language to code generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL \u201920). Association for Computational Linguistics, 6045\u20136052."},{"key":"e_1_3_2_149_2","unstructured":"Junjielong Xu Ruichun Yang Yintong Huo Chengyu Zhang and Pinjia He. 2023. Prompting for automatic log template extraction. arXiv:2307.09950."},{"key":"e_1_3_2_150_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2016.2560811"},{"key":"e_1_3_2_151_2","unstructured":"Weixiang Yan Haitian Liu Yunkun Wang Yunzhe Li Qian Chen Wen Wang Tingyu Lin Weishan Zhao Li Zhu Shuiguang Deng et al. 2023. CodeScope: An execution-based multilingual multitask multidimensional benchmark for evaluating LLMs on code understanding and generation. arXiv:2311.08588."},{"key":"e_1_3_2_152_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510222"},{"key":"e_1_3_2_153_2","series-title":"Association for Computational Linguistics","first-page":"7","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP \u201918)","author":"Yin Pengcheng","year":"2018","unstructured":"Pengcheng Yin and Graham Neubig. 2018. TRANX: A transition-based neural abstract syntax parser for semantic parsing and code generation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP \u201918). Association for Computational Linguistics, 7\u201312."},{"key":"e_1_3_2_154_2","first-page":"206","volume-title":"Proceedings of the 2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security (QRS \u201923)","author":"Yu Shengcheng","year":"2023","unstructured":"Shengcheng Yu, Chunrong Fang, Yuchen Ling, Chentian Wu, and Zhenyu Chen. 2023. LLM for Test script generation and migration: Challenges, capabilities, and opportunities. In Proceedings of the 2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security (QRS \u201923). IEEE, 206\u2013217."},{"key":"e_1_3_2_155_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2018.2874648"},{"key":"e_1_3_2_156_2","first-page":"1592","volume-title":"Proceedings of the ACM International Symposium on Software Testing and Analysis (ISSTA \u201924)","author":"Zhang Zhiyu Fan Abhik Roychoudhury Yuntong","year":"2024","unstructured":"Zhiyu Fan Abhik Roychoudhury Yuntong Zhang, Haifeng Ruan. 2024. AutoCodeRover: Autonomous program improvement. In Proceedings of the ACM International Symposium on Software Testing and Analysis (ISSTA \u201924), 1592\u20131604."},{"key":"e_1_3_2_157_2","unstructured":"Daoguang Zan Bei Chen Dejian Yang Zeqi Lin Minsu Kim Bei Guan Yongji Wang Weizhu Chen and Jian-Guang Lou. 2022. CERT: Continual pre-training on sketches for library-oriented code generation. arXiv:2206.06888."},{"key":"e_1_3_2_158_2","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3556955"},{"key":"e_1_3_2_159_2","unstructured":"Li Zhong and Zilong Wang. 2023. A study on robustness and reliability of large language model code generation. arXiv:2308.10335."},{"key":"e_1_3_2_160_2","first-page":"415","volume-title":"Proceedings of the 37th IEEE\/ACM International Conference on Software Engineering (ICSE \u201915)","volume":"1","author":"Zhu Jieming","year":"2015","unstructured":"Jieming Zhu, Pinjia He, Qiang Fu, Hongyu Zhang, Michael R. Lyu, and Dongmei Zhang. 2015. Learning to log: Helping developers make informed logging decisions. In Proceedings of the 37th IEEE\/ACM International Conference on Software Engineering (ICSE \u201915), Vol. 1, IEEE Computer Society, 415\u2013425. DOI: https:\/\/dl.acm.org\/doi\/10.5555\/2818754.2818807"},{"key":"e_1_3_2_161_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE-SEIP.2019.00021"},{"key":"e_1_3_2_162_2","doi-asserted-by":"publisher","DOI":"10.1145\/3468264.3468544"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3708519","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3708519","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:17:45Z","timestamp":1750295865000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3708519"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,24]]},"references-count":161,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,6,30]]}},"alternative-id":["10.1145\/3708519"],"URL":"https:\/\/doi.org\/10.1145\/3708519","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5,24]]},"assertion":[{"value":"2024-03-23","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-10-10","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-05-24","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}