{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,17]],"date-time":"2026-01-17T18:57:40Z","timestamp":1768676260982,"version":"3.49.0"},"reference-count":88,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2025,3,29]],"date-time":"2025-03-29T00:00:00Z","timestamp":1743206400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Commission through Horizon 2020","award":["812882"],"award-info":[{"award-number":["812882"]}]},{"name":"Research Council of Norway through the secureIT","award":["288787"],"award-info":[{"award-number":["288787"]}]},{"name":"Experimental Infrastructure for Exploration of Exascale Computing"},{"name":"Experimental Infrastructure for Exploration of Exascale Computing","award":["270053"],"award-info":[{"award-number":["270053"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Evol. Learn. Optim."],"published-print":{"date-parts":[[2025,3,31]]},"abstract":"<jats:p>Program synthesis with Large Language Models (LLMs) suffers from a \u201cnear-miss syndrome\u201d: The generated code closely resembles a correct solution but fails unit tests due to minor errors. We address this with a multi-agent framework called Synthesize, Execute, Instruct, Debug, and Repair (SEIDR). Effectively applying SEIDR to instruction-tuned LLMs requires determining (a) optimal prompts for LLMs, (b) what ranking algorithm selects the best programs in debugging rounds, and (c) balancing the repair of unsuccessful programs with the generation of new ones. We empirically explore these tradeoffs by comparing replace-focused, repair-focused, and hybrid debug strategies. We also evaluate lexicase and tournament selection to rank candidates in each generation. On Program Synthesis Benchmark 2 (PSB2), our framework outperforms both conventional use of OpenAI Codex without a repair phase and traditional genetic programming approaches. SEIDR outperforms the use of an LLM alone, solving 18 problems in C++ and 20 in Python on PSB2 at least once across experiments. To assess generalizability, we employ GPT-3.5 and Llama 3 on the PSB2 and HumanEval-X benchmarks. Although SEIDR with these models does not surpass current state-of-the-art methods on the Python benchmarks, the results on HumanEval-C++ are promising. SEIDR with Llama 3-8B achieves an average pass@100 of 84.2%. Across all SEIDR runs, 163 of 164 problems are solved at least once with GPT-3.5 in HumanEval-C++, and 162 of 164 with the smaller Llama 3-8B. We conclude that SEIDR effectively overcomes the near-miss syndrome in program synthesis with LLMs.<\/jats:p>","DOI":"10.1145\/3719351","type":"journal-article","created":{"date-parts":[[2025,2,26]],"date-time":"2025-02-26T14:37:51Z","timestamp":1740580671000},"page":"1-37","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Fully Autonomous Programming Using Iterative Multi-Agent Debugging with Large Language Models"],"prefix":"10.1145","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3139-0200","authenticated-orcid":false,"given":"Anastasiia","family":"Grishina","sequence":"first","affiliation":[{"name":"Simula, Oslo, Norway and University of Oslo, Oslo, Norway"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6670-6909","authenticated-orcid":false,"given":"Vadim","family":"Liventsev","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2966-3305","authenticated-orcid":false,"given":"Aki","family":"H\u00e4rm\u00e4","sequence":"additional","affiliation":[{"name":"Philips Research, Eindhoven, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1761-6771","authenticated-orcid":false,"given":"Leon","family":"Moonen","sequence":"additional","affiliation":[{"name":"Simula, Oslo, Norway"}]}],"member":"320","published-online":{"date-parts":[[2025,3,29]]},"reference":[{"issue":"4","key":"e_1_3_2_2_1","first-page":"1765","article-title":"A survey of genetic programming and its applications","volume":"13","author":"Ahvanooey Milad Taleby","year":"2019","unstructured":"Milad Taleby Ahvanooey, Qianmu Li, Ming Wu, and Shuo Wang. 2019. A survey of genetic programming and its applications. KSII Transactions on Internet and Information Systems 13, 4 (2019), 1765\u20131794.","journal-title":"KSII Transactions on Internet and Information Systems"},{"key":"e_1_3_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3212695"},{"key":"e_1_3_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2635868.2635901"},{"key":"e_1_3_2_5_1","doi-asserted-by":"publisher","DOI":"10.3233\/978-1-61499-495-4-1"},{"key":"e_1_3_2_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2018.2876563"},{"key":"e_1_3_2_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-04083-2_11"},{"key":"e_1_3_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3563327"},{"key":"e_1_3_2_9_1","volume-title":"Nine Worlds of Seid-Magic: Ecstasy and Neo-Shamanism in North European Paganism","author":"Blain Jenny","year":"2002","unstructured":"Jenny Blain. 2002. Nine Worlds of Seid-Magic: Ecstasy and Neo-Shamanism in North European Paganism. Routledge."},{"key":"e_1_3_2_10_1","first-page":"1877","volume-title":"International Conference on Neural Information Processing Systems (NIPS \u201920)","author":"Brown Tom B.","year":"2020","unstructured":"Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. In International Conference on Neural Information Processing Systems (NIPS \u201920). Curran Associates, Inc., Red Hook, NY, 1877\u20131901."},{"key":"e_1_3_2_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/tse.2021.3087402"},{"key":"e_1_3_2_12_1","doi-asserted-by":"publisher","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman et al. 2021. Evaluating large language models trained on code. arXiv:2107.03374. DOI: 10.48550\/arXiv.2107.03374","DOI":"10.48550\/arXiv.2107.03374"},{"key":"e_1_3_2_13_1","doi-asserted-by":"publisher","unstructured":"Xinyun Chen Maxwell Lin Nathanael Sch\u00e4rli and Denny Zhou. 2023. Teaching large language models to self-debug. arXiv:2304.05128. DOI: 10.48550\/arXiv.2304.05128","DOI":"10.48550\/arXiv.2304.05128"},{"key":"e_1_3_2_14_1","first-page":"22196","article-title":"Latent execution for neural program synthesis beyond domain-specific languages","volume":"34","author":"Chen Xinyun","year":"2021","unstructured":"Xinyun Chen, Dawn Song, and Yuandong Tian. 2021. Latent execution for neural program synthesis beyond domain-specific languages. In Advances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 22196\u201322208.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_15_1","first-page":"161","volume-title":"Diverse Perspectives and State-of-the-Art Approaches to the Utilization of Data-Driven Clinical Decision Support Systems","author":"Connolly Thomas M.","year":"2023","unstructured":"Thomas M. Connolly, Mario Soflano, and Petros Papadopoulos. 2023. Systematic literature review: XAI and clinical decision support. In Diverse Perspectives and State-of-the-Art Approaches to the Utilization of Data-Driven Clinical Decision Support Systems. IGI Global, 161\u2013188."},{"key":"e_1_3_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3638530.3664162"},{"key":"e_1_3_2_17_1","doi-asserted-by":"publisher","unstructured":"Sander de Bruin Vadim Liventsev and Milan Petkovi\u0107. 2021. Autoencoders as tools for program synthesis. arXiv:2108.07129. DOI: 10.48550\/arXiv.2108.07129","DOI":"10.48550\/arXiv.2108.07129"},{"key":"e_1_3_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3450494"},{"key":"e_1_3_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3565971"},{"key":"e_1_3_2_20_1","doi-asserted-by":"publisher","unstructured":"Yihong Dong Xue Jiang Zhi Jin and Ge Li. 2024. Self-collaboration code generation via ChatGPT. arXiv:2304.07590. DOI: 10.48550\/arXiv.2304.07590","DOI":"10.48550\/arXiv.2304.07590"},{"key":"e_1_3_2_21_1","doi-asserted-by":"publisher","unstructured":"Zhiyu Fan Xiang Gao Martin Mirchev Abhik Roychoudhury and Shin Hwei Tan. 2023. Automated repair of programs from large language models. arXiv:2205.10583. DOI: 10.48550\/arXiv.2205.10583","DOI":"10.48550\/arXiv.2205.10583"},{"key":"e_1_3_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2915970.2915984"},{"key":"e_1_3_2_23_1","doi-asserted-by":"publisher","DOI":"10.1002\/rob.21918"},{"key":"e_1_3_2_24_1","first-page":"22","volume-title":"Programming by Examples (and Its Applications in Data Wrangling)","author":"Gulwani Sumit","year":"2016","unstructured":"Sumit Gulwani. 2016. Programming by Examples (and Its Applications in Data Wrangling). Technical Report. Microsoft Corporation, Redmond, WA, 22 pages."},{"key":"e_1_3_2_25_1","doi-asserted-by":"publisher","DOI":"10.1561\/2500000010"},{"key":"e_1_3_2_26_1","first-page":"17685","article-title":"Synthesize, execute and debug: Learning to repair for neural program synthesis","volume":"33","author":"Gupta Kavi","year":"2020","unstructured":"Kavi Gupta, Peter Ebert Christensen, Xinyun Chen, and Dawn Song. 2020. Synthesize, execute and debug: Learning to repair for neural program synthesis. In Advances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc., 17685\u201317695.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/911909"},{"key":"e_1_3_2_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(88)90002-1"},{"key":"e_1_3_2_29_1","doi-asserted-by":"publisher","unstructured":"Thomas Helmuth and Peter Kelly. 2021. PSB2: The second program synthesis benchmark suite. arXiv:2106.06086. DOI: 10.48550\/arXiv.2106.06086","DOI":"10.48550\/arXiv.2106.06086"},{"key":"e_1_3_2_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10710-022-09434-y"},{"key":"e_1_3_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2739480.2754769"},{"key":"e_1_3_2_32_1","doi-asserted-by":"publisher","DOI":"10.1162\/artl_a_00341"},{"key":"e_1_3_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/tevc.2014.2362729"},{"key":"e_1_3_2_34_1","doi-asserted-by":"publisher","unstructured":"Srinivasan Iyer Ioannis Konstas Alvin Cheung and Luke Zettlemoyer. 2018. Mapping language to code in programmatic context. arXiv:1808.09588. DOI: 10.48550\/arXiv.1808.09588","DOI":"10.48550\/arXiv.1808.09588"},{"key":"e_1_3_2_35_1","doi-asserted-by":"publisher","DOI":"10.1016\/s0925-5273(00)00012-8"},{"key":"e_1_3_2_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/tetc.2022.3171314"},{"key":"e_1_3_2_37_1","doi-asserted-by":"publisher","unstructured":"Shuyang Jiang Yuhao Wang and Yu Wang. 2023. SelfEvolve: A code evolution framework via large language models. arXiv:2306.02907. DOI: 10.48550\/arXiv.2306.02907","DOI":"10.48550\/arXiv.2306.02907"},{"key":"e_1_3_2_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3549149"},{"key":"e_1_3_2_39_1","doi-asserted-by":"publisher","unstructured":"Harshit Joshi Jos\u00e9 Cambronero Sumit Gulwani Vu Le Ivan Radicek and Gust Verbruggen. 2022. Repair is nearly generation: Multilingual program repair with LLMs. arXiv:2208.11640. DOI: 10.48550\/arXiv.2208.11640","DOI":"10.48550\/arXiv.2208.11640"},{"key":"e_1_3_2_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2610384.2628055"},{"key":"e_1_3_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3604905"},{"key":"e_1_3_2_42_1","volume":"17","author":"Koza John R.","year":"1994","unstructured":"John R. Koza. 1994. Genetic Programming II, Vol. 17. MIT.","journal-title":"Genetic Programming II"},{"key":"e_1_3_2_43_1","first-page":"1","article-title":"SPoC: Search-based pseudocode to code","volume":"32","author":"Kulal Sumith","year":"2019","unstructured":"Sumith Kulal, Panupong Pasupat, Kartik Chandra, Mina Lee, Oded Padon, Alex Aiken, and Percy S. Liang. 2019. SPoC: Search-based pseudocode to code. In Advances in Neural Information Processing Systems, Vol. 32. Curran Associates, Inc., 1\u201312.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/icse.2012.6227211"},{"key":"e_1_3_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318162"},{"key":"e_1_3_2_46_1","doi-asserted-by":"publisher","unstructured":"Joel Lehman Jonathan Gordon Shawn Jain Kamal Ndousse Cathy Yeh and Kenneth O. Stanley. 2022. Evolution through large models. arXiv:2206.08896. DOI: 10.48550\/arXiv.2206.08896","DOI":"10.48550\/arXiv.2206.08896"},{"key":"e_1_3_2_47_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.abq1158"},{"key":"e_1_3_2_48_1","doi-asserted-by":"publisher","unstructured":"Fei Liu Xialiang Tong Mingxuan Yuan and Qingfu Zhang. 2023. Algorithm evolution using large language model. arXiv:2311.15249. DOI: 10.48550\/arXiv.2311.15249","DOI":"10.48550\/arXiv.2311.15249"},{"key":"e_1_3_2_49_1","first-page":"21558","article-title":"Is your code generated by ChatGPT really correct? Rigorous evaluation of large language models for code generation","volume":"36","author":"Liu Jiawei","year":"2023","unstructured":"Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. 2023. Is your code generated by ChatGPT really correct? Rigorous evaluation of large language models for code generation. In Advances in Neural Information Processing Systems, Vol. 36. Curran Associates, Inc., 21558\u201321572.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3583131.3590481"},{"key":"e_1_3_2_51_1","doi-asserted-by":"publisher","unstructured":"Vadim Liventsev Aki H\u00e4rm\u00e4 and Milan Petkovi\u0107. 2021. BF++: A language for general-purpose program synthesis. arXiv:2101.09571. DOI: 10.48550\/arXiv.2101.09571","DOI":"10.48550\/arXiv.2101.09571"},{"key":"e_1_3_2_52_1","first-page":"1","volume-title":"Neural Information Processing Systems Track on Datasets and Benchmarks","author":"Lu Shuai","year":"2021","unstructured":"Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, et al. 2021. CodeXGLUE: A machine learning benchmark dataset for code understanding and generation. In Neural Information Processing Systems Track on Datasets and Benchmarks. Curran Associates, Inc., 1\u201316."},{"key":"e_1_3_2_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/32.153379"},{"key":"e_1_3_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/362566.362568"},{"key":"e_1_3_2_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/thms.2020.3017748"},{"key":"e_1_3_2_56_1","doi-asserted-by":"publisher","unstructured":"Changan Niu Chuanyi Li Vincent Ng and Bin Luo. 2023. CrossCodeBench: Benchmarking cross-task generalization of source code models. arXiv:2302.04030. DOI: 10.48550\/arXiv.2302.04030","DOI":"10.48550\/arXiv.2302.04030"},{"key":"e_1_3_2_57_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-24855-2_70"},{"key":"e_1_3_2_58_1","doi-asserted-by":"publisher","unstructured":"OpenAI. 2023. GPT-4 Technical Report. DOI: 10.48550\/arXiv.2303.08774","DOI":"10.48550\/arXiv.2303.08774"},{"key":"e_1_3_2_59_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-61470-6_4"},{"key":"e_1_3_2_60_1","doi-asserted-by":"publisher","unstructured":"Shuyin Ouyang Jie M. Zhang Mark Harman and Meng Wang. 2023. LLM is like a box of chocolates: The non-determinism of ChatGPT in code generation. arXiv:2308.02828. DOI: 10.48550\/arXiv.2308.02828","DOI":"10.48550\/arXiv.2308.02828"},{"key":"e_1_3_2_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/sp46214.2022.9833571"},{"key":"e_1_3_2_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/tevc.2017.2693219"},{"key":"e_1_3_2_63_1","doi-asserted-by":"publisher","DOI":"10.1145\/2814270.2814310"},{"key":"e_1_3_2_64_1","doi-asserted-by":"publisher","unstructured":"Shuo Ren Daya Guo Shuai Lu Long Zhou Shujie Liu Duyu Tang Neel Sundaresan Ming Zhou Ambrosio Blanco and Shuai Ma. 2020. CodeBLEU: A method for automatic evaluation of code synthesis. arXiv:2009.10297. DOI: 10.48550\/arXiv.2009.10297","DOI":"10.48550\/arXiv.2009.10297"},{"key":"e_1_3_2_65_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-29573-7_3"},{"key":"e_1_3_2_66_1","doi-asserted-by":"publisher","unstructured":"Baptiste Rozi\u00e8re Jonas Gehring Fabian Gloeckle Sten Sootla Itai Gat Xiaoqing Ellen Tan Yossi Adi Jingyu Liu Tal Remez J\u00e9r\u00e9my Rapin et al. 2023. Code Llama: Open foundation models for code. arXiv:2308.12950. DOI: 10.48550\/arXiv.2308.12950","DOI":"10.48550\/arXiv.2308.12950"},{"key":"e_1_3_2_67_1","volume-title":"Artificial Intelligence: A Modern Approach","author":"Russell Stuart J.","year":"2010","unstructured":"Stuart J. Russell. 2010. Artificial Intelligence: A Modern Approach. Pearson Education, Inc."},{"key":"e_1_3_2_68_1","first-page":"8634","article-title":"Reflexion: Language agents with verbal reinforcement learning","volume":"36","author":"Shinn Noah","year":"2023","unstructured":"Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2023. Reflexion: Language agents with verbal reinforcement learning. In Advances in Neural Information Processing Systems, Vol. 36, 8634\u20138652.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_69_1","doi-asserted-by":"publisher","unstructured":"Atsushi Shirafuji Yutaka Watanobe Takumi Ito Makoto Morishita Yuki Nakamura Yusuke Oda and Jun Suzuki. 2023. Exploring the robustness of large language models for solving programming problems. arXiv:2306.14583. DOI: 10.48550\/arXiv.2306.14583","DOI":"10.48550\/arXiv.2306.14583"},{"key":"e_1_3_2_70_1","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(93)90034-9"},{"key":"e_1_3_2_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/3512290.3528700"},{"key":"e_1_3_2_72_1","doi-asserted-by":"publisher","unstructured":"Dominik Sobania Dirk Schweim and Franz Rothlauf. 2021. Recent developments in program synthesis with evolutionary algorithms. arXiv:2108.12227. DOI: 10.48550\/arXiv.2108.12227","DOI":"10.48550\/arXiv.2108.12227"},{"key":"e_1_3_2_73_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-15524-1"},{"key":"e_1_3_2_74_1","doi-asserted-by":"publisher","unstructured":"Hugo Touvron Thibaut Lavril Gautier Izacard Xavier Martinet Marie-Anne Lachaux Timoth\u00e9e Lacroix Baptiste Rozi\u00e8re Naman Goyal Eric Hambro Faisal Azhar et al. 2023. LLaMA: Open and efficient foundation language models. arXiv:2302.13971. DOI: 10.48550\/arXiv.2302.13971","DOI":"10.48550\/arXiv.2302.13971"},{"key":"e_1_3_2_75_1","first-page":"5998","volume-title":"International Conference on Neural Information Processing Systems (NeurIPS)","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In International Conference on Neural Information Processing Systems (NeurIPS). I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Curran Associates, Inc., 5998\u20136008."},{"key":"e_1_3_2_76_1","first-page":"230","article-title":"Search bias, language bias, and genetic programming","volume":"1996","author":"Whigham Peter A.","year":"1996","unstructured":"Peter A. Whigham. 1996. Search bias, language bias, and genetic programming. Genetic Programming 1996 (1996), 230\u2013237.","journal-title":"Genetic Programming"},{"key":"e_1_3_2_77_1","doi-asserted-by":"publisher","unstructured":"Chunqiu Steven Xia and Lingming Zhang. 2023. Conversational automated program repair. arXiv:2301.13246. DOI: 10.48550\/arXiv.2301.13246","DOI":"10.48550\/arXiv.2301.13246"},{"key":"e_1_3_2_78_1","doi-asserted-by":"publisher","DOI":"10.1109\/TETCI.2019.2952908"},{"key":"e_1_3_2_79_1","doi-asserted-by":"publisher","DOI":"10.1145\/3520312.3534862"},{"key":"e_1_3_2_80_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.findings-emnlp.337"},{"key":"e_1_3_2_81_1","doi-asserted-by":"publisher","unstructured":"Zihan Yu Liang He Zhen Wu Xinyu Dai and Jiajun Chen. 2023. Towards better chain-of-thought prompting strategies: A survey. arXiv:2310.04959. DOI: 10.48550\/arXiv.2310.04959","DOI":"10.48550\/arXiv.2310.04959"},{"key":"e_1_3_2_82_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.acl-long.411"},{"key":"e_1_3_2_83_1","doi-asserted-by":"publisher","unstructured":"Maksym Zavershynskyi Alex Skidanov and Illia Polosukhin. 2018. NAPS: Natural program synthesis dataset. arXiv:1807.0316. DOI: 10.48550\/arXiv.1807.0316","DOI":"10.48550\/arXiv.1807.0316"},{"key":"e_1_3_2_84_1","first-page":"1","volume-title":"Annual Workshop on Optimization for Machine Learning (OPT)","author":"Zelikman Eric","year":"2023","unstructured":"Eric Zelikman, Eliana Lorch, Lester Mackey, and Adam Tauman Kalai. 2023. Self-taught optimizer (STOP): Recursively self-improving code generation. In Annual Workshop on Optimization for Machine Learning (OPT). OpenReview.net, 1\u201344."},{"key":"e_1_3_2_85_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.acl-long.45"},{"key":"e_1_3_2_86_1","doi-asserted-by":"publisher","unstructured":"Quanjun Zhang Chunrong Fang Yang Xie YuXiang Ma Weisong Sun Yun Yang and Zhenyu Chen. 2024. A systematic literature review on large language models for automated program repair. arXiv:2405.01466. DOI: 10.48550\/arXiv.2405.01466","DOI":"10.48550\/arXiv.2405.01466"},{"key":"e_1_3_2_87_1","unstructured":"Shengyu Zhang Linfeng Dong Xiaoya Li Sen Zhang Xiaofei Sun Shuhe Wang Jiwei Li Runyi Hu Tianwei Zhang Fei Wu et al. 2024. Instruction tuning for large language models: A survey. arXiv:2308.10792. Retrieved from https:\/\/arxiv.org\/abs\/2308.10792"},{"key":"e_1_3_2_88_1","doi-asserted-by":"publisher","unstructured":"Yuwei Zhang Zhi Jin Ying Xing and Ge Li. 2023. STEAM: Simulating the InTeractive BEhavior of ProgrAMmers for automatic bug fixing. arXiv:2308.14460. DOI: 10.48550\/arXiv.2308.14460","DOI":"10.48550\/arXiv.2308.14460"},{"key":"e_1_3_2_89_1","doi-asserted-by":"publisher","DOI":"10.1145\/3580305.3599790"}],"container-title":["ACM Transactions on Evolutionary Learning and Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3719351","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3719351","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T18:43:21Z","timestamp":1750272201000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3719351"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,29]]},"references-count":88,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,3,31]]}},"alternative-id":["10.1145\/3719351"],"URL":"https:\/\/doi.org\/10.1145\/3719351","relation":{},"ISSN":["2688-3007"],"issn-type":[{"value":"2688-3007","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,3,29]]},"assertion":[{"value":"2023-12-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-02-19","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-29","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}