{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T18:36:44Z","timestamp":1775932604952,"version":"3.50.1"},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"8","license":[{"start":{"date-parts":[[2024,11,21]],"date-time":"2024-11-21T00:00:00Z","timestamp":1732147200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["Nos. 62192731, 62152730"],"award-info":[{"award-number":["Nos. 62192731, 62152730"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Key R & D Program","award":["No. 2023YFB4503801"],"award-info":[{"award-number":["No. 2023YFB4503801"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["Nos. 62072007, 62192733, 61832009, 62192730"],"award-info":[{"award-number":["Nos. 62072007, 62192733, 61832009, 62192730"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Major Program (JD) of Hubei Province","award":["No.2023BAA024"],"award-info":[{"award-number":["No.2023BAA024"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2024,11,30]]},"abstract":"<jats:p>\n            Large language models (LLMs) have shown great success in code generation. LLMs take as the input a prompt and output the code. How to make prompts (i.e.,\n            <jats:italic>Prompting Techniques<\/jats:italic>\n            ) is a key question. Existing prompting techniques are designed for natural language generation and have low accuracy in code generation.\n          <\/jats:p>\n          <jats:p>\n            In this article, we propose a new prompting technique named\n            <jats:sc>AceCoder<\/jats:sc>\n            . Our motivation is that code generation meets two unique challenges (i.e., requirement understanding and code implementation).\n            <jats:sc>AceCoder<\/jats:sc>\n            contains two novel mechanisms (i.e., guided code generation and example retrieval) to solve these challenges. \u2776 Guided code generation asks LLMs first to analyze requirements and output an intermediate preliminary (e.g., test cases). The preliminary clarifies requirements and tells LLMs\n            <jats:italic>\u201cwhat to write.\u201d<\/jats:italic>\n            \u2777 Example retrieval selects similar programs as examples in prompts, which provide lots of relevant content (e.g., algorithms, APIs) and teach LLMs\n            <jats:italic>\u201chow to write.\u201d<\/jats:italic>\n            We apply\n            <jats:sc>AceCoder<\/jats:sc>\n            to four LLMs (e.g., GPT-3.5, CodeGeeX) and evaluate it on three public benchmarks using the Pass@\n            <jats:inline-formula content-type=\"math\/tex\">\n              <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\(k\\)<\/jats:tex-math>\n            <\/jats:inline-formula>\n            . Results show that\n            <jats:sc>AceCoder<\/jats:sc>\n            can significantly improve the performance of LLMs on code generation.\n            <jats:italic>\n              In terms of Pass@1,\n              <jats:sc>AceCoder<\/jats:sc>\n              outperforms the SOTA baseline by up to 56.4% in MBPP, 70.7% in MBJP, and 88.4% in MBJSP\n            <\/jats:italic>\n            .\n            <jats:sc>AceCoder<\/jats:sc>\n            is effective in LLMs with different sizes (i.e., 6B\u201313B) and different languages (i.e., Python, Java, and JavaScript). Human evaluation shows human developers prefer programs from\n            <jats:sc>AceCoder<\/jats:sc>\n            .\n          <\/jats:p>","DOI":"10.1145\/3675395","type":"journal-article","created":{"date-parts":[[2024,7,4]],"date-time":"2024-07-04T14:38:52Z","timestamp":1720103932000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":26,"title":["<scp>AceCoder<\/scp>\n            : An Effective Prompting Technique Specialized in Code Generation"],"prefix":"10.1145","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5579-8852","authenticated-orcid":false,"suffix":"(he\/him\/hi","given":"Jia","family":"Li","sequence":"first","affiliation":[{"name":"Key Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing, China and School of Computer Science, Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-7034-3191","authenticated-orcid":false,"given":"Yunfei","family":"Zhao","sequence":"additional","affiliation":[{"name":"Key Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing, China and School of Computer Science, Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-3702-0043","authenticated-orcid":false,"given":"Yongmin","family":"Li","sequence":"additional","affiliation":[{"name":"Key Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing, China and School of Computer Science, Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5828-0186","authenticated-orcid":false,"given":"Ge","family":"Li","sequence":"additional","affiliation":[{"name":"Key Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing, China and School of Computer Science, Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1087-226X","authenticated-orcid":false,"given":"Zhi","family":"Jin","sequence":"additional","affiliation":[{"name":"Key Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing, China and School of Computer Science, Peking University, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2024,11,21]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"2022. CodeParrot. Retrieved from https:\/\/huggingface.co\/codeparrot\/codeparrot"},{"key":"e_1_3_2_3_2","unstructured":"2022. GitHub. Retrieved from https:\/\/github.com\/"},{"key":"e_1_3_2_4_2","unstructured":"2022. Lucene. Retrieved from https:\/\/lucene.apache.org\/"},{"key":"e_1_3_2_5_2","unstructured":"2022. tree-sitter. Retrieved from https:\/\/tree-sitter.github.io\/tree-sitter\/"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.211"},{"key":"e_1_3_2_7_2","volume-title":"Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923)","author":"Athiwaratkun Ben","year":"2023","unstructured":"Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, Hantian Ding, Varun Kumar, Nathan Fulton, Arash Farahani, Siddhartha Jain, Robert Giaquinto, Haifeng Qian, Murali Krishna Ramanathan, and Ramesh Nallapati. 2023. Multi-lingual Evaluation of Code Generation Models. In Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923). OpenReview.net. Retrieved from https:\/\/openreview.net\/pdf?id=Bo7eeXm6An8"},{"key":"e_1_3_2_8_2","unstructured":"Jacob Austin Augustus Odena Maxwell Nye Maarten Bosma Henryk Michalewski David Dohan Ellen Jiang Carrie Cai Michael Terry Quoc Le and Charles Sutton. 2021. Program Synthesis with Large Language Models. arXiv:2108.07732. Retrieved from https:\/\/arxiv.org\/abs\/2108.07732"},{"key":"e_1_3_2_9_2","volume-title":"Test-Driven Development - By Example","author":"Beck Kent L.","year":"2003","unstructured":"Kent L. Beck. 2003. Test-Driven Development - By Example. Addison-Wesley."},{"key":"e_1_3_2_10_2","first-page":"1877","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 33. 1877\u20131901."},{"key":"e_1_3_2_11_2","unstructured":"Sahil Chaudhary. 2023. Code Alpaca: An Instruction-Following LLaMA Model for Code Generation. Retrieved from https:\/\/github.com\/sahil280114\/codealpaca"},{"key":"e_1_3_2_12_2","volume-title":"Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923)","author":"Chen Bei","year":"2023","unstructured":"Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, and Weizhu Chen. 2023. CodeT: Code Generation with Generated Tests. In Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923). OpenReview.net. Retrieved from https:\/\/openreview.net\/pdf?id=ktrw68Cmu9c"},{"key":"e_1_3_2_13_2","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman Alex Ray Raul Puri Gretchen Krueger Michael Petrov Heidy Khlaaf Girish Sastry Pamela Mishkin Brooke Chan Scott Gray Nick Ryder Mikhail Pavlov Alethea Power Lukasz Kaiser Mohammad Bavarian Clemens Winter Philippe Tillet Felipe Petroski Such Dave Cummings Matthias Plappert Fotios Chantzis Elizabeth Barnes Ariel Herbert-Voss William Hebgen Guss Alex Nichol Alex Paino Nikolas Tezak Jie Tang Igor Babuschkin Suchir Balaji Shantanu Jain William Saunders Christopher Hesse Andrew N. Carr Jan Leike Josh Achiam Vedant Misra Evan Morikawa Alec Radford Matthew Knight Miles Brundage Mira Murati Katie Mayer Peter Welinder Bob McGrew Dario Amodei Sam McCandlish Ilya Sutskever and Wojciech Zaremba. 2021. Evaluating Large Language Models Trained on Code. arXiv:2107.03374. Retrieved from https:\/\/arxiv.org\/abs\/2107.03374"},{"key":"e_1_3_2_14_2","volume-title":"Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923)","author":"Fried Daniel","year":"2023","unstructured":"Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Scott Yih, Luke Zettlemoyer, and Mike Lewis. 2023. InCoder: A Generative Model for Code Infilling and Synthesis. In Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923). OpenReview.net. Retrieved from https:\/\/openreview.net\/pdf?id=hQwb-lbM6EL"},{"key":"e_1_3_2_15_2","unstructured":"Yiyang Hao Ge Li Yongqiang Liu Xiaowei Miao He Zong Siyuan Jiang Yang Liu and He Wei. 2022. AixBench: A Code Generation Benchmark Dataset. arXiv:2206.13179. Retrieved from https:\/\/arxiv.org\/pdf\/2206.13179"},{"key":"e_1_3_2_16_2","first-page":"10073","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018 (NeurIPS \u201918)","author":"Hashimoto Tatsunori B.","year":"2018","unstructured":"Tatsunori B. Hashimoto, Kelvin Guu, Yonatan Oren, and Percy Liang. 2018. A Retrieve-and-Edit Framework for Predicting Structured Outputs. In Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018 (NeurIPS \u201918). Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolo Cesa-Bianchi, and Roman Garnett (Eds.). 10073\u201310083. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2018\/hash\/cd17d3ce3b64f227987cd92cd701cc58-Abstract.html"},{"key":"e_1_3_2_17_2","volume-title":"Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021","author":"Hendrycks Dan","year":"2021","unstructured":"Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, and Jacob Steinhardt. 2021. Measuring Coding Challenge Competence with APPS. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021. Joaquin Vanschoren and Sai-Kit Yeung (Eds.). Retrieved from https:\/\/datasets-benchmarks-proceedings.neurips.cc\/paper\/2021\/hash\/c24cd76e1ce41366a4bbe8a49b02a028-Abstract-round2.html"},{"key":"e_1_3_2_18_2","volume-title":"Proceedings of the 8th International Conference on Learning Representations (ICLR \u201920)","author":"Holtzman Ari","year":"2020","unstructured":"Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi. 2020. The Curious Case of Neural Text Degeneration. In Proceedings of the 8th International Conference on Learning Representations (ICLR \u201920). OpenReview.net. Retrieved from https:\/\/openreview.net\/forum?id=rygGQyrFvH"},{"key":"e_1_3_2_19_2","first-page":"13419","volume-title":"Proceedings of the Advances in Neural Information Processing Systems (NeurIPS)","author":"Inala Jeevana Priya","year":"2022","unstructured":"Jeevana Priya Inala, Chenglong Wang, Mei Yang, Andr\u00e9s Codas, Mark Encarnaci\u00f3n, Shuvendu K. Lahiri, Madanlal Musuvathi, and Jianfeng Gao. 2022. Fault-Aware Neural Code Rankers. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS). 13419\u201313432. Retrieved from http:\/\/papers.nips.cc\/paper_files\/paper\/2022\/hash\/5762c579d09811b7639be2389b3d07be-Abstract-Conference.html"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510203"},{"key":"e_1_3_2_21_2","unstructured":"Jia Li Ge Li Xuanming Zhang Yihong Dong and Zhi Jin. 2024a. EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories. arXiv:2404.00599. Retrieved from https:\/\/arxiv.org\/pdf\/2404.00599"},{"key":"e_1_3_2_22_2","doi-asserted-by":"crossref","unstructured":"Jia Li Ge Li Yunfei Zhao Yongmin Li Huanyu Liu Hao Zhu Lecheng Wang Kaibo Liu Zheng Fang Lanshen Wang Jiazheng Ding Xuanming Zhang Yuqi Zhu Yihong Dong Zhi Jin Binhua Li Fei Huang and Yongbin Li. 2024b. DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories. arXiv:2405.19856. Retrieved from https:\/\/arxiv.org\/pdf\/2405.19856","DOI":"10.18653\/v1\/2024.findings-acl.214"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE51524.2021.9678724"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00179"},{"key":"e_1_3_2_25_2","unstructured":"Raymond Li Loubna Ben Allal Yangtian Zi Niklas Muennighoff Denis Kocetkov Chenghao Mou Marc Marone Christopher Akiki Jia Li Jenny Chim Qian Liu Evgenii Zheltonozhskii Terry Yue Zhuo Thomas Wang Olivier Dehaene Mishig Davaadorj Joel Lamy-Poirier Jo\u00e3o Monteiro Oleh Shliazhko Nicolas Gontier Nicholas Meade Armel Zebaze Ming-Ho Yee Logesh Kumar Umapathi Jian Zhu Benjamin Lipkin Muhtasham Oblokulov Zhiruo Wang Rudra Murthy Jason Stillerman Siva Sankalp Patel Dmitry Abulkhanov Marco Zocca Manan Dey Zhihan Zhang Nour Fahmy Urvashi Bhattacharyya Wenhao Yu Swayam Singh Sasha Luccioni Paulo Villegas Maxim Kunakov Fedor Zhdanov Manuel Romero Tony Lee Nadav Timor Jennifer Ding Claire Schlesinger Hailey Schoelkopf Jan Ebert Tri Dao Mayank Mishra Alex Gu Jennifer Robinson Carolyn Jane Anderson Brendan Dolan-Gavitt Danish Contractor Siva Reddy Daniel Fried Dzmitry Bahdanau Yacine Jernite Carlos Mu\u00f1oz Ferrandis Sean Hughes Thomas Wolf Arjun Guha Leandro von Werra and Harm de Vries. 2023a. StarCoder: May the Source be With You! arXiv:2305.06161. Retrieved from https:\/\/arxiv.org\/pdf\/2305.06161"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.abq1158"},{"key":"e_1_3_2_27_2","first-page":"74","volume-title":"Text Summarization Branches Out: Proceedings of the ACL-04 Workshop","author":"Lin C. Y.","year":"2004","unstructured":"C. Y. Lin. 2004. Rouge: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out: Proceedings of the ACL-04 Workshop. 74\u201381."},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.431"},{"key":"e_1_3_2_29_2","unstructured":"Ziyang Luo Can Xu Pu Zhao Qingfeng Sun Xiubo Geng Wenxiang Hu Chongyang Tao Jing Ma Qingwei Lin and Daxin Jiang. 2023. WizardCoder: Empowering Code Large Language Models with Evol-Instruct. arXiv:2306.08568. Retrieved from https:\/\/arxiv.org\/pdf\/2306.08568"},{"key":"e_1_3_2_30_2","first-page":"7","volume-title":"Proceedings of the 1st International Workshop on Emerging Trends in FLOSS Research and Development (FLOSS \u201907: ICSE Workshops 2007)","author":"Mockus Audris","year":"2007","unstructured":"Audris Mockus. 2007. Large-Scale Code Reuse in Open Source Software. In Proceedings of the 1st International Workshop on Emerging Trends in FLOSS Research and Development (FLOSS \u201907: ICSE Workshops 2007). IEEE, 7\u20137."},{"key":"e_1_3_2_31_2","volume-title":"Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923)","author":"Nijkamp Erik","year":"2023","unstructured":"Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, and Caiming Xiong. 2023. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. In Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923). OpenReview.net. Retrieved from https:\/\/openreview.net\/pdf?id=iaYcJKpY2B_"},{"key":"e_1_3_2_32_2","unstructured":"OpenAI. 2022. ChatGPT. Retrieved from https:\/\/openai.com\/blog\/chatgpt"},{"key":"e_1_3_2_33_2","unstructured":"OpenAI. 2023. gpt-3.5-turbo. Retrieved from https:\/\/platform.openai.com\/docs\/models\/gpt-3-5"},{"key":"e_1_3_2_34_2","first-page":"27730","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","volume":"35","author":"Ouyang Long","year":"2022","unstructured":"Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F. Christiano, Jan Leike, and Ryan Lowe. 2022. Training Language Models to Follow Instructions with Human Feedback. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 35. 27730\u201327744."},{"key":"e_1_3_2_35_2","first-page":"311","volume-title":"Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311\u2013318."},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-emnlp.232"},{"key":"e_1_3_2_37_2","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving Language Understanding by Generative Pre-Training. Retrieved from https:\/\/www.mikecaptain.com\/resources\/pdf\/GPT-1.pdf"},{"issue":"8","key":"e_1_3_2_38_2","first-page":"9","article-title":"Language Models Are Unsupervised Multitask Learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language Models Are Unsupervised Multitask Learners. OpenAI blog 1, 8 (2019), 9.","journal-title":"OpenAI blog"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1410"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1561\/1500000019"},{"key":"e_1_3_2_41_2","unstructured":"Rohan Taori Ishaan Gulrajani Tianyi Zhang Yann Dubois Xuechen Li Carlos Guestrin Percy Liang and Tatsunori B. Hashimoto. 2023. Stanford Alpaca: An Instruction-Following LLaMA Model. Retrieved from https:\/\/github.com\/tatsu-lab\/stanford_alpaca"},{"key":"e_1_3_2_42_2","unstructured":"Hugo Touvron Thibaut Lavril Gautier Izacard Xavier Martinet Marie-Anne Lachaux Timoth\u00e9e Lacroix Baptiste Rozi\u00e9re Naman Goyal Eric Hambro Faisal Azhar Aurelien Rodriguez Armand Joulin Edouard Grave and Guillaume Lample. 2023. LLaMA: Open and Efficient Foundation Language Models. arXiv:2302.13971. Retrieved from https:\/\/arxiv.org\/pdf\/2302.13971"},{"key":"e_1_3_2_43_2","volume-title":"Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923)","author":"Wang Xuezhi","year":"2023","unstructured":"Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc V. Le, Ed H. Chi, Sharan Narang, Aakanksha Chowdhery, and Denny Zhou. 2023c. Self-Consistency Improves Chain of Thought Reasoning in Language Models. In Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923). OpenReview.net. Retrieved from https:\/\/openreview.net\/pdf?id=1PL1NIMMrw"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.acl-long.754"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/2023.EMNLP-MAIN.68"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.685"},{"key":"e_1_3_2_47_2","first-page":"349","volume-title":"Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering (ASE \u201920)","author":"Wei Bolin","year":"2020","unstructured":"Bolin Wei, Yongmin Li, Ge Li, Xin Xia, and Zhi Jin. 2020. Retrieve and Refine: Exemplar-Based Neural Comment Generation. In Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering (ASE \u201920). IEEE, 349\u2013360."},{"key":"e_1_3_2_48_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Maarten Bosma, Vincent Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V. Le. 2022a. Finetuned Language Models are Zero-Shot Learners. In Proceedings of the International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=gEZrGCozdqR"},{"key":"e_1_3_2_49_2","first-page":"24824","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. 2022b. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Proceedings of the Advances in Neural Information Processing Systems. 24824\u201324837."},{"key":"e_1_3_2_50_2","unstructured":"Can Xu Qingfeng Sun Kai Zheng Xiubo Geng Pu Zhao Jiazhan Feng Chongyang Tao and Daxin Jiang. 2023. WizardLM: Empowering Large Language Models to Follow Complex Instructions. arXiv:2304.12244. Retrieved from https:\/\/arxiv.org\/pdf\/2304.12244"},{"key":"e_1_3_2_51_2","first-page":"12697","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Zhao Zihao","year":"2021","unstructured":"Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh. 2021. Calibrate Before Use: Improving Few-Shot Performance of Language Models. In Proceedings of the International Conference on Machine Learning. PMLR, 12697\u201312706."},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1145\/3580305.3599790"},{"key":"e_1_3_2_53_2","volume-title":"Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923)","author":"Zhou Denny","year":"2023","unstructured":"Denny Zhou, Nathanael Sch\u00e4rli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc V. Le, and Ed H. Chi. 2023. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models. In Proceedings of the 11th International Conference on Learning Representations (ICLR \u201923). OpenReview.net. Retrieved from https:\/\/openreview.net\/pdf?id=WZH7099tgfM"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3675395","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3675395","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:04:23Z","timestamp":1750291463000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3675395"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,21]]},"references-count":52,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2024,11,30]]}},"alternative-id":["10.1145\/3675395"],"URL":"https:\/\/doi.org\/10.1145\/3675395","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,21]]},"assertion":[{"value":"2023-10-22","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-06-17","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-11-21","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}