{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,30]],"date-time":"2026-06-30T05:13:36Z","timestamp":1782796416365,"version":"3.54.5"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2025,1,21]],"date-time":"2025-01-21T00:00:00Z","timestamp":1737417600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62192731, 62152730"],"award-info":[{"award-number":["62192731, 62152730"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Key R&D Program","award":["2023YFB4503801"],"award-info":[{"award-number":["2023YFB4503801"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62072007, 62192733, 61832009, 62192730"],"award-info":[{"award-number":["62072007, 62192733, 61832009, 62192730"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Major Program (JD) of Hubei Province","award":["2023BAA024"],"award-info":[{"award-number":["2023BAA024"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2025,2,28]]},"abstract":"<jats:p>\n            Large Language Models (LLMs) have shown impressive abilities in code generation. Chain-of-Thought (CoT) prompting is the state-of-the-art approach to utilizing LLMs. CoT prompting asks LLMs first to generate CoTs (i.e., intermediate natural language reasoning steps) and then output the code. However, the accuracy of CoT prompting still cannot satisfy practical applications. For example, gpt-3.5-turbo with CoT prompting only achieves 53.29% Pass@1 in HumanEval. In this article, we propose Structured CoTs (SCoTs) and present a novel prompting technique for code generation named SCoT prompting. Our motivation is that human developers follow structured programming. Developers use three programming structures (i.e., sequential, branch, and loop) to design and implement structured programs. Thus, we ask LLMs to use three programming structures to generate SCoTs (structured reasoning steps) before outputting the final code. Compared to CoT prompting, SCoT prompting explicitly introduces programming structures and unlocks the structured programming thinking of LLMs. We apply SCoT prompting to two LLMs (i.e., gpt-4-turbo, gpt-3.5-turbo, and DeepSeek Coder-Instruct-\n            <jats:inline-formula content-type=\"math\/tex\">\n              <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\(\\{\\)<\/jats:tex-math>\n            <\/jats:inline-formula>\n            1.3B, 6.7B, 33B\n            <jats:inline-formula content-type=\"math\/tex\">\n              <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\(\\}\\)<\/jats:tex-math>\n            <\/jats:inline-formula>\n            ) and evaluate it on three benchmarks (i.e., HumanEval, MBPP, and MBCPP). SCoT prompting outperforms CoT prompting by up to 13.79% in Pass@1. SCoT prompting is robust to examples and achieves substantial improvements. The human evaluation also shows human developers prefer programs from SCoT prompting.\n          <\/jats:p>","DOI":"10.1145\/3690635","type":"journal-article","created":{"date-parts":[[2024,8,29]],"date-time":"2024-08-29T15:10:38Z","timestamp":1724944238000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":110,"title":["Structured Chain-of-Thought Prompting for Code Generation"],"prefix":"10.1145","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5579-8852","authenticated-orcid":false,"suffix":"(he\/him\/hi","given":"Jia","family":"Li","sequence":"first","affiliation":[{"name":"Key Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Haidian, China and School of Computer Science, Peking University, Haidian, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5828-0186","authenticated-orcid":false,"given":"Ge","family":"Li","sequence":"additional","affiliation":[{"name":"Key Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Haidian, China and School of Computer Science, Peking University, Haidian, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-3702-0043","authenticated-orcid":false,"given":"Yongmin","family":"Li","sequence":"additional","affiliation":[{"name":"Key Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Haidian, China and School of Computer Science, Peking University, Haidian, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1087-226X","authenticated-orcid":false,"given":"Zhi","family":"Jin","sequence":"additional","affiliation":[{"name":"Key Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Haidian, China and School of Computer Science, Peking University, Haidian, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2025,1,21]]},"reference":[{"key":"e_1_3_2_2_2","volume-title":"the 11th International Conference on Learning Representations, ICLR 2023","author":"Athiwaratkun Ben","year":"2023","unstructured":"Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, Hantian Ding, Varun Kumar, Nathan Fulton, Arash Farahani, Siddhartha Jain, Robert Giaquinto, Haifeng Qian, Murali Krishna Ramanathan, and Ramesh Nallapati. 2023. Multi-lingual evaluation of code generation models. In the 11th International Conference on Learning Representations, ICLR 2023. OpenReview.net. Retrieved from https:\/\/openreview.net\/forum?id=Bo7eeXm6An8"},{"key":"e_1_3_2_3_2","unstructured":"Jacob Austin Augustus Odena Maxwell I. Nye Maarten Bosma Henryk Michalewski David Dohan Ellen Jiang Carrie J. Cai Michael Terry Quoc V. Le and Charles Sutton. 2021. Program synthesis with large language models. arXiv: 2108.07732. Retrieved from https:\/\/arxiv.org\/abs\/2108.07732"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/355592.365646"},{"key":"e_1_3_2_5_2","volume-title":"Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020.","author":"Brown Tom B.","year":"2020","unstructured":"Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020. Hugo Larochelle, Marc\u2019Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2020\/hash\/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html"},{"key":"e_1_3_2_6_2","unstructured":"Sahil Chaudhary. 2023. Code Alpaca: An Instruction-Following LLaMA Model for Code Generation. Retrieved from https:\/\/github.com\/sahil280114\/codealpaca"},{"key":"e_1_3_2_7_2","volume-title":"the 11th International Conference on Learning Representations, ICLR 2023","author":"Chen Bei","year":"2023","unstructured":"Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, and Weizhu Chen. 2023. CodeT: Code generation with generated tests. In the 11th International Conference on Learning Representations, ICLR 2023. OpenReview.net. Retrieved from https:\/\/openreview.net\/forum?id=ktrw68Cmu9c"},{"key":"e_1_3_2_8_2","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Pond\u00e9 de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman Alex Ray Raul Puri Gretchen Krueger Michael Petrov Heidy Khlaaf Girish Sastry Pamela Mishkin Brooke Chan Scott Gray Nick Ryder Mikhail Pavlov Alethea Power Lukasz Kaiser Mohammad Bavarian Clemens Winter Philippe Tillet Felipe Petroski Such Dave Cummings Matthias Plappert Fotios Chantzis Elizabeth Barnes Ariel Herbert-Voss William Hebgen Guss Alex Nichol Alex Paino Nikolas Tezak Jie Tang Igor Babuschkin Suchir Balaji Shantanu Jain William Saunders Christopher Hesse Andrew N. Carr Jan Leike Joshua Achiam Vedant Misra Evan Morikawa Alec Radford Matthew Knight Miles Brundage Mira Murati Katie Mayer Peter Welinder Bob McGrew Dario Amodei Sam McCandlish Ilya Sutskever and Wojciech Zaremba. 2021. Evaluating large language models trained on code. arXiv:2107.03374. Retrieved from https:\/\/arxiv.org\/abs\/2107.03374"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/1089786.1089799"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45672-4_31"},{"key":"e_1_3_2_11_2","volume-title":"the 11th International Conference on Learning Representations, ICLR 2023","author":"Fried Daniel","year":"2023","unstructured":"Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Scott Yih, Luke Zettlemoyer, and Mike Lewis. 2023. InCoder: A generative model for code infilling and synthesis. In the 11th International Conference on Learning Representations, ICLR 2023. OpenReview.net. Retrieved from https:\/\/openreview.net\/forum?id=hQwb-lbM6EL"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE56229.2023.00109"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","unstructured":"Daya Guo Qihao Zhu Dejian Yang Zhenda Xie Kai Dong Wentao Zhang Guanting Chen Xiao Bi Y. Wu Y. K. Li Fuli Luo Yingfei Xiong and Wenfeng Liang. 2024. DeepSeek-Coder: When the large language model meets programming. The rise of code intelligence. arXiv:2401.14196. Retrieved from 10.48550\/ARXIV.2401.14196","DOI":"10.48550\/ARXIV.2401.14196"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","unstructured":"Yiyang Hao Ge Li Yongqiang Liu Xiaowei Miao He Zong Siyuan Jiang Yang Liu and He Wei. 2022. AixBench: A code generation benchmark dataset. arXiv:2206.13179. Retrieved from 10.48550\/arXiv.2206.13179","DOI":"10.48550\/arXiv.2206.13179"},{"key":"e_1_3_2_15_2","volume-title":"8th International Conference on Learning Representations, ICLR 2020","author":"Holtzman Ari","year":"2020","unstructured":"Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi. 2020. The curious case of neural text degeneration. In 8th International Conference on Learning Representations, ICLR 2020. OpenReview.net. Retrieved from https:\/\/openreview.net\/forum?id=rygGQyrFvH"},{"key":"e_1_3_2_16_2","first-page":"13419","article-title":"Fault-aware neural code rankers","volume":"35","author":"Inala Jeevana Priya","year":"2022","unstructured":"Jeevana Priya Inala, Chenglong Wang, Mei Yang, Andres Codas, Mark Encarnaci\u00f3n, Shuvendu Lahiri, Madanlal Musuvathi, and Jianfeng Gao. 2022. Fault-aware neural code rankers. Advances in Neural Information Processing Systems 35 (2022), 13419\u201313432.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3597207"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","unstructured":"Jia Li Ge Li Xuanming Zhang Yihong Dong and Zhi Jin. 2024a. EvoCodeBench: An evolving code generation benchmark aligned with real-world code repositories. arXiv:2404.00599. Retrieved from https:\/\/doi.org\/10.48550\/ARXIV.2404.00599","DOI":"10.48550\/ARXIV.2404.00599"},{"key":"e_1_3_2_19_2","first-page":"3603","volume-title":"Findings of the Association for Computational Linguistics ACL 2024","author":"Li Jia","year":"2024","unstructured":"Jia Li, Ge Li, Yunfei Zhao, Yongmin Li, Huanyu Liu, Hao Zhu, Lecheng Wang, Kaibo Liu, Zheng Fang, Lanshen Wang, Jiazheng Ding, Xuanming Zhang, Yuqi Zhu, Yihong Dong, Zhi Jin, Binhua Li, Fei Huang, Yongbin Li, Bin Gu, and Mengfei Yang. 2024b. DevEval: A manually-annotated code generation benchmark aligned with real-world code repositories. In Findings of the Association for Computational Linguistics ACL 2024. Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.), Association for Computational Linguistics, Bangkok, Thailand, 3603\u20133614. DOI: https:\/\/aclanthology.org\/2024.findings-acl.214"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00179"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3675395"},{"key":"e_1_3_2_22_2","unstructured":"Raymond Li Loubna Ben Allal Yangtian Zi Niklas Muennighoff Denis Kocetkov Chenghao Mou Marc Marone Christopher Akiki Jia Li Jenny Chim Qian Liu Evgenii Zheltonozhskii Terry Yue Zhuo Thomas Wang Olivier Dehaene Mishig Davaadorj Joel Lamy-Poirier Jo\u00e3o Monteiro Oleh Shliazhko Nicolas Gontier Nicholas Meade Armel Zebaze Ming-Ho Yee Logesh Kumar Umapathi Jian Zhu Benjamin Lipkin Muhtasham Oblokulov Zhiruo Wang Rudra Murthy V. Jason T. Stillerman Siva Sankalp Patel Dmitry Abulkhanov Marco Zocca Manan Dey Zhihan Zhang Nour Fahmy Urvashi Bhattacharyya Wenhao Yu Swayam Singh Sasha Luccioni Paulo Villegas Maxim Kunakov Fedor Zhdanov Manuel Romero Tony Lee Nadav Timor Jennifer Ding Claire Schlesinger Hailey Schoelkopf Jan Ebert Tri Dao Mayank Mishra Alex Gu Jennifer Robinson Carolyn Jane Anderson Brendan Dolan-Gavitt Danish Contractor Siva Reddy Daniel Fried Dzmitry Bahdanau Yacine Jernite Carlos Mu\u00f1oz Ferrandis Sean Hughes Thomas Wolf Arjun Guha Leandro von Werra and Harm de Vries. 2023a. StarCoder: May the source be with you! Transactions on Machine Learning Research (2023). Retrieved from https:\/\/openreview.net\/forum?id=KoFOg41haE"},{"key":"e_1_3_2_23_2","volume-title":"the 12th International Conference on Learning Representations, ICLR 2024","author":"Luo Ziyang","year":"2024","unstructured":"Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, Qingwei Lin, and Daxin Jiang. 2024. WizardCoder: Empowering code large language models with Evol-Instruct. In the 12th International Conference on Learning Representations, ICLR 2024. OpenReview.net. Retrieved from https:\/\/openreview.net\/forum?id=UnUwSIgK5W"},{"key":"e_1_3_2_24_2","volume-title":"the 11th International Conference on Learning Representations, ICLR 2023","author":"Nijkamp Erik","year":"2023","unstructured":"Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, and Caiming Xiong. 2023. CodeGen: An open large language model for code with multi-turn program synthesis. In the 11th International Conference on Learning Representations, ICLR 2023. OpenReview.net. Retrieved from https:\/\/openreview.net\/forum?id=iaYcJKpY2B_"},{"key":"e_1_3_2_25_2","unstructured":"OpenAI. 2023a. gpt-3.5-turbo. Retrieved from https:\/\/platform.openai.com\/docs\/models\/gpt-3-5"},{"key":"e_1_3_2_26_2","unstructured":"OpenAI. 2023b. gpt-3.5-turbo. Retrieved from https:\/\/platform.openai.com\/docs\/models\/gpt-3-5-turbo"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","unstructured":"OpenAI. 2023c. GPT-4 Technical Report. arXiv:2303.08774. Retrieved from https:\/\/doi.org\/10.48550\/ARXIV.2303.08774","DOI":"10.48550\/ARXIV.2303.08774"},{"key":"e_1_3_2_28_2","first-page":"27730","article-title":"Training language models to follow instructions with human feedback","volume":"35","author":"Ouyang Long","year":"2022","unstructured":"Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 (2022), 27730\u201327744.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_29_2","first-page":"311","volume-title":"40th Annual Meeting of the Association for Computational Linguistics","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In 40th Annual Meeting of the Association for Computational Linguistics, 311\u2013318."},{"key":"e_1_3_2_30_2","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving Language Understanding by Generative Pre-Training. Retrieved from https:\/\/s3-us-west-2.amazonaws.com\/openai-assets\/research-covers\/language-unsupervised\/language_understanding_paper.pdf"},{"key":"e_1_3_2_31_2","unstructured":"Alec Radford Jeff Wu Rewon Child David Luan Dario Amodei and Ilya Sutskever. 2019. Language Models Are Unsupervised Multitask Learners. Retrieved from https:\/\/insightcivic.s3.us-east-1.amazonaws.com\/language-models.pdf"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/n19-1421"},{"key":"e_1_3_2_33_2","unstructured":"Rohan Taori Ishaan Gulrajani Tianyi Zhang Yann Dubois Xuechen Li Carlos Guestrin Percy Liang and Tatsunori B. Hashimoto. 2023. Stanford Alpaca: An Instruction-Following LLaMA Model. Retrieved from https:\/\/github.com\/tatsu-lab\/stanford_alpaca"},{"key":"e_1_3_2_34_2","unstructured":"CodeParrot Team. 2022. CodeParrot. Retrieved from https:\/\/huggingface.co\/codeparrot\/codeparrot"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","unstructured":"Hugo Touvron Thibaut Lavril Gautier Izacard Xavier Martinet Marie-Anne Lachaux Timoth\u00e9e Lacroix Baptiste Rozi\u00e8re Naman Goyal Eric Hambro Faisal Azhar Aur\u00e9lien Rodriguez Armand Joulin Edouard Grave and Guillaume Lample. 2023. LLaMA: Open and efficient foundation language models. arXiv:2302.13971. Retrieved from https:\/\/doi.org\/10.48550\/ARXIV.2302.13971","DOI":"10.48550\/ARXIV.2302.13971"},{"key":"e_1_3_2_36_2","volume-title":"the 11th International Conference on Learning Representations, ICLR 2023","author":"Wang Xuezhi","year":"2023","unstructured":"Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc V. Le, Ed H. Chi, Sharan Narang, Aakanksha Chowdhery, and Denny Zhou. 2023c. Self-consistency improves chain of thought reasoning in language models. In the 11th International Conference on Learning Representations, ICLR 2023. OpenReview.net. Retrieved from https:\/\/openreview.net\/pdf?id=1PL1NIMMrw"},{"key":"e_1_3_2_37_2","first-page":"13484","volume-title":"the 61st Annual Meeting of the Association for Computational Linguistics","volume":"1","author":"Wang Yizhong","year":"2023","unstructured":"Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, and Hannaneh Hajishirzi. 2023a. Self-instruct: Aligning language models with self-generated instructions. In the 61st Annual Meeting of the Association for Computational Linguistics, Vol. 1, Long Papers, Association for Computational Linguistics, Toronto, Canada, 13484\u201313508. DOI: https:\/\/aclanthology.org\/2023.acl-long.754"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/2023.EMNLP-MAIN.68"},{"key":"e_1_3_2_39_2","volume-title":"the 10th International Conference on Learning Representations, ICLR 2022","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V. Le. 2022a. Finetuned language models are zero-shot learners. In the 10th International Conference on Learning Representations, ICLR 2022. OpenReview.net. Retrieved from https:\/\/openreview.net\/forum?id=gEZrGCozdqR"},{"key":"e_1_3_2_40_2","unstructured":"Jason Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter Fei Xia Ed H. Chi Quoc V. Le and Denny Zhou. 2022. Chain of thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems. Alice H. Oh Alekh Agarwal Danielle Belgrave and Kyunghyun Cho (Eds.). Retrieved from https:\/\/openreview.net\/forum?id=_VjQlMeSB_J"},{"key":"e_1_3_2_41_2","volume-title":"the 12th International Conference on Learning Representations, ICLR 2024","author":"Xu Can","year":"2024","unstructured":"Can Xu, Qingfeng Sun, Kai Zheng, Xiubo Geng, Pu Zhao, Jiazhan Feng, Chongyang Tao, Qingwei Lin, and Daxin Jiang. 2024. WizardLM: Empowering large pre-trained language models to follow complex instructions. In the 12th International Conference on Learning Representations, ICLR 2024. OpenReview.net. DOI: https:\/\/openreview.net\/forum?id=CfXh93NDgH"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.acl-long.45"},{"key":"e_1_3_2_43_2","first-page":"12697","volume-title":"38th International Conference on Machine Learning, ICML 2021","volume":"139","author":"Zhao Zihao","year":"2021","unstructured":"Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh. 2021. Calibrate before use: Improving few-shot performance of language models. In 38th International Conference on Machine Learning, ICML 2021. Marina Meila and Tong Zhang (Eds.), Virtual Event (Proceedings of Machine Learning Research, Vol. 139), PMLR, 12697\u201312706. Retrieved from http:\/\/proceedings.mlr.press\/v139\/zhao21c.html"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","unstructured":"Qinkai Zheng Xiao Xia Xu Zou Yuxiao Dong Shan Wang Yufei Xue Zihan Wang Lei Shen Andi Wang Yang Li Teng Su Zhilin Yang and Jie Tang. 2023. CodeGeeX: A pre-trained model for code generation with multilingual evaluations on HumanEval-X. arXiv:2303.17568. Retrieved from https:\/\/doi.org\/10.48550\/ARXIV.2303.17568","DOI":"10.48550\/ARXIV.2303.17568"},{"key":"e_1_3_2_45_2","volume-title":"the 11th International Conference on Learning Representations, ICLR 2023","author":"Zhou Denny","year":"2023","unstructured":"Denny Zhou, Nathanael Sch\u00e4rli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc V. Le, and Ed H. Chi. 2023. Least-to-most prompting enables complex reasoning in large language models. In the 11th International Conference on Learning Representations, ICLR 2023. OpenReview.net. Retrieved from https:\/\/openreview.net\/pdf?id=WZH7099tgfM"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3690635","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3690635","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:19:10Z","timestamp":1750295950000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3690635"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,21]]},"references-count":44,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,2,28]]}},"alternative-id":["10.1145\/3690635"],"URL":"https:\/\/doi.org\/10.1145\/3690635","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,21]]},"assertion":[{"value":"2024-02-27","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-08-10","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-01-21","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}