{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,16]],"date-time":"2026-02-16T18:38:38Z","timestamp":1771267118332,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":23,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2026,2,22]]},"DOI":"10.1145\/3773966.3778008","type":"proceedings-article","created":{"date-parts":[[2026,2,16]],"date-time":"2026-02-16T17:50:01Z","timestamp":1771264201000},"page":"79-88","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Code LLMs Still Fall Short of Top Programmers: Evaluating Algorithmic Code Generation Through Computational Thinking"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-0368-6866","authenticated-orcid":false,"given":"Shisong","family":"Chen","sequence":"first","affiliation":[{"name":"Shanghai Institute of Artificial Intelligence for Education, East China Normal University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-7419-3098","authenticated-orcid":false,"given":"Ziyu","family":"Zhou","sequence":"additional","affiliation":[{"name":"College of Computer Science and Artificial Intelligence, Fudan University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-1738-8335","authenticated-orcid":false,"given":"Yicong","family":"Zhao","sequence":"additional","affiliation":[{"name":"College of Computer Science and Artificial Intelligence, Fudan University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7023-7543","authenticated-orcid":false,"given":"Chengyi","family":"Yang","sequence":"additional","affiliation":[{"name":"Shanghai Institute of Artificial Intelligence for Education, East China Normal University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2355-288X","authenticated-orcid":false,"given":"Zhixu","family":"Li","sequence":"additional","affiliation":[{"name":"School of Information, Renmin University of China, Beijing, China and School of Smart Governance, Renmin University of China, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8403-9591","authenticated-orcid":false,"given":"Yanghua","family":"Xiao","sequence":"additional","affiliation":[{"name":"Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-4110-8989","authenticated-orcid":false,"given":"Xin","family":"Lin","sequence":"additional","affiliation":[{"name":"Shanghai Institute of Artificial Intelligence for Education, East China Normal University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2425-7217","authenticated-orcid":false,"given":"Xiaojun","family":"Meng","sequence":"additional","affiliation":[{"name":"Huawei Noah\u2019s Ark Lab, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8518-4088","authenticated-orcid":false,"given":"Jiansheng","family":"Wei","sequence":"additional","affiliation":[{"name":"Huawei Noah\u2019s Ark Lab, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3674-7555","authenticated-orcid":false,"given":"Kuien","family":"Liu","sequence":"additional","affiliation":[{"name":"Institute of Software, Chinese Academy of Sciences, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2026,2,21]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, et al.","author":"Athiwaratkun Ben","year":"2022","unstructured":"Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, et al., 2022. Multi-lingual evaluation of code generation models. arXiv preprint arXiv:2210.14868 (2022)."},{"key":"e_1_3_2_1_2_1","unstructured":"Jacob Austin Augustus Odena Maxwell Nye Maarten Bosma Henryk Michalewski David Dohan Ellen Jiang Carrie Cai Michael Terry Quoc Le et al. 2021. Program synthesis with large language models. arXiv preprint arXiv:2108.07732 (2021)."},{"key":"e_1_3_2_1_3_1","unstructured":"Jinze Bai Shuai Bai Yunfei Chu Zeyu Cui Kai Dang Xiaodong Deng Yang Fan Wenbin Ge Yu Han Fei Huang et al. 2023. Qwen technical report. arXiv preprint arXiv:2309.16609 (2023)."},{"key":"e_1_3_2_1_4_1","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman Alex Ray Raul Puri Gretchen Krueger Michael Petrov Heidy Khlaaf Girish Sastry Pamela Mishkin Brooke Chan Scott Gray Nick Ryder Mikhail Pavlov Alethea Power Lukasz Kaiser Mohammad Bavarian Clemens Winter Philippe Tillet Felipe Petroski Such Dave Cummings Matthias Plappert Fotios Chantzis Elizabeth Barnes Ariel Herbert-Voss William Hebgen Guss Alex Nichol Alex Paino Nikolas Tezak Jie Tang Igor Babuschkin Suchir Balaji Shantanu Jain William Saunders Christopher Hesse Andrew N. Carr Jan Leike Josh Achiam Vedant Misra Evan Morikawa Alec Radford Matthew Knight Miles Brundage Mira Murati Katie Mayer Peter Welinder Bob McGrew Dario Amodei Sam McCandlish Ilya Sutskever and Wojciech Zaremba. 2021. Evaluating Large Language Models Trained on Code. (2021). arXiv:2107.03374 [cs.LG]"},{"key":"e_1_3_2_1_5_1","volume-title":"A survey on in-context learning. arXiv preprint arXiv:2301.00234","author":"Dong Qingxiu","year":"2022","unstructured":"Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, and Zhifang Sui. 2022. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2022)."},{"key":"e_1_3_2_1_6_1","volume-title":"Proceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research","volume":"10799","author":"Gao Luyu","year":"2023","unstructured":"Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, Pengfei Liu, Yiming Yang, Jamie Callan, and Graham Neubig. 2023. PAL: Program-aided Language Models. In Proceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 202), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (Eds.). PMLR, 10764-10799. https:\/\/proceedings.mlr.press\/v202\/gao23f.html"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.3102\/0013189X12463051"},{"key":"e_1_3_2_1_8_1","unstructured":"Daya Guo Dejian Yang Haowei Zhang Junxiao Song Ruoyu Zhang Runxin Xu Qihao Zhu Shirong Ma Peiyi Wang Xiao Bi et al. 2025. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948 (2025)."},{"key":"e_1_3_2_1_9_1","volume-title":"Measuring Coding Challenge Competence With APPS. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2).","author":"Hendrycks Dan","year":"2021","unstructured":"Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, et al., 2021. Measuring Coding Challenge Competence With APPS. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)."},{"key":"e_1_3_2_1_10_1","unstructured":"Aaron Jaech Adam Kalai Adam Lerer Adam Richardson Ahmed El-Kishky Aiden Low Alec Helyar Aleksander Madry Alex Beutel Alex Carney et al. 2024. Openai o1 system card. arXiv preprint arXiv:2412.16720 (2024)."},{"key":"e_1_3_2_1_11_1","volume-title":"International Conference on Machine Learning. PMLR","author":"Lai Yuhang","year":"2023","unstructured":"Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Wen-tau Yih, Daniel Fried, Sida Wang, and Tao Yu. 2023. DS-1000: A natural and reliable benchmark for data science code generation. In International Conference on Machine Learning. PMLR, 18319-18345."},{"key":"e_1_3_2_1_12_1","volume-title":"Taco: Topics in algorithmic code generation dataset. arXiv preprint arXiv:2312.14852","author":"Li Rongao","year":"2023","unstructured":"Rongao Li, Jie Fu, Bo-Wen Zhang, Tao Huang, Zhihong Sun, Chen Lyu, Guang Liu, Zhi Jin, and Ge Li. 2023. Taco: Topics in algorithmic code generation dataset. arXiv preprint arXiv:2312.14852 (2023)."},{"key":"e_1_3_2_1_13_1","volume-title":"Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74-81.","author":"Lin Chin-Yew","year":"2004","unstructured":"Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74-81."},{"key":"e_1_3_2_1_14_1","volume-title":"Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, et al.","author":"Lozhkov Anton","year":"2024","unstructured":"Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, et al., 2024. Starcoder 2 and the stack v2: The next generation. arXiv preprint arXiv:2402.19173 (2024)."},{"key":"e_1_3_2_1_15_1","volume-title":"Wizardcoder: Empowering code large language models with evol-instruct. arXiv preprint arXiv:2306.08568","author":"Luo Ziyang","year":"2023","unstructured":"Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, Qingwei Lin, and Daxin Jiang. 2023. Wizardcoder: Empowering code large language models with evol-instruct. arXiv preprint arXiv:2306.08568 (2023)."},{"key":"e_1_3_2_1_16_1","volume-title":"Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311-318","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311-318."},{"key":"e_1_3_2_1_17_1","volume-title":"HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization. arXiv preprint arXiv:2402.16694","author":"Peng Qiwei","year":"2024","unstructured":"Qiwei Peng, Yekun Chai, and Xuhong Li. 2024. HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization. arXiv preprint arXiv:2402.16694 (2024)."},{"key":"e_1_3_2_1_18_1","volume-title":"Computational thinking assessment: Literature review. Research on E-Learning and ICT in Education: Technological, Pedagogical and Instructional Perspectives","author":"Poulakis Emmanouil","year":"2021","unstructured":"Emmanouil Poulakis and Panagiotis Politis. 2021. Computational thinking assessment: Literature review. Research on E-Learning and ICT in Education: Technological, Pedagogical and Instructional Perspectives (2021), 111-128."},{"key":"e_1_3_2_1_19_1","unstructured":"Machel Reid Nikolay Savinov Denis Teplyashin Dmitry Lepikhin Timothy Lillicrap Jean-baptiste Alayrac Radu Soricut Angeliki Lazaridou Orhan Firat Julian Schrittwieser et al. 2024. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv preprint arXiv:2403.05530 (2024)."},{"key":"e_1_3_2_1_20_1","volume-title":"Yossi Adi, Jingyu Liu, Romain Sauvestre, Tal Remez, et al.","author":"Roziere Baptiste","year":"2023","unstructured":"Baptiste Roziere, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Romain Sauvestre, Tal Remez, et al., 2023. Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950 (2023)."},{"key":"e_1_3_2_1_21_1","volume-title":"Can Language Models Solve Olympiad Programming? arXiv preprint arXiv:2404.10952","author":"Shi Quan","year":"2024","unstructured":"Quan Shi, Michael Tang, Karthik Narasimhan, and Shunyu Yao. 2024. Can Language Models Solve Olympiad Programming? arXiv preprint arXiv:2404.10952 (2024)."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3580305.3599790"},{"key":"e_1_3_2_1_23_1","unstructured":"Qihao Zhu Daya Guo Zhihong Shao Dejian Yang Peiyi Wang Runxin Xu Y Wu Yukun Li Huazuo Gao Shirong Ma et al. 2024. DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence. arXiv preprint arXiv:2406.11931 (2024)."}],"event":{"name":"WSDM '26:The Nineteenth ACM International Conference on Web Search and Data Mining","location":"Boise ID USA","sponsor":["SIGKDD ACM Special Interest Group on Knowledge Discovery in Data","SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval","SIGMOD ACM Special Interest Group on Management of Data"]},"container-title":["Proceedings of the Nineteenth ACM International Conference on Web Search and Data Mining"],"original-title":[],"deposited":{"date-parts":[[2026,2,16]],"date-time":"2026-02-16T17:51:27Z","timestamp":1771264287000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3773966.3778008"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,21]]},"references-count":23,"alternative-id":["10.1145\/3773966.3778008","10.1145\/3773966"],"URL":"https:\/\/doi.org\/10.1145\/3773966.3778008","relation":{},"subject":[],"published":{"date-parts":[[2026,2,21]]},"assertion":[{"value":"2026-02-21","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}