{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T09:30:29Z","timestamp":1773739829530,"version":"3.50.1"},"reference-count":87,"publisher":"Association for Computing Machinery (ACM)","issue":"1","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62372220"],"award-info":[{"award-number":["62372220"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Research Grants Council of the Hong Kong Special Administrative Region, China","award":["PolyU\/25200821"],"award-info":[{"award-number":["PolyU\/25200821"]}]},{"DOI":"10.13039\/501100010428","name":"Innovation and Technology Fund","doi-asserted-by":"crossref","award":["PRP\/047\/22FX"],"award-info":[{"award-number":["PRP\/047\/22FX"]}],"id":[{"id":"10.13039\/501100010428","id-type":"DOI","asserted-by":"crossref"}]},{"name":"PolyU Internal Fund from RC-DSAI","award":["1-CE1E"],"award-info":[{"award-number":["1-CE1E"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2026,1,31]]},"abstract":"<jats:p>\n                    Automated code completion, aiming at generating subsequent tokens from unfinished code, has significantly benefited from recent progress in pre-trained Large Language Models (LLMs). However, these models often suffer from coherence issues and hallucinations when dealing with complex code logic or extrapolating beyond their training data. Existing Retrieval Augmented Generation (RAG) techniques partially address these issues by retrieving relevant code with a separate encoding model where the retrieved snippet serves as contextual reference for code completion. However, their retrieval scope is subject to a singular perspective defined by the encoding model, which largely overlooks the complexity and diversity inherent in code semantics. To address this limitation, we propose ProCC, a code completion framework leveraging prompt engineering and the contextual multi-armed bandits algorithm to flexibly incorporate and adapt to multiple perspectives of code. ProCC first employs a\n                    <jats:italic toggle=\"yes\">prompt-based multi-retriever system<\/jats:italic>\n                    which crafts prompt templates to elicit LLM knowledge to understand code semantics with multiple retrieval perspectives. Then, it adopts the\n                    <jats:italic toggle=\"yes\">adaptive retrieval selection algorithm<\/jats:italic>\n                    to incorporate code similarity into the decision-making process to determine the most suitable retrieval perspective for the LLM to complete the code. Experimental results demonstrate that ProCC outperforms a widely studied code completion technique RepoCoder by 7.92% on the public benchmark CCEval, 3.19% in HumanEval-Infilling, 2.80% on our collected open-source benchmark suite, and 4.48% on the private-domain benchmark suite collected from Kuaishou Technology in terms of Exact Match. ProCC also allows augmenting fine-tuned techniques in a plug-and-play manner, yielding an averaged 6.5% improvement over the fine-tuned model.\n                  <\/jats:p>","DOI":"10.1145\/3725812","type":"journal-article","created":{"date-parts":[[2025,3,26]],"date-time":"2025-03-26T11:56:19Z","timestamp":1742990179000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Prompt-Based Code Completion via Multi-Retrieval Augmented Generation"],"prefix":"10.1145","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5392-5435","authenticated-orcid":false,"given":"Hanzhuo","family":"Tan","sequence":"first","affiliation":[{"name":"Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China and Computing, Hong Kong Polytechnic University, Hong Kong, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-0931-8314","authenticated-orcid":false,"given":"Qi","family":"Luo","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Southern University of\u00a0Science and Technology, Shenzhen, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-3134-1811","authenticated-orcid":false,"given":"Ling","family":"Jiang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Southern University of\u00a0Science and Technology, Shenzhen, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-3450-8929","authenticated-orcid":false,"given":"Zizheng","family":"Zhan","sequence":"additional","affiliation":[{"name":"Beijing Kuaishou Technology Co Ltd, Haidian, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8044-2284","authenticated-orcid":false,"given":"Jing","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Computing and the Research Centre on Data Science and Artificial Intelligence (RC-DSAI), The Hong Kong Polytechnic University, Hong Kong, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-3649-8531","authenticated-orcid":false,"given":"Haotian","family":"Zhang","sequence":"additional","affiliation":[{"name":"Beijing Kuaishou Technology Co Ltd, Haidian, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1499-5729","authenticated-orcid":false,"given":"Yuqun","family":"Zhang","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China"}]}],"member":"320","published-online":{"date-parts":[[2025,12,11]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"GitHub. 2024. Prompt-based Code Completion via Multi-Retrieval Augmented Generation. Retrieved from https:\/\/github.com\/anonepo\/proccGitHub repository"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.emnlp-main.1107"},{"key":"e_1_3_1_4_2","unstructured":"Loubna Ben Allal Raymond Li Denis Kocetkov Chenghao Mou Christopher Akiki Carlos Mu\u00f1oz Ferrandis Niklas Muennighoff Mayank Mishra Alexander Gu Manan Dey et al. 2023. SantaCoder: Don\u2019t reach for the stars! arXiv:2301.03988. Retrieved from https:\/\/arxiv.org\/abs\/2301.03988"},{"key":"e_1_3_1_5_2","unstructured":"Mohammad Bavarian Heewoo Jun Nikolas Tezak John Schulman Christine McLeavey Jerry Tworek and Mark Chen. 2022. Efficient training of language models to fill in the middle. arXiv:2207.14255. Retrieved from https:\/\/arxiv.org\/abs\/2207.14255"},{"key":"e_1_3_1_6_2","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman et al. 2021. Evaluating large language models trained on code. arXiv:2107.03374. Retrieved from https:\/\/arxiv.org\/abs\/2107.03374"},{"key":"e_1_3_1_7_2","unstructured":"Wei Chu Lihong Li Lev Reyzin and Robert Schapire. 2011. Contextual bandits with linear payoff functions. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings 208\u2013214."},{"key":"e_1_3_1_8_2","unstructured":"Yangruibo Ding Zijian Wang Wasi Uddin Ahmad Hantian Ding Ming Tan Nihal Jain Murali Krishna Ramanathan Ramesh Nallapati Parminder Bhatia Dan Roth et al. 2023. CrossCodeEval: A diverse and multilingual benchmark for cross-file code completion. arXiv:2310.11248. Retrieved from https:\/\/arxiv.org\/abs\/2310.11248"},{"key":"e_1_3_1_9_2","unstructured":"Yangruibo Ding Zijian Wang Wasi Uddin Ahmad Murali Krishna Ramanathan Ramesh Nallapati Parminder Bhatia Dan Roth and Bing Xiang. 2022. CoCoMIC: Code completion by jointly modeling in-file and cross-file context. arXiv:2212.10007. Retrieved from https:\/\/arxiv.org\/abs\/2212.10007"},{"key":"e_1_3_1_10_2","unstructured":"Guanting Dong Hongyi Yuan Keming Lu Chengpeng Li Mingfeng Xue Dayiheng Liu Wei Wang Zheng Yuan Chang Zhou and Jingren Zhou. 2023. How abilities in large language models are affected by supervised fine-tuning data composition. arXiv:2310.05492. Retrieved from https:\/\/arxiv.org\/abs\/2310.05492"},{"key":"e_1_3_1_11_2","unstructured":"Hugging Face. 2025. Large Language Model Text Generation Inference. Retrieved from https:\/\/github.com\/huggingface\/text-generation-inferenceGitHub repository"},{"key":"e_1_3_1_12_2","doi-asserted-by":"crossref","unstructured":"Zhangyin Feng Daya Guo Duyu Tang Nan Duan Xiaocheng Feng Ming Gong Linjun Shou Bing Qin Ting Liu Daxin Jiang and Ming Zhou. 2020. CodeBERT: A pre-trained model for programming and natural languages. arXiv:2002.08155. Retrieved from https:\/\/arxiv.org\/abs\/2002.08155","DOI":"10.18653\/v1\/2020.findings-emnlp.139"},{"key":"e_1_3_1_13_2","unstructured":"Daniel Fried Armen Aghajanyan Jessy Lin Sida I. Wang Eric Wallace Freda Shi Ruiqi Zhong Wen tau Yih Luke Zettlemoyer and Mike Lewis. 2022. InCoder: A generative model for code infilling and synthesis. arXiv:2204.05999. Retrieved from https:\/\/arxiv.org\/abs\/2002.08155"},{"key":"e_1_3_1_14_2","unstructured":"Luyu Gao Xueguang Ma Jimmy Lin and Jamie Callan. 2022. Precise zero-shot dense retrieval without relevance labels. arXiv:2212.10496. Retrieved from https:\/\/arxiv.org\/abs\/2002.08155"},{"key":"e_1_3_1_15_2","unstructured":"Tianyu Gao Xingcheng Yao and Danqi Chen. 2021. SimCSE: Simple contrastive learning of sentence embeddings. arXiv:2104.08821. Retrieved from https:\/\/arxiv.org\/abs\/2104.08821"},{"key":"e_1_3_1_16_2","article-title":"User feedback-based online learning for intent classification","author":"G\u00f6n\u00e7 Kaan","year":"2023","unstructured":"Kaan G\u00f6n\u00e7, Baturay Sa\u011flam, Onat Dalmaz, Tolga \u00c7ukur, Serdar Kozat, and Hamdi Dibeklio\u011flu. 2023. User feedback-based online learning for intent classification. In Proceedings of the 25th International Conference on Multimodal Interaction. Retrieved from https:\/\/api.semanticscholar.org\/CorpusID:263742809","journal-title":"Proceedings of the 25th International Conference on Multimodal Interaction"},{"key":"e_1_3_1_17_2","doi-asserted-by":"crossref","unstructured":"Daya Guo Shuai Lu Nan Duan Yanlin Wang Ming Zhou and Jian Yin. 2022. UniXcoder: Unified cross-modal pre-training for code representation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. Retrieved from https:\/\/api.semanticscholar.org\/CorpusID:247315559","DOI":"10.18653\/v1\/2022.acl-long.499"},{"key":"e_1_3_1_18_2","unstructured":"Daya Guo Qihao Zhu Dejian Yang Zhenda Xie Kai Dong Wentao Zhang Guanting Chen Xiao Bi Y Wu Y. K. Li et al. 2024. DeepSeek-Coder: When the large language model meets programming\u2013The rise of code intelligence. arXiv:2401.14196. Retrieved from https:\/\/arxiv.org\/abs\/2401.14196"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3650212.3652130"},{"key":"e_1_3_1_20_2","doi-asserted-by":"crossref","unstructured":"Tihomir Gvero Viktor Kuncak Ivan Kuraj and Ruzica Piskac. 2013. Complete completion using types and weights. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation 27\u201338.","DOI":"10.1145\/2491956.2462192"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_1_22_2","unstructured":"Mohanna Hoveyda Arjen P. de Vries Maarten de Rijke Harrie Oosterhuis and Faegheh Hasibi. 2024. AQA: Adaptive question answering in a society of LLMs via contextual multi-armed bandit. arXiv:2409.13447. Retrieved from https:\/\/arxiv.org\/abs\/2409.13447"},{"key":"e_1_3_1_23_2","unstructured":"Binyuan Hui Jian Yang Zeyu Cui Jiaxi Yang Dayiheng Liu Lei Zhang Tianyu Liu Jiajun Zhang Bowen Yu Keming Lu et al. 2024. Qwen2.5-coder technical report. arXiv:2409.12186. Retrieved from https:\/\/arxiv.org\/abs\/2409.12186"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1469-8137.1912.tb05611.x"},{"key":"e_1_3_1_25_2","unstructured":"Rolf Jagerman Honglei Zhuang Zhen Qin Xuanhui Wang and Michael Bendersky. 2023. Query expansion by prompting large language models. arXiv:2305.03653. Retrieved from https:\/\/arxiv.org\/abs\/2305.03653"},{"key":"e_1_3_1_26_2","doi-asserted-by":"crossref","unstructured":"Paras Jain Ajay Jain Tianjun Zhang P. Abbeel Joseph Gonzalez and Ion Stoica. 2020. Contrastive code representation learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Retrieved from https:\/\/api.semanticscholar.org\/CorpusID:220425360","DOI":"10.18653\/v1\/2021.emnlp-main.482"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3597503.3639100"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.emnlp-main.603"},{"key":"e_1_3_1_29_2","unstructured":"Jai Kannan Scott Barnett Anj Simmons Taylan Selvi and Lu\u00eds Cruz. 2023. Green Runner: A tool for efficient model selection from model repositories. arXiv:2305.16849. Retrieved from https:\/\/arxiv.org\/abs\/2305.16849"},{"key":"e_1_3_1_30_2","doi-asserted-by":"crossref","unstructured":"Vladimir Karpukhin Barlas O\u011fuz Sewon Min Patrick Lewis Ledell Yu Wu Sergey Edunov Danqi Chen and Wen tau Yih. 2020. Dense passage retrieval for open-domain question answering. arXiv:2004.04906. Retrieved from https:\/\/arxiv.org\/abs\/1711.09573","DOI":"10.18653\/v1\/2020.emnlp-main.550"},{"key":"e_1_3_1_31_2","unstructured":"Urvashi Khandelwal Omer Levy Dan Jurafsky Luke Zettlemoyer and Mike Lewis. 2020. Generalization through memorization: Nearest neighbor language models. In Proceedings of the International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=HklBjCEKvH"},{"key":"e_1_3_1_32_2","doi-asserted-by":"crossref","unstructured":"Woosuk Kwon Zhuohan Li Siyuan Zhuang Ying Sheng Lianmin Zheng Cody Hao Yu Joseph E. Gonzalez Hao Zhang and Ion Stoica. 2023. Efficient memory management for large language model serving with PagedAttention. In Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles.","DOI":"10.1145\/3600006.3613165"},{"key":"e_1_3_1_33_2","unstructured":"LangChain. 2023. How to Stream Results from Your RAG Application. Retrieved from https:\/\/python.langchain.com\/v0.2\/docs\/how_to\/qa_streaming\/#streaming-final-outputs"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature14539"},{"issue":"8","key":"e_1_3_1_35_2","first-page":"707","article-title":"Binary codes capable of correcting deletions, insertions, and reversals","volume":"10","author":"Levenshtein Vladimir I.","year":"1965","unstructured":"Vladimir I. Levenshtein. 1965. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10, 8 (1965), 707\u2013710.","journal-title":"Soviet Physics Doklady"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.5555\/3495724.3496517"},{"key":"e_1_3_1_37_2","doi-asserted-by":"crossref","unstructured":"Jingxuan Li Rui Huang Wei Li Kai Yao and Weiguo Tan. 2021. Toward less hidden cost of code completion with acceptance and ranking models. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME \u201921). IEEE 195\u2013205.","DOI":"10.1109\/ICSME52107.2021.00024"},{"key":"e_1_3_1_38_2","unstructured":"Jian Li Yue Wang Michael R. Lyu and Irwin King. 2017. Code completion with neural attention and pointer networks. arXiv:1711.09573. Retrieved from https:\/\/arxiv.org\/abs\/1711.09573"},{"key":"e_1_3_1_39_2","unstructured":"Lihong Li Wei Chu John Langford and Robert E. Schapire. 2010. Contextual bandits with linear payoff functions. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics 208\u2013214."},{"key":"e_1_3_1_40_2","unstructured":"Raymond Li Loubna Ben Allal Yangtian Zi Niklas Muennighoff Denis Kocetkov Chenghao Mou Marc Marone Christopher Akiki Jia Li Jenny Chim et al. 2023. StarCoder: May the source be with you! arXiv:2305.06161. Retrieved from https:\/\/arxiv.org\/abs\/2305.06161"},{"key":"e_1_3_1_41_2","unstructured":"Zehan Li Xin Zhang Yanzhao Zhang Dingkun Long Pengjun Xie and Meishan Zhang. 2023. Towards general text embeddings with multi-stage contrastive learning. arXiv:2308.03281. Retrieved from https:\/\/arxiv.org\/abs\/2308.03281"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.229"},{"key":"e_1_3_1_43_2","unstructured":"Tianyang Liu Canwen Xu and Julian McAuley. 2023. RepoBench: Benchmarking repository-level code auto-completion systems. arXiv:2306.03091. Retrieved from https:\/\/arxiv.org\/abs\/2306.03091"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3691620.3695054"},{"key":"e_1_3_1_45_2","unstructured":"Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. arXiv:1711.05101. Retrieved from https:\/\/arxiv.org\/abs\/1711.05101"},{"key":"e_1_3_1_46_2","unstructured":"Shuai Lu Nan Duan Hojae Han Daya Guo Seung-won Hwang and Alexey Svyatkovskiy. 2022. ReACC: A retrieval-augmented code completion framework. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 6227\u20136240."},{"key":"e_1_3_1_47_2","unstructured":"Shuai Lu Daya Guo Shuo Ren Junjie Huang Alexey Svyatkovskiy Ambrosio Blanco Colin Clement Dawn Drain Daxin Jiang Duyu Tang et al. 2021. Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv:2102.04664. Retrieved from https:\/\/arxiv.org\/abs\/2102.04664"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/1064978.1065018"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.173"},{"key":"e_1_3_1_50_2","unstructured":"Microsoft. 2023. Pyright: Static Type Checker for Python. Retrieved from https:\/\/github.com\/microsoft\/pyright"},{"key":"e_1_3_1_51_2","unstructured":"Akshay Uttama Nambi Vaibhav Balloli Mercy Prasanna Ranjit Tanuja Ganu Kabir Ahuja Sunayana Sitaram and Kalika Bali. 2023. Breaking language barriers with a LEAP: Learning strategies for polyglot LLMs. arXiv:2305.17740. Retrieved from https:\/\/arxiv.org\/abs\/2305.17740"},{"key":"e_1_3_1_52_2","unstructured":"Duy Nguyen Archiki Prasad Elias Stengel-Eskin and Mohit Bansal. 2024. LASeR: Learning to adaptively select reward models with multi-armed bandits. arXiv:2410.01735. Retrieved from https:\/\/arxiv.org\/abs\/2410.01735"},{"key":"e_1_3_1_53_2","unstructured":"Erik Nijkamp Bo Pang Hiroaki Hayashi Lifu Tu Huan Wang Yingbo Zhou Silvio Savarese and Caiming Xiong. 2023. CodeGen: An open large language model for code with multi-turn program synthesis. arXiv:2203.13474. Retrieved from https:\/\/arxiv.org\/abs\/2203.13474"},{"key":"e_1_3_1_54_2","unstructured":"OpenAI. 2023. GPT-4 Technical Report. Technical Report. OpenAI. Retrieved from https:\/\/arxiv.org\/pdf\/2303.08774.pdf"},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/3586183.3606763"},{"key":"e_1_3_1_56_2","doi-asserted-by":"crossref","unstructured":"Md Rizwan Parvez Wasi Uddin Ahmad Saikat Chakraborty Baishakhi Ray and Kai-Wei Chang. 2021. Retrieval augmented code generation and summarization. arXiv:2108.11601. Retrieved from https:\/\/arxiv.org\/abs\/2108.11601","DOI":"10.18653\/v1\/2021.findings-emnlp.232"},{"key":"e_1_3_1_57_2","doi-asserted-by":"crossref","unstructured":"Daniel Perelman Sumit Gulwani Thomas Ball and Dan Grossman. 2012. Type-directed completion of partial expressions. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation 275\u2013286.","DOI":"10.1145\/2254064.2254098"},{"key":"e_1_3_1_58_2","unstructured":"Alec Radford Jeff Wu Rewon Child David Luan Dario Amodei and Ilya Sutskever. 2019. Language Models Are Unsupervised Multitask Learners. Retrieved from https:\/\/api.semanticscholar.org\/CorpusID:160025533"},{"key":"e_1_3_1_59_2","first-page":"1\u20137426461:18","article-title":"A neural network based intelligent support model for program code completion","volume":"2020","author":"Rahman Md. Mostafizer","year":"2020","unstructured":"Md. Mostafizer Rahman, Yutaka Watanobe, and Keita Nakamura. 2020. A neural network based intelligent support model for program code completion. Scientific Programming 2020 (2020), 7426461:1\u20137426461:18. Retrieved from https:\/\/api.semanticscholar.org\/CorpusID:225584181","journal-title":"Scientific Programming"},{"key":"e_1_3_1_60_2","doi-asserted-by":"publisher","DOI":"10.1145\/2594291.2594321"},{"key":"e_1_3_1_61_2","unstructured":"[60] Facebook Research. 2023. FAISS: A Library for Efficient Similarity Search and Clustering of Dense Vectors. Retrieved from https:\/\/github.com\/facebookresearch\/faissGitHub repository"},{"key":"e_1_3_1_62_2","doi-asserted-by":"publisher","DOI":"10.1561\/1500000019"},{"key":"e_1_3_1_63_2","unstructured":"Baptiste Roziere Jonas Gehring Fabian Gloeckle Sten Sootla Itai Gat Xiaoqing Ellen Tan Yossi Adi Jingyu Liu Tal Remez J\u00e9r\u00e9my Rapin et al. 2023. Code LlaMA: Open foundation models for code. arXiv:2308.12950. Retrieved from https:\/\/arxiv.org\/abs\/2308.12950"},{"key":"e_1_3_1_64_2","unstructured":"Sebastian Ruder. 2016. An overview of gradient descent optimization algorithms. arXiv:1609.04747. Retrieved from https:\/\/arxiv.org\/abs\/1609.04747"},{"key":"e_1_3_1_65_2","doi-asserted-by":"publisher","DOI":"10.1038\/323533a0"},{"key":"e_1_3_1_66_2","first-page":"2198","volume-title":"2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE \u201922)","author":"Shi Ensheng","year":"2022","unstructured":"Ensheng Shi, Yanlin Wang, Wenchao Gu, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, and Hongbin Sun. 2022. CoCoSoDa: Effective contrastive learning for code search. In 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE \u201922), 2198\u20132210. Retrieved from https:\/\/api.semanticscholar.org\/CorpusID:256827724"},{"key":"e_1_3_1_67_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.emnlp-main.203"},{"key":"e_1_3_1_68_2","doi-asserted-by":"crossref","unstructured":"Ze Tang Jidong Ge Shangqing Liu Tingwei Zhu Tongtong Xu Liguo Huang and Bin Luo. 2023. Domain adaptive code completion via language models and decoupled domain databases. arXiv:2308.09313. Retrieved from https:\/\/arxiv.org\/abs\/2308.09313","DOI":"10.1109\/ASE56229.2023.00076"},{"key":"e_1_3_1_69_2","unstructured":"Rohan Taori Ishaan Gulrajani Tianyi Zhang Yann Dubois Xuechen Li Carlos Guestrin Percy Liang and Tatsunori B. Hashimoto. 2023. Stanford Alpaca: An Instruction-Following LLaMA Model. Retrieved from https:\/\/github.com\/tatsu-lab\/stanford_alpaca"},{"key":"e_1_3_1_70_2","first-page":"109","volume-title":"Proceedings of the IEEE 11th International Workshop on Computational Intelligence and Applications (IWCIA \u201919)","author":"Terada Kenta","year":"2019","unstructured":"Kenta Terada and Yutaka Watanobe. 2019. Code completion for programming education based on recurrent neural network. In Proceedings of the IEEE 11th International Workshop on Computational Intelligence and Applications (IWCIA \u201919), 109\u2013114. Retrieved from https:\/\/api.semanticscholar.org\/CorpusID:210694727"},{"key":"e_1_3_1_71_2","unstructured":"Zhao Tian and Junjie Chen. 2023. Test-case-driven programming understanding in large language models for better code generation. arXiv:2309.16120. Retrieved from https:\/\/arxiv.org\/abs\/2309.16120"},{"key":"e_1_3_1_72_2","unstructured":"Hugo Touvron Louis Martin Kevin Stone Peter Albert Amjad Almahairi Yasmine Babaei Nikolay Bashlykov Soumya Batra Prajjwal Bhargava Shruti Bhosale et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv:2307.09288. Retrieved from https:\/\/arxiv.org\/abs\/2307.09288"},{"key":"e_1_3_1_73_2","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_3_1_74_2","doi-asserted-by":"publisher","DOI":"10.1145\/3611643.3616280"},{"key":"e_1_3_1_75_2","unstructured":"Chong Wang Kaifeng Huang Jian Zhang Yebo Feng Lyuye Zhang Yang Liu and Xin Peng. 2024. How and why LLMs use deprecated APIs in code completion? An empirical study. arXiv:2406.09834. Retrieved from https:\/\/arxiv.org\/abs\/2406.09834"},{"key":"e_1_3_1_76_2","doi-asserted-by":"publisher","unstructured":"Chong Wang Jian Zhang Yebo Feng Tianlin Li Weisong Sun Yang Liu and Xin Peng. 2025. Teaching code LLMs to use autocompletion tools in repository-level code generation. ACM Transactions on Software Engineering and Methodology (Jan. 2025). DOI: 10.1145\/3714462","DOI":"10.1145\/3714462"},{"key":"e_1_3_1_77_2","doi-asserted-by":"crossref","unstructured":"Liang Wang Nan Yang and Furu Wei. 2023. Query2doc: Query expansion with large language models. arXiv: 2303.07678. Retrieved from https:\/\/arxiv.org\/abs\/2303.07678","DOI":"10.18653\/v1\/2023.emnlp-main.585"},{"key":"e_1_3_1_78_2","doi-asserted-by":"crossref","unstructured":"Yue Wang Hung Le Akhilesh Deepak Gotmare Nghi D. Q. Bui Junnan Li and Steven C. H. Hoi. 2023. Codet5+: Open code large language models for code understanding and generation. arXiv:2305.07922. Retrieved from https:\/\/arxiv.org\/abs\/2305.07922","DOI":"10.18653\/v1\/2023.emnlp-main.68"},{"key":"e_1_3_1_79_2","volume-title":"Code Generation as a Dual Task of Code Summarization","author":"Wei Bolin","year":"2019","unstructured":"Bolin Wei, Ge Li, Xin Xia, Zhiyi Fu, and Zhi Jin. 2019. Code Generation as a Dual Task of Code Summarization. Curran Associates Inc., Red Hook, NY."},{"key":"e_1_3_1_80_2","unstructured":"Thomas Wolf Lysandre Debut Victor Sanh Julien Chaumond Clement Delangue Anthony Moi Pierric Cistac Tim Rault R\u00e9mi Louf Morgan Funtowicz et al. 2019. Huggingface\u2019s transformers: State-of-the-art natural language processing. arXiv:1910.03771. Retrieved from https:\/\/arxiv.org\/abs\/1910.03771"},{"key":"e_1_3_1_81_2","unstructured":"Chaoyi Wu Xiaoman Zhang Ya Zhang Yanfeng Wang and Weidi Xie. 2023. PMC-LLaMA: Towards Building Open-Source Language Models for Medicine. Retrieved from https:\/\/api.semanticscholar.org\/CorpusID:258417843"},{"key":"e_1_3_1_82_2","unstructured":"Jiahong Xiang Xiaoyang Xu Fanchu Kong Mingyuan Wu Zizheng Zhang Haotian Zhang and Yuqun Zhang. 2024. How far can we go with practical function-level program repair? arXiv:2404.12833. Retrieved from https:\/\/arxiv.org\/abs\/2404.12833"},{"key":"e_1_3_1_83_2","unstructured":"Can Xu Qingfeng Sun Kai Zheng Xiubo Geng Pu Zhao Jiazhan Feng Chongyang Tao and Daxin Jiang. 2023. WizardLM: Empowering large language models to follow complex instructions. arXiv:2304.12244. Retrieved from https:\/\/arxiv.org\/abs\/2304.12244"},{"key":"e_1_3_1_84_2","unstructured":"Daoguang Zan Bei Chen Fengji Zhang Dianjie Lu Bingchao Wu Bei Guan Yongji Wang and Jian-Guang Lou. 2023. Large language models meet NL2Code: A survey. arXiv:2212.09420. Retrieved from https:\/\/arxiv.org\/abs\/2212.09420"},{"key":"e_1_3_1_85_2","doi-asserted-by":"publisher","DOI":"10.1145\/3533767.3534390"},{"key":"e_1_3_1_86_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.emnlp-main.151"},{"key":"e_1_3_1_87_2","unstructured":"Lianmin Zheng Wei-Lin Chiang Ying Sheng Siyuan Zhuang Zhanghao Wu Yonghao Zhuang Zi Lin Zhuohan Li Dacheng Li Eric P. Xing Hao Zhang Joseph E. Gonzalez and Ion Stoica. 2023. Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. arXiv:2306.05685. Retrieved from https:\/\/arxiv.org\/abs\/2306.05685"},{"key":"e_1_3_1_88_2","unstructured":"Lianmin Zheng Liangsheng Yin Zhiqiang Xie Chuyue Sun Jeff Huang Cody Hao Yu Shiyi Cao Christos Kozyrakis Ion Stoica Joseph E. Gonzalez Clark Barrett and Ying Sheng. 2024. SGLang: Efficient execution of structured language model programs. arXiv:2312.07104. Retrieved from https:\/\/arxiv.org\/abs\/2312.07104"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3725812","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,11]],"date-time":"2025-12-11T15:56:01Z","timestamp":1765468561000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3725812"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,11]]},"references-count":87,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,1,31]]}},"alternative-id":["10.1145\/3725812"],"URL":"https:\/\/doi.org\/10.1145\/3725812","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,11]]},"assertion":[{"value":"2024-10-28","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-17","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-12-11","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}