{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T15:50:31Z","timestamp":1772121031345,"version":"3.50.1"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2024,3,15]],"date-time":"2024-03-15T00:00:00Z","timestamp":1710460800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61802167"],"award-info":[{"award-number":["61802167"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Natural Science Foundation of Jiangsu Province, China","award":["BK20201250"],"award-info":[{"award-number":["BK20201250"]}]},{"name":"NSF award","award":["2034508"],"award-info":[{"award-number":["2034508"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2024,3,31]]},"abstract":"<jats:p>\n            Recommending APIs is a practical and essential feature of IDEs. Improving the accuracy of API recommendations is an effective way to improve coding efficiency. With the success of deep learning in software engineering, the state-of-the-art (SOTA) performance of API recommendation is also achieved by deep-learning-based approaches. However, existing SOTAs either only consider the API sequences in the code snippets or rely on complex operations for extracting hand-crafted features, all of which have potential risks in under-encoding the input code snippets and further resulting in sub-optimal recommendation performance. To this end, this article proposes to utilize the code understanding ability of existing general code\n            <jats:underline>P<\/jats:underline>\n            re-\n            <jats:underline>T<\/jats:underline>\n            raining\n            <jats:underline>M<\/jats:underline>\n            odels to fully encode the input code snippet to improve the accuracy of\n            <jats:underline>API<\/jats:underline>\n            <jats:underline>Rec<\/jats:underline>\n            ommendation, namely,\n            <jats:bold>PTM-APIRec<\/jats:bold>\n            . To ensure that the code semantics of the input are fully understood and the API recommended actually exists, we use separate vocabularies for the input code snippet and the APIs to be predicted. The experimental results on the JDK and Android datasets show that PTM-APIRec surpasses existing approaches. Besides, an effective way to improve the performance of PTM-APIRec is to enhance the pre-trained model with more pre-training data (which is easier to obtain than API recommendation datasets).\n          <\/jats:p>","DOI":"10.1145\/3632745","type":"journal-article","created":{"date-parts":[[2023,11,14]],"date-time":"2023-11-14T11:38:05Z","timestamp":1699961885000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":15,"title":["PTM-APIRec: Leveraging Pre-trained Models of Source Code in API Recommendation"],"prefix":"10.1145","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-6357-201X","authenticated-orcid":false,"given":"Zhihao","family":"Li","sequence":"first","affiliation":[{"name":"State Key Laboratory for Novel Software Technology at Nanjing University, Jiangsu, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9270-5072","authenticated-orcid":false,"given":"Chuanyi","family":"Li","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology at Nanjing University, Jiangsu, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-8062-9986","authenticated-orcid":false,"given":"Ze","family":"Tang","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology at Nanjing University, Jiangsu, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-4505-3665","authenticated-orcid":false,"given":"Wanhong","family":"Huang","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology at Nanjing University, Jiangsu, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1773-0942","authenticated-orcid":false,"given":"Jidong","family":"Ge","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology at Nanjing University, Jiangsu, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9036-0063","authenticated-orcid":false,"given":"Bin","family":"Luo","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology at Nanjing University, Jiangsu, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8237-429X","authenticated-orcid":false,"given":"Vincent","family":"Ng","sequence":"additional","affiliation":[{"name":"University of Texas at Dallas, Texas, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-0071-1127","authenticated-orcid":false,"given":"Ting","family":"Wang","sequence":"additional","affiliation":[{"name":"Tencent, Guangdong, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-8856-4657","authenticated-orcid":false,"given":"Yucheng","family":"Hu","sequence":"additional","affiliation":[{"name":"Tencent, Guangdong, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-7622-7972","authenticated-orcid":false,"given":"Xiaopeng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Tencent, Guangdong, China"}]}],"member":"320","published-online":{"date-parts":[[2024,3,15]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.211"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSR.2013.6624029"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSM.2015.7332481"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/1595696.1595728"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-021-10040-2"},{"key":"e_1_3_2_7_2","article-title":"Evaluating large language models trained on code","volume":"2107","author":"Chen Mark","year":"2021","unstructured":"Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pond\u00e9 de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Joshua Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. 2021. Evaluating large language models trained on code. CoRR abs\/2107.03374 (2021). arXiv:2107.03374https:\/\/arxiv.org\/abs\/2107.03374","journal-title":"CoRR"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2021.3128234"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/n19-1423"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.139"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/2950290.2950319"},{"issue":"2","key":"e_1_3_2_12_2","first-page":"23","article-title":"A new algorithm for data compression","volume":"12","author":"Gage Philip","year":"1994","unstructured":"Philip Gage. 1994. A new algorithm for data compression. C Users J. 12, 2 (Feb.1994), 23\u201338.","journal-title":"C Users J."},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/2950290.2950334"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/2950290.2950334"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.499"},{"key":"e_1_3_2_16_2","volume-title":"9th International Conference on Learning Representations (ICLR \u201921)","author":"Guo Daya","year":"2021","unstructured":"Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin B. Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, and Ming Zhou. 2021. GraphCodeBERT: Pre-training code representations with data flow. In 9th International Conference on Learning Representations (ICLR \u201921). OpenReview.net. https:\/\/openreview.net\/forum?id=jLoC4ez43PZ"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2012.6227135"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3238191"},{"key":"e_1_3_2_19_2","article-title":"CodeSearchNet challenge: Evaluating the state of semantic code search","volume":"1909","author":"Husain Hamel","year":"2019","unstructured":"Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis Allamanis, and Marc Brockschmidt. 2019. CodeSearchNet challenge: Evaluating the state of semantic code search. CoRR abs\/1909.09436 (2019). arXiv:1909.09436http:\/\/arxiv.org\/abs\/1909.09436","journal-title":"CoRR"},{"key":"e_1_3_2_20_2","first-page":"54","volume-title":"Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI \u201921) (Proceedings of Machine Learning Research)","volume":"161","author":"Jiang Xue","year":"2021","unstructured":"Xue Jiang, Zhuoran Zheng, Chen Lyu, Liang Li, and Lei Lyu. 2021. TreeBERT: A tree-based pre-trained model for programming language. In Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI \u201921) (Proceedings of Machine Learning Research), Cassio P. de Campos, Marloes H. Maathuis, and Erik Quaeghebeur (Eds.), Vol. 161. AUAI Press, 54\u201363. https:\/\/proceedings.mlr.press\/v161\/jiang21a.html"},{"key":"e_1_3_2_21_2","first-page":"5110","volume-title":"Proceedings of the 37th International Conference on Machine Learning (ICML \u201920) (Proceedings of Machine Learning Research)","volume":"119","author":"Kanade Aditya","year":"2020","unstructured":"Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, and Kensen Shi. 2020. Learning and evaluating contextual embedding of source code. In Proceedings of the 37th International Conference on Machine Learning (ICML \u201920) (Proceedings of Machine Learning Research), Vol. 119. PMLR, 5110\u20135121. http:\/\/proceedings.mlr.press\/v119\/kanade20a.html"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.275"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2203.07814"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/3506696"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/3324884.3416591"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3238216"},{"key":"e_1_3_2_27_2","article-title":"RoBERTa: A robustly optimized BERT pretraining approach","volume":"1907","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. CoRR abs\/1907.11692 (2019). arXiv:1907.11692http:\/\/arxiv.org\/abs\/1907.11692","journal-title":"CoRR"},{"key":"e_1_3_2_28_2","volume-title":"Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1 (NeurIPS Datasets and Benchmarks \u201921)","author":"Lu Shuai","year":"2021","unstructured":"Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin B. Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, and Shujie Liu. 2021. CodeXGLUE: A machine learning benchmark dataset for code understanding and generation. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1 (NeurIPS Datasets and Benchmarks \u201921), Joaquin Vanschoren and Sai-Kit Yeung (Eds.). https:\/\/datasets-benchmarks-proceedings.neurips.cc\/paper\/2021\/hash\/c16a5320fa475530d9583c34fd356ef5-Abstract-round1.html"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2019.00099"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/2950290.2950333"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2015.336"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2012.6227205"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2019.00109"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2306.06620"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2015.109"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2022\/775"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2022\/775"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510096"},{"key":"e_1_3_2_39_2","article-title":"Revisiting, benchmarking and exploring API recommendation: How far are we?","volume":"2112","author":"Peng Yun","year":"2021","unstructured":"Yun Peng, Shuqing Li, Wenwei Gu, Yichen Li, Wenxuan Wang, Cuiyun Gao, and Michael R. Lyu. 2021. Revisiting, benchmarking and exploring API recommendation: How far are we? CoRR abs\/2112.12653 (2021). arXiv:2112.12653https:\/\/arxiv.org\/abs\/2112.12653","journal-title":"CoRR"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/2744200"},{"issue":"8","key":"e_1_3_2_41_2","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.","journal-title":"OpenAI blog"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/SANER.2016.80"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/2594291.2594321"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE-NIER.2019.00016"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3417058"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330699"},{"key":"e_1_3_2_47_2","first-page":"5998","volume-title":"Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998\u20136008. https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSR.2013.6624045"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.685"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510159"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.18293\/SEKE2018-193"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2012.6227136"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-03013-0_15"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2021.3053111"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3632745","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3632745","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:51:04Z","timestamp":1750287064000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3632745"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,15]]},"references-count":53,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,3,31]]}},"alternative-id":["10.1145\/3632745"],"URL":"https:\/\/doi.org\/10.1145\/3632745","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,3,15]]},"assertion":[{"value":"2023-05-23","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-10-26","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-03-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}