{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,13]],"date-time":"2026-06-13T16:10:00Z","timestamp":1781367000213,"version":"3.54.1"},"reference-count":67,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2023,12,22]],"date-time":"2023-12-22T00:00:00Z","timestamp":1703203200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Natural Sciences and Engineering Research Council of Canada (NSERC) and the National Natural Science Foundation of China","award":["62272445 and 62332001"],"award-info":[{"award-number":["62272445 and 62332001"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2024,2,29]]},"abstract":"<jats:p>\n            Machine learning (ML) has been increasingly used in a variety of domains, while solving ML programming tasks poses unique challenges due to the fundamental difference in the nature and the construct of general programming tasks, especially for developers who do not have ML backgrounds. Automatic code generation that produces a code snippet from a natural language description can be a promising technique to accelerate ML programming tasks. In recent years, although many deep learning-based neural code generation models have been proposed with high accuracy, the fact that most of them are mainly evaluated on general programming tasks calls into question their effectiveness and usefulness in ML programming tasks. In this article, we set out to investigate the effectiveness of existing neural code generation models on ML programming tasks. For our analysis, we select six state-of-the-art neural code generation models and evaluate their performance on four widely used ML libraries, with newly created 83K pairs of natural-language described ML programming tasks. Our empirical study reveals some good, bad, and missing aspects of neural code generation models on ML tasks, with a few major ones listed below. (\n            <jats:bold>Good<\/jats:bold>\n            ) Neural code generation models perform significantly better on ML tasks than on non-ML tasks with an average difference of 10.6 points in BLEU-4 scores. (\n            <jats:bold>Bad<\/jats:bold>\n            ) More than 80% of the generated code is semantically incorrect. (\n            <jats:bold>Bad<\/jats:bold>\n            ) Code generation models do not have significance in improving developers\u2019 completion time. (\n            <jats:bold>Good<\/jats:bold>\n            ) The generated code can help developers write correct code by providing developers with clues for using correct APIs. (\n            <jats:bold>Missing<\/jats:bold>\n            ) The observation from our user study reveals the missing aspects of code generation for ML tasks, e.g., decomposing code generation for divide-and-conquer into API sequence identification and API usage generation.\n          <\/jats:p>","DOI":"10.1145\/3630009","type":"journal-article","created":{"date-parts":[[2023,10,23]],"date-time":"2023-10-23T21:21:10Z","timestamp":1698096070000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["The Good, the Bad, and the Missing: Neural Code Generation for Machine Learning Tasks"],"prefix":"10.1145","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8829-3773","authenticated-orcid":false,"given":"Jiho","family":"Shin","sequence":"first","affiliation":[{"name":"York University, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1659-1960","authenticated-orcid":false,"given":"Moshi","family":"Wei","sequence":"additional","affiliation":[{"name":"York University, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9941-6713","authenticated-orcid":false,"given":"Junjie","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Software, Chinese Academy of Sciences, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1476-7213","authenticated-orcid":false,"given":"Lin","family":"Shi","sequence":"additional","affiliation":[{"name":"School of Software, Beihang University, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0617-2877","authenticated-orcid":false,"given":"Song","family":"Wang","sequence":"additional","affiliation":[{"name":"York University, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2023,12,22]]},"reference":[{"key":"e_1_3_2_2_2","article-title":"Juice: A large scale distantly supervised dataset for open domain context-based code generation","author":"Agashe Rajas","year":"2019","unstructured":"Rajas Agashe, Srinivasan Iyer, and Luke Zettlemoyer. 2019. Juice: A large scale distantly supervised dataset for open domain context-based code generation. Retrieved from https:\/\/arXiv:1910.02216","journal-title":"Retrieved from https:\/\/arXiv:1910.02216"},{"key":"e_1_3_2_3_2","first-page":"2655","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Ahmad Wasi","year":"2021","unstructured":"Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2021. Unified pre-training for program understanding and generation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2655\u20132668."},{"key":"e_1_3_2_4_2","article-title":"AVATAR: A parallel corpus for Java-Python program translation","author":"Ahmad Wasi Uddin","year":"2021","unstructured":"Wasi Uddin Ahmad, Md Golam Rahman Tushar, Saikat Chakraborty, and Kai-Wei Chang. 2021. AVATAR: A parallel corpus for Java-Python program translation. Retrieved from https:\/\/arXiv:2108.11590","journal-title":"Retrieved from https:\/\/arXiv:2108.11590"},{"key":"e_1_3_2_5_2","first-page":"143","volume-title":"Proceedings of the ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software","author":"Allamanis Miltiadis","year":"2019","unstructured":"Miltiadis Allamanis. 2019. The adverse effects of code duplication in machine learning models of code. In Proceedings of the ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. 143\u2013153."},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.5555\/2487085.2487127"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE-SEIP.2019.00042"},{"key":"e_1_3_2_8_2","first-page":"65","volume-title":"Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization","author":"Banerjee Satanjeev","year":"2005","unstructured":"Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization. 65\u201372."},{"key":"e_1_3_2_9_2","unstructured":"Sid Black Leo Gao Phil Wang Connor Leahy and Stella Rose Biderman. 2021. GPT-Neo: Large scale autoregressive language modeling with Mesh-tensorflow. https:\/\/api.semanticscholar.org\/CorpusID:245758737"},{"key":"e_1_3_2_10_2","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et\u00a0al. 2020. Language models are few-shot learners. Advances in Neural Information Processing Systems 33 (2020), 1877\u20131901.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462840"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/2393596.2393606"},{"issue":"1","key":"e_1_3_2_13_2","first-page":"1","article-title":"\u201cMore Than Deep Learning\u201d: Post-processing for API sequence recommendation","volume":"27","author":"Chen Chi","year":"2022","unstructured":"Chi Chen, Xin Peng, Bihuan Chen, Jun Sun, Zhenchang Xing, Xin Wang, and Wenyun Zhao. 2022. \u201cMore Than Deep Learning\u201d: Post-processing for API sequence recommendation. Empir. Softw. Eng. 27, 1 (2022), 1\u201332.","journal-title":"Empir. Softw. Eng."},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2021.3074309"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3468264.3468611"},{"key":"e_1_3_2_16_2","first-page":"495","article-title":"Statistical power analysis for the behavioral 698 sciences","volume":"2","author":"Cohen J.","year":"1988","unstructured":"J. Cohen. 1988. Statistical power analysis for the behavioral 698 sciences. Stat. Power Anal. Behav. Sci. 2 (1988), 495.","journal-title":"Stat. Power Anal. Behav. Sci."},{"key":"e_1_3_2_17_2","unstructured":"Anthony E. Cozzie and Samuel King. 2012. Macho: Writing programs with natural language and examples. (2012). http:\/\/hdl.handle.net\/2142\/33791"},{"key":"e_1_3_2_18_2","first-page":"4382","volume-title":"Proceedings of the Association for Computational Linguistics (ACL-IJCNLP\u201921)","author":"Dahal Samip","year":"2021","unstructured":"Samip Dahal, Adyasha Maharana, and Mohit Bansal. 2021. Analysis of tree-structured architectures for code generation. In Proceedings of the Association for Computational Linguistics (ACL-IJCNLP\u201921). 4382\u20134391."},{"key":"e_1_3_2_19_2","article-title":"Bert: Pre-training of deep bidirectional transformers for language understanding","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https:\/\/arXiv:1810.04805","journal-title":"Retrieved from https:\/\/arXiv:1810.04805"},{"key":"e_1_3_2_20_2","article-title":"Language to logical form with neural attention","author":"Dong Li","year":"2016","unstructured":"Li Dong and Mirella Lapata. 2016. Language to logical form with neural attention. Retrieved from https:\/\/arXiv:1601.01280","journal-title":"Retrieved from https:\/\/arXiv:1601.01280"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1068"},{"key":"e_1_3_2_22_2","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1613\/jair.5477","article-title":"Survey of the state of the art in natural language generation: Core tasks, applications and evaluation","volume":"61","author":"Gatt Albert","year":"2018","unstructured":"Albert Gatt and Emiel Krahmer. 2018. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. J. Artific. Intell. Res. 61 (2018), 65\u2013170.","journal-title":"J. Artific. Intell. Res."},{"key":"e_1_3_2_23_2","first-page":"933","volume-title":"Proceedings of the IEEE\/ACM 40th International Conference on Software Engineering (ICSE\u201918)","author":"Gu Xiaodong","year":"2018","unstructured":"Xiaodong Gu, Hongyu Zhang, and Sunghun Kim. 2018. Deep code search. In Proceedings of the IEEE\/ACM 40th International Conference on Software Engineering (ICSE\u201918). IEEE, 933\u2013944."},{"key":"e_1_3_2_24_2","article-title":"Graphcodebert: Pre-training code representations with data flow","author":"Guo Daya","year":"2020","unstructured":"Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, et\u00a0al. 2020. Graphcodebert: Pre-training code representations with data flow. Retrieved from https:\/\/arXiv:2009.08366","journal-title":"Retrieved from https:\/\/arXiv:2009.08366"},{"key":"e_1_3_2_25_2","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Hayati Shirley Anugrah","year":"2018","unstructured":"Shirley Anugrah Hayati, Raphael Olivier, Pravalika Avvaru, Pengcheng Yin, Anthony Tomasic, and Graham Neubig. 2018. Retrieval-based neural code generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing."},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2005.50"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3238191"},{"key":"e_1_3_2_28_2","first-page":"5076","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Jiang Hui","year":"2021","unstructured":"Hui Jiang, Chulun Zhou, Fandong Meng, Biao Zhang, Jie Zhou, Degen Huang, Qingqiang Wu, and Jinsong Su. 2021. Exploring dynamic selection of branch expansion orders for code generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 5076\u20135085."},{"key":"e_1_3_2_29_2","article-title":"Exploring the limits of language modeling","author":"Jozefowicz Rafal","year":"2016","unstructured":"Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. 2016. Exploring the limits of language modeling. Retrieved from https:\/\/arXiv:1602.02410","journal-title":"Retrieved from https:\/\/arXiv:1602.02410"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/SAI.2014.6918213"},{"key":"e_1_3_2_31_2","first-page":"3931","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT\u201919)","author":"LeClair Alex","year":"2019","unstructured":"Alex LeClair and Collin McMillan. 2019. Recommendations for datasets for source code summarization. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT\u201919). 3931\u20133937."},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature14539"},{"key":"e_1_3_2_33_2","first-page":"74","volume-title":"Text Summarization Branches Out","author":"Lin Chin-Yew","year":"2004","unstructured":"Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out. 74\u201381."},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1057"},{"key":"e_1_3_2_35_2","unstructured":"Chang Liu XinWang Richard Shin Joseph E. Gonzalez and Dawn Song. 2017. Neural Code Completion. https:\/\/openreview.net\/forum?id=rJbPBt9lg"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2020.3018481"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3238216"},{"key":"e_1_3_2_38_2","article-title":"Codexglue: A machine learning benchmark dataset for code understanding and generation","author":"Lu Shuai","year":"2021","unstructured":"Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, et\u00a0al. 2021. Codexglue: A machine learning benchmark dataset for code understanding and generation. Retrieved from https:\/\/arXiv:2102.04664","journal-title":"Retrieved from https:\/\/arXiv:2102.04664"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-021-09968-2"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE43902.2021.00041"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/1985793.1985809"},{"key":"e_1_3_2_42_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Murali Vijayaraghavan","year":"2018","unstructured":"Vijayaraghavan Murali, Letao Qi, Swarat Chaudhuri, and Chris Jermaine. 2018. Neural sketch learning for conditional program generation. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1108\/LHT-02-2022-0103"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-018-09679-z"},{"key":"e_1_3_2_45_2","first-page":"776","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)","author":"Norouzi Sajad","year":"2021","unstructured":"Sajad Norouzi, Keyi Tang, and Yanshuai Cao. 2021. Code generation from natural language with less prior knowledge and more monolingual data. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 776\u2013785."},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2015.36"},{"key":"e_1_3_2_47_2","first-page":"311","volume-title":"Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311\u2013318."},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2022.3197063"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1105"},{"issue":"140","key":"e_1_3_2_50_2","first-page":"1","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer.","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu, et\u00a0al. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140 (2020), 1\u201367.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_2_51_2","first-page":"357","volume-title":"Proceedings of the IEEE\/ACM 38th International Conference on Software Engineering (ICSE \u201916)","author":"Raghothaman Mukund","year":"2016","unstructured":"Mukund Raghothaman, Yi Wei, and Youssef Hamadi. 2016. Swim: Synthesizing what i mean-code search and idiomatic snippet synthesis. In Proceedings of the IEEE\/ACM 38th International Conference on Software Engineering (ICSE \u201916). IEEE, 357\u2013367."},{"key":"e_1_3_2_52_2","first-page":"349","volume-title":"Proceedings of the IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER\u201916)","volume":"1","author":"Rahman Mohammad Masudur","year":"2016","unstructured":"Mohammad Masudur Rahman, Chanchal K. Roy, and David Lo. 2016. Rack: Automatic api recommendation using crowdsourced knowledge. In Proceedings of the IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER\u201916), Vol. 1. IEEE, 349\u2013359."},{"key":"e_1_3_2_53_2","article-title":"Codebleu: A method for automatic evaluation of code synthesis","author":"Ren Shuo","year":"2020","unstructured":"Shuo Ren, Daya Guo, Shuai Lu, Long Zhou, Shujie Liu, Duyu Tang, Neel Sundaresan, Ming Zhou, Ambrosio Blanco, and Shuai Ma. 2020. Codebleu: A method for automatic evaluation of code synthesis. Retrieved from https:\/\/arXiv:2009.10297","journal-title":"Retrieved from https:\/\/arXiv:2009.10297"},{"key":"e_1_3_2_54_2","first-page":"7055","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"33","author":"Sun Zeyu","year":"2019","unstructured":"Zeyu Sun, Qihao Zhu, Lili Mou, Yingfei Xiong, Ge Li, and Lu Zhang. 2019. A grammar-based structural cnn decoder for code generation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 7055\u20137062."},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i05.6430"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2020.106277"},{"key":"e_1_3_2_57_2","first-page":"5998","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998\u20136008."},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE43902.2021.00138"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3549124"},{"key":"e_1_3_2_60_2","doi-asserted-by":"crossref","unstructured":"F. Wilcoxon. 1945. Individual Comparisons by Ranking Methods. Biom. Bull. 1 (1945) 80\u201383.","DOI":"10.2307\/3001968"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3409731"},{"key":"e_1_3_2_62_2","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Xu Frank F.","year":"2020","unstructured":"Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, and Graham Neubig. 2020. Incorporating external knowledge through pre-training for natural language to code generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics."},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3186081"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/3196398.3196408"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1041"},{"key":"e_1_3_2_66_2","first-page":"7","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations","author":"Yin Pengcheng","year":"2018","unstructured":"Pengcheng Yin and Graham Neubig. 2018. TRANX: A transition-based neural abstract syntax parser for semantic parsing and code generation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 7\u201312."},{"key":"e_1_3_2_67_2","article-title":"CERT: Continual pre-training on sketches for library-oriented code generation","author":"Zan Daoguang","year":"2022","unstructured":"Daoguang Zan, Bei Chen, Dejian Yang, Zeqi Lin, Minsu Kim, Bei Guan, Yongji Wang, Weizhu Chen, and Jian-Guang Lou. 2022. CERT: Continual pre-training on sketches for library-oriented code generation. Retrieved from https:\/\/arXiv:2206.06888","journal-title":"Retrieved from https:\/\/arXiv:2206.06888"},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2019.2962027"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3630009","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3630009","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T23:57:00Z","timestamp":1750291020000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3630009"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,22]]},"references-count":67,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,2,29]]}},"alternative-id":["10.1145\/3630009"],"URL":"https:\/\/doi.org\/10.1145\/3630009","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,22]]},"assertion":[{"value":"2022-11-22","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-10-06","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-12-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}