{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T21:44:35Z","timestamp":1774129475417,"version":"3.50.1"},"reference-count":145,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2025,5,28]],"date-time":"2025-05-28T00:00:00Z","timestamp":1748390400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"SNF project A-Test Autonomic Software Testing","award":["200021_215487"],"award-info":[{"award-number":["200021_215487"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2025,6,30]]},"abstract":"<jats:p>Artificial intelligence and recent advances in deep learning architectures, including transformer networks and large language models, change the way people think and act to solve problems. Software engineering, as an increasingly complex process to design, develop, test, deploy, and maintain large-scale software systems for solving real-world challenges, is profoundly affected by many revolutionary artificial intelligence tools in general and machine learning in particular. In this roadmap for artificial intelligence in software engineering, we highlight the recent deep impact of artificial intelligence on software engineering by discussing successful stories of applications of artificial intelligence to classic and new software development challenges. We identify the new challenges that the software engineering community has to address in the coming years to successfully apply artificial intelligence in software engineering, and we share our research roadmap toward the effective use of artificial intelligence in the software engineering profession, while still protecting fundamental human values.<\/jats:p>\n          <jats:p>We spotlight three main areas that challenge the research in software engineering: the use of generative artificial intelligence and large language models for engineering large software systems, the need of large and unbiased datasets and benchmarks for training and evaluating deep learning and large language models for software engineering, and the need of a new code of digital ethics to apply artificial intelligence in software engineering.<\/jats:p>","DOI":"10.1145\/3719006","type":"journal-article","created":{"date-parts":[[2025,4,18]],"date-time":"2025-04-18T14:41:21Z","timestamp":1744987281000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Artificial Intelligence for Software Engineering: The Journey So Far and the Road Ahead"],"prefix":"10.1145","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8221-5352","authenticated-orcid":false,"given":"Iftekhar","family":"Ahmed","sequence":"first","affiliation":[{"name":"University of California, Irvine, Irvine, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1716-690X","authenticated-orcid":false,"given":"Aldeida","family":"Aleti","sequence":"additional","affiliation":[{"name":"Monash University, Clayton, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5224-9970","authenticated-orcid":false,"given":"Haipeng","family":"Cai","sequence":"additional","affiliation":[{"name":"University at Buffalo, The State University of New York, New York, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5381-8418","authenticated-orcid":false,"given":"Alexander","family":"Chatzigeorgiou","sequence":"additional","affiliation":[{"name":"University of Macedonia, Thessaloniki, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3377-8129","authenticated-orcid":false,"given":"Pinjia","family":"He","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0093-3292","authenticated-orcid":false,"given":"Xing","family":"Hu","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5193-7379","authenticated-orcid":false,"given":"Mauro","family":"Pezz\u00e8","sequence":"additional","affiliation":[{"name":"Universit\u00e0 della Svizzera Italiana, Lugano, Switzerland, Universit\u00e0 degli Studi di Milano-Bicocca, Milan, Italy, and Constructor University, Schaffhausen, Switzerland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5626-7586","authenticated-orcid":false,"given":"Denys","family":"Poshyvanyk","sequence":"additional","affiliation":[{"name":"William and Mary, Williamsburg, Virginia, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6302-3256","authenticated-orcid":false,"given":"Xin","family":"Xia","sequence":"additional","affiliation":[{"name":"Huawei Technologies Co., Ltd, Shenzhen, China"}]}],"member":"320","published-online":{"date-parts":[[2025,5,28]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"EhsanLab - Your Software Testing Partner. 2023. 7-reasons-why-software-testing-has-brighter-future-than-development. Retrieved May 26 2023 from https:\/\/www.linkedin.com\/pulse\/7-reasons-why-software-testing-has-brighter-future-than-development-mz12f?trk=article-ssr-frontend-pulse_more-articles_related-content-card"},{"key":"e_1_3_2_3_2","unstructured":"Tom Dotan and Deepa Seetharaman. 2023. Big tech struggles to turn AI hype into profits Microsoft Google and others experiment with how to produce market and charge for new tools. The Wall Street Journal."},{"key":"e_1_3_2_4_2","unstructured":"Kasper Groes Albin Ludvigsen. 2023. The carbon footprint of GPT-4. Retrieved April 30 2015 from https:\/\/towardsdatascience.com\/the-carbon-footprint-of-gpt-4-d6c676eb21ae"},{"key":"e_1_3_2_5_2","unstructured":"Amazon Web Services. 2023. Helping developers around the world improve productivity with AI. Retrieved May 26 2023 from https:\/\/aws.amazon.com\/careers\/life-at-aws-impactful-work-helping-developers-around-the-world-improve-productivity\/"},{"key":"e_1_3_2_6_2","unstructured":"PEGA. 2023. What consumers really think about AI: A global study. Retrieved April 30 2025 from https:\/\/www.pega.com\/ai-survey"},{"key":"e_1_3_2_7_2","doi-asserted-by":"crossref","first-page":"1737","DOI":"10.1109\/ICSE48619.2023.00149","volume-title":"Proceedings of the 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE)","author":"Ahmed Toufique","year":"2023","unstructured":"Toufique Ahmed, Supriyo Ghosh, Chetan Bansal, Thomas Zimmermann, Xuchao Zhang, and Saravan Rajmohan. 2023. Recommending root-cause and mitigation steps for cloud incidents using large language models. In Proceedings of the 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1737\u20131749."},{"key":"e_1_3_2_8_2","doi-asserted-by":"crossref","unstructured":"Saranya Alagarsamy Chakkrit Tantithamthavorn and Aldeida Aleti. 2023. A3Test: Assertion-augmented automated test case generation. arXiv:2302.10352. Retrieved from https:\/\/arxiv.org\/abs\/2302.10352","DOI":"10.2139\/ssrn.4724885"},{"key":"e_1_3_2_9_2","doi-asserted-by":"crossref","first-page":"107565","DOI":"10.1016\/j.infsof.2024.107565","article-title":"A3test: Assertion-augmented automated test case generation","volume":"176","author":"Alagarsamy Saranya","year":"2024","unstructured":"Saranya Alagarsamy, Chakkrit Tantithamthavorn, and Aldeida Aleti. 2024. A3test: Assertion-augmented automated test case generation. Information and Software Technology 176 (2024), 107565.","journal-title":"Information and Software Technology"},{"key":"e_1_3_2_10_2","doi-asserted-by":"crossref","first-page":"1536","DOI":"10.1007\/s10664-019-09759-w","article-title":"Code localization in programming screencasts","volume":"25","author":"Alahmadi Mohammad","year":"2020","unstructured":"Mohammad Alahmadi, Abdulkarim Khormi, Biswas Parajuli, Jonathan Hassel, Sonia Haiduc, and Piyush Kumar. 2020. Code localization in programming screencasts. Empirical Software Engineering 25 (2020), 1536\u20131572.","journal-title":"Empirical Software Engineering"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2019.106214"},{"key":"e_1_3_2_12_2","doi-asserted-by":"crossref","unstructured":"Chetan Arora John Grundy and Mohamed Abdelrazek. 2023. Advancing requirements engineering through generative AI: Assessing the role of LLMS. arXiv:2310.13976. Retrieved from https:\/\/arxiv.org\/abs\/2310.13976","DOI":"10.1007\/978-3-031-55642-5_6"},{"key":"e_1_3_2_13_2","doi-asserted-by":"crossref","first-page":"580","DOI":"10.1109\/MODELS-C59198.2023.00096","volume-title":"Proceedings of the 2023 ACM\/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C)","author":"Arulmohan Sathurshan","year":"2023","unstructured":"Sathurshan Arulmohan, Marie-Jean Meurs, and S\u00e9bastien Mosser. 2023. Extracting domain models from textual requirements in the era of large language models. In Proceedings of the 2023 ACM\/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C). IEEE, 580\u2013587."},{"key":"e_1_3_2_14_2","unstructured":"Jacob Austin Augustus Odena Maxwell Nye Maarten Bosma Henryk Michalewski David Dohan Ellen Jiang Carrie Cai Michael Terry Quoc Le et al. 2021. Program synthesis with large language models. arXiv:2108.07732. Retrieved from https:\/\/arxiv.org\/abs\/2108.07732"},{"issue":"5","key":"e_1_3_2_15_2","doi-asserted-by":"crossref","first-page":"507","DOI":"10.1109\/TSE.2014.2372785","article-title":"The oracle problem in software testing: A survey","volume":"41","author":"Barr Earl T.","year":"2014","unstructured":"Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2014. The oracle problem in software testing: A survey. IEEE Transactions on Software Engineering 41, 5 (2014), 507\u2013525.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_3_2_16_2","first-page":"500","volume-title":"Proceedings of the 54th ACM Technical Symposium on Computer Science Education (SIGCSE \u201923)","author":"Becker Brett A.","year":"2023","unstructured":"Brett A. Becker, Paul Denny, James Finnie-Ansley, Andrew Luxton-Reilly, James Prather, and Eddie Antonio Santos. 2023. Programming is hard - Or at least it used to be: Educational opportunities and challenges of AI code generation. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education (SIGCSE \u201923). ACM, New York, NY, 500\u2013506. DOI: 10.1145\/3545945.3569759"},{"issue":"100","key":"e_1_3_2_17_2","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1093\/epolic\/eiaa001","article-title":"Automation and jobs: When technology boosts employment","volume":"34","author":"Bessen James","year":"2019","unstructured":"James Bessen. 2019. Automation and jobs: When technology boosts employment. Economic Policy 34, 100 (2019), 589\u2013626.","journal-title":"Economic Policy"},{"issue":"3","key":"e_1_3_2_18_2","doi-asserted-by":"crossref","first-page":"781","DOI":"10.1007\/s10270-023-01105-5","article-title":"On the assessment of generative AI in modeling tasks: An experience report with ChatGPT and UML","volume":"22","author":"C\u00e1mara Javier","year":"2023","unstructured":"Javier C\u00e1mara, Javier Troya, Lola Burgue\u00f1o, and Antonio Vallecillo. 2023. On the assessment of generative AI in modeling tasks: An experience report with ChatGPT and UML. Software and Systems Modeling 22, 3 (2023), 781\u2013793.","journal-title":"Software and Systems Modeling"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-018-9669-7"},{"key":"e_1_3_2_20_2","doi-asserted-by":"crossref","first-page":"665","DOI":"10.1145\/3180155.3180240","volume-title":"Proceedings of the 40th International Conference on Software Engineering","author":"Chen Chunyang","year":"2018","unstructured":"Chunyang Chen, Ting Su, Guozhu Meng, Zhenchang Xing, and Yang Liu. 2018. From UI design image to GUI skeleton: A neural machine translator to bootstrap mobile GUI implementation. In Proceedings of the 40th International Conference on Software Engineering, 665\u2013676."},{"key":"e_1_3_2_21_2","first-page":"364","volume-title":"Proceedings of the 2019 34th IEEE\/ACM International Conference on Automated Software Engineering (ASE)","author":"Chen Junjie","year":"2019","unstructured":"Junjie Chen, Xiaoting He, Qingwei Lin, Hongyu Zhang, Dan Hao, Feng Gao, Zhangwei Xu, Yingnong Dang, and Dongmei Zhang. 2019. Continuous incident triage for large-scale online service systems. In Proceedings of the 2019 34th IEEE\/ACM International Conference on Automated Software Engineering (ASE). IEEE, 364\u2013375."},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman et al. 2021. Evaluating large language models trained on code. arXiv:2107.03374. DOI: 10.48550\/arXiv.2107.03374","DOI":"10.48550\/arXiv.2107.03374"},{"key":"e_1_3_2_23_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Chen Xinyun","year":"2018","unstructured":"Xinyun Chen, Chang Liu, and Dawn Song. 2018. Execution-guided neural program synthesis. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_24_2","unstructured":"Yinfang Chen Huaibing Xie Minghua Ma Yu Kang Xin Gao Liu Shi Yunjie Cao Xuedong Gao Hao Fan Ming Wen et al. 2023. Empowering practical root cause analysis by large language models for cloud incidents. arXiv:2305.15778. Retrieved from https:\/\/arxiv.org\/abs\/2305.15778"},{"issue":"2","key":"e_1_3_2_25_2","doi-asserted-by":"crossref","first-page":"626","DOI":"10.1109\/TR.2021.3052510","article-title":"Generative adversarial networks-based imbalance learning in software aging-related bug prediction","volume":"70","author":"Singh Chouhan Satyendra","year":"2021","unstructured":"Satyendra Singh Chouhan and Santosh Singh Rathore. 2021. Generative adversarial networks-based imbalance learning in software aging-related bug prediction. IEEE Transactions on Reliability 70, 2 (2021), 626\u2013642.","journal-title":"IEEE Transactions on Reliability"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","unstructured":"Matteo Ciniselli Nathan Cooper Luca Pascarella Antonio Mastropaolo Emad Aghajani Denys Poshyvanyk Massimiliano Di Penta and Gabriele Bavota. 2021. An empirical study on the usage of transformer models for code completion. arXiv:2108.01585. DOI: 10.1109\/TSE.2021.3128234","DOI":"10.1109\/TSE.2021.3128234"},{"key":"e_1_3_2_27_2","unstructured":"J. Constantz. 2023. Nearly a third of white-collar workers have tried ChatGPT or other AI programs according to a new survey. Time Magazine. Retrieved from https:\/\/time.com\/6248707\/survey-chatgpt-ai-use-at-work\/"},{"key":"e_1_3_2_28_2","first-page":"121","volume-title":"Proceedings of the 45th International Conference on Software Engineering (ICSE \u201923","author":"Croft Roland","year":"2023","unstructured":"Roland Croft, M. Ali Babar, and M. Mehdi Kholoosi. 2023. Data quality for software vulnerability datasets. In Proceedings of the 45th International Conference on Software Engineering (ICSE \u201923). IEEE Press, 121\u2013133. DOI: 10.1109\/ICSE48619.2023.00022"},{"key":"e_1_3_2_29_2","doi-asserted-by":"crossref","first-page":"111734","DOI":"10.1016\/j.jss.2023.111734","article-title":"Github copilot AI pair programmer: Asset or liability?","volume":"203","author":"Dakhel Arghavan Moradi","year":"2023","unstructured":"Arghavan Moradi Dakhel, Vahid Majdinasab, Amin Nikanjam, Foutse Khomh, Michel C. Desmarais, and Zhen Ming Jack Jiang. 2023. Github copilot AI pair programmer: Asset or liability? Journal of Systems and Software 203 (2023), 111734.","journal-title":"Journal of Systems and Software"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2023.3327583"},{"key":"e_1_3_2_31_2","unstructured":"Ankur Desai and Atul Deo. 2022. Introducing Amazon CodeWhisperer the ML-powered coding companion. Retrieved from https:\/\/aws.amazon.com\/blogs\/machine-learning\/introducing-amazon-codewhisperer-the-ml-powered-coding-companion\/"},{"key":"e_1_3_2_32_2","unstructured":"Prem Devanbu Matthew Dwyer Sebastian Elbaum Michael Lowry Kevin Moran Denys Poshyvanyk Baishakhi Ray Rishabh Singh and Xiangyu Zhang. 2020. Deep learning & software engineering: State of research and future directions. arXiv:2009.08525. Retrieved from https:\/\/arxiv.org\/abs\/2009.08525"},{"key":"e_1_3_2_33_2","volume-title":"Proceedings of the 44th International Conference on Software Engineering (ICSE \u201922)","author":"Dinella Elizabeth","year":"2022","unstructured":"Elizabeth Dinella, Gabriel Ryan, Todd Mytkowicz, and Shuvendu Lahiri. 2022. TOGA: A neural method for test oracle generation. In Proceedings of the 44th International Conference on Software Engineering (ICSE \u201922). ACM. Retrieved from https:\/\/www.microsoft.com\/en-us\/research\/publication\/toga-a-neural-method-for-test-oracle-generation\/"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/MS.2023.3265877"},{"key":"e_1_3_2_35_2","unstructured":"European Commission. 2016. Regulation (EU) 2016\/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data and repealing Directive 95\/46\/EC (General Data Protection Regulation) (Text with EEA relevance). Retrieved from https:\/\/eur-lex.europa.eu\/eli\/reg\/2016\/679\/oj"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","unstructured":"Angela Fan Beliz Gokkaya Mark Harman Mitya Lyubarskiy Shubho Sengupta Shin Yoo and Jie M. Zhang. 2023. Large language models for software engineering: Survey and open problems. arXiv:2310.03533. Retrieved from 10.48550\/arXiv.2310.03533","DOI":"10.48550\/arXiv.2310.03533"},{"key":"e_1_3_2_37_2","doi-asserted-by":"crossref","first-page":"254","DOI":"10.1016\/j.techfore.2016.08.019","article-title":"The future of employment: How susceptible are jobs to computerisation?","volume":"114","author":"Frey Carl Benedikt","year":"2017","unstructured":"Carl Benedikt Frey and Michael A. Osborne. 2017. The future of employment: How susceptible are jobs to computerisation? Technological Forecasting and Social Change 114 (2017), 254\u2013280.","journal-title":"Technological Forecasting and Social Change"},{"key":"e_1_3_2_38_2","first-page":"761","volume-title":"Proceedings of the 2023 38th IEEE\/ACM International Conference on Automated Software Engineering (ASE)","author":"Gao S.","year":"2023","unstructured":"S. Gao, X. Wen, C. Gao, W. Wang, H. Zhang, and M. R. Lyu. 2023. What makes good in-context demonstrations for code intelligence tasks with LLMs?. In Proceedings of the 2023 38th IEEE\/ACM International Conference on Automated Software Engineering (ASE). IEEE Computer Society, Los Alamitos, CA, 761\u2013773. DOI: 10.1109\/ASE56229.2023.00109"},{"key":"e_1_3_2_39_2","first-page":"13","volume-title":"Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering (ICSE \u201924)","author":"Geng Mingyang","year":"2024","unstructured":"Mingyang Geng, Shangwen Wang, Dezun Dong, Haotian Wang, Ge Li, Zhi Jin, Xiaoguang Mao, and Xiangke Liao. 2024. Large language models are few-shot summarizers: Multi-intent comment generation via in-context learning. In Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering (ICSE \u201924). ACM, New York, NY, Article 39, 13 pages. DOI: 10.1145\/3597503.3608134"},{"key":"e_1_3_2_40_2","doi-asserted-by":"crossref","first-page":"746","DOI":"10.1145\/3324884.3416546","volume-title":"Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering","author":"Gros David","year":"2020","unstructured":"David Gros, Hariharan Sezhiyan, Prem Devanbu, and Zhou Yu. 2020. Code to comment\u201d translation\u201d data, metrics, baselining & evaluation. In Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering, 746\u2013757."},{"key":"e_1_3_2_41_2","first-page":"1296","volume-title":"Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE)","author":"Gu Jiazhen","year":"2020","unstructured":"Jiazhen Gu, Jiaqi Wen, Zijian Wang, Pu Zhao, Chuan Luo, Yu Kang, Yangfan Zhou, Li Yang, Jeffrey Sun, Zhangwei Xu, et al. 2020. Efficient customer incident triage via linking with system incidents. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE), 1296\u20131307."},{"key":"e_1_3_2_42_2","unstructured":"Guoxiang Guo Aldeida Aleti Neelofar Neelofar and Chakkrit Tantithamthavorn. 2024. MORTAR: Metamorphic multi-turn testing for LLM-based dialogue systems. arXiv:2412.15557. Retrieved from https:\/\/arxiv.org\/abs\/2412.15557"},{"key":"e_1_3_2_43_2","first-page":"13","volume-title":"Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering (ICSE \u201924)","author":"Guo Qi","year":"2024","unstructured":"Qi Guo, Junming Cao, Xiaofei Xie, Shangqing Liu, Xiaohong Li, Bihuan Chen, and Xin Peng. 2024. Exploring the potential of ChatGPT in automated code refinement: An empirical study. In Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering (ICSE \u201924). ACM, New York, NY, Article 34, 13 pages. DOI: 10.1145\/3597503.3623306"},{"key":"e_1_3_2_44_2","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"31","author":"Gupta Rahul","year":"2017","unstructured":"Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish Shevade. 2017. Deepfix: Fixing common c language errors by deep learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31."},{"key":"e_1_3_2_45_2","first-page":"1345","volume-title":"Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence","author":"Gupta Rahul","year":"2017","unstructured":"Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish Shevade. 2017. DeepFix: Fixing common C language errors by deep learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. AAAI Press, San Francisco, California, 1345\u20131351."},{"key":"e_1_3_2_46_2","first-page":"1465","volume-title":"Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering","author":"He Shilin","year":"2022","unstructured":"Shilin He, Xu Zhang, Pinjia He, Yong Xu, Liqun Li, Yu Kang, Minghua Ma, Yining Wei, Yingnong Dang, Saravanakumar Rajmohan, et al. 2022. An empirical study of log analysis at Microsoft. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 1465\u20131476."},{"key":"e_1_3_2_47_2","unstructured":"Dan Hendrycks Steven Basart Saurav Kadavath Mantas Mazeika Akul Arora Ethan Guo Collin Burns Samir Puranik Horace He Dawn Song et al. 2021. Measuring coding challenge competence with apps. arXiv:2105.09938. Retrieved from https:\/\/arxiv.org\/abs\/2105.09938"},{"key":"e_1_3_2_48_2","doi-asserted-by":"crossref","first-page":"837","DOI":"10.1109\/ICSE.2012.6227135","volume-title":"Proceedings of the 2012 34th International Conference on Software Engineering (ICSE)","author":"Hindle Abram","year":"2012","unstructured":"Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the naturalness of software. In Proceedings of the 2012 34th International Conference on Software Engineering (ICSE), 837\u2013847. DOI: 10.1109\/ICSE.2012.6227135"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","unstructured":"Xinyi Hou Yanjie Zhao Yue Liu Zhou Yang Kailong Wang Li Li Xiapu Luo David Lo John Grundy and Haoyu Wang. 2023. Large language models for software engineering: A systematic literature review. arXiv.2308.10620. Retrieved from 10.48550\/arXiv.2308.10620","DOI":"10.48550\/arXiv.2308.10620"},{"key":"e_1_3_2_50_2","unstructured":"Hui Huang Yingqi Qu Hongli Zhou Jing Liu Muyun Yang Bing Xu and Tiejun Zhao. 2024. On the limitations of fine-tuned judge models for LLM evaluation. arXiv:2403.02839v2. Retrieved from https:\/\/arxiv.org\/abs\/2403.02839v2"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2020.2982385"},{"key":"e_1_3_2_52_2","first-page":"1646","volume-title":"Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering","author":"Jin Matthew","year":"2023","unstructured":"Matthew Jin, Syed Shahriar, Michele Tufano, Xin Shi, Shuai Lu, Neel Sundaresan, and Alexey Svyatkovskiy. 2023. Inferfix: End-to-end program repair with LLMS. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 1646\u20131656."},{"key":"e_1_3_2_53_2","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1145\/2610384.2628055","volume-title":"Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA \u201914)","author":"Just Ren\u00e9","year":"2014","unstructured":"Ren\u00e9 Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA \u201914). ACM, New York, NY, 437\u2013440. DOI: 10.1145\/2610384.2628055"},{"key":"e_1_3_2_54_2","first-page":"5110","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Kanade Aditya","year":"2020","unstructured":"Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, and Kensen Shi. 2020. Learning and evaluating contextual embedding of source code. In Proceedings of the International Conference on Machine Learning. PMLR, 5110\u20135121."},{"issue":"6","key":"e_1_3_2_55_2","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1007\/s10664-024-10533-w","article-title":"Impact of log parsing on deep learning-based anomaly detection","volume":"29","author":"Khan Zanis Ali","year":"2024","unstructured":"Zanis Ali Khan, Donghwan Shin, Domenico Bianculli, and Lionel C Briand. 2024. Impact of log parsing on deep learning-based anomaly detection. Empirical Software Engineering 29, 6 (2024), 139.","journal-title":"Empirical Software Engineering"},{"key":"e_1_3_2_56_2","unstructured":"Fatemeh Khayashi Behnaz Jamasb Reza Akbari and Pirooz Shamsinejadbabaki. 2022. Deep learning methods for software requirement classification: A performance study on the PURE dataset. arXiv:2211.05286. Retrieved from https:\/\/arxiv.org\/abs\/2211.05286"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","unstructured":"Nicholas Kroeger Dan Ley Satyapriya Krishna Chirag Agarwal and Himabindu Lakkaraju. 2023. Are large language models post hoc explainers? arXiv:2310.05797. Retrieved from 10.48550\/arXiv.2310.05797","DOI":"10.48550\/arXiv.2310.05797"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1145\/3383458"},{"key":"e_1_3_2_59_2","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1145\/3387904.3389268","volume-title":"Proceedings of the 28th International Conference on Program Comprehension","author":"LeClair Alexander","year":"2020","unstructured":"Alexander LeClair, Sakib Haque, Lingfei Wu, and Collin McMillan. 2020. Improved code summarization via a graph neural network. In Proceedings of the 28th International Conference on Program Comprehension, 184\u2013195."},{"key":"e_1_3_2_60_2","unstructured":"Jian Li Yue Wang Michael R. Lyu and Irwin King. 2017. Code completion with neural attention and pointer networks. arXiv:1711.09573. Retrieved from https:\/\/arxiv.org\/abs\/1711.09573"},{"key":"e_1_3_2_61_2","first-page":"208","volume-title":"Proceedings of the 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME)","author":"Li Mingyang","year":"2020","unstructured":"Mingyang Li, Ye Yang, Lin Shi, Qing Wang, Jun Hu, Xinhua Peng, Weimin Liao, and Guizhen Pi. 2020. Automated extraction of requirement entities by leveraging LSTM-CRF and transfer learning. In Proceedings of the 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 208\u2013219."},{"key":"e_1_3_2_62_2","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1109\/ICSME46990.2020.00021","volume-title":"Proceedings of the 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME)","author":"Li Wei","year":"2020","unstructured":"Wei Li, Haozhe Qin, Shuhan Yan, Beijun Shen, and Yuting Chen. 2020. Learning code-query interaction for enhancing code searches. In Proceedings of the 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 115\u2013126."},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1145\/3236386.3241340"},{"key":"e_1_3_2_64_2","article-title":"Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation","volume":"36","author":"Liu Jiawei","year":"2024","unstructured":"Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. 2024. Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation. In Advances in Neural Information Processing Systems, Vol. 36.","journal-title":"Advances in Neural Information Processing Systems, Vol"},{"key":"e_1_3_2_65_2","unstructured":"Matthew Lodge. 2021. Software testing is tedious. AI can help. Harvard Business Review (Feb. 2021). Retrieved from https:\/\/hbr.org\/2021\/02\/software-testing-is-tedious-ai-can-help"},{"key":"e_1_3_2_66_2","unstructured":"Shuai Lu Daya Guo Shuo Ren Junjie Huang Alexey Svyatkovskiy Ambrosio Blanco Colin Clement Dawn Drain Daxin Jiang Duyu Tang et al. 2021. Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv:2102.04664. Retrieved from https:\/\/arxiv.org\/abs\/2102.04664"},{"issue":"253","key":"e_1_3_2_67_2","first-page":"1","article-title":"Estimating the carbon footprint of bloom, a 176b parameter language model","volume":"24","author":"Luccioni Alexandra Sasha","year":"2023","unstructured":"Alexandra Sasha Luccioni, Sylvain Viguier, and Anne-Laure Ligozat. 2023. Estimating the carbon footprint of bloom, a 176b parameter language model. Journal of Machine Learning Research 24, 253 (2023), 1\u201315.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_68_2","unstructured":"Dipeeka Luitel Shabnam Hassani and Mehrdad Sabetzadeh. 2023. Improving requirements completeness: Automated assistance through large language models. arXiv:2308.03784. Retrieved from https:\/\/arxiv.org\/abs\/2308.03784"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2014.11.023"},{"key":"e_1_3_2_70_2","doi-asserted-by":"crossref","first-page":"304","DOI":"10.1109\/ICSE.2019.00045","volume-title":"Proceedings of the 2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE)","author":"Malik Rabee Sohail","year":"2019","unstructured":"Rabee Sohail Malik, Jibesh Patra, and Michael Pradel. 2019. NL2Type: Inferring JavaScript function types from natural language information. In Proceedings of the 2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 304\u2013315."},{"key":"e_1_3_2_71_2","unstructured":"Farhad Manjoo. 2023. It\u2019s the end of computer programming as we know it. (And I feel fine.). The New York Times. Retrieved from https:\/\/www.nytimes.com\/2023\/06\/02\/opinion\/ai-coding.html"},{"key":"e_1_3_2_72_2","unstructured":"Antonio Mastropaolo Nathan Cooper David Nader Palacio Simone Scalabrino Denys Poshyvanyk Rocco Oliveto and Gabriele Bavota. 2022. Using transfer learning for code-related tasks. arXiv:2206.08574. Retrieved from https:\/\/arxiv.org\/abs\/2206.08574"},{"key":"e_1_3_2_73_2","doi-asserted-by":"crossref","first-page":"98754","DOI":"10.1109\/ACCESS.2021.3095559","article-title":"Software defect prediction using ensemble learning: A systematic literature review","volume":"9","author":"Matloob Faseeha","year":"2021","unstructured":"Faseeha Matloob, Taher M. Ghazal, Nasser Taleb, Shabib Aftab, Munir Ahmad, Muhammad Adnan Khan, Sagheer Abbas, and Tariq Rahim Soomro. 2021. Software defect prediction using ensemble learning: A systematic literature review. IEEE Access 9 (2021), 98754\u201398771.","journal-title":"IEEE Access"},{"key":"e_1_3_2_74_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2018.07.007"},{"key":"e_1_3_2_75_2","first-page":"18","volume-title":"Proceedings of the 32nd USENIX Conference on Security Symposium (SEC \u201923)","author":"Mirsky Yisroel","year":"2023","unstructured":"Yisroel Mirsky, George Macon, Michael Brown, Carter Yagemann, Matthew Pruett, Evan Downing, Sukarno Mertoguno, and Wenke Lee. 2023. VulChecker: Graph-based vulnerability localization in source code. In Proceedings of the 32nd USENIX Conference on Security Symposium (SEC \u201923). USENIX Association, Article 367, 18 pages."},{"key":"e_1_3_2_76_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dsp.2017.10.011"},{"key":"e_1_3_2_77_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2018.2844788"},{"key":"e_1_3_2_78_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-022-10246-w"},{"key":"e_1_3_2_79_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2024.3379943"},{"key":"e_1_3_2_80_2","doi-asserted-by":"crossref","first-page":"1372","DOI":"10.1145\/3377811.3380926","volume-title":"Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering","author":"Nguyen Son","year":"2020","unstructured":"Son Nguyen, Hung Phan, Trinh Le, and Tien N Nguyen. 2020. Suggesting natural method names to check name consistencies. In Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering, 1372\u20131384."},{"key":"e_1_3_2_81_2","doi-asserted-by":"crossref","first-page":"110538","DOI":"10.1016\/j.jss.2020.110538","article-title":"Analyzing bug fix for automatic bug cause classification","volume":"163","author":"Ni Zhen","year":"2020","unstructured":"Zhen Ni, Bin Li, Xiaobing Sun, Tianhao Chen, Ben Tang, and Xinchen Shi. 2020. Analyzing bug fix for automatic bug cause classification. Journal of Systems and Software 163 (2020), 110538.","journal-title":"Journal of Systems and Software"},{"key":"e_1_3_2_82_2","volume-title":"Proceedings of the 2024 IEEE\/ACM 46th International Conference on Software Engineering (ICSE)","author":"Nong Yu","year":"2024","unstructured":"Yu Nong, Richard Fang, Guangbei Yi, Kunsong Zhao, Xiapu Luo, Feng Chen, and Haipeng Cai. 2024. VGX: Large-scale sample generation for boosting learning-based software vulnerability analyses. In Proceedings of the 2024 IEEE\/ACM 46th International Conference on Software Engineering (ICSE)."},{"key":"e_1_3_2_83_2","doi-asserted-by":"crossref","first-page":"2527","DOI":"10.1109\/ICSE48619.2023.00211","volume-title":"Proceedings of the 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE)","author":"Nong Yu","year":"2023","unstructured":"Yu Nong, Yuzhe Ou, Michael Pradel, Feng Chen, and Haipeng Cai. 2023. VULGEN: Realistic vulnerability generation via pattern mining and deep learning. In Proceedings of the 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE), 2527\u20132539. DOI: 10.1109\/ICSE48619.2023.00211"},{"key":"e_1_3_2_84_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2022.3207149"},{"key":"e_1_3_2_85_2","unstructured":"Curtis G. Northcutt Anish Athalye and Jonas Mueller. 2021. Pervasive label errors in test sets destabilize machine learning benchmarks. arXiv:2103.14749. Retrieved from https:\/\/arxiv.org\/abs\/2103.14749"},{"key":"e_1_3_2_86_2","unstructured":"United States Copyright Office. 2023. Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence. 16190 Federal Register Vol. 88 No. 51."},{"key":"e_1_3_2_87_2","first-page":"575","volume-title":"Proceedings of the 37th Annual IEEE\/IFIP International Conference on Dependable Systems and Networks","author":"Oliner Adam","year":"2007","unstructured":"Adam Oliner and Jon Stearley. 2007. What supercomputers say: A study of five system logs. In Proceedings of the 37th Annual IEEE\/IFIP International Conference on Dependable Systems and Networks, 575\u2013594."},{"key":"e_1_3_2_88_2","doi-asserted-by":"publisher","unstructured":"David N. Palacio Daniel Rodriguez-Cardenas Alejandro Velasco Dipin Khati Kevin Moran and Denys Poshyvanyk. 2024. Towards more trustworthy and interpretable LLMs for code through syntax-grounded explanations. arXiv:2407.08983. Retrieved from 10.48550\/arXiv.2407.08983","DOI":"10.48550\/arXiv.2407.08983"},{"key":"e_1_3_2_89_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-021-10066-6"},{"key":"e_1_3_2_90_2","first-page":"1","volume-title":"Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering","author":"Pan Rangeet","year":"2024","unstructured":"Rangeet Pan, Ali Reza Ibrahimzada, Rahul Krishna, Divya Sankar, Lambert Pouguem Wassi, Michele Merler, Boris Sobolev, Raju Pavuluri, Saurabh Sinha, and Reyhaneh Jabbarvand. 2024. Lost in translation: A study of bugs introduced by large language models while translating code. In Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering, 1\u201313."},{"key":"e_1_3_2_91_2","doi-asserted-by":"crossref","first-page":"754","DOI":"10.1109\/SP46214.2022.9833571","volume-title":"Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP)","author":"Pearce Hammond","year":"2022","unstructured":"Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, and Ramesh Karri. 2022. Asleep at the keyboard? Assessing the security of GitHub copilot\u2019s code contributions. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP). IEEE, 754\u2013768. DOI: 10.1109\/SP46214.2022.9833571"},{"issue":"3","key":"e_1_3_2_92_2","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1007\/s10664-022-10116-7","article-title":"Search-based fairness testing for regression-based machine learning systems","volume":"27","author":"Perera Anjana","year":"2022","unstructured":"Anjana Perera, Aldeida Aleti, Chakkrit Tantithamthavorn, Jirayus Jiarpakdee, Burak Turhan, Lisa Kuhn, and Katie Walker. 2022. Search-based fairness testing for regression-based machine learning systems. Empirical Software Engineering 27, 3 (2022), 79.","journal-title":"Empirical Software Engineering"},{"key":"e_1_3_2_93_2","first-page":"211","volume-title":"Proceedings of the 2019 IEEE 27th International Requirements Engineering Conference (RE)","author":"Pudlitz Florian","year":"2019","unstructured":"Florian Pudlitz, Florian Brokhausen, and Andreas Vogelsang. 2019. Extraction of system states from natural language requirements. In Proceedings of the 2019 IEEE 27th International Requirements Engineering Conference (RE). IEEE, 211\u2013222."},{"issue":"1","key":"e_1_3_2_94_2","doi-asserted-by":"crossref","first-page":"3923","DOI":"10.1038\/s41467-020-17419-7","article-title":"Improving the accuracy of medical diagnosis with causal machine learning","volume":"11","author":"Richens Jonathan G.","year":"2020","unstructured":"Jonathan G. Richens, Ciar\u00e1n M. Lee, and Saurabh Johri. 2020. Improving the accuracy of medical diagnosis with causal machine learning. Nature Communications 11, 1 (2020), 3923.","journal-title":"Nature Communications"},{"key":"e_1_3_2_95_2","first-page":"432","volume-title":"An Integrated Approach to Communication Theory and Research","author":"Rogers Everett M.","year":"2014","unstructured":"Everett M. Rogers, Arvind Singhal, and Margaret M. Quinlan. 2014. Diffusion of innovations. In An Integrated Approach to Communication Theory and Research. Routledge, 432\u2013448."},{"key":"e_1_3_2_96_2","unstructured":"K. Roose. 2022. The brilliance and weirdness of ChatGPT. The New York Times. Retrieved from https:\/\/www.nytimes.com\/2022\/12\/05\/technology\/chatgpt-ai-twitter.htm"},{"key":"e_1_3_2_97_2","first-page":"287","volume-title":"Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering","author":"Roy Devjeet","year":"2020","unstructured":"Devjeet Roy, Ziyi Zhang, Maggie Ma, Venera Arnaoudova, Annibale Panichella, Sebastiano Panichella, Danielle Gonzalez, and Mehdi Mirakhorli. 2020. DeepTC-Enhancer: Improving the readability of automatically generated tests. In Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering, 287\u2013298."},{"key":"e_1_3_2_98_2","first-page":"107","volume-title":"Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE \u201922)","author":"Shi Lin","year":"2022","unstructured":"Lin Shi, Fangwen Mu, Xiao Chen, Song Wang, Junjie Wang, Ye Yang, Ge Li, Xin Xia, and Qing Wang. 2022. Are we building on the rock? on the importance of data preprocessing for code summarization. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE \u201922). ACM, New York, NY, 107\u2013119. DOI: 10.1145\/3540250.3549145"},{"key":"e_1_3_2_99_2","first-page":"641","volume-title":"Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering","author":"Shi Lin","year":"2020","unstructured":"Lin Shi, Mingzhe Xing, Mingyang Li, Yawen Wang, Shoubin Li, and Qing Wang. 2020. Detection of hidden feature requests from massive chat messages via deep siamese network. In Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering, 641\u2013653."},{"key":"e_1_3_2_100_2","unstructured":"Trevor Stalnaker Nathan Wintersgill Oscar Chaparro Laura A. Heymann Massimiliano Di Penta Daniel M. German and Denys Poshyvanyk. 2024. Developer perspectives on licensing and copyright issues arising from generative AI for coding. arXiv:2411.10877. Retrieved from https:\/\/arxiv.org\/abs\/2411.10877"},{"key":"e_1_3_2_101_2","unstructured":"Trevor Stalnaker Nathan Wintersgill Oscar Chaparro Laura A. Heymann Massimiliano Di Penta Daniel M. German and Denys Poshyvanyk. 2025. The ML supply chain in the era of software 2.0: Lessons learned from hugging face. arXiv:2502.04484. Retrieved from https:\/\/arxiv.org\/abs\/2502.04484"},{"key":"e_1_3_2_102_2","first-page":"1609","volume-title":"Proceedings of the 44th International Conference on Software Engineering (ICSE \u201922)","author":"Sun Zhensu","year":"2022","unstructured":"Zhensu Sun, Li Li, Yan Liu, Xiaoning Du, and Li Li. 2022. On the importance of building high-quality training datasets for neural code search. In Proceedings of the 44th International Conference on Software Engineering (ICSE \u201922). ACM, New York, NY, 1609\u20131620. DOI: 10.1145\/3510003.3510160"},{"key":"e_1_3_2_103_2","first-page":"207","volume-title":"Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER)","author":"Thaller Hannes","year":"2019","unstructured":"Hannes Thaller, Lukas Linsbauer, and Alexander Egyed. 2019. Feature maps: A comprehensible software representation for design pattern detection. In Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 207\u2013217."},{"key":"e_1_3_2_104_2","unstructured":"Haoye Tian Weiqi Lu Tsz On Li Xunzhu Tang Shing-Chi Cheung Jacques Klein and Tegawend\u00e9 F Bissyand\u00e9. 2023. Is ChatGPT the ultimate programming assistant\u2013how far is it? arXiv:2304.11938. Retrieved from https:\/\/arxiv.org\/abs\/2304.11938"},{"key":"e_1_3_2_105_2","doi-asserted-by":"crossref","first-page":"106289","DOI":"10.1016\/j.infsof.2020.106289","article-title":"BVDetector: A program slice-based binary code vulnerability intelligent detection system","volume":"123","author":"Tian Junfeng","year":"2020","unstructured":"Junfeng Tian, Wenjing Xing, and Zhen Li. 2020. BVDetector: A program slice-based binary code vulnerability intelligent detection system. Information and Software Technology 123 (2020), 106289.","journal-title":"Information and Software Technology"},{"issue":"7","key":"e_1_3_2_106_2","doi-asserted-by":"crossref","first-page":"352","DOI":"10.3390\/systems11070352","article-title":"Agile methodology for the standardization of engineering requirements using large language models","volume":"11","author":"Ray Archana Tikayat","year":"2023","unstructured":"Archana Tikayat Ray, Bjorn F. Cole, Olivia J. Pinon Fischer, Anirudh Prabhakara Bhat, Ryan T. White, and Dimitri N. Mavris. 2023. Agile methodology for the standardization of engineering requirements using large language models. Systems 11, 7 (2023), 352.","journal-title":"Systems"},{"key":"e_1_3_2_107_2","doi-asserted-by":"publisher","DOI":"10.5465\/amp.2019.0062"},{"key":"e_1_3_2_108_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2024.3422427"},{"key":"e_1_3_2_109_2","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1145\/3524842.3528009","volume-title":"Proceedings of the 19th International Conference on Mining Software Repositories","author":"Tufano Michele","year":"2022","unstructured":"Michele Tufano, Shao Kun Deng, Neel Sundaresan, and Alexey Svyatkovskiy. 2022. Methods2Test: A dataset of focal methods mapped to test cases. In Proceedings of the 19th International Conference on Mining Software Repositories, 299\u2013303."},{"key":"e_1_3_2_110_2","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1145\/3524481.3527220","volume-title":"Proceedings of the 3rd ACM\/IEEE International Conference on Automation of Software Test (AST \u201922)","author":"Tufano Michele","year":"2022","unstructured":"Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, and Neel Sundaresan. 2022. Generating accurate assert statements for unit test cases using pretrained transformers. In Proceedings of the 3rd ACM\/IEEE International Conference on Automation of Software Test (AST \u201922). ACM, New York, NY, 54\u201364. DOI: 10.1145\/3524481.3527220"},{"key":"e_1_3_2_111_2","unstructured":"Lam Nguyen Tung Steven Cho Xiaoning Du Neelofar Neelofar Valerio Terragni Stefano Ruberto and Aldeida Aleti. 2024. Automated trustworthiness oracle generation for machine learning text classifiers. arXiv:2410.22663. Retrieved from https:\/\/arxiv.org\/abs\/2410.22663"},{"key":"e_1_3_2_112_2","doi-asserted-by":"publisher","unstructured":"Alejandro Velasco Aya Garryyeva David N. Palacio Antonio Mastropaolo and Denys Poshyvanyk. 2025. Toward neurosymbolic program comprehension. arXiv:2502.01806. Retrieved from 10.48550\/arXiv.2502.01806","DOI":"10.48550\/arXiv.2502.01806"},{"key":"e_1_3_2_113_2","first-page":"72","volume-title":"Proceedings of the 2024 ACM\/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER \u201924)","author":"Velasco Alejandro","year":"2024","unstructured":"Alejandro Velasco, David N. Palacio, Daniel Rodriguez-Cardenas, and Denys Poshyvanyk. 2024. Which syntactic capabilities are statistically learned by masked language models for code? In Proceedings of the 2024 ACM\/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER \u201924). ACM, New York, NY, 72\u201376. DOI: 10.1145\/3639476.3639768"},{"key":"e_1_3_2_114_2","doi-asserted-by":"publisher","unstructured":"Alejandro Velasco Daniel Rodriguez-Cardenas Luftar Rahman Alif David N. Palacio and Denys Poshyvanyk. 2025. How propense are large language models at producing code smells? A Benchmarking Study. arXiv:2412.18989. DOI: 10.48550\/arXiv.2412.18989","DOI":"10.48550\/arXiv.2412.18989"},{"key":"e_1_3_2_115_2","first-page":"895","volume-title":"Proceedings of the 2022 17th Conference on Computer Science and Intelligence Systems (FedCSIS)","author":"Vijayvargiya Sanidhya","year":"2022","unstructured":"Sanidhya Vijayvargiya, Lov Kumar, Lalita Bhanu Murthy, and Sanjay Misra. 2022. Software requirements classification using deep-learning approach with various hidden layers. In Proceedings of the 2022 17th Conference on Computer Science and Intelligence Systems (FedCSIS). IEEE, 895\u2013904."},{"key":"e_1_3_2_116_2","first-page":"163","volume-title":"Proceedings of the 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","author":"Wan Xiaohui","year":"2019","unstructured":"Xiaohui Wan, Zheng Zheng, Fangyun Qin, Yu Qiao, and Kishor S. Trivedi. 2019. Supervised representation learning approach for cross-project aging-related bug prediction. In Proceedings of the 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 163\u2013172."},{"key":"e_1_3_2_117_2","doi-asserted-by":"publisher","unstructured":"Chaozheng Wang Junhao Hu Cuiyun Gao Yu Jin Tao Xie Hailiang Huang Zhenyu Lei and Yuetang Deng. 2023. Practitioners\u2019 expectations on code completion. arXiv:2301.03846. Retrieved from 10.48550\/arXiv.2301.03846.","DOI":"10.48550\/arXiv.2301.03846"},{"key":"e_1_3_2_118_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2024.3368208"},{"issue":"1","key":"e_1_3_2_119_2","first-page":"8867757","article-title":"Safety of autonomous vehicles","volume":"2020","author":"Wang Jun","year":"2020","unstructured":"Jun Wang, Li Zhang, Yanjun Huang, and Jian Zhao. 2020. Safety of autonomous vehicles. Journal of Advanced Transportation 2020, 1 (2020), 8867757.","journal-title":"Journal of Advanced Transportation"},{"key":"e_1_3_2_120_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2022.3173346"},{"key":"e_1_3_2_121_2","first-page":"40","volume-title":"Proceedings of the 2018 IEEE 26th International Requirements Engineering Conference (RE)","author":"Wang Wentao","year":"2018","unstructured":"Wentao Wang, Nan Niu, Hui Liu, and Zhendong Niu. 2018. Enhancing automated requirements traceability by resolving polysemy. In Proceedings of the 2018 IEEE 26th International Requirements Engineering Conference (RE). IEEE, 40\u201351."},{"key":"e_1_3_2_122_2","unstructured":"Cody Watson Nathan Cooper David Nader Palacio Kevin Moran and Denys Poshyvanyk. 2021. A systematic literature review on the use of deep learning in software engineering research. arXiv:2009.06520. Retrieved from https:\/\/arxiv.org\/abs\/2009.06520"},{"key":"e_1_3_2_123_2","doi-asserted-by":"crossref","first-page":"1398","DOI":"10.1145\/3377811.3380429","volume-title":"Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering (ICSE \u201920)","author":"Watson Cody","year":"2020","unstructured":"Cody Watson, Michele Tufano, Kevin Moran, Gabriele Bavota, and Denys Poshyvanyk. 2020. On learning meaningful assert statements for unit test cases. In Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering (ICSE \u201920). ACM, New York, NY, 1398\u20131409. DOI: 10.1145\/3377811.3380429"},{"key":"e_1_3_2_124_2","first-page":"87","volume-title":"Proceedings of the 31st IEEE\/ACM International Conference on Automated Software Engineering (ASE \u201916)","author":"White Martin","year":"2016","unstructured":"Martin White, Michele Tufano, Christopher Vendome, and Denys Poshyvanyk. 2016. Deep learning code fragments for code clone detection. In Proceedings of the 31st IEEE\/ACM International Conference on Automated Software Engineering (ASE \u201916). ACM, New York, NY, 87\u201398. DOI: 10.1145\/2970276.2970326"},{"key":"e_1_3_2_125_2","doi-asserted-by":"crossref","first-page":"334","DOI":"10.1109\/MSR.2015.38","volume-title":"Proceedings of the 2015 IEEE\/ACM 12th Working Conference on Mining Software Repositories","author":"White Martin","year":"2015","unstructured":"Martin White, Christopher Vendome, Mario Linares-Vasquez, and Denys Poshyvanyk. 2015. Toward deep learning software repositories. In Proceedings of the 2015 IEEE\/ACM 12th Working Conference on Mining Software Repositories, 334\u2013345. DOI: 10.1109\/MSR.2015.38"},{"key":"e_1_3_2_126_2","first-page":"120","volume-title":"Proceedings of the 2019 IEEE 27th International Requirements Engineering Conference (RE)","author":"Winkler Jonas Paul","year":"2019","unstructured":"Jonas Paul Winkler, Jannis Gr\u00f6nberg, and Andreas Vogelsang. 2019. Predicting how to test requirements: An automated approach. In Proceedings of the 2019 IEEE 27th International Requirements Engineering Conference (RE). IEEE, 120\u2013130."},{"key":"e_1_3_2_127_2","doi-asserted-by":"publisher","unstructured":"Skyler Wu Eric Meng Shen Charumathi Badrinath Jiaqi Ma and Himabindu Lakkaraju. 2023. Analyzing chain-of-thought prompting in large language models via gradient-based feature attributions. arXiv:2307.13339. Retrieved from 10.48550\/arXiv.2307.13339","DOI":"10.48550\/arXiv.2307.13339"},{"key":"e_1_3_2_128_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2022.05.014"},{"key":"e_1_3_2_129_2","first-page":"1482","volume-title":"Proceedings of the 45th International Conference on Software Engineering (ICSE \u201923)","author":"Xia Chunqiu Steven","year":"2023","unstructured":"Chunqiu Steven Xia, Yuxiang Wei, and Lingming Zhang. 2023. Automated program repair in the era of large pre-trained language models. In Proceedings of the 45th International Conference on Software Engineering (ICSE \u201923). IEEE Press, 1482\u20131494. DOI: 10.1109\/ICSE48619.2023.00129"},{"key":"e_1_3_2_130_2","doi-asserted-by":"publisher","unstructured":"Chunqiu Steven Xia and Lingming Zhang. 2023. Conversational automated program repair. arXiv:2301.13246. Retrieved from 10.48550\/arXiv.2301.13246","DOI":"10.48550\/arXiv.2301.13246"},{"key":"e_1_3_2_131_2","unstructured":"Mingxuan Xiao Yan Xiao Shunhui Ji Yunhe Li Lei Xue and Pengcheng Zhang. 2025. ABFS: Natural robustness testing for LLM-based NLP software. arXiv:2503.01319. Retrieved from https:\/\/arxiv.org\/abs\/2503.01319"},{"key":"e_1_3_2_132_2","first-page":"1","volume-title":"Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering","author":"Xu Junjielong","year":"2024","unstructured":"Junjielong Xu, Ziang Cui, Yuan Zhao, Xu Zhang, Shilin He, Pinjia He, Liqun Li, Yu Kang, Qingwei Lin, Yingnong Dang, et al. 2024. UniLog: Automatic logging via LLM and in-context learning. In Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering, 1\u201312."},{"key":"e_1_3_2_133_2","first-page":"1","volume-title":"Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering","author":"Xu Junjielong","year":"2024","unstructured":"Junjielong Xu, Ruichun Yang, Yintong Huo, Chengyu Zhang, and Pinjia He. 2024. DivLog: Log parsing with prompt enhanced in-context learning. In Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering, 1\u201312."},{"key":"e_1_3_2_134_2","volume-title":"Proceedings of the 13th International Conference on Learning Representations (ICLR)","author":"Xu Junjielong","year":"2025","unstructured":"Junjielong Xu, Qinan Zhang, Zhiqing Zhong, Shilin He, Chaoyun Zhang, Qingwei Lin, Dan Pei, Pinjia He, Dongmei Zhang, and Qi Zhang. 2025. OpenRCA: Can large language models locate the root cause of software failures? In Proceedings of the 13th International Conference on Learning Representations (ICLR)."},{"key":"e_1_3_2_135_2","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1145\/1629575.1629587","volume-title":"Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles","author":"Xu Wei","year":"2009","unstructured":"Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael I. Jordan. 2009. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, 117\u2013132."},{"issue":"3","key":"e_1_3_2_136_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3503509","article-title":"Predictive models in software engineering: Challenges and opportunities","volume":"31","author":"Yang Yanming","year":"2022","unstructured":"Yanming Yang, Xin Xia, David Lo, Tingting Bi, John Grundy, and Xiaohu Yang. 2022. Predictive models in software engineering: Challenges and opportunities. ACM Transactions on Software Engineering and Methodology 31, 3 (2022), 1\u201372.","journal-title":"ACM Transactions on Software Engineering and Methodology"},{"issue":"10","key":"e_1_3_2_137_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3505243","article-title":"A survey on deep learning for software engineering","volume":"54","author":"Yang Yanming","year":"2022","unstructured":"Yanming Yang, Xin Xia, David Lo, and John Grundy. 2022. A survey on deep learning for software engineering. ACM Computing Surveys 54, 10s (2022), 1\u201373.","journal-title":"ACM Computing Surveys"},{"key":"e_1_3_2_138_2","volume-title":"Proceedings of the 12th International Conference on Learning Representations","author":"Yasunaga Michihiro","year":"2024","unstructured":"Michihiro Yasunaga, Xinyun Chen, Yujia Li, Panupong Pasupat, Jure Leskovec, Percy Liang, Ed H. Chi, and Denny Zhou. 2024. Large language models as analogical reasoners. In Proceedings of the 12th International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=AgDICX1h50"},{"key":"e_1_3_2_139_2","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1145\/3558489.3559072","volume-title":"Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering","author":"Yetistiren Burak","year":"2022","unstructured":"Burak Yetistiren, Isik Ozsoy, and Eray Tuzun. 2022. Assessing the quality of GitHub copilot\u2019s code generation. In Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering, 62\u201371."},{"key":"e_1_3_2_140_2","first-page":"1","volume-title":"Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering (ICSE)","author":"Yu Boxi","year":"2024","unstructured":"Boxi Yu, Jiayi Yao, Qiuai Fu, Zhiqing Zhong, Haotian Xie, Yaoliang Wu, Yuchi Ma, and Pinjia He. 2024. Deep learning or classical machine learning? an empirical study on log-based anomaly detection. In Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering (ICSE), 1\u201313."},{"key":"e_1_3_2_141_2","first-page":"12","volume-title":"Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering (ICSE \u201924)","author":"Yu Hao","year":"2024","unstructured":"Hao Yu, Bo Shen, Dezhi Ran, Jiaxin Zhang, Qi Zhang, Yuchi Ma, Guangtai Liang, Ying Li, Qianxiang Wang, and Tao Xie. 2024. CoderEval: A benchmark of pragmatic code generation with generative pre-trained models. In Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering (ICSE \u201924). ACM, New York, NY, Article 37, 12 pages. DOI: 10.1145\/3597503.3623316"},{"key":"e_1_3_2_142_2","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1145\/3387904.3389281","volume-title":"Proceedings of the 28th International Conference on Program Comprehension","author":"Zhang Jinglei","year":"2020","unstructured":"Jinglei Zhang, Rui Xie, Wei Ye, Yuhan Zhang, and Shikun Zhang. 2020. Exploiting code knowledge graph for bug localization via bi-directional attention. In Proceedings of the 28th International Conference on Program Comprehension, 219\u2013229."},{"key":"e_1_3_2_143_2","doi-asserted-by":"publisher","DOI":"10.1145\/3639372"},{"key":"e_1_3_2_144_2","first-page":"46595","article-title":"Judging LLM-as-a-judge with mt-bench and chatbot arena","volume":"36","author":"Zheng Lianmin","year":"2023","unstructured":"Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric Xing, et al. 2023. Judging LLM-as-a-judge with mt-bench and chatbot arena. In Advances in Neural Information Processing Systems, Vol. 36, 46595\u201346623.","journal-title":"Advances in Neural Information Processing Systems, Vol"},{"issue":"2","key":"e_1_3_2_145_2","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1109\/TSE.2018.2887384","article-title":"Fault analysis and debugging of microservice systems: Industrial survey, benchmark system, and empirical study","volume":"47","author":"Zhou Xiang","year":"2018","unstructured":"Xiang Zhou, Xin Peng, Tao Xie, Jun Sun, Chao Ji, Wenhai Li, and Dan Ding. 2018. Fault analysis and debugging of microservice systems: Industrial survey, benchmark system, and empirical study. IEEE Transactions on Software Engineering 47, 2 (2018), 243\u2013260.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_3_2_146_2","unstructured":"Ming Zhu Aneesh Jain Karthik Suresh Roshan Ravindran Sindhu Tipirneni and Chandan K. Reddy. 2022. Xlcost: A benchmark dataset for cross-lingual code intelligence. arXiv:2206.08474. Retrieved from https:\/\/arxiv.org\/abs\/2206.08474"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3719006","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3719006","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:19:08Z","timestamp":1750295948000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3719006"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,28]]},"references-count":145,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,6,30]]}},"alternative-id":["10.1145\/3719006"],"URL":"https:\/\/doi.org\/10.1145\/3719006","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5,28]]},"assertion":[{"value":"2024-12-17","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-02-17","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-05-28","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}