{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,6]],"date-time":"2026-06-06T19:53:41Z","timestamp":1780775621144,"version":"3.54.1"},"reference-count":59,"publisher":"Association for Computing Machinery (ACM)","issue":"FSE","funder":[{"name":"National Natural Science Foundation of China","award":["No.62072227;No.62202219;No.62302210"],"award-info":[{"award-number":["No.62072227;No.62202219;No.62302210"]}]},{"name":"Jiangsu Provincial Key Research and Development Program","award":["No.BE2021002-2"],"award-info":[{"award-number":["No.BE2021002-2"]}]},{"name":"Natural Science Foundation of Jiangsu Province","award":["No.BK20241195"],"award-info":[{"award-number":["No.BK20241195"]}]},{"name":"Innovation Project and Overseas Open Project of State Key Laboratory for Novel Software Technology","award":["ZZKT2024A18; ZZKT2024B07; KFKT2023A09; KFKT2023A10; KFKT2024A02; KFKT2024A13; KFKT2024A14"],"award-info":[{"award-number":["ZZKT2024A18; ZZKT2024B07; KFKT2023A09; KFKT2023A10; KFKT2024A02; KFKT2024A13; KFKT2024A14"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Softw. Eng."],"published-print":{"date-parts":[[2025,6,19]]},"abstract":"<jats:p>A significant number of bug reports are generated every day as software systems continue to develop. Large Language Models (LLMs) have been used to correlate bug reports with source code to locate bugs automatically. The existing research has shown that LLMs are effective for bug localization and can increase software development efficiency. However, these studies still have two limitations. First, these models fail to capture context information about bug reports and source code. Second, these models are unable to understand the domain-specific expertise inherent to particular projects, such as version information in projects that are composed of alphanumeric characters without any semantic meaning.<\/jats:p>\n          <jats:p>\n            To address these challenges, we propose a\n            <jats:bold>K<\/jats:bold>\n            nowledge\n            <jats:bold>E<\/jats:bold>\n            nhanced\n            <jats:bold>P<\/jats:bold>\n            re-\n            <jats:bold>T<\/jats:bold>\n            rained model using project documents and historical code, called\n            <jats:bold>KEPT<\/jats:bold>\n            , for bug localization. Project documents record, revise, and restate project information that provides rich semantic information about those projects. Historical code contains rich code semantic information that can enhance the reasoning ability of LLMs. Specifically, we construct knowledge graphs from project documents and source code. Then, we introduce knowledge graphs to the LLM through soft-position embedding and visible matrices, enhancing its contextual and professional reasoning ability. To validate our model, we conducted a series of experiments on seven open-source software projects with over 6,000 bug reports. Compared with the traditional model (Locus), KEPT performs better by 33.2% to 59.5% in terms of mean reciprocal rank, mean average precision, and Top@N. Compared with the best-performing non-commercial LLM (CodeT5), KEPT achieves an improvement of 36.6% to 63.7%. Compared to the state-of-the-art commercial LLM developed by OpenAI, called\n            <jats:italic toggle=\"yes\">text-embedding-ada-002<\/jats:italic>\n            , KEPT achieves an average improvement of 7.8% to 17.4%. The results indicate that introducing knowledge graphs contributes to enhance the effectiveness of the LLM in bug localization.\n          <\/jats:p>","DOI":"10.1145\/3729356","type":"journal-article","created":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:15:34Z","timestamp":1750346134000},"page":"1914-1936","source":"Crossref","is-referenced-by-count":6,"title":["A Knowledge Enhanced Large Language Model for Bug Localization"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1856-7182","authenticated-orcid":false,"given":"Yue","family":"Li","sequence":"first","affiliation":[{"name":"Nanjing University, Nanjing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0146-5411","authenticated-orcid":false,"given":"Bohan","family":"Liu","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6001-1372","authenticated-orcid":false,"given":"Ting","family":"Zhang","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore, Singapore"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-8029-4607","authenticated-orcid":false,"given":"Zhiqi","family":"Wang","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4367-7201","authenticated-orcid":false,"given":"David","family":"Lo","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore, Singapore"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0406-2263","authenticated-orcid":false,"given":"Lanxin","family":"Yang","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9070-7269","authenticated-orcid":false,"given":"Jun","family":"Lyu","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9159-5331","authenticated-orcid":false,"given":"He","family":"Zhang","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2025,6,19]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the IEEE\/ACM 45th International Conference on Software Engineering (ICSE\u201923)","author":"An Gabin","year":"2023","unstructured":"Gabin An, Jingun Hong, Naryeong Kim, and Shin Yoo. 2023. Fonte: Finding Bug Inducing Commits from Failures. In Proceedings of the IEEE\/ACM 45th International Conference on Software Engineering (ICSE\u201923). IEEE, Melbourne, Australia. 589\u2013601."},{"key":"e_1_2_1_2_1","doi-asserted-by":"crossref","first-page":"10345","DOI":"10.1007\/s10462-023-10419-1","article-title":"Impact of word embedding models on text analytics in deep learning environment: a review","volume":"56","author":"Asudani Deepak Suresh","year":"2023","unstructured":"Deepak Suresh Asudani, Naresh Kumar Nagwani, and Pradeep Singh. 2023. Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 56, 9 (2023), 10345\u201310425.","journal-title":"Artificial Intelligence Review"},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA\u201923)","author":"Callaghan Dylan","year":"2023","unstructured":"Dylan Callaghan and Bernd Fischer. 2023. Improving Spectrum-Based Localization of Multiple Faults by Iterative Test Suite Reduction. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA\u201923). ACM, Seattle, WA, USA. 1445\u20131457."},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 44th International Conference on Software Engineering (ICSE\u201922)","author":"Ciborowska Agnieszka","year":"2022","unstructured":"Agnieszka Ciborowska and Kostadin Damevski. 2022. Fast Changeset-Based Bug Localization with BERT. In Proceedings of the 44th International Conference on Software Engineering (ICSE\u201922). ACM, Pittsburgh, Pennsylvania. 946\u2013957."},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE\u201923)","author":"Du Yali","year":"2023","unstructured":"Yali Du and Zhongxing Yu. 2023. Pre-training Code Representation with Semantic Flow Graph for Effective Bug Localization. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE\u201923). ACM, San Francisco, CA, USA. 579\u2013591."},{"key":"e_1_2_1_6_1","volume-title":"CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In Findings of the Association for Computational Linguistics: EMNLP","author":"Feng Zhangyin","year":"2020","unstructured":"Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In Findings of the Association for Computational Linguistics: EMNLP 2020, Trevor Cohn, Yulan He, and Yang Liu (Eds.). ACM, Online. 1536\u20131547."},{"key":"e_1_2_1_7_1","volume-title":"MinIE: Minimizing facts in open information extraction","author":"Gashteovski K","unstructured":"K Gashteovski, R Gemulla, and L Del Corro. 2017. MinIE: Minimizing facts in open information extraction. Association for Computational Linguistics, 1\u201311."},{"key":"e_1_2_1_8_1","volume-title":"Siddharth Dutt Choubey, and Kopal Gangrade","author":"Gore Alpa","year":"2016","unstructured":"Alpa Gore, Siddharth Dutt Choubey, and Kopal Gangrade. 2016. Improved Bug Localization Technique Using Hybrid Information Retrieval Model. In Proceedings of the 12th Distributed Computing and Internet Technology (ICDCIT\u201916), Nikolaj Bj\u00f8rner, Sanjiva Prasad, and Laxmi Parida (Eds.). Springer, Bhubaneswar, India. 127\u2013131."},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1162\/tacl_a_00302","article-title":"A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation","volume":"8","author":"Guan Jian","year":"2020","unstructured":"Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, and Minlie Huang. 2020. A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation. Transactions of the Association for Computational Linguistics, 8 (2020), 93\u2013108.","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 7212\u20137225","author":"Guo Daya","year":"2022","unstructured":"Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, and Jian Yin. 2022. UniXcoder: Unified Cross-Modal Pre-training for Code Representation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 7212\u20137225."},{"key":"e_1_2_1_11_1","volume-title":"GraphCodeBERT: Pre-training Code Representations with Data Flow. In International Conference on Learning Representations (ICLR\u201921)","author":"Guo Daya","year":"2021","unstructured":"Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie LIU, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, and Ming Zhou. 2021. GraphCodeBERT: Pre-training Code Representations with Data Flow. In International Conference on Learning Representations (ICLR\u201921). 1\u201318."},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1016\/j.aiopen.2021.08.002","article-title":"Pre-trained models: Past, present and future","volume":"2","author":"Han Xu","year":"2021","unstructured":"Xu Han, Zhengyan Zhang, Ning Ding, Yuxian Gu, Xiao Liu, Yuqi Huo, Jiezhong Qiu, Yuan Yao, Ao Zhang, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, and Jun Zhu. 2021. Pre-trained models: Past, present and future. AI Open, 2 (2021), 225\u2013250.","journal-title":"AI Open"},{"key":"e_1_2_1_13_1","doi-asserted-by":"crossref","first-page":"1413","DOI":"10.1109\/TKDE.2023.3310002","article-title":"A Survey of Knowledge Enhanced Pre-Trained Language Models","volume":"36","author":"Hu Linmei","year":"2024","unstructured":"Linmei Hu, Zeyi Liu, Ziwang Zhao, Lei Hou, Liqiang Nie, and Juanzi Li. 2024. A Survey of Knowledge Enhanced Pre-Trained Language Models. IEEE Transactions on Knowledge and Data Engineering, 36, 4 (2024), 1413\u20131430.","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI\u201916)","author":"Huo Xuan","year":"2016","unstructured":"Xuan Huo, Ming Li, and Zhi-Hua Zhou. 2016. Learning Unified Features from Natural and Programming Languages for Locating Buggy Source Code. In Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI\u201916). AAAI Press, New York, USA. 1606\u20131612."},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","first-page":"1368","DOI":"10.1109\/TSE.2019.2920771","article-title":"Deep Transfer Bug Localization","volume":"47","author":"Huo Xuan","year":"2021","unstructured":"Xuan Huo, Ferdian Thung, Ming Li, David Lo, and Shu-Ting Shi. 2021. Deep Transfer Bug Localization. IEEE Transactions on Software Engineering, 47, 7 (2021), 1368\u20131380.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_2_1_16_1","doi-asserted-by":"crossref","first-page":"3010","DOI":"10.1109\/TSE.2021.3075215","article-title":"Legion: Massively Composing Rankers for Improved Bug Localization at Adobe","volume":"48","author":"Jarman Darryl","year":"2022","unstructured":"Darryl Jarman, Jeffrey Berry, Riley Smith, Ferdian Thung, and David Lo. 2022. Legion: Massively Composing Rankers for Improved Bug Localization at Adobe. IEEE Transactions on Software Engineering, 48, 8 (2022), 3010\u20133024.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.specom.2017.05.001","article-title":"A knowledge graph based speech interface for question answering systems","volume":"92","author":"Kumar Ashwini Jaya","year":"2017","unstructured":"Ashwini Jaya Kumar, Christoph Schmidt, and Joachim K\u00f6hler. 2017. A knowledge graph based speech interface for question answering systems. Speech Communication, 92 (2017), 1\u201312.","journal-title":"Speech Communication"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the 25th International Conference on Program Comprehension (ICPC\u201917)","author":"Lam An Ngoc","unstructured":"An Ngoc Lam, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen. 2017. Bug Localization with Combination of Deep Learning and Information Retrieval. In Proceedings of the 25th International Conference on Program Comprehension (ICPC\u201917). IEEE, Buenos Aires, Argentina. 218\u2013229."},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA\u20192018)","author":"Lee Jaekwon","year":"2018","unstructured":"Jaekwon Lee, Dongsun Kim, Tegawend\u00e9 F. Bissyand\u00e9, Woosung Jung, and Yves Le Traon. 2018. Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA\u20192018). ACM, Amsterdam, Netherlands. 61\u201372."},{"key":"e_1_2_1_20_1","volume-title":"Aligning Language Models for Versatile Text-based Item Retrieval. In Companion Proceedings of the ACM Web Conference 2024 (WWW \u201924)","author":"Lei Yuxuan","year":"2024","unstructured":"Yuxuan Lei, Jianxun Lian, Jing Yao, Mingqi Wu, Defu Lian, and Xing Xie. 2024. Aligning Language Models for Versatile Text-based Item Retrieval. In Companion Proceedings of the ACM Web Conference 2024 (WWW \u201924). Association for Computing Machinery, New York, NY, USA. 935\u2013938."},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering (ICSE\u201924)","author":"Li Yue","year":"2024","unstructured":"Yue Li, Zhong Ren, Zhiqi Wang, Lanxin Yang, Liming Dong, Chenxing Zhong, and He Zhang. 2024. Fine-SE: Integrating Semantic Features and Expert Features for Software Effort Estimation. In Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering (ICSE\u201924). ACM, Article 27, 12 pages."},{"key":"e_1_2_1_22_1","first-page":"1","article-title":"Modeling function-level interactions for file-level bug localization","volume":"27","author":"Liang Hongliang","year":"2022","unstructured":"Hongliang Liang, Dengji Hang, and Xiangyu Li. 2022. Modeling function-level interactions for file-level bug localization. Empirical Software Engineering, 27, 7 (2022), 1\u201326.","journal-title":"Empirical Software Engineering"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 43rd International Conference on Software Engineering (ICSE\u201921)","author":"Lin Jinfeng","year":"2021","unstructured":"Jinfeng Lin, Yalin Liu, Qingkai Zeng, Meng Jiang, and Jane Cleland-Huang. 2021. Traceability Transformed: Generating More Accurate Links with Pre-Trained BERT Models. In Proceedings of the 43rd International Conference on Software Engineering (ICSE\u201921). IEEE, Madrid, ES. 324\u2013335."},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics (ETMTNLP \u201902)","author":"Loper Edward","year":"2002","unstructured":"Edward Loper and Steven Bird. 2002. NLTK: the Natural Language Toolkit. In Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics (ETMTNLP \u201902) (ETMTNLP \u201902). ACM, Philadelphia, Pennsylvania. 63\u201370."},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering (ICSE\u201924)","author":"Ma Lipeng","year":"2024","unstructured":"Lipeng Ma, Weidong Yang, Bo Xu, Sihang Jiang, Ben Fei, Jiaqing Liang, Mingjie Zhou, and Yanghua Xiao. 2024. KnowLog: Knowledge Enhanced Pre-trained Language Model for Log Understanding. In Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering (ICSE\u201924). ACM, Lisbon, Portugal. Article 32, 13 pages."},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP\u201921)","author":"Murali Vijayaraghavan","year":"2021","unstructured":"Vijayaraghavan Murali, Lee Gross, Rebecca Qian, and Satish Chandra. 2021. Industry-Scale IR-Based Bug Localization: A Perspective from Facebook. In Proceedings of the 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP\u201921). IEEE, Madrid, ES. 188\u2013197."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE \u201922)","author":"Ni Chao","year":"2022","unstructured":"Chao Ni, Wei Wang, Kaiwen Yang, Xin Xia, Kui Liu, and David Lo. 2022. The Best of Both Worlds: Integrating Semantic Features with Expert Features for Defect Prediction and Localization. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE \u201922). ACM, Singapore, Singapore. 672\u2013683."},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 45th International Conference on Software Engineering (ICSE\u201923)","author":"Niu Feifei","year":"2023","unstructured":"Feifei Niu, Wesley K. G. Assun\u00e7\u00e3o, LiGuo Huang, Christoph Mayr-Dorn, Jidong Ge, Bin Luo, and Alexander Egyed. 2023. RAT: A Refactoring-Aware Traceability Model for Bug Localization. In Proceedings of the 45th International Conference on Software Engineering (ICSE\u201923). IEEE, Melbourne, Australia. 196\u2013207."},{"key":"e_1_2_1_29_1","unstructured":"OpenAI. 2022. text-embedding-ada-002. https:\/\/platform.openai.com\/docs\/guides\/embeddings"},{"key":"e_1_2_1_30_1","volume-title":"PyTorch: An Imperative Style","author":"Paszke Adam","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas K\u00f6pf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Curran Associates Inc., Vancouver Canada. 1\u201312."},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA\u201920)","author":"Pradel Michael","year":"2020","unstructured":"Michael Pradel, Vijayaraghavan Murali, Rebecca Qian, Mateusz Machalica, Erik Meijer, and Satish Chandra. 2020. Scaffle: Bug Localization on Millions of Files. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA\u201920). ACM, Virtual Event, USA. 225\u2013236."},{"key":"e_1_2_1_32_1","unstructured":"Alec Radford Jeff Wu Rewon Child David Luan Dario Amodei and Ilya Sutskever. 2019. Language Models are Unsupervised Multitask Learners. arXiv 1\u201324."},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the 15th International Conference on Mining Software Repositories (MSR\u201918)","author":"Rath Michael","year":"2018","unstructured":"Michael Rath, David Lo, and Patrick M\u00e4der. 2018. Analyzing Requirements and Traceability Information to Improve Bug Localization. In Proceedings of the 15th International Conference on Mining Software Repositories (MSR\u201918). ACM, Gothenburg, Sweden. 442\u2013453."},{"key":"e_1_2_1_34_1","volume-title":"proven approaches to text retrieval","author":"Robertson E.","unstructured":"Stephen.E. Robertson and Karen. Sp\u00e4rck Jones. 1994. Simple, proven approaches to text retrieval. University of Cambridge, Computer Laboratory."},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 43rd International Conference on Software Engineering (ICSE\u201921)","author":"Rosa Giovanni","year":"2021","unstructured":"Giovanni Rosa, Luca Pascarella, Simone Scalabrino, Rosalia Tufano, Gabriele Bavota, Michele Lanza, and Rocco Oliveto. 2021. Evaluating SZZ Implementations Through a Developer-Informed Oracle. In Proceedings of the 43rd International Conference on Software Engineering (ICSE\u201921). IEEE, Madrid, Spain. 436\u2013447."},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the 28th International Conference on Automated Software Engineering (ASE\u201913)","author":"Saha Ripon K.","unstructured":"Ripon K. Saha, Matthew Lease, Sarfraz Khurshid, and Dewayne E. Perry. 2013. Improving bug localization using structured information retrieval. In Proceedings of the 28th International Conference on Automated Software Engineering (ASE\u201913). IEEE, Silicon Valley, CA, USA. 345\u2013355."},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the 9th IEEE Working Conference on Mining Software Repositories (MSR\u201912)","author":"Sisman Bunyamin","unstructured":"Bunyamin Sisman and Avinash C. Kak. 2012. Incorporating version histories in Information Retrieval based bug localization. In Proceedings of the 9th IEEE Working Conference on Mining Software Repositories (MSR\u201912). IEEE, Zurich, Switzerland. 50\u201359."},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the 28th International Conference on Computational Linguistics (COLING\u201920)","author":"Sun Tianxiang","year":"2020","unstructured":"Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, and Zheng Zhang. 2020. CoLAKE: Contextualized Language and Knowledge Embedding. In Proceedings of the 28th International Conference on Computational Linguistics (COLING\u201920). ICCL, Barcelona, Spain (Online). 3660\u20133670."},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL\u201920)","author":"Tabassum Jeniya","year":"2020","unstructured":"Jeniya Tabassum, Mounica Maddela, Wei Xu, and Alan Ritter. 2020. Code and Named Entity Recognition in StackOverflow. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL\u201920). ACL, Online. 4913\u20134926."},{"key":"e_1_2_1_40_1","doi-asserted-by":"crossref","first-page":"1649","DOI":"10.1109\/TSC.2020.3006214","article-title":"Multi-Dimension Convolutional Neural Network for Bug Localization","volume":"15","author":"Wang Bei","year":"2022","unstructured":"Bei Wang, Ling Xu, Meng Yan, Chao Liu, and Ling Liu. 2022. Multi-Dimension Convolutional Neural Network for Bug Localization. IEEE Transactions on Services Computing, 15, 3 (2022), 1649\u20131663.","journal-title":"IEEE Transactions on Services Computing"},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. ACL, Abu Dhabi, United Arab Emirates. 3152\u20133163","author":"Wang Jianing","year":"2022","unstructured":"Jianing Wang, Chengyu Wang, Minghui Qiu, Qiuhui Shi, Hongbin Wang, Jun Huang, and Ming Gao. 2022. KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive Question Answering. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. ACL, Abu Dhabi, United Arab Emirates. 3152\u20133163."},{"key":"e_1_2_1_42_1","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1002\/spip.366","article-title":"Estimating fixing effort and schedule based on defect injection distribution","volume":"13","author":"Wang Qing","year":"2008","unstructured":"Qing Wang, Lang Gou, Nan Jiang, Meiru Che, Ronghui Zhang, Yun Yang, and Mingshu Li. 2008. Estimating fixing effort and schedule based on defect injection distribution. Software Process: Improvement and Practice, 13, 1 (2008), 35\u201350.","journal-title":"Software Process: Improvement and Practice"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the 22nd International Conference on Program Comprehension (ICPC\u201914)","author":"Wang Shaowei","year":"2014","unstructured":"Shaowei Wang and David Lo. 2014. Version History, Similar Report, and Structure: Putting Them Together for Improved Bug Localization. In Proceedings of the 22nd International Conference on Program Comprehension (ICPC\u201914). ACM, Hyderabad, India. 53\u201363."},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of the 30th IEEE International Conference on Software Maintenance and Evolution (ICSME\u201914)","author":"Wang Shaowei","year":"2014","unstructured":"Shaowei Wang, David Lo, and Julia Lawall. 2014. Compositional Vector Space Models for Improved Bug Localization. In Proceedings of the 30th IEEE International Conference on Software Maintenance and Evolution (ICSME\u201914). IEEE, Victoria, BC, Canada. 171\u2013180."},{"key":"e_1_2_1_45_1","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1162\/tacl_a_00360","article-title":"KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation","volume":"9","author":"Wang Xiaozhi","year":"2021","unstructured":"Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhengyan Zhang, Zhiyuan Liu, Juanzi Li, and Jian Tang. 2021. KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation. Transactions of the Association for Computational Linguistics, 9 (2021), 176\u2013194.","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"e_1_2_1_46_1","volume-title":"Hoi","author":"Wang Yue","year":"2021","unstructured":"Yue Wang, Weishi Wang, Shafiq Joty, and Steven C. H. Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. arXiv, 1\u201313."},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of the 31st International Conference on Automated Software Engineering (ASE\u201916)","author":"Wen Ming","year":"2016","unstructured":"Ming Wen, Rongxin Wu, and Shing-Chi Cheung. 2016. Locus: Locating Bugs from Software Changes. In Proceedings of the 31st International Conference on Automated Software Engineering (ASE\u201916). ACM, Singapore, Singapore. 262\u2013273."},{"key":"e_1_2_1_48_1","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP\u201920)","author":"Wolf Thomas","unstructured":"Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R\u00e9mi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. 2019. HuggingFace\u2019s Transformers: State-of-the-art Natural Language Processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP\u201920). ACL, online. 38\u201345."},{"key":"e_1_2_1_49_1","volume-title":"Proceedings of the 30th IEEE International Conference on Software Maintenance and Evolution (ICSME\u201914)","author":"Wong Chu-Pan","year":"2014","unstructured":"Chu-Pan Wong, Yingfei Xiong, Hongyu Zhang, Dan Hao, Lu Zhang, and Hong Mei. 2014. Boosting Bug-Report-Oriented Fault Localization with Segmentation and Stack-Trace Analysis. In Proceedings of the 30th IEEE International Conference on Software Maintenance and Evolution (ICSME\u201914). IEEE, Victoria, BC, Canada. 181\u2013190."},{"key":"e_1_2_1_50_1","first-page":"1","article-title":"BugRadar: Bug localization by knowledge graph link prediction","volume":"162","author":"Xiao Xi","year":"2023","unstructured":"Xi Xiao, Renjie Xiao, Qing Li, Jianhui Lv, Shunyan Cui, and Qixu Liu. 2023. BugRadar: Bug localization by knowledge graph link prediction. Information and Software Technology, 162 (2023), 1\u201313.","journal-title":"Information and Software Technology"},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI\u201922)","author":"Xu Yichong","year":"2022","unstructured":"Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, and Xuedong Huang. 2022. Human parity on commonsenseqa: Augmenting self-attention with external attention. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI\u201922). 2762\u20132768."},{"key":"e_1_2_1_52_1","volume-title":"Proceedings of the 29th International Conference on Program Comprehension (ICPC\u201921)","author":"Yang Shouliang","year":"2021","unstructured":"Shouliang Yang, Junming Cao, Hushuang Zeng, Beijun Shen, and Hao Zhong. 2021. Locating Faulty Methods with a Mixed RNN and Attention Model. In Proceedings of the 29th International Conference on Program Comprehension (ICPC\u201921). IEEE, Madrid, Spain. 207\u2013218."},{"key":"e_1_2_1_53_1","volume-title":"Proceedings of the 36th International Conference on Automated Software Engineering (ASE\u201921)","author":"Yang Zhou","year":"2021","unstructured":"Zhou Yang, Jieke Shi, Shaowei Wang, and David Lo. 2021. IncBL: Incremental Bug Localization. In Proceedings of the 36th International Conference on Automated Software Engineering (ASE\u201921). IEEE, Melbourne, Australia. 1223\u20131226."},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL\u201924)","author":"Yoon Jinsung","year":"2024","unstructured":"Jinsung Yoon, Yanfei Chen, Sercan Arik, and Tomas Pfister. 2024. Search-Adaptor: Embedding Customization for Information Retrieval. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL\u201924), Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). ACM, Bangkok, Thailand. 12230\u201312247."},{"key":"e_1_2_1_55_1","doi-asserted-by":"crossref","first-page":"3939","DOI":"10.1109\/TSE.2023.3279125","article-title":"Context-Aware Neural Fault Localization","volume":"49","author":"Zhang Zhuo","year":"2023","unstructured":"Zhuo Zhang, Yan Lei, Xiaoguang Mao, Meng Yan, Xin Xia, and David Lo. 2023. Context-Aware Neural Fault Localization. IEEE Transactions on Software Engineering, 49, 7 (2023), 3939\u20133954.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_2_1_56_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence (AAAI\u201923)","author":"Zhang Zhuosheng","year":"2023","unstructured":"Zhuosheng Zhang, Hai Zhao, Masao Utiyama, and Eiichiro Sumita. 2023. Language Model Pre-training on True Negatives. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI\u201923). 37, Washington DC, USA. 14002\u201314010."},{"key":"e_1_2_1_57_1","volume-title":"Proceedings of the 34th International Conference on Software Engineering (ICSE\u201912)","author":"Zhou Jian","year":"2012","unstructured":"Jian Zhou, Hongyu Zhang, and David Lo. 2012. Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports. In Proceedings of the 34th International Conference on Software Engineering (ICSE\u201912). IEEE, Zurich, Switzerland. 14\u201324."},{"key":"e_1_2_1_58_1","volume-title":"Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI\u201920)","author":"Zhu Ziye","year":"2020","unstructured":"Ziye Zhu, Yun Li, Hanghang Tong, and Yu Wang. 2020. CooBa: Cross-project Bug Localization via Adversarial Transfer Learning. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI\u201920). ACM, Vienna, Austria. 3565\u20133571."},{"key":"e_1_2_1_59_1","doi-asserted-by":"crossref","first-page":"836","DOI":"10.1109\/TSE.2018.2870414","article-title":"How Practitioners Perceive Automated Bug Report Management Techniques","volume":"46","author":"Zou Weiqin","year":"2020","unstructured":"Weiqin Zou, David Lo, Zhenyu Chen, Xin Xia, Yang Feng, and Baowen Xu. 2020. How Practitioners Perceive Automated Bug Report Management Techniques. IEEE Transactions on Software Engineering, 46, 8 (2020), 836\u2013862.","journal-title":"IEEE Transactions on Software Engineering"}],"container-title":["Proceedings of the ACM on Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3729356","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:28:52Z","timestamp":1750346932000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3729356"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,19]]},"references-count":59,"journal-issue":{"issue":"FSE","published-print":{"date-parts":[[2025,6,19]]}},"alternative-id":["10.1145\/3729356"],"URL":"https:\/\/doi.org\/10.1145\/3729356","relation":{},"ISSN":["2994-970X"],"issn-type":[{"value":"2994-970X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,19]]}}}