{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T04:48:55Z","timestamp":1777610935203,"version":"3.51.4"},"reference-count":95,"publisher":"Association for Computing Machinery (ACM)","issue":"FSE","license":[{"start":{"date-parts":[[2024,7,12]],"date-time":"2024-07-12T00:00:00Z","timestamp":1720742400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Softw. Eng."],"published-print":{"date-parts":[[2024,7,12]]},"abstract":"<jats:p>Commit messages play a vital role in software development and maintenance. While previous research has introduced various Commit Message Generation (CMG) approaches, they often suffer from a lack of consideration for the broader software context associated with code changes. This limitation resulted in generated commit messages that contained insufficient information and were poorly readable. To address these shortcomings, we approached CMG as a knowledge-intensive reasoning task. We employed ReAct prompting with a cutting-edge Large Language Model (LLM) to generate high-quality commit messages. Our tool retrieves a wide range of software context information, enabling the LLM to create commit messages that are factually grounded and comprehensive. Additionally, we gathered commit message quality expectations from software practitioners, incorporating them into our approach to further enhance message quality. Human evaluation demonstrates the overall effectiveness of our CMG approach, which we named Omniscient Message Generator (OMG). It achieved an average improvement of 30.2% over human-written messages and a 71.6% improvement over state-of-the-art CMG methods.<\/jats:p>","DOI":"10.1145\/3643760","type":"journal-article","created":{"date-parts":[[2024,7,12]],"date-time":"2024-07-12T10:22:09Z","timestamp":1720779729000},"page":"745-766","source":"Crossref","is-referenced-by-count":19,"title":["Only diff Is Not Enough: Generating Commit Messages Leveraging Reasoning and Action of Large Language Model"],"prefix":"10.1145","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4434-4812","authenticated-orcid":false,"given":"Jiawei","family":"Li","sequence":"first","affiliation":[{"name":"University of California, Irvine, Irvine, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-2380-6076","authenticated-orcid":false,"given":"David","family":"Farag\u00f3","sequence":"additional","affiliation":[{"name":"Innoopract, Karlsruhe, Germany"},{"name":"QPR Technologies, Karlsruhe, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8776-4289","authenticated-orcid":false,"given":"Christian","family":"Petrov","sequence":"additional","affiliation":[{"name":"Innoopract, Karlsruhe, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8221-5352","authenticated-orcid":false,"given":"Iftekhar","family":"Ahmed","sequence":"additional","affiliation":[{"name":"University of California, Irvine, Irvine, USA"}]}],"member":"320","published-online":{"date-parts":[[2024,7,12]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"2006. Code Change Example 2. https:\/\/github.com\/apache\/maven\/commit\/40aacad4f0d2b0b33f3a70b971030c5d42afa167. 2006."},{"key":"e_1_3_1_3_2","unstructured":"2013. Code Change Example 1. https:\/\/github.com\/apache\/karaf\/commit\/5ea93654cf709383c1d59012e749e0fa20e70ffb. 2013."},{"key":"e_1_3_1_4_2","unstructured":"2022. ChatGPT: Optimizing Language Models for Dialogue. https:\/\/openai.com\/blog\/chatgpt. 2022."},{"key":"e_1_3_1_5_2","unstructured":"2023. Agents in Langchain. https:\/\/python.langchain.com\/docs\/modules\/agents\/. 2023."},{"key":"e_1_3_1_6_2","unstructured":"2023. Apache Jira. https:\/\/issues.apache.org\/jira. 2023."},{"key":"e_1_3_1_7_2","unstructured":"2023. Apache Software Foundation Contributor Guide. https:\/\/community.apache.org\/contributors\/. 2023."},{"key":"e_1_3_1_8_2","unstructured":"2023. beautifulsoup4. https:\/\/pypi.org\/project\/beautifulsoup4\/. 2023."},{"key":"e_1_3_1_9_2","unstructured":"2023. Github. https:\/\/github.com\/. 2023."},{"key":"e_1_3_1_10_2","unstructured":"2023. GPT-4. https:\/\/openai.com\/research\/gpt-4. 2023."},{"key":"e_1_3_1_11_2","unstructured":"2023. Jira Issue tracking system. https:\/\/www.atlassian.com\/software\/jira. 2023."},{"key":"e_1_3_1_12_2","unstructured":"2023. LangChain. https:\/\/www.langchain.com\/. 2023."},{"key":"e_1_3_1_13_2","unstructured":"2023. LangChain\u2019s code understanding agent. https:\/\/python.langchain.com\/docs\/use_cases\/code_understanding. 2023"},{"key":"e_1_3_1_14_2","unstructured":"2023. Pygithub: A python library to access the github api v3. https:\/\/github.com\/PyGithub\/PyGithub. 2023."},{"key":"e_1_3_1_15_2","unstructured":"2023. SciTools Understand. https:\/\/scitools.com\/. 2023."},{"key":"e_1_3_1_16_2","doi-asserted-by":"crossref","unstructured":"Iftekhar Ahmed Umme Ayda Mannan Rahul Gopinath and Carlos Jensen 2015. An empirical study of design degradation: How software projects get worse over time. In 2015 ACM\/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE 1\u201310.","DOI":"10.1109\/ESEM.2015.7321186"},{"key":"e_1_3_1_17_2","doi-asserted-by":"crossref","unstructured":"Ahmed Anwar Haider Ilyas Ussama Yaqub and Salma Zaman 2021. Analyzing qanon on twitter in context of us elections 2020: Analysis of user messages and profiles using vader and bert topic modeling. In DG. O2021: The 22nd Annual International Conference on Digital Government Research. 82\u201388.","DOI":"10.1145\/3463677.3463718"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF02960514"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.5555\/944919.944937"},{"key":"e_1_3_1_20_2","doi-asserted-by":"crossref","unstructured":"Ali Borji 2023. A categorical archive of chatgpt failures. arXiv preprint arXiv:2302.03494 (2023).","DOI":"10.21203\/rs.3.rs-2895792\/v1"},{"key":"e_1_3_1_21_2","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877\u20131901.","journal-title":"Advances in neural information processing systems"},{"key":"e_1_3_1_22_2","doi-asserted-by":"crossref","unstructured":"Raymond PL Buse and Westley R Weimer 2010. Automatically documenting program changes. In Proceedings of the 25th IEEE\/ACM international conference on automated software engineering. 33\u201342.","DOI":"10.1145\/1858996.1859005"},{"key":"e_1_3_1_23_2","doi-asserted-by":"crossref","unstructured":"Kuljit Kaur Chahal and Munish Saini 2018. Developer dynamics and syntactic quality of commit messages in oss projects. In Open Source Systems: Enterprise Software and Solutions: 14th IFIP WG 2.13 International Conference OSS 2018 Athens Greece June 8-10 2018 Proceedings 14. Springer 61\u201376.","DOI":"10.1007\/978-3-319-92375-8_6"},{"issue":"4","key":"e_1_3_1_24_2","article-title":"Inequalities in open source software development: Analysis of contributor\u2019s commits in apache software foundation projects.","volume":"11","author":"Che\u0142kowski Tadeusz","year":"2016","unstructured":"Tadeusz Che\u0142kowski, Peter Gloor, and Dariusz Jemielniak 2016. Inequalities in open source software development: Analysis of contributor\u2019s commits in apache software foundation projects. PLoS One 11, 4 (2016), e0152976.","journal-title":"PLoS One"},{"key":"e_1_3_1_25_2","doi-asserted-by":"crossref","unstructured":"Dan Chen and Sally E Goldin 2020. A project-level investigation of software commit comments and code quality. In 2020 3rd International Conference on Information and Communications Technology (ICOIACT). IEEE 240\u2013245.","DOI":"10.1109\/ICOIACT50329.2020.9332086"},{"key":"e_1_3_1_26_2","doi-asserted-by":"crossref","unstructured":"Luis Fernando Cort\u00e9s-Coy Mario Linares-V\u00e1squez Jairo Aponte and Denys Poshyvanyk 2014. On automatically generating commit messages via summarization of source code changes. In 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation. IEEE 275\u2013284.","DOI":"10.1109\/SCAM.2014.14"},{"key":"e_1_3_1_27_2","unstructured":"Andrew M Dai and Quoc V Le 2015. Semi-supervised sequence learning. Advances in neural information processing systems 28 (2015)."},{"key":"e_1_3_1_28_2","doi-asserted-by":"crossref","unstructured":"Brian De Alwis and Jonathan Sillito 2009. Why are software projects moving from centralized to decentralized version control systems?. In 2009 ICSE Workshop on Cooperative and Human Aspects on Software Engineering. IEEE 36\u201339.","DOI":"10.1109\/CHASE.2009.5071408"},{"key":"e_1_3_1_29_2","doi-asserted-by":"crossref","unstructured":"Themistoklis Diamantopoulos Dimitrios-Nikitas Nastos and Andreas Symeonidis 2023. Semantically-enriched Jira issue tracking data. In 2023 IEEE\/ACM 20th International Conference on Mining Software Repositories (MSR). IEEE 218\u2013222.","DOI":"10.1109\/MSR59073.2023.00039"},{"key":"e_1_3_1_30_2","doi-asserted-by":"crossref","unstructured":"Jinhao Dong Yiling Lou Dan Hao and Lin Tan 2023. Revisiting Learning-based Commit Message Generation. In 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE). IEEE 794\u2013805.","DOI":"10.1109\/ICSE48619.2023.00075"},{"key":"e_1_3_1_31_2","doi-asserted-by":"crossref","unstructured":"Jinhao Dong Yiling Lou Qihao Zhu Zeyu Sun Zhilin Li Wenjie Zhang and Dan Hao 2022. FIRA: fine-grained graphbased code change representation for automated commit message generation. In Proceedings of the 44th International Conference on Software Engineering. 970\u2013981.","DOI":"10.1145\/3510003.3510069"},{"key":"e_1_3_1_32_2","doi-asserted-by":"crossref","unstructured":"Robert Dyer Hoan Anh Nguyen Hridesh Rajan and Tien N Nguyen 2013. Boa: A language and infrastructure for analyzing ultra-large-scale software repositories. In 2013 35th International Conference on Software Engineering (ICSE). IEEE 422\u2013431.","DOI":"10.1109\/ICSE.2013.6606588"},{"key":"e_1_3_1_33_2","doi-asserted-by":"crossref","unstructured":"Zixuan Feng Amreeta Chatterjee Anita Sarma and Iftekhar Ahmed 2022. A case study of implicit mentoring its prevalence and impact in apache. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 797\u2013809.","DOI":"10.1145\/3540250.3549167"},{"key":"e_1_3_1_34_2","doi-asserted-by":"crossref","unstructured":"Zhangyin Feng Daya Guo Duyu Tang Nan Duan Xiaocheng Feng Ming Gong Linjun Shou Bing Qin Ting Liu Daxin Jiang et al. 2020. Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020).","DOI":"10.18653\/v1\/2020.findings-emnlp.139"},{"key":"e_1_3_1_35_2","doi-asserted-by":"crossref","unstructured":"Mingyang Geng Shangwen Wang Dezun Dong Haotian Wang Ge Li Zhi Jin Xiaoguang Mao and Xiangke Liao 2024. Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning. (2024).","DOI":"10.1145\/3597503.3608134"},{"key":"e_1_3_1_36_2","doi-asserted-by":"crossref","unstructured":"Jiri Gesi Jiawei Li and Iftekhar Ahmed 2021. An empirical examination of the impact of bias on just-in-time defect prediction. In Proceedings of the 15th ACM\/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). 1\u201312.","DOI":"10.1145\/3475716.3475791"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-014-9332-x"},{"key":"e_1_3_1_38_2","doi-asserted-by":"crossref","unstructured":"Leo A Goodman 1961. Snowball sampling. The annals of mathematical statistics (1961) 148\u2013170.","DOI":"10.1214\/aoms\/1177705148"},{"key":"e_1_3_1_39_2","unstructured":"Maarten Grootendorst 2022. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794 (2022)."},{"key":"e_1_3_1_40_2","doi-asserted-by":"crossref","unstructured":"Yichen He Liran Wang Kaiyi Wang Yupeng Zhang Hang Zhang and Zhoujun Li 2023. COME: Commit Message Generation with Modification Embedding. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. 792\u2013803.","DOI":"10.1145\/3597926.3598096"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-020-0496-0"},{"key":"e_1_3_1_42_2","doi-asserted-by":"crossref","unstructured":"Yuan Huang Qiaoyang Zheng Xiangping Chen Yingfei Xiong Zhiyong Liu and Xiaonan Luo 2017. Mining version control system for automatically generating commit comment. In 2017 ACM\/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE 414\u2013423.","DOI":"10.1109\/ESEM.2017.56"},{"key":"e_1_3_1_43_2","doi-asserted-by":"crossref","unstructured":"Nan Jiang Thibaud Lutellier and Lin Tan 2021. Cure: Code-aware neural machine translation for automatic program repair. In 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). IEEE 1161\u20131173.","DOI":"10.1109\/ICSE43902.2021.00107"},{"key":"e_1_3_1_44_2","doi-asserted-by":"crossref","unstructured":"Siyuan Jiang Ameer Armaly and Collin McMillan 2017. Automatically generating commit messages from diffs using neural machine translation. In 2017 32nd IEEE\/ACM International Conference on Automated Software Engineering (ASE). IEEE 135\u2013146.","DOI":"10.1109\/ASE.2017.8115626"},{"key":"e_1_3_1_45_2","doi-asserted-by":"crossref","unstructured":"Suhas Kabinna Cor-Paul Bezemer Weiyi Shang and Ahmed E Hassan 2016. Logging library migrations: A case study for the apache software foundation projects. In Proceedings of the 13th International Conference on Mining Software Repositories. 154\u2013164.","DOI":"10.1145\/2901739.2901769"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1023\/B:LIDA.0000048322.42751.ca"},{"key":"e_1_3_1_47_2","doi-asserted-by":"crossref","unstructured":"Katja Kevic Braden M Walters Timothy R Shaffer Bonita Sharif David C Shepherd and Thomas Fritz 2015. Tracing software developers\u2019 eyes and interactions for change tasks. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 202\u2013213.","DOI":"10.1145\/2786805.2786864"},{"key":"e_1_3_1_48_2","unstructured":"Denis Kocetkov Raymond Li Loubna Ben Allal Jia Li Chenghao Mou Carlos Mu\u00f1oz Ferrandis Yacine Jernite Margaret Mitchell Sean Hughes Thomas Wolf et al. 2022. The stack: 3 tb of permissively licensed source code. arXiv preprint arXiv:2211.15533 (2022)."},{"key":"e_1_3_1_49_2","doi-asserted-by":"crossref","unstructured":"Stanislav Levin and Amiram Yehudai 2017. Boosting automatic commit classification into maintenance activities by utilizing source code changes. In Proceedings of the 13th International Conference on Predictive Models and Data Analytics in Software Engineering. 97\u2013106.","DOI":"10.1145\/3127005.3127016"},{"key":"e_1_3_1_50_2","doi-asserted-by":"crossref","unstructured":"Jiawei Li and Iftekhar Ahmed 2023. Commit message matters: Investigating impact and evolution of commit message quality. In 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE). IEEE 806\u2013817.","DOI":"10.1109\/ICSE48619.2023.00076"},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-022-07877-z"},{"key":"e_1_3_1_52_2","unstructured":"Raymond Li Loubna Ben Allal Yangtian Zi Niklas Muennighoff Denis Kocetkov Chenghao Mou Marc Marone Christopher Akiki Jia Li Jenny Chim et al. 2023. StarCoder: may the source be with you! arXiv preprint arXiv:2305.06161 (2023)."},{"key":"e_1_3_1_53_2","first-page":"709","article-title":"Changescribe: A tool for automatically generating commit messages","volume":"2","author":"Linares-V\u00e1squez Mario","year":"2015","unstructured":"Mario Linares-V\u00e1squez, Luis Fernando Cort\u00e9s-Coy, Jairo Aponte, and Denys Poshyvanyk 2015. Changescribe: A tool for automatically generating commit messages. In 2015 IEEE\/ACM 37th IEEE International Conference on Software Engineering, Vol. 2. IEEE, 709\u2013712.","journal-title":"2015 IEEE\/ACM 37th IEEE International Conference on Software Engineering"},{"key":"e_1_3_1_54_2","unstructured":"Jiawei Liu Chunqiu Steven Xia Yuyao Wang and Lingming Zhang 2023. Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation. arXiv preprint arXiv:2305.01210 (2023)."},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/3560815"},{"key":"e_1_3_1_56_2","doi-asserted-by":"crossref","unstructured":"Qin Liu Zihe Liu Hongming Zhu Hongfei Fan Bowen Du and Yu Qian 2019. Generating commit messages from diffs using pointer-generator network. In 2019 IEEE\/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE 299\u2013309.","DOI":"10.1109\/MSR.2019.00056"},{"key":"e_1_3_1_57_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2020.3038681"},{"key":"e_1_3_1_58_2","doi-asserted-by":"crossref","unstructured":"Zhongxin Liu Xin Xia Ahmed E Hassan David Lo Zhenchang Xing and Xinyu Wang 2018. Neural-machine-translation-based commit message generation: how far are we?. In Proceedings of the 33rd ACM\/IEEE International Conference on Automated Software Engineering. 373\u2013384.","DOI":"10.1145\/3238147.3238190"},{"key":"e_1_3_1_59_2","doi-asserted-by":"crossref","unstructured":"Pablo Loyola Edison Marrese-Taylor and Yutaka Matsuo 2017. A neural architecture for generating natural language descriptions from source code changes. arXiv preprint arXiv:1704.04856 (2017).","DOI":"10.18653\/v1\/P17-2045"},{"key":"e_1_3_1_60_2","doi-asserted-by":"crossref","unstructured":"Umme Ayda Mannan Iftekhar Ahmed Carlos Jensen and Anita Sarma 2020. On the relationship between design discussions and design quality: a case study of Apache projects. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 543\u2013555.","DOI":"10.1145\/3368089.3409707"},{"key":"e_1_3_1_61_2","doi-asserted-by":"crossref","unstructured":"Leland McInnes John Healy and Steve Astels 2017. hdbscan: Hierarchical density based clustering. J. Open Source Softw. 2 11 (2017) 205.","DOI":"10.21105\/joss.00205"},{"key":"e_1_3_1_62_2","doi-asserted-by":"crossref","unstructured":"Leland McInnes John Healy and James Melville 2018. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018).","DOI":"10.21105\/joss.00861"},{"key":"e_1_3_1_63_2","unstructured":"Diener MJ 2010. Cohen\u2019s d. The Corsini encyclopedia of psychology (2010)."},{"key":"e_1_3_1_64_2","doi-asserted-by":"crossref","unstructured":"Mockus and Votta 2000. Identifying reasons for software changes using historic databases. In Proceedings 2000 International Conference on Software Maintenance. IEEE 120\u2013130.","DOI":"10.1109\/ICSM.2000.883028"},{"key":"e_1_3_1_65_2","unstructured":"Tha\u00eds Mombach and Marco Tulio Valente 2018. GitHub REST API vs GHTorrent vs GitHub Archive: A comparative study."},{"key":"e_1_3_1_66_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2021.05.039"},{"key":"e_1_3_1_67_2","unstructured":"OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]"},{"key":"e_1_3_1_68_2","doi-asserted-by":"crossref","unstructured":"Keqin Peng Liang Ding Qihuang Zhong Li Shen Xuebo Liu Min Zhang Yuanxin Ouyang and Dacheng Tao 2023. Towards making the most of chatgpt for machine translation. arXiv preprint arXiv:2303.13780 (2023).","DOI":"10.2139\/ssrn.4390455"},{"key":"e_1_3_1_69_2","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans Ilya Sutskever et al. 2018. Improving language understanding by generative pre-training. (2018)."},{"key":"e_1_3_1_70_2","doi-asserted-by":"crossref","unstructured":"Nils Reimers and Iryna Gurevych 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019).","DOI":"10.18653\/v1\/D19-1410"},{"key":"e_1_3_1_71_2","doi-asserted-by":"crossref","unstructured":"Samantha Robertson Zijie J Wang Dominik Moritz Mary Beth Kery and Fred Hohman 2023. Angler: Helping Machine Translation Practitioners Prioritize Model Improvements. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1\u201320.","DOI":"10.1145\/3544548.3580790"},{"key":"e_1_3_1_72_2","doi-asserted-by":"publisher","DOI":"10.1093\/beheco\/ark016"},{"key":"e_1_3_1_73_2","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1109\/COMPSAC.2016.162","article-title":"On automatic summarization of what and why information in source code changes","volume":"1","author":"Shen Jinfeng","year":"2016","unstructured":"Jinfeng Shen, Xiaobing Sun, Bin Li, Hui Yang, and Jiajun Hu 2016. On automatic summarization of what and why information in source code changes. In 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), Vol. 1. IEEE, 103\u2013112.","journal-title":"2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC)"},{"key":"e_1_3_1_74_2","unstructured":"Ensheng Shi Yanlin Wang Wei Tao Lun Du Hongyu Zhang Shi Han Dongmei Zhang and Hongbin Sun 2022. RACE: Retrieval-Augmented Commit Message Generation. arXiv preprint arXiv:2203.02700 (2022)."},{"key":"e_1_3_1_75_2","doi-asserted-by":"crossref","unstructured":"Edward Smith Robert Loftin Emerson Murphy-Hill Christian Bird and Thomas Zimmermann 2013. Improving developer participation rates in surveys. In 2013 6th International workshop on cooperative and human aspects of software engineering (CHASE). IEEE 89\u201392.","DOI":"10.1109\/CHASE.2013.6614738"},{"key":"e_1_3_1_76_2","first-page":"29","article-title":"Javaparser: visited","volume":"10","author":"Smith Nicholas","year":"2017","unstructured":"Nicholas Smith, Danny Van Bruggen, and Federico Tomassetti 2017. Javaparser: visited. Leanpub, oct. de 10 (2017), 29\u201340.","journal-title":"Leanpub, oct. de"},{"key":"e_1_3_1_77_2","unstructured":"Supplementary. 2023. Replication Package. https:\/\/figshare.com\/s\/d0d7375a2d19edf62cd4"},{"issue":"06","key":"e_1_3_1_78_2","first-page":"4864","article-title":"A survey on text pre-processing & feature extraction techniques in natural language processing","volume":"7","author":"Tabassum Ayisha","year":"2020","unstructured":"Ayisha Tabassum and Rajendra R Patil 2020. A survey on text pre-processing & feature extraction techniques in natural language processing. International Research Journal of Engineering and Technology (IRJET) 7, 06 (2020), 4864\u20134867.","journal-title":"International Research Journal of Engineering and Technology (IRJET)"},{"key":"e_1_3_1_79_2","doi-asserted-by":"crossref","unstructured":"Wei Tao Yanlin Wang Ensheng Shi Lun Du Shi Han Hongyu Zhang Dongmei Zhang and Wenqiang Zhang 2021. On the evaluation of commit message generation models: An experimental study. In 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE 126\u2013136.","DOI":"10.1109\/ICSME52107.2021.00018"},{"key":"e_1_3_1_80_2","doi-asserted-by":"crossref","unstructured":"James Thorne Andreas Vlachos Christos Christodoulopoulos and Arpit Mittal 2018. FEVER: a large-scale dataset for fact extraction and VERification. arXiv preprint arXiv:1803.05355 (2018).","DOI":"10.18653\/v1\/N18-1074"},{"key":"e_1_3_1_81_2","doi-asserted-by":"crossref","unstructured":"Yingchen Tian Yuxia Zhang Klaas-Jan Stol Lin Jiang and Hui Liu 2022. What makes a good commit message?. In Proceedings of the 44th International Conference on Software Engineering. 2389\u20132401.","DOI":"10.1145\/3510003.3510205"},{"key":"e_1_3_1_82_2","doi-asserted-by":"crossref","first-page":"854","DOI":"10.1609\/icwsm.v17i1.22194","article-title":"A Multi-task Model for Sentiment Aided Stance Detection of Climate Change Tweets","volume":"17","author":"Upadhyaya Apoorva","year":"2023","unstructured":"Apoorva Upadhyaya, Marco Fisichella, and Wolfgang Nejdl 2023. A Multi-task Model for Sentiment Aided Stance Detection of Climate Change Tweets. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 17. 854\u2013865.","journal-title":"Proceedings of the International AAAI Conference on Web and Social Media"},{"key":"e_1_3_1_83_2","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez \u0141ukasz Kaiser and Illia Polosukhin 2017. Attention is all you need. Advances in neural information processing systems 30 (2017)."},{"key":"e_1_3_1_84_2","doi-asserted-by":"publisher","DOI":"10.1145\/3464689"},{"key":"e_1_3_1_85_2","doi-asserted-by":"crossref","unstructured":"Liran Wang Xunzhu Tang Yichen He Changyu Ren Shuhua Shi Chaoran Yan and Zhoujun Li 2023. Delving into Commit-Issue Correlation to Enhance Commit Message Generation Models. arXiv preprint arXiv:2308.00147 (2023).","DOI":"10.1109\/ASE56229.2023.00050"},{"key":"e_1_3_1_86_2","doi-asserted-by":"crossref","unstructured":"Ying Wang Bihuan Chen Kaifeng Huang Bowen Shi Congying Xu Xin Peng Yijian Wu and Yang Liu 2020. An empirical study of usages updates and risks of third-party libraries in java projects. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE 35\u201345.","DOI":"10.1109\/ICSME46990.2020.00014"},{"key":"e_1_3_1_87_2","doi-asserted-by":"crossref","unstructured":"Yue Wang Weishi Wang Shafiq Joty and Steven CH Hoi 2021. Codet5: Identifier-aware unified pre-trained encoderdecoder models for code understanding and generation. arXiv preprint arXiv:2109.00859 (2021).","DOI":"10.18653\/v1\/2021.emnlp-main.685"},{"key":"e_1_3_1_88_2","unstructured":"Shengbin Xu Yuan Yao Feng Xu Tianxiao Gu Hanghang Tong and Jian Lu 2019. Commit message generation for source code changes. In IJCAI."},{"key":"e_1_3_1_89_2","doi-asserted-by":"crossref","unstructured":"Zhilin Yang Peng Qi Saizheng Zhang Yoshua Bengio William W Cohen Ruslan Salakhutdinov and Christopher D Manning 2018. HotpotQA: A dataset for diverse explainable multi-hop question answering. arXiv preprint arXiv:1809.09600 (2018).","DOI":"10.18653\/v1\/D18-1259"},{"key":"e_1_3_1_90_2","unstructured":"Shunyu Yao Jeffrey Zhao Dian Yu Nan Du Izhak Shafran Karthik Narasimhan and Yuan Cao 2022. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629 (2022)."},{"key":"e_1_3_1_91_2","doi-asserted-by":"crossref","unstructured":"Bereket A Yilma and Luis A Leiva 2023. The Elements of Visual Art Recommendation: Learning Latent Semantic Representations of Paintings. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1\u201317.","DOI":"10.1145\/3544548.3581477"},{"key":"e_1_3_1_92_2","doi-asserted-by":"crossref","unstructured":"Ting Zhang Ivana Clairine Irsan Ferdian Thung DongGyun Han David Lo and Lingxiao Jiang 2022. Automatic pull request title generation. In 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE 71\u201381.","DOI":"10.1109\/ICSME55016.2022.00015"},{"key":"e_1_3_1_93_2","doi-asserted-by":"crossref","unstructured":"Ting Zhang Ivana Clairine Irsan Ferdian Thung DongGyun Han David Lo and Lingxiao Jiang 2022. iTiger: an automatic issue title generation tool. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1637\u20131641.","DOI":"10.1145\/3540250.3558934"},{"key":"e_1_3_1_94_2","unstructured":"Daniel M Ziegler Nisan Stiennon Jeffrey Wu Tom B Brown Alec Radford Dario Amodei Paul Christiano and Geoffrey Irving 2019. Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593 (2019)."},{"key":"e_1_3_1_95_2","doi-asserted-by":"crossref","unstructured":"Thomas Zimmermann 2016. Card-sorting: From text to themes. In Perspectives on data science for software engineering. Elsevier 137\u2013141.","DOI":"10.1016\/B978-0-12-804206-9.00027-1"},{"key":"e_1_3_1_96_2","unstructured":"Zulip. 2021. Zulip Commit Guideline. https:\/\/zulip.readthedocs.io\/en\/latest\/contributing\/commit-discipline.html#commit-messages"}],"container-title":["Proceedings of the ACM on Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3643760","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3643760","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,4]],"date-time":"2026-02-04T07:58:24Z","timestamp":1770191904000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3643760"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,12]]},"references-count":95,"journal-issue":{"issue":"FSE","published-print":{"date-parts":[[2024,7,12]]}},"alternative-id":["10.1145\/3643760"],"URL":"https:\/\/doi.org\/10.1145\/3643760","relation":{},"ISSN":["2994-970X"],"issn-type":[{"value":"2994-970X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,12]]}}}