{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T09:24:46Z","timestamp":1776158686622,"version":"3.50.1"},"reference-count":106,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2015,12,2]],"date-time":"2015-12-02T00:00:00Z","timestamp":1449014400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Louisiana Board of Regents Research Competitiveness Subprogram","award":["LEQSF(2015-18)-RD-A-07"],"award-info":[{"award-number":["LEQSF(2015-18)-RD-A-07"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2015,12,2]]},"abstract":"<jats:p>Contemporary software engineering tools exploit semantic relations between individual code terms to aid in code analysis and retrieval tasks. Such tools employ word similarity methods, often used in natural language processing (<jats:sc>nlp<\/jats:sc>), to analyze the textual content of source code. However, the notion of similarity in source code is different from natural language. Source code often includes unnatural domain-specific terms (e.g., abbreviations and acronyms), and such terms might be related due to their structural relations rather than linguistic aspects. Therefore, applying natural language similarity methods to source code without adjustment can produce low-quality and error-prone results. Motivated by these observations, we systematically investigate the performance of several semantic-relatedness methods in the context of software. Our main objective is to identify the most effective semantic schemes in capturing association relations between source code terms. To provide an unbiased comparison, different methods are compared against human-generated relatedness information using terms from three software systems. Results show that corpus-based methods tend to outperform methods that exploit external sources of semantic knowledge. However, due to inherent code limitations, the performance of such methods is still suboptimal. To address these limitations, we propose Normalized Software Distance (<jats:sc>nsd<\/jats:sc>), an information-theoretic method that captures semantic relatedness in source code by exploiting the distributional cues of code terms across the system.<jats:sc>nsd<\/jats:sc>overcomes data sparsity and lack of context problems often associated with source code, achieving higher levels of resemblance to the human perception of relatedness at the term and the text levels of code.<\/jats:p>","DOI":"10.1145\/2824251","type":"journal-article","created":{"date-parts":[[2015,12,4]],"date-time":"2015-12-04T13:43:07Z","timestamp":1449236587000},"page":"1-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":17,"title":["Estimating Semantic Relatedness in Source Code"],"prefix":"10.1145","volume":"25","author":[{"given":"Anas","family":"Mahmoud","sequence":"first","affiliation":[{"name":"Louisiana State University, Baton Rouge, LA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gary","family":"Bradshaw","sequence":"additional","affiliation":[{"name":"Mississippi State University, MS"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2015,12,2]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Mining Text Data","author":"Aggarwal Charu"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/1620754.1620758"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/832306.837051"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the Conference of the Centre for Advanced Studies on Collaborative Research. 4--14","author":"Anquetil Nicolas","year":"1998"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1049\/ip-sen:20030581"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1076034.1076134"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2013.60"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-012-9225-9"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242675"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1162\/coli.2006.32.1.13"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the International Conference on Large Scale Semantic Access to Content (Text, Image, Video, and Sound). 314--332","author":"Budiu Raluca","year":"2007"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.3758\/BF03193020"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1060745.1060811"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/850948.853439"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2009.07.016"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.5555\/89086.89095"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.48"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/RE.2005.78"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSM.2013.43"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/832307.837119"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2012.6240488"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-008-9090-8"},{"key":"e_1_2_1_23_1","doi-asserted-by":"crossref","unstructured":"Angela Dean and Daniel Voss. 1999. Design and Analysis of Experiments. Springer. Angela Dean and Daniel Voss. 1999. Design and Analysis of Experiments. Springer.","DOI":"10.1007\/b97673"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9"},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","unstructured":"Serge Demeyer St\u00e9phane Ducasse and Oscar Nierstrasz. 2003. Object-Oriented Reengineering Patterns. Elsevier. Serge Demeyer St\u00e9phane Ducasse and Oscar Nierstrasz. 2003. Object-Oriented Reengineering Patterns. Elsevier.","DOI":"10.1016\/B978-155860639-5\/50006-7"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1137\/0911052"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0378-2166(00)00068-0"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2010.12"},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","unstructured":"C. Fellbaum. 1998. WordNet: An Electronic Lexical Database. MIT Press. C. Fellbaum. 1998. WordNet: An Electronic Lexical Database. MIT Press.","DOI":"10.7551\/mitpress\/7287.001.0001"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/503104.503110"},{"key":"e_1_2_1_31_1","first-page":"1","article-title":"The intelligent essay assessor: Applications to educational technology","volume":"1","author":"Foltz Peter","year":"1999","journal-title":"Interact. Multimedia Educ. J. Comput. Enhanced Learn."},{"key":"e_1_2_1_32_1","volume-title":"Refactoring: Improving the Design of Existing Code. Addison--Wesley.","author":"Fowler Martin","year":"1999"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/1927229.1927242"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1882291.1882315"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the International Joint Conference on Artifical Intelligence. 1606--1611","author":"Gabrilovich Evgeniy","year":"2007"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-85481-4_12"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1145581.1145630"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/SCAM.2010.22"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2597008.2597138"},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the Annual Meeting of the Association for Computational Linguistics. 239--249","author":"Guo Weiwei","year":"2013"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/WCRE.2010.13"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.5555\/2486788.2486898"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/243199.243216"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1370750.1370771"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.5555\/2337223.2337322"},{"key":"e_1_2_1_46_1","volume-title":"Quality Issues in the Management of Web Information","author":"Holzinger Andreas"},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of the Working Conference on Mining Software Repositories. 377--386","author":"Howard Matthew"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2006.3"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376815.1376819"},{"key":"e_1_2_1_50_1","volume-title":"Proceedings of the International Conference on Language Resources and Evaluation. 1033--1038"},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of the International Conference on Research in Computational Linguistics. 19--33","author":"Jiang Jay","year":"1997"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2002.1019480"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1987.1165125"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2006.10.017"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.104.2.211"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/SCAM.2007.9"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11334-007-0031-2"},{"key":"e_1_2_1_58_1","doi-asserted-by":"crossref","unstructured":"Claudia Leacock and Martin Chodorow. 1998. Combining Local Context and WordNet Similarity for Word Sense Identification. MIT Press 265--283. Claudia Leacock and Martin Chodorow. 1998. Combining Local Context and WordNet Similarity for Word Sense Identification. MIT Press 265--283.","DOI":"10.7551\/mitpress\/7287.003.0018"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1016\/0164-1212(79)90022-0"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/318723.318728"},{"key":"e_1_2_1_61_1","volume-title":"Proceedings of the International Conference on Machine Learning. 296--304","author":"Lin Dekang","year":"1998"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/2491411.2491432"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.3758\/BF03204766"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00766-013-0199-y"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.5555\/519308.786914"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2007.70732"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.5555\/872023.872542"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.5555\/776816.776832"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2007.70768"},{"key":"e_1_2_1_70_1","volume-title":"Proceedings of the National Conference on Artificial Intelligence. 775--780","author":"Mihalcea Rada","year":"2006"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1080\/01690969108406936"},{"key":"e_1_2_1_72_1","volume-title":"Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics. 100--108","author":"Newman David","year":"2010"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/RE.2012.6345842"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2010.20"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.5555\/2486788.2486857"},{"key":"e_1_2_1_76_1","volume-title":"Lecture Notes in Computer Science","volume":"7171","author":"Pollock Lori","year":"2013"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2006.17"},{"key":"e_1_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.1145\/1963405.1963455"},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.3758\/BRM.41.3.647"},{"key":"e_1_2_1_80_1","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence. 448--453","author":"Resnik Philip","year":"1995"},{"key":"e_1_2_1_81_1","unstructured":"B. Rosario. 2000. Latent Semantic Indexing: An Overview. INFOSYS 240 Spring Paper University of California Berkeley. B. Rosario. 2000. Latent Semantic Indexing: An Overview. INFOSYS 240 Spring Paper University of California Berkeley."},{"key":"e_1_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.1145\/361219.361220"},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2011.13"},{"key":"e_1_2_1_84_1","doi-asserted-by":"publisher","DOI":"10.1145\/1218563.1218587"},{"key":"e_1_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSM.2011.6080802"},{"key":"e_1_2_1_86_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/16.1.30"},{"key":"e_1_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.1145\/345508.345578"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.5555\/525595.836975"},{"key":"e_1_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.1145\/1871985.1871996"},{"key":"e_1_2_1_90_1","doi-asserted-by":"publisher","DOI":"10.1108\/eb026526"},{"key":"e_1_2_1_91_1","doi-asserted-by":"publisher","DOI":"10.1145\/1858996.1859006"},{"key":"e_1_2_1_92_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2008.18"},{"key":"e_1_2_1_93_1","volume-title":"Proceedings of the National Conference on Artificial Intelligence. 1419--1424","author":"Strube Michael","year":"2006"},{"key":"e_1_2_1_94_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.v59:1"},{"key":"e_1_2_1_95_1","doi-asserted-by":"publisher","DOI":"10.1109\/CSMR-WCRE.2014.6747213"},{"key":"e_1_2_1_96_1","doi-asserted-by":"publisher","DOI":"10.5555\/645328.650004"},{"key":"e_1_2_1_97_1","doi-asserted-by":"publisher","DOI":"10.5555\/832307.837118"},{"key":"e_1_2_1_98_1","unstructured":"C. J. van Rijsbergen. 1979. Information Retrieval. Butterworths. C. J. van Rijsbergen. 1979. Information Retrieval. Butterworths."},{"key":"e_1_2_1_99_1","volume-title":"Proceedings of the Annual Meeting of the Cognitive Science Society. 1282--1287","author":"Veksler Vladislav","year":"2008"},{"key":"e_1_2_1_100_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148204"},{"key":"e_1_2_1_101_1","doi-asserted-by":"publisher","DOI":"10.5555\/851042.857040"},{"key":"e_1_2_1_102_1","volume-title":"Proceedings of the International Workshop on Program Comprehension. 194--203","author":"Wen Zhihua","year":"2004"},{"key":"e_1_2_1_103_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-009-0203-5"},{"key":"e_1_2_1_104_1","doi-asserted-by":"publisher","DOI":"10.3115\/981732.981751"},{"key":"e_1_2_1_105_1","doi-asserted-by":"publisher","DOI":"10.1177\/0047287508321193"},{"key":"e_1_2_1_106_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-013-9264-x"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2824251","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2824251","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T05:43:21Z","timestamp":1750225401000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2824251"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,12,2]]},"references-count":106,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2015,12,2]]}},"alternative-id":["10.1145\/2824251"],"URL":"https:\/\/doi.org\/10.1145\/2824251","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2015,12,2]]},"assertion":[{"value":"2014-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-12-02","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}