{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T16:02:04Z","timestamp":1761580924643,"version":"3.40.5"},"reference-count":24,"publisher":"IGI Global","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,4,1]]},"abstract":"<p>In this paper, the authors present an empirical evaluation of similarity coefficients for binary valued data. Similarity coefficients provide a means to measure the similarity or distance between two binary valued objects in a dataset such that the attributes qualifying each object have a 0-1 value. This is useful in several domains, such as similarity of feature vectors in sensor networks, document search, router network mining, and web mining. The authors survey 35 similarity coefficients used in various domains and present conclusions about the efficacy of the similarity computed in (1) labeled data to quantify the accuracy of the similarity coefficients, (2) varying density of the data to evaluate the effect of sparsity of the values, and (3) varying number of attributes to see the effect of high dimensionality in the data on the similarity computed.<\/p>","DOI":"10.4018\/jdwm.2011040103","type":"journal-article","created":{"date-parts":[[2011,10,19]],"date-time":"2011-10-19T16:11:25Z","timestamp":1319040685000},"page":"44-66","source":"Crossref","is-referenced-by-count":9,"title":["An Empirical Evaluation of Similarity Coefficients for Binary Valued Data"],"prefix":"10.4018","volume":"7","author":[{"given":"David M.","family":"Lewis","sequence":"first","affiliation":[{"name":"Carnegie Mellon University, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0130-6135","authenticated-orcid":true,"given":"Vandana P.","family":"Janeja","sequence":"additional","affiliation":[{"name":"University of Maryland, Baltimore County, USA"}]}],"member":"2432","reference":[{"doi-asserted-by":"crossref","unstructured":"Adam, N. R., Janeja, V. P., & Atluri, V. (2004). Neighborhood based detection of anomalies in high dimensional spatio-temporal sensor datasets. In Proceedings of the 2004 ACM Symposium on Applied Computing.","key":"jdwm.2011040103-0","DOI":"10.1145\/967900.968020"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-1","DOI":"10.1016\/j.datak.2007.03.016"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-2","DOI":"10.4238\/vol7-3gmr458"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-3","DOI":"10.2307\/2412493"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-4","DOI":"10.2307\/3236912"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-5","DOI":"10.1128\/JCM.43.11.5483-5490.2005"},{"doi-asserted-by":"crossref","unstructured":"Cha, S-H., Tappert, C., & Yoon, S. (2006). Enhancing binary feature vector similarity measures. Journal of Pattern Recognition Research, 63-77.","key":"jdwm.2011040103-6","DOI":"10.13176\/11.20"},{"issue":"1","key":"jdwm.2011040103-7","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1590\/S1415-47572004000100014","article-title":"Comparison of similarity coefficients used for cluster analysis with dominant markers in maize (zea mays l).","volume":"27","author":"A.da Silva Meyer","year":"2004","journal-title":"Genetics and Molecular Biology"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-8","DOI":"10.1673\/031.009.7101"},{"doi-asserted-by":"crossref","unstructured":"Guha, S., Rastogi, R., & Shim, K. (2000). Rock: A robust clustering algorithm for categorical attributes. In Proceedings of the 15th International Conference on Data Engineering.","key":"jdwm.2011040103-9","DOI":"10.1109\/ICDE.1999.754967"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-10","DOI":"10.1021\/ci700413a"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-11","DOI":"10.1086\/284927"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-12","DOI":"10.1007\/s10618-009-0147-0"},{"doi-asserted-by":"crossref","unstructured":"Karabatis, G., Chen, Z., Janeja, V. P., Lobo, T., Advani, M., Lindvall, M., & Feldmann, R. L. (2009). Using semantic networks and context in search for relevant software engineering artifacts. Journal of Data Semantics.","key":"jdwm.2011040103-13","DOI":"10.1007\/978-3-642-10562-3_3"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-14","DOI":"10.1111\/j.1365-294X.2005.02416.x"},{"unstructured":"Lewis, D. M. (2008). Using similarity coefficients to identify synonymous routers.","key":"jdwm.2011040103-15"},{"unstructured":"Lewis, D. M., & Janeja, V. P. (2009). An evaluative comparison of similarity coefficients for binary valued data. In Proceedings of ACM SIGMOD: Undergraduate Research Competition.","key":"jdwm.2011040103-16"},{"doi-asserted-by":"crossref","unstructured":"Lindvall, M., Feldmann, R. L., Karabatis, G., Chen, Z., & Janeja, V. P. (2009). Searching for relevant software change artifacts using semantic networks. In Proceedings of the Symposium on Applied Computing (pp. 496-500).","key":"jdwm.2011040103-17","DOI":"10.1145\/1529282.1529387"},{"issue":"3","key":"jdwm.2011040103-18","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1590\/S1415-47571999000300024","article-title":"Comparison of similarity coefficients based on rapd markers in the common bean.","volume":"22","author":"J.Moura Duarte","year":"1999","journal-title":"Genetics and Molecular Biology"},{"key":"jdwm.2011040103-19","first-page":"415","article-title":"Estimating the effect of the similarity coefficient and the cluster algorithm on biogeographic classifications.","volume":"40","author":"M.Murguia","year":"2003","journal-title":"Annales Botanici Fennici"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-20","DOI":"10.1007\/s00357-008-9024-6"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-21","DOI":"10.4018\/jdwm.2010040104"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-22","DOI":"10.1016\/j.cie.2003.01.001"},{"doi-asserted-by":"publisher","key":"jdwm.2011040103-23","DOI":"10.4018\/jdwm.2008010104"}],"container-title":["International Journal of Data Warehousing and Mining"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=53039","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,1]],"date-time":"2022-06-01T20:13:02Z","timestamp":1654114382000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/jdwm.2011040103"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2011,4,1]]},"references-count":24,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2011,4]]}},"URL":"https:\/\/doi.org\/10.4018\/jdwm.2011040103","relation":{},"ISSN":["1548-3924","1548-3932"],"issn-type":[{"type":"print","value":"1548-3924"},{"type":"electronic","value":"1548-3932"}],"subject":[],"published":{"date-parts":[[2011,4,1]]}}}