{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T00:26:34Z","timestamp":1777854394746,"version":"3.51.4"},"reference-count":13,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2004,12,1]],"date-time":"2004-12-01T00:00:00Z","timestamp":1101859200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2004,12]]},"abstract":"<jats:p>Similarity between objects (documents, persons, answers to a questionnaire, etc.) is generally determined through relations between representations of these objects. In the case of binary representations the presence of a property (e.g. an index term) carries a weight of one, its absence a weight of zero. In many similarity studies common zeros are ignored. This situation is called the zero insensitive case. In this article, however, we study the zero sensitive case. Clearly, answers to binary questionnaires (yes-no, encoded as 1-0) are zero sensitive, as people who answer \u2018no\u2019 to the same questions are more similar than those who give different answers. We present a wish list for such a zero sensitive approach to similarity. Making a difference between common zeros and common ones leads to an \u2018identity-similarity\u2019 theory. Hence, we move beyond a pure similarity theory. Two approaches to the problem of similarity measurement of presence-absence data, where common zeros matter and have the same effect as common ones, are presented. For the case that there is a difference between common ones and common zeros a totally new approach is proposed. In each case a coding approach is used, leading to new representations, which then lead to a similarity ranking. Examples of functions respecting these rankings are given.<\/jats:p>\n                  <jats:p>When discussing similarity in general terms authors should clearly state which requirements they imply for the notion of \u2018similarity\u2019. It is only then that the problem of the best measure for a given study can be brought up for discussion in a meaningful way.<\/jats:p>","DOI":"10.1177\/0165551504047827","type":"journal-article","created":{"date-parts":[[2004,12,3]],"date-time":"2004-12-03T06:37:58Z","timestamp":1102055878000},"page":"509-519","source":"Crossref","is-referenced-by-count":4,"title":["An approach to similarity measurement of absence-presence data: the case that                 common zeros matter"],"prefix":"10.1177","volume":"30","author":[{"given":"Leo","family":"Egghe","sequence":"first","affiliation":[{"name":"LUC, Universitaire Campus, B-3590 Diepenbeek, Belgium, and UA, IBW,                         Universiteitsplein1, B-2610 Wilrijk, Belgium"}]},{"given":"Ronald","family":"Rousseau","sequence":"additional","affiliation":[{"name":"KHBO, IWT, Zeedijk 101, B-8400 Oostende, Belgium, and UA, IBW,                         Universiteitsplein 1, B-2610 Wilrijk, Belgium"}]}],"member":"179","published-online":{"date-parts":[[2004,12,1]]},"reference":[{"key":"atypb1","volume-title":"Information Processing and Management","author":"L. Egghe","year":"2004"},{"key":"atypb2","volume-title":"Introduction to Modern Information Retrieval","author":"G. Salton","year":"1983"},{"key":"atypb3","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-010-0752-8"},{"key":"atypb4","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199312)44:10<579::AID-ASI3>3.0.CO;2-B"},{"key":"atypb5","doi-asserted-by":"publisher","DOI":"10.1108\/eb026932"},{"key":"atypb6","first-page":"122","volume-title":"Mycology in Sustainable Development: Expanding Concepts, Vanishing Borders","author":"R.E. Tulloss","year":"1997"},{"issue":"1","key":"atypb7","first-page":"33","volume":"13","author":"D. Nijssen","year":"1998","journal-title":"Coenoses"},{"key":"atypb8","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1950.tb00463.x"},{"key":"atypb9","volume-title":"Numerical Taxonomy","author":"P.H.A. Sneath","year":"1973"},{"key":"atypb10","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-015-7358-0"},{"key":"atypb11","first-page":"391","volume-title":"Frontiers of Population Ecology","author":"H.P. Possingham","year":"1996"},{"key":"atypb12","doi-asserted-by":"publisher","DOI":"10.1146\/annurev.soc.27.1.415"},{"key":"atypb13","doi-asserted-by":"publisher","DOI":"10.1038\/scientificamerican0603-76"}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551504047827","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551504047827","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:06:54Z","timestamp":1777504014000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0165551504047827"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2004,12]]},"references-count":13,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2004,12]]}},"alternative-id":["10.1177\/0165551504047827"],"URL":"https:\/\/doi.org\/10.1177\/0165551504047827","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2004,12]]}}}