{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T00:11:56Z","timestamp":1770509516134,"version":"3.49.0"},"reference-count":36,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2011,2,1]],"date-time":"2011-02-01T00:00:00Z","timestamp":1296518400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100004602","name":"Program for New Century Excellent Talents in University","doi-asserted-by":"publisher","award":["NCET-07-0491"],"award-info":[{"award-number":["NCET-07-0491"]}],"id":[{"id":"10.13039\/501100004602","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002855","name":"Ministry of Science and Technology of the People's Republic of China","doi-asserted-by":"publisher","award":["2011CB302206"],"award-info":[{"award-number":["2011CB302206"]}],"id":[{"id":"10.13039\/501100002855","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["60833003"],"award-info":[{"award-number":["60833003"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Data and Information Quality"],"published-print":{"date-parts":[[2011,2]]},"abstract":"<jats:p>\n            Name ambiguity stems from the fact that many people or objects share identical names in the real world. Such name ambiguity decreases the performance of document retrieval, Web search, information integration, and may cause confusion in other applications. Due to the same name spellings and lack of information, it is a nontrivial task to distinguish them accurately. In this article, we focus on investigating the problem in digital libraries to distinguish publications written by authors with identical names. We present an effective framework named GHOST (abbreviation for GrapHical framewOrk for name diSambiguaTion), to solve the problem systematically. We devise a novel similarity metric, and utilize only one type of attribute (i.e., coauthorship) in GHOST. Given the similarity matrix, intermediate results are grouped into clusters with a recently introduced powerful clustering algorithm called\n            <jats:italic>Affinity Propagation<\/jats:italic>\n            . In addition, as a complementary technique, user feedback can be used to enhance the performance. We evaluated the framework on the real DBLP and PubMed datasets, and the experimental results show that GHOST can achieve both high\n            <jats:italic>precision<\/jats:italic>\n            and\n            <jats:italic>recall<\/jats:italic>\n            .\n          <\/jats:p>","DOI":"10.1145\/1891879.1891883","type":"journal-article","created":{"date-parts":[[2011,2,15]],"date-time":"2011-02-15T18:30:59Z","timestamp":1297794659000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":86,"title":["On Graph-Based Name Disambiguation"],"prefix":"10.1145","volume":"2","author":[{"given":"Xiaoming","family":"Fan","sequence":"first","affiliation":[{"name":"Tsinghua University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianyong","family":"Wang","sequence":"additional","affiliation":[{"name":"Tsinghua University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xu","family":"Pu","sequence":"additional","affiliation":[{"name":"Tsinghua University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lizhu","family":"Zhou","sequence":"additional","affiliation":[{"name":"Tsinghua University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bing","family":"Lv","sequence":"additional","affiliation":[{"name":"Tsinghua University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2011,2]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1014052.1014062"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1060745.1060813"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1217299.1217304"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/956750.956759"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2003.1234765"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.1150938"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872796"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1255175.1255215"},{"key":"e_1_2_1_9_1","unstructured":"DBLP. 2010. bibliography. http:\/\/www.informatik.uni-trier.de\/~ley\/db\/. DBLP . 2010. bibliography. http:\/\/www.informatik.uni-trier.de\/~ley\/db\/."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066168"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the IEEE International Conference on Computer Vision (ICCV\u201907)","author":"Dueck D.","unstructured":"Dueck , D. and Frey , B. J . 2007. Non-metric affinity propagation for unsupervised image categorization . In Proceedings of the IEEE International Conference on Computer Vision (ICCV\u201907) . 1--8. Dueck, D. and Frey, B. J. 2007. Non-metric affinity propagation for unsupervised image categorization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV\u201907). 1--8."},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the Annual Conference on Research in Computational Molecular Biology (RECOMB\u201908)","author":"Dueck D.","unstructured":"Dueck , D. and Frey , B. J . 2008. Constructing treatment portfolios using affinity propagation . In Proceedings of the Annual Conference on Research in Computational Molecular Biology (RECOMB\u201908) . 360--371. Dueck, D. and Frey, B. J. 2008. Constructing treatment portfolios using affinity propagation. In Proceedings of the Annual Conference on Research in Computational Molecular Biology (RECOMB\u201908). 360--371."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1458082.1458327"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1969.10501049"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.1136800"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.1151268"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/996350.996419"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065385.1065462"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/223784.223807"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2009.25"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1138394.1138401"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/301136.301255"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1077501.1077514"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btm414"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/347090.347123"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148179"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the SIAM International Conference on Data Mining (SDM\u201907)","author":"On B.-W.","unstructured":"On , B.-W. and Lee , D . 2007. Scalable name disambiguation using multi-level graph partition . In Proceedings of the SIAM International Conference on Data Mining (SDM\u201907) . 575--580. On, B.-W. and Lee, D. 2007. Scalable name disambiguation using multi-level graph partition. In Proceedings of the SIAM International Conference on Data Mining (SDM\u201907). 575--580."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2006.85"},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS\u201902)","author":"Pasula H.","unstructured":"Pasula , H. , Marthi , B. , Milch , B. , Russell , S. , and Shpitser , I . 2002. Identity uncertainty and citation matching . In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS\u201902) . Pasula, H., Marthi, B., Milch, B., Russell, S., and Shpitser, I. 2002. Identity uncertainty and citation matching. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS\u201902)."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1081870.1081898"},{"key":"e_1_2_1_31_1","unstructured":"PubMed. 2010. bibliography. http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/ PubMed . 2010. bibliography. http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775087"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.2307\/2786545"},{"key":"e_1_2_1_34_1","volume-title":"Statistical Research Division","author":"Winkler W.","unstructured":"Winkler , W. 1999. The state of record linkage and current research problems. Tech. rep ., Statistical Research Division , U.S. Bureau of the Census. Winkler, W. 1999. The state of record linkage and current research problems. Tech. rep., Statistical Research Division, U.S. Bureau of the Census."},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the IEEE International Conference on Data Engineering (ICDE\u201907)","author":"Yin X.","unstructured":"Yin , X. , Han , J. , and Yu , P. S . 2007. Object distinction: Distinguishing objects with identical names . In Proceedings of the IEEE International Conference on Data Engineering (ICDE\u201907) . 1242--1246. Yin, X., Han, J., and Yu, P. S. 2007. Object distinction: Distinguishing objects with identical names. In Proceedings of the IEEE International Conference on Data Engineering (ICDE\u201907). 1242--1246."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1321440.1321600"}],"container-title":["Journal of Data and Information Quality"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1891879.1891883","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1891879.1891883","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T10:59:40Z","timestamp":1750244380000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1891879.1891883"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,2]]},"references-count":36,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2011,2]]}},"alternative-id":["10.1145\/1891879.1891883"],"URL":"https:\/\/doi.org\/10.1145\/1891879.1891883","relation":{},"ISSN":["1936-1955","1936-1963"],"issn-type":[{"value":"1936-1955","type":"print"},{"value":"1936-1963","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,2]]},"assertion":[{"value":"2008-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-02-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}