{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,24]],"date-time":"2024-07-24T15:53:43Z","timestamp":1721836423997},"reference-count":27,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2014,8]]},"abstract":"<jats:p>Title matching refers roughly to the following problem. We are given two strings of text obtained from different data sources. The texts refer to some underlying physical entities and the problem is to report whether the two strings refer to the same physical entity or not. There are manifestations of this problem in a variety of domains, such as product or bibliography matching, and location or person disambiguation.<\/jats:p>\n          <jats:p>We propose a new approach to solving this problem, consisting of two main components. The first component uses Web searches to \"enrich\" the given pair of titles: making titles that refer to the same physical entity more similar, and those which do not, much less similar. A notion of similarity is then measured using the second component, where the tokens from the two titles are modelled as vertices of a \"social\" network graph. A \"strength of ties\" style of clustering algorithm is then applied on this to see whether they form one cohesive \"community\" (matching titles), or separately clustered communities (mismatching titles). Experimental results confirm the effectiveness of our approach over existing title matching methods across several input domains.<\/jats:p>","DOI":"10.14778\/2732977.2732990","type":"journal-article","created":{"date-parts":[[2015,5,12]],"date-time":"2015-05-12T15:37:52Z","timestamp":1431445072000},"page":"1167-1178","source":"Crossref","is-referenced-by-count":14,"title":["Matching titles with cross title web-search enrichment and community detection"],"prefix":"10.14778","volume":"7","author":[{"given":"Nikhil","family":"Londhe","sequence":"first","affiliation":[{"name":"State University of New York, Buffalo"}]},{"given":"Vishrawas","family":"Gopalakrishnan","sequence":"additional","affiliation":[{"name":"State University of New York, Buffalo"}]},{"given":"Aidong","family":"Zhang","sequence":"additional","affiliation":[{"name":"State University of New York, Buffalo"}]},{"given":"Hung Q.","family":"Ngo","sequence":"additional","affiliation":[{"name":"State University of New York, Buffalo"}]},{"given":"Rohini","family":"Srihari","sequence":"additional","affiliation":[{"name":"State University of New York, Buffalo"}]}],"member":"320","published-online":{"date-parts":[[2014,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242591"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-008-0098-x"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1217299.1217304"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1088\/1742-5468\/2008\/10\/P10008"},{"issue":"3","key":"e_1_2_1_5_1","first-page":"41","article-title":"A survey of entity resolution and record linkage methodologies","volume":"6","author":"Brizan D. G.","year":"2006","unstructured":"D. G. Brizan and A. U. Tansel . A survey of entity resolution and record linkage methodologies . Communications of the IIMA , 6 ( 3 ): 41 -- 50 , 2006 . D. G. Brizan and A. U. Tansel. A survey of entity resolution and record linkage methodologies. Communications of the IIMA, 6(3): 41--50, 2006.","journal-title":"Communications of the IIMA"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526778"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2006.9"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526731"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687673"},{"key":"e_1_2_1_10_1","volume-title":"SimilarWeb Blog","author":"Buchuk Daniel","year":"2013","unstructured":"Daniel Buchuk . Uk online porn ban: Web traffic analysis of britain's porn affair . SimilarWeb Blog , 2013 . Daniel Buchuk. Uk online porn ban: Web traffic analysis of britain's porn affair. SimilarWeb Blog, 2013."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.2105\/AJPH.36.12.1412"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.9"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1969.10501049"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0605965104"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.122653799"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2396761.2396839"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-87696-0_29"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020474"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920904"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2247596.2247662"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2487575.2506179"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(88)90021-0"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775087"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2484028.2484205"},{"key":"e_1_2_1_25_1","volume-title":"Mathematics and the internet: A source of enormous confusion and great potential","author":"Willinger W.","year":"2009","unstructured":"W. Willinger , D. Alderson , and J. C. Doyle . Mathematics and the internet: A source of enormous confusion and great potential . Defense Technical Information Center , 2009 . W. Willinger, D. Alderson, and J. C. Doyle. Mathematics and the internet: A source of enormous confusion and great potential. Defense Technical Information Center, 2009."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2000824.2000825"},{"key":"e_1_2_1_27_1","volume-title":"WWW 2007 Workshop on Query Log Analysis: Social And Technological Challenges","author":"Zhang W. V.","year":"2007","unstructured":"W. V. Zhang and R. Jones . Comparing click logs and editorial labels for training query rewriting . In WWW 2007 Workshop on Query Log Analysis: Social And Technological Challenges , 2007 . W. V. Zhang and R. Jones. Comparing click logs and editorial labels for training query rewriting. In WWW 2007 Workshop on Query Log Analysis: Social And Technological Challenges, 2007."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2732977.2732990","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:21:45Z","timestamp":1672226505000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2732977.2732990"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,8]]},"references-count":27,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2014,8]]}},"alternative-id":["10.14778\/2732977.2732990"],"URL":"https:\/\/doi.org\/10.14778\/2732977.2732990","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2014,8]]}}}