{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,1]],"date-time":"2025-11-01T07:14:35Z","timestamp":1761981275643,"version":"build-2065373602"},"reference-count":34,"publisher":"Association for Computing Machinery (ACM)","issue":"7","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2013,5]]},"abstract":"<jats:p>Graphs have been widely used to model complex data in many real-world applications. Answering vertex join queries over large graphs is meaningful and interesting, which can benefit friend recommendation in social networks and link prediction, etc. In this paper, we adopt \"SimRank\" to evaluate the similarity of two vertices in a large graph because of its generality. Note that \"SimRank\" is purely structure dependent and it does not rely on the domain knowledge. Specifically, we define a SimRank-based join (SRJ) query to find all the vertex pairs satisfying the threshold in a data graph<jats:italic>G<\/jats:italic>. In order to reduce the search space, we propose an estimated shortest-path distance based upper bound for SimRank scores to prune unpromising vertex pairs. In the verification, we propose a novel index, called h-go cover, to efficiently compute the SimRank score of a single vertex pair. Given a graph<jats:italic>G<\/jats:italic>, we only materialize the SimRank scores of a small proportion of vertex pairs (called h-go covers), based on which, the SimRank score of any vertex pair can be computed easily. In order to handle large graphs, we extend our technique to the partition-based framework. Thorough theoretical analysis and extensive experiments over both real and synthetic datasets confirm the efficiency and effectiveness of our solution.<\/jats:p>","DOI":"10.14778\/2536349.2536350","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"493-504","source":"Crossref","is-referenced-by-count":43,"title":["Efficient simrank-based similarity join over large graphs"],"prefix":"10.14778","volume":"6","author":[{"given":"Weiguo","family":"Zheng","sequence":"first","affiliation":[{"name":"Peking University, China"}]},{"given":"Lei","family":"Zou","sequence":"additional","affiliation":[{"name":"Peking University, China"}]},{"given":"Yansong","family":"Feng","sequence":"additional","affiliation":[{"name":"Peking University, China"}]},{"given":"Lei","family":"Chen","sequence":"additional","affiliation":[{"name":"Hong Kong University of Science and Technology, China"}]},{"given":"Dongyan","family":"Zhao","sequence":"additional","affiliation":[{"name":"Peking University, China"}]}],"member":"320","published-online":{"date-parts":[[2013,5]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"139","volume-title":"WebKDD\/SNA-KDD","author":"Abbassi Z.","year":"2007","unstructured":"Z. Abbassi and V. S. Mirrokni . A recommender system based on local random walks and spectral methods . In WebKDD\/SNA-KDD , pages 139 - 153 , 2007 . Z. Abbassi and V. S. Mirrokni. A recommender system based on local random walks and spectral methods. In WebKDD\/SNA-KDD, pages 139-153, 2007."},{"volume-title":"nichomachean ethics. Rackman transl","year":"1934","key":"e_1_2_1_2_1","unstructured":"Aristotle. Rhetoric. nichomachean ethics. Rackman transl . Cambridge : Harvard Univ. Press , 23, 1934 . Aristotle. Rhetoric. nichomachean ethics. Rackman transl. Cambridge: Harvard Univ. Press, 23, 1934."},{"issue":"3","key":"e_1_2_1_3_1","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1007\/s00778-005-0177-1","article-title":"Optimization and evaluation of shortest path queries","volume":"16","author":"Chan E. P. F.","year":"2007","unstructured":"E. P. F. Chan and H. Lim . Optimization and evaluation of shortest path queries . VLDB J. , 16 ( 3 ): 343 - 369 , 2007 . E. P. F. Chan and H. Lim. Optimization and evaluation of shortest path queries. VLDB J., 16(3):343-369, 2007.","journal-title":"VLDB J."},{"key":"e_1_2_1_4_1","first-page":"913","volume-title":"ICDE","author":"Cheng J.","year":"2008","unstructured":"J. Cheng , J. X. Yu , B. Ding , P. S. Yu , and H. Wang . Fast graph pattern matching . In ICDE , pages 913 - 922 , 2008 . J. Cheng, J. X. Yu, B. Ding, P. S. Yu, and H. Wang. Fast graph pattern matching. In ICDE, pages 913-922, 2008."},{"volume-title":"Introduction to Algorithms","author":"Cormen T. H.","key":"e_1_2_1_5_1","unstructured":"T. H. Cormen , C. E. Leiserson , R. L. Rivest , and C. Stein . Introduction to Algorithms , Third Edition. The MIT Press . T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, Third Edition. The MIT Press."},{"issue":"1","key":"e_1_2_1_6_1","doi-asserted-by":"crossref","first-page":"439","DOI":"10.4007\/annals.2005.162.439","article-title":"On the hardness of approximating minimum vertex cover","volume":"162","author":"Dinur I.","year":"2005","unstructured":"I. Dinur and S. Safra . On the hardness of approximating minimum vertex cover . Annals of Mathematics , 162 ( 1 ): 439 - 485 , 2005 . I. Dinur and S. Safra. On the hardness of approximating minimum vertex cover. Annals of Mathematics, 162(1):439-485, 2005.","journal-title":"Annals of Mathematics"},{"key":"e_1_2_1_7_1","first-page":"313","volume-title":"ICDE","author":"Fan W.","year":"2011","unstructured":"W. Fan , J. Li , S. Ma , N. Tang , and Y. Wu . Adding regular expressions to graph reachability and pattern queries . In ICDE , pages 313 - 338 , 2011 . W. Fan, J. Li, S. Ma, N. Tang, and Y. Wu. Adding regular expressions to graph reachability and pattern queries. In ICDE, pages 313-338, 2011."},{"key":"e_1_2_1_8_1","first-page":"641","volume-title":"WWW","author":"Fogaras D.","year":"2005","unstructured":"D. Fogaras and B. R\u00e1cz . Scaling link-based similarity search . In WWW , pages 641 - 650 , 2005 . D. Fogaras and B. R\u00e1cz. Scaling link-based similarity search. In WWW, pages 641-650, 2005."},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1007\/978-1-4612-0623-1_3","volume-title":"Numerical Linear Algebra for Applications in Statistics","author":"Gentle J. E.","year":"1998","unstructured":"J. E. Gentle . Gaussian elimination . In Numerical Linear Algebra for Applications in Statistics , pages 87 - 91 , 1998 . J. E. Gentle. Gaussian elimination. In Numerical Linear Algebra for Applications in Statistics, pages 87-91, 1998."},{"key":"e_1_2_1_10_1","doi-asserted-by":"crossref","first-page":"538","DOI":"10.1145\/775047.775126","volume-title":"KDD","author":"Jeh G.","year":"2002","unstructured":"G. Jeh and J. Widom . Simrank: a measure of structural-context similarity . In KDD , pages 538 - 543 , 2002 . G. Jeh and J. Widom. Simrank: a measure of structural-context similarity. In KDD, pages 538-543, 2002."},{"issue":"4","key":"e_1_2_1_12_1","doi-asserted-by":"crossref","DOI":"10.1145\/1597036.1597045","article-title":"A better approximation ratio for the vertex cover problem","volume":"5","author":"Karakostas G.","year":"2009","unstructured":"G. Karakostas . A better approximation ratio for the vertex cover problem . ACM Transactions on Algorithms , 5 ( 4 ), 2009 . G. Karakostas. A better approximation ratio for the vertex cover problem. ACM Transactions on Algorithms, 5(4), 2009.","journal-title":"ACM Transactions on Algorithms"},{"issue":"1","key":"e_1_2_1_13_1","doi-asserted-by":"crossref","DOI":"10.1006\/jpdc.1997.1404","article-title":"Multilevel k-way partitioning scheme for irregular graphs","volume":"48","author":"Karypis G.","year":"1998","unstructured":"G. Karypis and V. Kumar . Multilevel k-way partitioning scheme for irregular graphs . J. Parallel Distrib. Comput. , 48 ( 1 ), 1998 . G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput., 48(1), 1998.","journal-title":"J. Parallel Distrib. Comput."},{"issue":"4","key":"e_1_2_1_14_1","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1016\/0020-0271(63)90016-0","article-title":"Bibliographic coupling extended in time: Ten case histories","volume":"1","author":"Kessler M. M.","year":"1963","unstructured":"M. M. Kessler . Bibliographic coupling extended in time: Ten case histories . Information Storage and Retrieval , 1 ( 4 ): 169 - 187 , 1963 . M. M. Kessler. Bibliographic coupling extended in time: Ten case histories. Information Storage and Retrieval, 1(4):169-187, 1963.","journal-title":"Information Storage and Retrieval"},{"key":"e_1_2_1_15_1","first-page":"901","volume-title":"SIGMOD Conference","author":"Khan A.","year":"2011","unstructured":"A. Khan , N. Li , X. Yan , Z. Guan , S. Chakraborty , and S. Tao . Neighborhood based fast graph search in large networks . In SIGMOD Conference , pages 901 - 912 , 2011 . A. Khan, N. Li, X. Yan, Z. Guan, S. Chakraborty, and S. Tao. Neighborhood based fast graph search in large networks. In SIGMOD Conference, pages 901-912, 2011."},{"issue":"3","key":"e_1_2_1_16_1","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1016\/j.jcss.2007.06.019","article-title":"Vertex cover might be hard to approximate to within 2-epsilon","volume":"74","author":"Khot S.","year":"2008","unstructured":"S. Khot and O. Regev . Vertex cover might be hard to approximate to within 2-epsilon . J. Comput. Syst. Sci. , 74 ( 3 ): 335 - 349 , 2008 . S. Khot and O. Regev. Vertex cover might be hard to approximate to within 2-epsilon. J. Comput. Syst. Sci., 74(3):335-349, 2008.","journal-title":"J. Comput. Syst. Sci."},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1145\/1739041.1739098","volume-title":"EDBT","author":"Li C.","year":"2010","unstructured":"C. Li , J. Han , G. He , X. Jin , Y. Sun , Y. Yu , and T. Wu . Fast computation of simrank for static and dynamic information networks . In EDBT , pages 465 - 476 , 2010 . C. Li, J. Han, G. He, X. Jin, Y. Sun, Y. Yu, and T. Wu. Fast computation of simrank for static and dynamic information networks. In EDBT, pages 465-476, 2010."},{"key":"e_1_2_1_18_1","first-page":"389","volume-title":"PAKDD","author":"Li P.","year":"2009","unstructured":"P. Li , Y. Cai , H. Liu , J. He , and X. Du . Exploiting the block structure of link graph for efficient similarity computation . In PAKDD , pages 389 - 400 , 2009 . P. Li, Y. Cai, H. Liu, J. He, and X. Du. Exploiting the block structure of link graph for efficient similarity computation. In PAKDD, pages 389-400, 2009."},{"key":"e_1_2_1_19_1","first-page":"571","volume-title":"SDM","author":"Li P.","year":"2010","unstructured":"P. Li , H. Liu , J. X. Yu , J. He , and X. Du . Fast single-pair simrank computation . In SDM , pages 571 - 582 , 2010 . P. Li, H. Liu, J. X. Yu, J. He, and X. Du. Fast single-pair simrank computation. In SDM, pages 571-582, 2010."},{"key":"e_1_2_1_20_1","doi-asserted-by":"crossref","first-page":"1019","DOI":"10.1002\/asi.20591","article-title":"The link-prediction problem for social networks","volume":"58","author":"Liben-Nowell D.","year":"2007","unstructured":"D. Liben-Nowell and J. Kleinberg . The link-prediction problem for social networks . Journal of the American Society for Information Science and Technology , 58 : 1019 - 1031 , 2007 . D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58:1019-1031, 2007.","journal-title":"Journal of the American Society for Information Science and Technology"},{"issue":"1","key":"e_1_2_1_21_1","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1007\/s00778-009-0168-8","article-title":"Accuracy estimate and optimization techniques for simrank computation","volume":"19","author":"Lizorkin D.","year":"2010","unstructured":"D. Lizorkin , P. Velikhov , M. N. Grinev , and D. Turdakov . Accuracy estimate and optimization techniques for simrank computation . VLDB J. , 19 ( 1 ): 45 - 66 , 2010 . D. Lizorkin, P. Velikhov, M. N. Grinev, and D. Turdakov. Accuracy estimate and optimization techniques for simrank computation. VLDB J., 19(1):45-66, 2010.","journal-title":"VLDB J."},{"key":"e_1_2_1_22_1","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1023\/A:1009953814988","article-title":"Automating the construction of internet portals with machine learning","volume":"3","author":"McCallum A.","year":"2000","unstructured":"A. McCallum , K. Nigam , J. Rennie , and K. Seymore . Automating the construction of internet portals with machine learning . Information Retrieval Journal , 3 : 127 - 163 , 2000 . A. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning. Information Retrieval Journal, 3:127-163, 2000.","journal-title":"Information Retrieval Journal"},{"key":"e_1_2_1_23_1","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1146\/annurev.soc.27.1.415","article-title":"Birds of a feather: Homophily in social networks","volume":"27","author":"McPherson M.","year":"2001","unstructured":"M. McPherson , L. S. Lovin , and J. M. Cook . Birds of a feather: Homophily in social networks . Annual Review of Sociology , 27 : 415 - 444 , 2001 . M. McPherson, L. S. Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27:415-444, 2001.","journal-title":"Annual Review of Sociology"},{"volume-title":"plato in twelve volumes. Bury translator","year":"1968","key":"e_1_2_1_24_1","unstructured":"Plato. Laws. plato in twelve volumes. Bury translator . Cambridge : Harvard Univ. Press , 11, 1968 . Plato. Laws. plato in twelve volumes. Bury translator. Cambridge: Harvard Univ. Press, 11, 1968."},{"key":"e_1_2_1_25_1","first-page":"867","volume-title":"CIKM","author":"Potamias M.","year":"2009","unstructured":"M. Potamias , F. Bonchi , C. Castillo , and A. Gionis . Fast shortest path distance estimation in large networks . In CIKM , pages 867 - 876 , 2009 . M. Potamias, F. Bonchi, C. Castillo, and A. Gionis. Fast shortest path distance estimation in large networks. In CIKM, pages 867-876, 2009."},{"key":"e_1_2_1_26_1","first-page":"47","article-title":"Statistical mechanics of complex networks","volume":"74","author":"Albert R\u00e9ka","year":"2002","unstructured":"R\u00e9ka Albert and Albert-L\u00e1szl\u00f3 Barab\u00e1si . Statistical mechanics of complex networks . Reviews of Mordern Physics , 74 : 47 - 97 , 2002 . R\u00e9ka Albert and Albert-L\u00e1szl\u00f3 Barab\u00e1si. Statistical mechanics of complex networks. Reviews of Mordern Physics, 74:47-97, 2002.","journal-title":"Reviews of Mordern Physics"},{"key":"e_1_2_1_27_1","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1002\/asi.4630240406","article-title":"Co-citation in the scientific literature: A new measure of the relationship between two documents","volume":"24","author":"Small H.","year":"1973","unstructured":"H. Small . Co-citation in the scientific literature: A new measure of the relationship between two documents . Journal of the American Society for Information Science , 24 : 265 - 269 , 1973 . H. Small. Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24:265-269, 1973.","journal-title":"Journal of the American Society for Information Science"},{"issue":"11","key":"e_1_2_1_28_1","first-page":"714","article-title":"On link-based similarity join","volume":"4","author":"Sun L.","year":"2011","unstructured":"L. Sun , R. Cheng , X. Li , D. W. Cheung , and J. Han . On link-based similarity join . PVLDB , 4 ( 11 ): 714 - 725 , 2011 . L. Sun, R. Cheng, X. Li, D. W. Cheung, and J. Han. On link-based similarity join. PVLDB, 4(11):714-725, 2011.","journal-title":"PVLDB"},{"key":"e_1_2_1_29_1","first-page":"1785","volume-title":"CIKM","author":"Tretyakov K.","year":"2011","unstructured":"K. Tretyakov , A. Armas-Cervantes , L. Garc\u00eda-Ba\u00f1uelos , J. Vilo , and M. Dumas . Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs . In CIKM , pages 1785 - 1794 , 2011 . K. Tretyakov, A. Armas-Cervantes, L. Garc\u00eda-Ba\u00f1uelos, J. Vilo, and M. Dumas. Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs. In CIKM, pages 1785-1794, 2011."},{"key":"e_1_2_1_30_1","doi-asserted-by":"crossref","first-page":"845","DOI":"10.1145\/1247480.1247573","volume-title":"SIGMOD","author":"Tri\u00dfl S.","year":"2007","unstructured":"S. Tri\u00dfl and U. Leser . Fast and practical indexing and querying of very large graphs . In SIGMOD , pages 845 - 856 , 2007 . S. Tri\u00dfl and U. Leser. Fast and practical indexing and querying of very large graphs. In SIGMOD, pages 845-856, 2007."},{"key":"e_1_2_1_31_1","first-page":"845","volume-title":"ICDE","author":"Wang H.","year":"2006","unstructured":"H. Wang , H. He , J. Yang , P. S. Yu , and J. X. Yu . Dual labeling: Answering graph reachability queries in constant time . In ICDE , pages 845 - 856 , 2006 . H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In ICDE, pages 845-856, 2006."},{"key":"e_1_2_1_32_1","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1145\/1516360.1516384","volume-title":"EDBT","author":"Zhang S.","year":"2009","unstructured":"S. Zhang , S. Li , and J. Yang . Gaddi: distance index based subgraph matching in biological networks . In EDBT , pages 192 - 203 , 2009 . S. Zhang, S. Li, and J. Yang. Gaddi: distance index based subgraph matching in biological networks. In EDBT, pages 192-203, 2009."},{"issue":"1","key":"e_1_2_1_33_1","first-page":"1185","article-title":"Sapper: Subgraph indexing and approximate matching in large graphs","volume":"3","author":"Zhang S.","year":"2010","unstructured":"S. Zhang , J. Yang , and W. Jin . Sapper: Subgraph indexing and approximate matching in large graphs . PVLDB , 3 ( 1 ): 1185 - 1194 , 2010 . S. Zhang, J. Yang, and W. Jin. Sapper: Subgraph indexing and approximate matching in large graphs. PVLDB, 3(1):1185-1194, 2010.","journal-title":"PVLDB"},{"issue":"1","key":"e_1_2_1_34_1","first-page":"340","article-title":"On graph query optimization in large networks","volume":"3","author":"Zhao P.","year":"2010","unstructured":"P. Zhao and J. Han . On graph query optimization in large networks . PVLDB , 3 ( 1 ): 340 - 351 , 2010 . P. Zhao and J. Han. On graph query optimization in large networks. PVLDB, 3(1):340-351, 2010.","journal-title":"PVLDB"},{"issue":"1","key":"e_1_2_1_35_1","first-page":"886","article-title":"Pattern match query in a large graph database","volume":"2","author":"Zou L.","year":"2009","unstructured":"L. Zou , L. Chen , and M. T. \u00d6zsu . Distancejoin : Pattern match query in a large graph database . PVLDB , 2 ( 1 ): 886 - 897 , 2009 . L. Zou, L. Chen, and M. T. \u00d6zsu. Distancejoin: Pattern match query in a large graph database. PVLDB, 2(1):886-897, 2009.","journal-title":"PVLDB"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2536349.2536350","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,26]],"date-time":"2023-07-26T23:57:37Z","timestamp":1690415857000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2536349.2536350"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,5]]},"references-count":34,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2013,5]]}},"alternative-id":["10.14778\/2536349.2536350"],"URL":"https:\/\/doi.org\/10.14778\/2536349.2536350","relation":{},"ISSN":["2150-8097"],"issn-type":[{"type":"print","value":"2150-8097"}],"subject":[],"published":{"date-parts":[[2013,5]]}}}