{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,4]],"date-time":"2026-03-04T05:45:20Z","timestamp":1772603120793,"version":"3.50.1"},"reference-count":59,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2017,8,21]],"date-time":"2017-08-21T00:00:00Z","timestamp":1503273600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2018,4,30]]},"abstract":"<jats:p>\n            Similarity search is a fundamental problem in network analysis and can be applied in many applications, such as collaborator recommendation in coauthor networks, friend recommendation in social networks, and relation prediction in medical information networks. In this article, we propose a sampling-based method using random paths to estimate the similarities based on both common neighbors and structural contexts efficiently in very large homogeneous or heterogeneous information networks. We give a theoretical guarantee that the sampling size depends on the error-bound \u03b5, the confidence level (1-\u03b4), and the path length\n            <jats:italic>T<\/jats:italic>\n            of each random walk. We perform an extensive empirical study on a Tencent microblogging network of 1,000,000,000 edges. We show that our algorithm can return top-\n            <jats:italic>k<\/jats:italic>\n            similar vertices for any vertex in a network 300\u00d7 faster than the state-of-the-art methods. We develop a prototype system of recommending similar authors to demonstrate the effectiveness of our method.\n          <\/jats:p>","DOI":"10.1145\/3086695","type":"journal-article","created":{"date-parts":[[2017,8,24]],"date-time":"2017-08-24T11:49:04Z","timestamp":1503575344000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":20,"title":["Fast and Flexible Top-\n            <i>k<\/i>\n            Similarity Search on Large Networks"],"prefix":"10.1145","volume":"36","author":[{"given":"Jing","family":"Zhang","sequence":"first","affiliation":[{"name":"Tsinghua University, Renmin University of China"}]},{"given":"Jie","family":"Tang","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}]},{"given":"Cong","family":"Ma","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}]},{"given":"Hanghang","family":"Tong","sequence":"additional","affiliation":[{"name":"Arizona State University"}]},{"given":"Yu","family":"Jing","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}]},{"given":"Juanzi","family":"Li","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}]},{"given":"Walter","family":"Luyten","sequence":"additional","affiliation":[{"name":"KU Leuven, Leuven, Belgium"}]},{"given":"Marie-Francine","family":"Moens","sequence":"additional","affiliation":[{"name":"KU Leuven, Leuven, Belgium"}]}],"member":"320","published-online":{"date-parts":[[2017,8,21]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2011.5767885"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2623330.2623757"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020576"},{"key":"e_1_2_2_4_1","volume-title":"Modern Information Retrieval","author":"Baeza-Yates Ricardo","unstructured":"Ricardo Baeza-Yates and Berthier Ribeiro-Neto . 1999. Modern Information Retrieval . Vol. 463 . ACM , New York, NY . Ricardo Baeza-Yates and Berthier Ribeiro-Neto. 1999. Modern Information Retrieval. Vol. 463. ACM, New York, NY."},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1137\/S0036144502415960"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1142351.1142388"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/0378-8733(90)90023-3"},{"key":"e_1_2_2_8_1","volume-title":"Structural Holes: The Social Structure of Competition","author":"Burt Ronald S.","year":"2009","unstructured":"Ronald S. Burt . 2009 . Structural Holes: The Social Structure of Competition . Harvard University Press , Cambridge, MA . Ronald S. Burt. 2009. Structural Holes: The Social Structure of Competition. Harvard University Press, Cambridge, MA."},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1002574"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065167.1065201"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2623330.2623703"},{"key":"e_1_2_2_12_1","doi-asserted-by":"crossref","unstructured":"Nick Duffield Yunhong Xu Liangzhen Xia Nesreen Ahmed and Minlan Yu. 2017. Stream aggregation through order samplingarXiv:1703.02693.  Nick Duffield Yunhong Xu Liangzhen Xia Nesreen Ahmed and Minlan Yu. 2017. Stream aggregation through order samplingarXiv:1703.02693.","DOI":"10.1145\/3132847.3133042"},{"key":"e_1_2_2_13_1","volume-title":"An Introduction to Probability Theory and its Applications","author":"Feller William","unstructured":"William Feller . 2008. An Introduction to Probability Theory and its Applications . Vol. 2 . John Wiley 8 Sons. William Feller. 2008. An Introduction to Probability Theory and its Applications. Vol. 2. John Wiley 8 Sons."},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.2307\/3033543"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/0378-8733(78)90021-7"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2463717"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0701361104"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TST.2016.7399279"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2339530.2339723"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020512"},{"key":"e_1_2_2_21_1","article-title":"An exponential family of probability distributions for directed graphs","volume":"76","author":"Holland Paul W.","year":"1981","unstructured":"Paul W. Holland and Samuel Leinhardt . 1981 . An exponential family of probability distributions for directed graphs . Journal of the American Statistical Association 76 , 373, 33--50. Paul W. Holland and Samuel Leinhardt. 1981. An exponential family of probability distributions for directed graphs. Journal of the American Statistical Association 76, 373, 33--50.","journal-title":"Journal of the American Statistical Association"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2063576.2063740"},{"key":"e_1_2_2_23_1","first-page":"547","article-title":"\u00c9tude comparative de le distribution florale dans une portion de alpes et du jura","volume":"37","author":"Jaccard Paul","year":"1901","unstructured":"Paul Jaccard . 1901 . \u00c9tude comparative de le distribution florale dans une portion de alpes et du jura . Bulletin de la Soci\u00e9t\u00e9 Vaudoise des Sciences Naturelles 37 , 547 -- 579 . Paul Jaccard. 1901. \u00c9tude comparative de le distribution florale dans une portion de alpes et du jura. Bulletin de la Soci\u00e9t\u00e9 Vaudoise des Sciences Naturelles 37, 547--579.","journal-title":"Bulletin de la Soci\u00e9t\u00e9 Vaudoise des Sciences Naturelles"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775126"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2487575.2487678"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2736277.2741101"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020561"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bth163"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF02289026"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.5090140103"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11432-016-0074-9"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2610526"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2012.109"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.73.026120"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783285"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2488388.2488461"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btu269"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.74.036104"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1014052.1014135"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.14778\/2556549.2556569"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2013.6691744"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2396761.2398454"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2556195.2556224"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2014.2349913"},{"key":"e_1_2_2_45_1","volume-title":"Ahmed","author":"Rossi Ryan A.","year":"2017","unstructured":"Ryan A. Rossi , Rong Zhou , and Nesreen K . Ahmed . 2017 . Estimation of graphlet statisticsarXiv:1701.01772. Ryan A. Rossi, Rong Zhou, and Nesreen K. Ahmed. 2017. Estimation of graphlet statisticsarXiv:1701.01772."},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1835804.1835871"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/1970392.1970397"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2013.2297920"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.4630240406"},{"key":"e_1_2_2_50_1","volume-title":"Proceedings of the 2011 VLDB Conference (VLDB\u201911)","author":"Sun Yizhou","year":"2011","unstructured":"Yizhou Sun , Jiawei Han , Xifeng Yan , Philip S. Yu , and Tianyi Wu . 2011 . PathSim: Meta path-based top-k similarity search in heterogeneous information networks . In Proceedings of the 2011 VLDB Conference (VLDB\u201911) . 992--1003. Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, and Tianyi Wu. 2011. PathSim: Meta path-based top-k similarity search in heterogeneous information networks. In Proceedings of the 2011 VLDB Conference (VLDB\u201911). 992--1003."},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/1401890.1402008"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2006.70"},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1080\/15427951.2013.836581"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1137\/1116025"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/RT.2006.280216"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCBB.2006.51"},{"key":"e_1_2_2_57_1","volume-title":"Proceedings of the 2014 AAAI Conference (AAAI\u201914)","author":"Yang Yang","year":"2014","unstructured":"Yang Yang , Jie Tang , Cane Wing-Ki Leung , Yizhou Sun , Qicong Chen , Juanzi Li , and Qiang Yang . 2014 . RAIN: Social role-aware information diffusion . In Proceedings of the 2014 AAAI Conference (AAAI\u201914) . 367--373. Yang Yang, Jie Tang, Cane Wing-Ki Leung, Yizhou Sun, Qicong Chen, Juanzi Li, and Qiang Yang. 2014. RAIN: Social role-aware information diffusion. In Proceedings of the 2014 AAAI Conference (AAAI\u201914). 367--373."},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2014.2373385"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783267"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3086695","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3086695","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:30:13Z","timestamp":1750217413000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3086695"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,8,21]]},"references-count":59,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2018,4,30]]}},"alternative-id":["10.1145\/3086695"],"URL":"https:\/\/doi.org\/10.1145\/3086695","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"value":"1046-8188","type":"print"},{"value":"1558-2868","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,8,21]]},"assertion":[{"value":"2016-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-08-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}