{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T19:49:01Z","timestamp":1774986541036,"version":"3.50.1"},"reference-count":64,"publisher":"Association for Computing Machinery (ACM)","issue":"3","funder":[{"name":"AISG","award":["AISG2-TC-2021-002"],"award-info":[{"award-number":["AISG2-TC-2021-002"]}]},{"DOI":"10.13039\/501100001459","name":"the Ministry of Education, Singapore","doi-asserted-by":"crossref","award":["MOE2019-T2-2-065"],"award-info":[{"award-number":["MOE2019-T2-2-065"]}],"id":[{"id":"10.13039\/501100001459","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001459","name":"the Singapore Ministry of Education","doi-asserted-by":"crossref","award":["23-SIS-SMU-063"],"award-info":[{"award-number":["23-SIS-SMU-063"]}],"id":[{"id":"10.13039\/501100001459","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,6,17]]},"abstract":"<jats:p>Community detection in heterogeneous information networks (HINs) poses significant challenges due to the diversity of entity types and the complexity of their interrelations. While traditional algorithms may perform adequately in some scenarios, many struggle with the high memory usage and computational demands of large-scale HINs. To address these challenges, we introduce a novel framework, SCAR, which efficiently uncovers community structures in HINs without requiring network materialization. SCAR leverages insights from meta-paths to interpret multi-relational data through compact vertex-based sketches, significantly reducing computational overhead and materialization overhead. We propose a sketch-based technique for estimating changes in modularity, improving both the precision and speed in community detection. Our extensive evaluations on diverse real-world datasets provide detailed comparative metrics, demonstrating that SCAR outperforms several state-of-the-art methods, including Gdy, Louvain, Leiden, Infomap, Walktrap, and Networkit, in execution time and memory consumption while maintaining competitive accuracy. Overall, SCAR offers a robust and scalable solution for revealing community structures in large HINs, with applications across various domains, including social networks, academic collaboration networks, and e-commerce platforms.<\/jats:p>","DOI":"10.1145\/3725276","type":"journal-article","created":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T21:23:29Z","timestamp":1750281809000},"page":"1-27","source":"Crossref","is-referenced-by-count":1,"title":["Community Detection in Heterogeneous Information Networks Without Materialization"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8748-3225","authenticated-orcid":false,"given":"Jiaxin","family":"Jiang","sequence":"first","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-8243-3947","authenticated-orcid":false,"given":"Siyuan","family":"Yao","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-2301-4244","authenticated-orcid":false,"given":"Yuhang","family":"Chen","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8618-4581","authenticated-orcid":false,"given":"Bingsheng","family":"He","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5434-8577","authenticated-orcid":false,"given":"Yudong","family":"Niu","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9646-291X","authenticated-orcid":false,"given":"Yuchen","family":"Li","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4060-9438","authenticated-orcid":false,"given":"Shixuan","family":"Sun","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3440-9675","authenticated-orcid":false,"given":"Yongchao","family":"Liu","sequence":"additional","affiliation":[{"name":"Ant Group, Hangzhou, Zhejiang, China"}]}],"member":"320","published-online":{"date-parts":[[2025,6,18]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457256"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-25952-7_28"},{"key":"e_1_2_1_3_1","volume-title":"Fast unfolding of communities in large networks: 15 years later. arXiv preprint arXiv:2311.06047","author":"Blondel Vincent","year":"2023","unstructured":"Vincent Blondel, Jean-Loup Guillaume, and Renaud Lambiotte. 2023. Fast unfolding of communities in large networks: 15 years later. arXiv preprint arXiv:2311.06047 (2023)."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1088\/1742-5468\/2008\/10\/P10008"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-44436-X_10"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3543507.3583322"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-30952-7_50"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.14778\/3681954.3682028"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.70.066111"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1006\/jcss.1997.1534"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1281100.1281133"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1088\/1742-5468\/2005\/09\/P09008"},{"key":"e_1_2_1_13_1","volume-title":"Order statistics","author":"David Herbert A","unstructured":"Herbert A David and Haikady N Nagaraja. 2004. Order statistics. John Wiley & Sons."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.14778\/3380750.3380756"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.14778\/3407790.3407797"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE60146.2024.00189"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939747"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2020.3027950"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2610495"},{"key":"e_1_2_1_20_1","volume-title":"PP-DBLP: Modeling and Generating Attributed Public-Private Networks with DBLP","author":"Huang Xin","unstructured":"Xin Huang, Jiaxin Jiang, Byron Choi, Jianliang Xu, Zhiwei Zhang, and Yunya Song. 2018. PP-DBLP: Modeling and Generating Attributed Public-Private Networks with DBLP. In ICDM. IEEE."},{"key":"e_1_2_1_21_1","volume-title":"Mucha","author":"Jeub Lucas G. S.","year":"2017","unstructured":"Lucas G. S. Jeub, Marya Bazzi, Inderjit S. Jutla, and Peter J. Mucha. 2017. A generalized Louvain method for community detection implemented in MATLAB. https:\/\/github.com\/GenLouvain\/GenLouvain. Accessed: insert-date-here."},{"key":"e_1_2_1_22_1","volume-title":"Spade: A Generic Real-Time Fraud Detection Framework on Dynamic Graphs","author":"Jiang Jiaxin","year":"2024","unstructured":"Jiaxin Jiang, Yuhang Chen, Bingsheng He, Min Chen, and Jia Chen. 2024. Spade: A Generic Real-Time Fraud Detection Framework on Dynamic Graphs. IEEE Transactions on Knowledge and Data Engineering (2024)."},{"key":"e_1_2_1_23_1","volume-title":"PPKWS: An Efficient Framework for Keyword Search on Public-Private Networks.","author":"Jiang Jiaxin","year":"2020","unstructured":"Jiaxin Jiang, Xin Huang, Byron Choi, Jianliang Xu, Sourav S. Bhowmick, and Lyu Xu. 2020. PPKWS: An Efficient Framework for Keyword Search on Public-Private Networks."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.14778\/3570690.3570696"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-020-00649-y"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.14778\/2850469.2850471"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the European Conference on Machinc Learning (ECML).","author":"Klimat B","year":"2004","unstructured":"B Klimat. 2004. The enron corpus: A new datasct tor email classification rcscarch. In Proceedings of the European Conference on Machinc Learning (ECML). 2004."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.physrep.2019.10.004"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3132847.3132960"},{"key":"e_1_2_1_30_1","volume-title":"International Conference on Extending Database Technology","volume":"2015","author":"Kuck Jonathan","year":"2015","unstructured":"Jonathan Kuck, Honglei Zhuang, Xifeng Yan, Hasan Cam, and Jiawei Han. 2015. Query-based outlier detection in heterogeneous information networks. In Advances in database technology: proceedings. International Conference on Extending Database Technology, Vol. 2015. NIH Public Access, 325."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3523275"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.14778\/3407790.3407843"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE51399.2021.00084"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.14778\/3529337.3529340"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.3034214"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389697"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-44853-5_10"},{"key":"e_1_2_1_38_1","volume-title":"Multivariate chebyshev inequalities. The Annals of Mathematical Statistics","author":"Marshall Albert W","year":"1960","unstructured":"Albert W Marshall and Ingram Olkin. 1960. Multivariate chebyshev inequalities. The Annals of Mathematical Statistics (1960), 1001--1014."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2019\/435"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0601602103"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3180143"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings 20","author":"Pons Pascal","year":"2005","unstructured":"Pascal Pons and Matthieu Latapy. 2005. Computing communities in large networks using random walks. In Computer and Information Sciences-ISCIS 2005: 20th International Symposium, Istanbul, Turkey, October 26--28, 2005. Proceedings 20. Springer, 284--293."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0706851105"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSS.2020.2964197"},{"key":"e_1_2_1_45_1","volume-title":"A real-time detecting algorithm for tracking community structure of dynamic networks. arXiv preprint arXiv:1407.2683","author":"Shang Jiaxing","year":"2014","unstructured":"Jiaxing Shang, Lianchen Liu, Feng Xie, Zhen Chen, Jiajia Miao, Xuelin Fang, and Cheng Wu. 2014. A real-time detecting algorithm for tracking community structure of dynamic networks. arXiv preprint arXiv:1407.2683 (2014)."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2016.2598561"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2806416.2806528"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098087"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2011.61"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1017\/nws.2016.20"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.14778\/3402707.3402736"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.14778\/3554821.3554901"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/2500492"},{"key":"e_1_2_1_54_1","volume-title":"Community detection via heterogeneous interaction analysis. Data mining and knowledge discovery","author":"Tang Lei","year":"2012","unstructured":"Lei Tang, Xufei Wang, and Huan Liu. 2012. Community detection via heterogeneous interaction analysis. Data mining and knowledge discovery, Vol. 25 (2012), 1--33."},{"key":"e_1_2_1_55_1","volume-title":"From Louvain to","author":"Traag Vincent A","year":"2019","unstructured":"Vincent A Traag, Ludo Waltman, and Nees Jan Van Eck. 2019. From Louvain to Leiden: guaranteeing well-connected communities. Scientific reports, Vol. 9, 1 (2019), 5233."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/2736277.2741098"},{"key":"e_1_2_1_57_1","volume-title":"Proceedings of the 17th International Conference on Database Theory (ICDT). OpenProceedings.org, 96--106","author":"Veldhuizen Todd L.","year":"2014","unstructured":"Todd L. Veldhuizen. 2014. Triejoin: A Simple, Worst-Case Optimal Join Algorithm. In Proceedings of the 17th International Conference on Database Theory (ICDT). OpenProceedings.org, 96--106."},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE48307.2020.00083"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654982"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3452774"},{"key":"e_1_2_1_61_1","first-page":"601","article-title":"Modularity based community detection in heterogeneous networks","volume":"30","author":"Zhang Jingfei","year":"2020","unstructured":"Jingfei Zhang and Yuguo Chen. 2020. Modularity based community detection in heterogeneous networks. Statistica Sinica, Vol. 30, 2 (2020), 601--629.","journal-title":"Statistica Sinica"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCOM.2017.1700481"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.14778\/3594512.3594532"},{"key":"e_1_2_1_64_1","first-page":"1934","article-title":"DynaMo: Dynamic community detection by incrementally maximizing modularity","volume":"33","author":"Zhuang Di","year":"2019","unstructured":"Di Zhuang, J Morris Chang, and Mingchen Li. 2019. DynaMo: Dynamic community detection by incrementally maximizing modularity. IEEE Transactions on Knowledge and Data Engineering, Vol. 33, 5 (2019), 1934--1945.","journal-title":"IEEE Transactions on Knowledge and Data Engineering"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3725276","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T18:54:48Z","timestamp":1774983288000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3725276"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,17]]},"references-count":64,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,6,17]]}},"alternative-id":["10.1145\/3725276"],"URL":"https:\/\/doi.org\/10.1145\/3725276","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,17]]}}}