{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,11]],"date-time":"2026-01-11T05:35:43Z","timestamp":1768109743765,"version":"3.49.0"},"reference-count":29,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2024,7]]},"abstract":"<jats:p>\n            Counting and enumerating all occurrences of\n            <jats:italic>k<\/jats:italic>\n            -cliques, i.e., complete subgraphs with\n            <jats:italic>k<\/jats:italic>\n            vertices, in a large graph\n            <jats:italic>G<\/jats:italic>\n            is a fundamental problem with many applications. However, exact solutions are often infeasible due to the exponential growth in the number of\n            <jats:italic>k<\/jats:italic>\n            -cliques when\n            <jats:italic>k<\/jats:italic>\n            increases. Thus, a more practical approach is approximately counting and uniformly sampling\n            <jats:italic>k<\/jats:italic>\n            -cliques. Tur\u00e1n-Shadow and DPColorPath are two state-of-the-art algorithms for approximately counting\n            <jats:italic>k<\/jats:italic>\n            -cliques. The general idea is first constructing a sample space that is a superset of all\n            <jats:italic>k<\/jats:italic>\n            -cliques in\n            <jats:italic>G<\/jats:italic>\n            , and then sampling\n            <jats:italic>t elements<\/jats:italic>\n            uniformly-at-random (u.a.r.) from the sample space for a pre-determined\n            <jats:italic>t<\/jats:italic>\n            ; the\n            <jats:italic>k<\/jats:italic>\n            -clique count is estimated as the sample space size multiplied by the ratio of\n            <jats:italic>k<\/jats:italic>\n            -cliques among the\n            <jats:italic>t<\/jats:italic>\n            samples. Although techniques have been proposed in Tur\u00e1n-Shadow for setting\n            <jats:italic>t<\/jats:italic>\n            to ensure the estimation accuracy, the theoretically chosen\n            <jats:italic>t<\/jats:italic>\n            is often too large to be practical. As a result, both of the existing algorithms used a fixed\n            <jats:italic>t<\/jats:italic>\n            in their implementations and thus do not offer accuracy guarantee. In this paper, we propose the first randomized algorithm that achieves the theoretical estimation accuracy and the practical efficiency at the same time. Different from the existing algorithms, we pre-determine the number\n            <jats:italic>s<\/jats:italic>\n            of\n            <jats:italic>k-clique samples<\/jats:italic>\n            that are required to achieve the estimation accuracy. Consequently, we can estimate the running time of the sampling stage (i.e., time taken to sample\n            <jats:italic>sk<\/jats:italic>\n            -cliques), for a given sample space. Then, we propose to balance the time of constructing\/refining the sample space and the time of the sampling stage, by stopping the refinement of the sample space once the elapsed time is comparable to the estimated time of the sampling stage. Extensive empirical studies on large real graphs show that our algorithm SR-kCCE provides an accurate\n            <jats:italic>k<\/jats:italic>\n            -clique count estimation and also runs efficiently. As a by-product, our algorithm can also be used for\n            <jats:italic>efficiently sampling<\/jats:italic>\n            a certain number of\n            <jats:italic>k<\/jats:italic>\n            -cliques u.a.r. from\n            <jats:italic>G.<\/jats:italic>\n          <\/jats:p>","DOI":"10.14778\/3681954.3682032","type":"journal-article","created":{"date-parts":[[2024,8,30]],"date-time":"2024-08-30T16:23:36Z","timestamp":1725035016000},"page":"3707-3719","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Efficient\n            <i>k<\/i>\n            -Clique Count Estimation with Accuracy Guarantee"],"prefix":"10.14778","volume":"17","author":[{"given":"Lijun","family":"Chang","sequence":"first","affiliation":[{"name":"The University of Sydney, Sydney, Australia"}]},{"given":"Rashmika","family":"Gamage","sequence":"additional","affiliation":[{"name":"The University of Sydney, Sydney, Australia"}]},{"given":"Jeffrey Xu","family":"Yu","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, China"}]}],"member":"320","published-online":{"date-parts":[[2024,8,30]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"1","article-title":"Parallel K-clique counting on GPUs. In Proc. of ICS'22","volume":"21","author":"Almasri Mohammad","year":"2022","unstructured":"Mohammad Almasri, Izzat El Hajj, Rakesh Nagi, Jinjun Xiong, and Wen-Mei Hwu. 2022. Parallel K-clique counting on GPUs. In Proc. of ICS'22. ACM, 21:1--21:14.","journal-title":"ACM"},{"key":"e_1_2_1_2_1","first-page":"4","article-title":"Motif Counting Beyond Five Nodes","volume":"12","author":"Bressan Marco","year":"2018","unstructured":"Marco Bressan, Flavio Chierichetti, Ravi Kumar, Stefano Leucci, and Alessandro Panconesi. 2018. Motif Counting Beyond Five Nodes. ACM Trans. Knowl. Discovery Data 12, 4 (April 2018), 1--25.","journal-title":"ACM Trans. Knowl. Discovery Data"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/362342.362367"},{"key":"e_1_2_1_4_1","series-title":"Springer Series in the Data Sciences","volume-title":"Cohesive Subgraph Computation over Large Sparse Graphs","author":"Chang Lijun","unstructured":"Lijun Chang and Lu Qin. 2018. Cohesive Subgraph Computation over Large Sparse Graphs. Springer Series in the Data Sciences."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1137\/0214017"},{"key":"e_1_2_1_6_1","volume-title":"Proc. of WWW'19 (Tutorial). 1317--1318","author":"Comandur Seshadhri","year":"2019","unstructured":"Seshadhri Comandur and Srikanta Tirthapura. 2019. Scalable Subgraph Counting: The Methods Behind The Madness. In Proc. of WWW'19 (Tutorial). 1317--1318."},{"key":"e_1_2_1_7_1","series-title":"SIAM Journal on computing 29, 5","volume-title":"An optimal algorithm for Monte Carlo estimation","author":"Dagum Paul","year":"2000","unstructured":"Paul Dagum, Richard Karp, Michael Luby, and Sheldon Ross. 2000. An optimal algorithm for Monte Carlo estimation. SIAM Journal on computing 29, 5 (2000), 1484--1496."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3186125"},{"key":"e_1_2_1_9_1","volume-title":"Listing All Maximal Cliques in Large Sparse Real-World Graphs. ACM Journal of Experimental Algorithmics 18","author":"Eppstein David","year":"2013","unstructured":"David Eppstein, Maarten L\u00f6ffler, and Darren Strash. 2013. Listing All Maximal Cliques in Large Sparse Real-World Graphs. ACM Journal of Experimental Algorithmics 18 (2013)."},{"key":"e_1_2_1_10_1","volume-title":"On the number of complete subgraphs and circuits contained in graphs. \u010casopis pro p\u011bstov\u00e1n\u00ed matematiky 094, 3","author":"Erd\u00f6s P\u00e1l","year":"1969","unstructured":"P\u00e1l Erd\u00f6s. 1969. On the number of complete subgraphs and circuits contained in graphs. \u010casopis pro p\u011bstov\u00e1n\u00ed matematiky 094, 3 (1969), 290--296."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2794080"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588923"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052636"},{"key":"e_1_2_1_14_1","volume-title":"Proc. WSDM'20","author":"Jain Shweta","unstructured":"Shweta Jain and C. Seshadhri. 2020. The Power of Pivoting for Exact Clique Counting. In Proc. WSDM'20. ACM, 268--276."},{"key":"e_1_2_1_15_1","volume-title":"Proc. of SDM'22","author":"Jain Shweta","year":"2022","unstructured":"Shweta Jain and Hanghang Tong. 2022. YACC: A Framework Generalizing Tur\u00e1nShadow for Counting Large Cliques. In Proc. of SDM'22. 684--692."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2736277.2741101"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2402.322385"},{"key":"e_1_2_1_18_1","volume-title":"Probability and Computing: Randomized Algorithms and Probabilistic Analysis","author":"Mitzenmacher Michael","unstructured":"Michael Mitzenmacher and Eli Upfal. 2005. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611973198.1"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature03607"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2013.2297929"},{"key":"e_1_2_1_22_1","volume-title":"Silva","author":"Ribeiro Pedro","year":"2022","unstructured":"Pedro Ribeiro, Pedro Paredes, Miguel E. P. Silva, David Apar\u00edcio, and Fernando M. A. Silva. 2022. A Survey on Subgraph Counting: Concepts, Algorithms, and Applications to Network Motifs and Graphlets. ACM Comput. Surv. 54, 2 (2022), 28:1--28:36."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2736277.2741640"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611976830.13"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/3401960.3401962"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2736277.2741098"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/32.92917"},{"key":"e_1_2_1_28_1","volume-title":"Efficient k-Clique Listing: An Edge-Oriented Branching Strategy. CoRR","author":"Wang Kaixin","year":"2023","unstructured":"Kaixin Wang, Kaiqiang Yu, and Cheng Long. 2023. Efficient k-Clique Listing: An Edge-Oriented Branching Strategy. CoRR (2023)."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3485447.3512167"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3681954.3682032","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,4]],"date-time":"2024-09-04T18:44:48Z","timestamp":1725475488000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3681954.3682032"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7]]},"references-count":29,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2024,7]]}},"alternative-id":["10.14778\/3681954.3682032"],"URL":"https:\/\/doi.org\/10.14778\/3681954.3682032","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2024,7]]},"assertion":[{"value":"2024-08-30","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}