{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,20]],"date-time":"2026-05-20T19:21:54Z","timestamp":1779304914801,"version":"3.51.4"},"reference-count":33,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2023,4,16]],"date-time":"2023-04-16T00:00:00Z","timestamp":1681603200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,4,16]],"date-time":"2023-04-16T00:00:00Z","timestamp":1681603200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Data Sci Anal"],"published-print":{"date-parts":[[2023,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Not all graphs are clusterable. Not all graphs have a clustered structure and can be meaningfully summarized through vertex clustering. Clusterable graphs are characterized by pockets of densely connected vertices that are only sparsely connected to the remaining graph. In this article, we re-introduce a very simple and intuitive, yet highly informative, statistical hypothesis test for graph clusterability that is based on vertex and neighborhood samples. The goal of this test is to determine if a graph meets the necessary structural conditions to be summarized meaningfully through vertex clusters. Our test is based on the hypothesis that a clusterable graph will display, on average, a local neighborhood induced subgraph density that is greater than the graph\u2019s overall density. The test is also applied to graph comparisons, to test whether one graph has a stronger clustered structure than another. Significance is assessed using the<jats:italic>t<\/jats:italic>-statistic. Since it is based on sampling, we provide a focused examination of our test\u2019s sensitivity to sample size. The main contribution of this article is a detailed examination of our test\u2019s accuracy, sensitivity to sample size, conclusion reproducibility and robustness. Our empirical results remain consistent with our earlier conclusions and demonstrate the almost perfect accuracy of our test, even with very small samples of the graph. They also reveal that our test remains robust even under severe departures from the null hypothesis.<\/jats:p>","DOI":"10.1007\/s41060-023-00389-6","type":"journal-article","created":{"date-parts":[[2023,4,16]],"date-time":"2023-04-16T03:44:58Z","timestamp":1681616698000},"page":"379-390","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Statistical power, accuracy, reproducibility and robustness of a graph clusterability test"],"prefix":"10.1007","volume":"15","author":[{"given":"Pierre","family":"Miasnikof","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alexander Y.","family":"Shestopaloff","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrei","family":"Raigorodskii","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,4,16]]},"reference":[{"key":"389_CR1","unstructured":"Autonomous systems (DIMACS10) network dataset\u2014KONECT. Available at http:\/\/konect.cc\/networks\/dimacs10-as-22july06 (2018)"},{"key":"389_CR2","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1016\/j.patcog.2018.10.026","volume":"88","author":"A Adolfsson","year":"2019","unstructured":"Adolfsson, A., Ackerman, M., Brownstein, N.C.: To cluster, or not to cluster: an analysis of clusterability methods. Pattern Recogn. 88, 13\u201326 (2019)","journal-title":"Pattern Recogn."},{"key":"389_CR3","unstructured":"Adriaens, F., Apers, S.: Testing properties of signed graphs. arXiv:2102.07587 (2021)"},{"issue":"S1","key":"389_CR4","doi-asserted-by":"publisher","first-page":"S134","DOI":"10.1017\/nws.2021.2","volume":"9","author":"N Antunes","year":"2021","unstructured":"Antunes, N., Guo, T., Pipiras, V.: Sampling methods and estimation of triangle count distributions in large networks. Netw. Sci. 9(S1), S134\u2013S156 (2021)","journal-title":"Netw. Sci."},{"issue":"2","key":"389_CR5","doi-asserted-by":"publisher","first-page":"1159","DOI":"10.3150\/20-BEJ1269","volume":"27","author":"K Bogerd","year":"2021","unstructured":"Bogerd, K., Castro, R.M., van der Hofstad, R., Verzelen, N.: Detecting a planted community in an inhomogeneous random graph. Bernoulli 27(2), 1159\u20131188 (2021)","journal-title":"Bernoulli"},{"key":"389_CR6","doi-asserted-by":"crossref","unstructured":"Chiplunkar, A., Kapralov, M., Khanna, S., Mousavifar, A., Peres, Y.: Testing graph clusterability: algorithms and lower bounds. In: 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pp. 497\u2013508 (2018)","DOI":"10.1109\/FOCS.2018.00054"},{"key":"389_CR7","doi-asserted-by":"crossref","unstructured":"Czumaj, A., Peng, P., Sohler, C.: Testing Cluster Structure of Graphs. In: Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing (2015)","DOI":"10.1145\/2746539.2746618"},{"key":"389_CR8","doi-asserted-by":"publisher","first-page":"290","DOI":"10.5486\/PMD.1959.6.3-4.12","volume":"6","author":"P Erd\u00f6s","year":"1959","unstructured":"Erd\u00f6s, P., R\u00e9nyi, A.: On random graphs I. Publ. Math. Debrecen 6, 290\u2013297 (1959)","journal-title":"Publ. Math. Debrecen"},{"key":"389_CR9","unstructured":"Filan, D., Casper, S., Hod, S., Wild, C., Critch, A., Russell, S.: Clusterability in neural networks. arXiv:2103.03386, archivePrefix=arXiv, primaryClass=cs.NE, (2021)"},{"key":"389_CR10","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1016\/j.physrep.2009.11.002","volume":"486","author":"S Fortunato","year":"2010","unstructured":"Fortunato, S.: Community detection in graphs. Phys. Rep. 486, 75\u2013174 (2010)","journal-title":"Phys. Rep."},{"key":"389_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.physrep.2016.09.002","volume":"659","author":"S Fortunato","year":"2016","unstructured":"Fortunato, S., Hric, D.: Community detection in networks: a user guide. Phys. Rep. 659, 1\u201344 (2016)","journal-title":"Phys. Rep."},{"key":"389_CR12","unstructured":"Gao, C., Lafferty, J.: Testing for global network structure using small subgraph statistics. arXiv, arXiv:1710.00862 (2017)"},{"key":"389_CR13","unstructured":"Gao, C., Lafferty, J.: Testing network structure using relations between small subgraph probabilities. arXiv:1704.06742 (2017)"},{"key":"389_CR14","unstructured":"Gao, C., Ma, Z.: Minimax rates in network analysis: graphon estimation, community detection and hypothesis testing. page arXiv:1811.06055 (November 2018)"},{"issue":"4","key":"389_CR15","doi-asserted-by":"publisher","first-page":"1141","DOI":"10.1214\/aoms\/1177706098","volume":"30","author":"EN Gilbert","year":"1959","unstructured":"Gilbert, E.N.: Random graphs. Ann. Math. Stat. 30(4), 1141\u20131144 (1959)","journal-title":"Ann. Math. Stat."},{"issue":"4","key":"389_CR16","doi-asserted-by":"publisher","first-page":"653","DOI":"10.1145\/285055.285060","volume":"45","author":"O Goldreich","year":"1998","unstructured":"Goldreich, O., Goldwasser, S., Ron, D.: Property testing and its connection to learning and approximation. J. ACM 45(4), 653\u2013750 (1998)","journal-title":"J. ACM"},{"key":"389_CR17","doi-asserted-by":"crossref","unstructured":"Hagberg, A.A., Schult, D.A., Swart, P.J.: Exploring network structure, dynamics, and function using network. In: Varoquaux, G., Vaught, T., Millman, J. (eds) Proceedings of the 7th Python in Science Conference, pp. 11\u201315, Pasadena, CA USA (2008)","DOI":"10.25080\/TCWV9851"},{"issue":"2","key":"389_CR18","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1016\/0378-8733(83)90021-7","volume":"5","author":"PW Holland","year":"1983","unstructured":"Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: first steps. Soc. Netw. 5(2), 109\u2013137 (1983)","journal-title":"Soc. Netw."},{"key":"389_CR19","doi-asserted-by":"crossref","unstructured":"Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1), 2-es (2007)","DOI":"10.1145\/1217299.1217301"},{"key":"389_CR20","unstructured":"Klimt, B., Yang, Y.: Introducing the enron corpus. In: CEAS (2004)"},{"key":"389_CR21","doi-asserted-by":"crossref","unstructured":"Kunegis, J.: KONECT\u2014The Koblenz network collection. In: Proceedings of the 22nd international conference on world wide web, pp. 1343\u20131350 (2013)","DOI":"10.1145\/2487788.2488173"},{"issue":"4","key":"389_CR22","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.78.046110","volume":"78","author":"A Lancichinetti","year":"2008","unstructured":"Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78(4), 046110 (2008)","journal-title":"Phys. Rev. E"},{"key":"389_CR23","doi-asserted-by":"crossref","unstructured":"Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. arXiv e-prints, page arXiv:0810.1355 (October 2008)","DOI":"10.1080\/15427951.2009.10129177"},{"key":"389_CR24","doi-asserted-by":"crossref","unstructured":"Miasnikof, P., Prokhorenkova, L., Shestopaloff, A.Y., Raigorodskii, A.: A statistical test of heterogeneous subgraph densities to assess clusterability. In: Matsatsinis, Nikolaos F., Marinakis, Yannis, Pardalos, Panos (eds.) Learning and Intelligent Optimization, pp. 17\u201329. Springer International Publishing, Cham (2020)","DOI":"10.1007\/978-3-030-38629-0_2"},{"key":"389_CR25","volume-title":"A Statistical Performance Analysis of Graph Clustering Algorithms, chapter\u00a011. Lecture Notes in Computer Science","author":"P Miasnikof","year":"2018","unstructured":"Miasnikof, P., Shestopaloff, A.Y., Bonner, A.J., Lawryshyn, Y.: A Statistical Performance Analysis of Graph Clustering Algorithms, chapter\u00a011. Lecture Notes in Computer Science, vol. 6. Springer Nature, Berlin (2018)"},{"issue":"3","key":"389_CR26","doi-asserted-by":"publisher","first-page":"08","DOI":"10.1093\/comnet\/cnaa012","volume":"8","author":"P Miasnikof","year":"2020","unstructured":"Miasnikof, P., Shestopaloff, A.Y., Bonner, A.J., Lawryshyn, Y., Pardalos, P.M.: A density-based statistical analysis of graph clustering algorithm performance. J. Complex Networks 8(3), 08 (2020)","journal-title":"J. Complex Networks"},{"issue":"2","key":"389_CR27","doi-asserted-by":"publisher","first-page":"167","DOI":"10.1137\/S003614450342480","volume":"45","author":"MEJ Newman","year":"2003","unstructured":"Newman, M.E.J.: The structure and function of complex networks. SIAM Rev. 45(2), 167\u2013256 (2003)","journal-title":"SIAM Rev."},{"key":"389_CR28","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1007\/978-3-319-49787-7_10","volume-title":"Algorithms and Models for the Web Graph","author":"L Ostroumova Prokhorenkova","year":"2016","unstructured":"Ostroumova Prokhorenkova, L., Pra\u0142at, P., Raigorodskii, A.: Modularity of complex networks models. In: Bonato, A., Graham, F.C., Pra\u0142at, P. (eds.) Algorithms and Models for the Web Graph, pp. 115\u2013126. Springer International Publishing, Cham (2016)"},{"key":"389_CR29","doi-asserted-by":"crossref","unstructured":"Ostroumova Prokhorenkova, L., Pra\u0142at, P., Raigorodskii, A.: Modularity in several random graph models. Electron. Notes Discrete Math., 61, 947\u2013953 (2017). The European Conference on Combinatorics, Graph Theory and Applications (EUROCOMB\u201917)","DOI":"10.1016\/j.endm.2017.07.058"},{"issue":"1","key":"389_CR30","doi-asserted-by":"publisher","first-page":"27","DOI":"10.1016\/j.cosrev.2007.05.001","volume":"1","author":"SE Schaeffer","year":"2007","unstructured":"Schaeffer, S.E.: Survey: graph clustering. Comput. Sci. Rev. 1(1), 27\u201364 (2007)","journal-title":"Comput. Sci. Rev."},{"key":"389_CR31","unstructured":"Weisstein, E.W.: Caveman graph. https:\/\/mathworld.wolfram.com\/CavemanGraph.html"},{"key":"389_CR32","unstructured":"Weisstein, E.W.: Central limit theorem. https:\/\/mathworld.wolfram.com\/CentralLimitTheorem.html"},{"key":"389_CR33","doi-asserted-by":"crossref","unstructured":"Yang, J., Leskovec, J.: Defining and Evaluating Network Communities based on Ground-truth. CoRR, arXiv:1205.6233 (2012)","DOI":"10.1145\/2350190.2350193"}],"container-title":["International Journal of Data Science and Analytics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41060-023-00389-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s41060-023-00389-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41060-023-00389-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,18]],"date-time":"2024-10-18T10:20:41Z","timestamp":1729246841000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s41060-023-00389-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,16]]},"references-count":33,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,5]]}},"alternative-id":["389"],"URL":"https:\/\/doi.org\/10.1007\/s41060-023-00389-6","relation":{},"ISSN":["2364-415X","2364-4168"],"issn-type":[{"value":"2364-415X","type":"print"},{"value":"2364-4168","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,4,16]]},"assertion":[{"value":"28 June 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 March 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 April 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors state they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}