{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:27:52Z","timestamp":1750220872223,"version":"3.41.0"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","license":[{"start":{"date-parts":[[2019,10,11]],"date-time":"2019-10-11T00:00:00Z","timestamp":1570752000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["ACM J. Exp. Algorithmics"],"published-print":{"date-parts":[[2019,12,17]]},"abstract":"<jats:p>\n            Clustering is a fundamental problem in data science, yet the variety of clustering methods and their sensitivity to parameters make clustering hard. To analyze the stability of a given clustering algorithm while varying its parameters, and to compare clusters yielded by different algorithms, several comparison schemes based on matchings, information theory, and various indices (Rand, Jaccard) have been developed. We go beyond these by providing a novel class of methods computing meta-clusters within each clustering\u2014a\n            <jats:italic>meta-cluster<\/jats:italic>\n            is a group of clusters, together with a matching between these.\n          <\/jats:p>\n          <jats:p>\n            Let the intersection graph of two clusterings be the edge-weighted bipartite graph in which the nodes represent the clusters, the edges represent the nonempty intersection between two clusters, and the weight of an edge is the number of common items. We introduce the so-called\n            <jats:italic>D<\/jats:italic>\n            -family-matching\u00a0problem on intersection graphs, with\n            <jats:italic>D<\/jats:italic>\n            the upper bound on the diameter of the graph induced by the clusters of any meta-cluster. First we prove\n            <jats:italic>NP<\/jats:italic>\n            -completeness and\n            <jats:italic>APX<\/jats:italic>\n            -hardness results, and unbounded approximation ratio of simple strategies. Second, we design exact polynomial time dynamic programming algorithms for some classes of graphs (in particular trees). Then we prove spanning tree\u2013based efficient heuristic algorithms for general graphs.\n          <\/jats:p>\n          <jats:p>\n            Our experiments illustrate the role of\n            <jats:italic>D<\/jats:italic>\n            as a scale parameter providing information on the relationship between clusters within a clustering and in-between two clusterings. They also show the advantages of our built-in mapping over classical cluster comparison measures such as the variation of information.\n          <\/jats:p>","DOI":"10.1145\/3345951","type":"journal-article","created":{"date-parts":[[2019,10,15]],"date-time":"2019-10-15T16:35:58Z","timestamp":1571157358000},"page":"1-41","source":"Crossref","is-referenced-by-count":4,"title":["Comparing Two Clusterings Using Matchings between Clusters of Clusters"],"prefix":"10.1145","volume":"24","author":[{"given":"F.","family":"Cazals","sequence":"first","affiliation":[{"name":"Universit\u00e9 C\u00f4te d'Azur, Inria, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"D.","family":"Mazauric","sequence":"additional","affiliation":[{"name":"Universit\u00e9 C\u00f4te d'Azur, Inria, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"R.","family":"Tetley","sequence":"additional","affiliation":[{"name":"Universit\u00e9 C\u00f4te d'Azur, Inria, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"R.","family":"Watrigant","sequence":"additional","affiliation":[{"name":"University Lyon, CNRS, ENS de Lyon, Universit\u00e9 Claude Bernard Lyon 1, LIP UMR5668, France"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2019,10,11]]},"reference":[{"doi-asserted-by":"publisher","key":"e_1_2_1_1_1","DOI":"10.1109\/TNN.2005.845141"},{"unstructured":"R. O. Duda and P. E. Hart. 1973. Pattern Classification and Scene Analysis. Wiley.  R. O. Duda and P. E. Hart. 1973. Pattern Classification and Scene Analysis. Wiley.","key":"e_1_2_1_2_1"},{"volume-title":"Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA\u201907)","author":"Arthur D.","unstructured":"D. Arthur and S. Vassilvitskii . 2007. k-means++: The advantages of careful seeding . In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA\u201907) . 1027--1035. D. Arthur and S. Vassilvitskii. 2007. k-means++: The advantages of careful seeding. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA\u201907). 1027--1035.","key":"e_1_2_1_3_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_4_1","DOI":"10.1109\/34.400568"},{"doi-asserted-by":"publisher","key":"e_1_2_1_5_1","DOI":"10.1145\/2535927"},{"doi-asserted-by":"crossref","unstructured":"A. Rodriguez and A. Laio. 2014. Clustering by fast search and find of density peaks. Science 344 6191 (2014) 1492--1496.  A. Rodriguez and A. Laio. 2014. Clustering by fast search and find of density peaks. Science 344 6191 (2014) 1492--1496.","key":"e_1_2_1_6_1","DOI":"10.1126\/science.1242072"},{"doi-asserted-by":"publisher","key":"e_1_2_1_7_1","DOI":"10.1007\/s11222-007-9033-z"},{"volume-title":"Proceedings of the 5th ACM International Conference on Knowledge Discovery and Data Mining (KDD\u201999)","author":"Larsen B.","unstructured":"B. Larsen and C. Aone . 1999. Fast and effective text mining using linear-time document clustering . In Proceedings of the 5th ACM International Conference on Knowledge Discovery and Data Mining (KDD\u201999) . ACM, New York, NY, 16--22. B. Larsen and C. Aone. 1999. Fast and effective text mining using linear-time document clustering. In Proceedings of the 5th ACM International Conference on Knowledge Discovery and Data Mining (KDD\u201999). ACM, New York, NY, 16--22.","key":"e_1_2_1_8_1"},{"volume-title":"Clustering Stability","author":"Luxburg U. Von","unstructured":"U. Von Luxburg . 2010. Clustering Stability . Now Publishers . U. Von Luxburg. 2010. Clustering Stability. Now Publishers.","key":"e_1_2_1_10_1"},{"key":"e_1_2_1_11_1","first-page":"2006","article-title":"Comparing Clusterings","author":"Wagner S.","year":"2007","unstructured":"S. Wagner and D. Wagner . 2007 . Comparing Clusterings : An Overview. Technical Report 2006 - 2004 . Universitat Karlsruhe. S. Wagner and D. Wagner. 2007. Comparing Clusterings: An Overview. Technical Report 2006-04. Universitat Karlsruhe.","journal-title":"An Overview. Technical Report"},{"doi-asserted-by":"publisher","key":"e_1_2_1_12_1","DOI":"10.1016\/j.jmva.2006.11.013"},{"doi-asserted-by":"crossref","unstructured":"T. M. Cover and J. A. Thomas. 2006. Elements of Information Theory. Wiley 8 Sons.  T. M. Cover and J. A. Thomas. 2006. Elements of Information Theory. Wiley 8 Sons.","key":"e_1_2_1_13_1","DOI":"10.1002\/047174882X"},{"volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201912)","author":"Xiang Q.","unstructured":"Q. Xiang , Q. Mao , K. Chai , H. Chieu , I. Tsang , and Z. Zhao . 2012. A split-merge framework for comparing clusterings . In Proceedings of the International Conference on Machine Learning (ICML\u201912) . Q. Xiang, Q. Mao, K. Chai, H. Chieu, I. Tsang, and Z. Zhao. 2012. A split-merge framework for comparing clusterings. In Proceedings of the International Conference on Machine Learning (ICML\u201912).","key":"e_1_2_1_14_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_15_1","DOI":"10.1007\/s10618-015-0426-x"},{"volume-title":"Proceedings of the International Conference on Machine Learning. 1143--1151","author":"Romano S.","unstructured":"S. Romano , J. Bailey , V. Nguyen , and K. Verspoor . 2014. Standardized mutual information for clustering comparisons: One step further in adjustment for chance . In Proceedings of the International Conference on Machine Learning. 1143--1151 . S. Romano, J. Bailey, V. Nguyen, and K. Verspoor. 2014. Standardized mutual information for clustering comparisons: One step further in adjustment for chance. In Proceedings of the International Conference on Machine Learning. 1143--1151.","key":"e_1_2_1_16_1"},{"volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201905)","author":"Zhou D.","unstructured":"D. Zhou , J. Li , and H. Zha . 2005. A new mallows distance based metric for comparing clusterings . In Proceedings of the International Conference on Machine Learning (ICML\u201905) . ACM, New York, NY, 1028--1035. D. Zhou, J. Li, and H. Zha. 2005. A new mallows distance based metric for comparing clusterings. In Proceedings of the International Conference on Machine Learning (ICML\u201905). ACM, New York, NY, 1028--1035.","key":"e_1_2_1_17_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_18_1","DOI":"10.1023\/A:1026543900054"},{"volume-title":"Proceedings of the SIAM International Conference on Data Mining. 1--9.","author":"Glantz R.","unstructured":"R. Glantz and H. Meyerhenke . 2018. Many-to-many correspondences between partitions: Introducing a cut-based approach . In Proceedings of the SIAM International Conference on Data Mining. 1--9. R. Glantz and H. Meyerhenke. 2018. Many-to-many correspondences between partitions: Introducing a cut-based approach. In Proceedings of the SIAM International Conference on Data Mining. 1--9.","key":"e_1_2_1_19_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_20_1","DOI":"10.1111\/1467-9868.00293"},{"key":"e_1_2_1_21_1","article-title":"Cluster ensembles\u2014A knowledge reuse framework for combining multiple partitions","author":"Strehl A.","year":"2002","unstructured":"A. Strehl and J. Ghosh . 2002 . Cluster ensembles\u2014A knowledge reuse framework for combining multiple partitions . Journal of Machine Learning Research 3 ( December 2002), 583--617. A. Strehl and J. Ghosh. 2002. Cluster ensembles\u2014A knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3 (December 2002), 583--617.","journal-title":"Journal of Machine Learning Research 3"},{"key":"e_1_2_1_22_1","first-page":"90","article-title":"Review on determining number of cluster in k-means clustering","volume":"1","author":"Kodinariya T.","year":"2013","unstructured":"T. Kodinariya and P. Prashant . 2013 . Review on determining number of cluster in k-means clustering . International Journal 1 , 6 (2013), 90 -- 95 . T. Kodinariya and P. Prashant. 2013. Review on determining number of cluster in k-means clustering. International Journal 1, 6 (2013), 90--95.","journal-title":"International Journal"},{"volume-title":"Proceedings of the 20th International Workshop on Database and Expert Systems Application (DEXA\u201909)","author":"Muhr M.","unstructured":"M. Muhr and M. Granitzer . 2009. Automatic cluster number selection using a split and merge k-means approach . In Proceedings of the 20th International Workshop on Database and Expert Systems Application (DEXA\u201909) . IEEE, Los Alamitos, CA, 363--367. M. Muhr and M. Granitzer. 2009. Automatic cluster number selection using a split and merge k-means approach. In Proceedings of the 20th International Workshop on Database and Expert Systems Application (DEXA\u201909). IEEE, Los Alamitos, CA, 363--367.","key":"e_1_2_1_23_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_24_1","DOI":"10.2478\/jaiscr-2014-0005"},{"doi-asserted-by":"publisher","key":"e_1_2_1_25_1","DOI":"10.1198\/jcgs.2010.08111"},{"doi-asserted-by":"publisher","key":"e_1_2_1_26_1","DOI":"10.1007\/s11634-010-0058-3"},{"key":"e_1_2_1_27_1","volume-title":"Data: An Introduction to Cluster Analysis","author":"Kaufman L.","year":"1990","unstructured":"L. Kaufman and P. Rousseeuw . 1990 . Finding Groups in Data: An Introduction to Cluster Analysis . Wiley . L. Kaufman and P. Rousseeuw. 1990. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley."},{"doi-asserted-by":"publisher","key":"e_1_2_1_28_1","DOI":"10.2307\/3315985"},{"unstructured":"F. Cazals and R. Tetley. Multiscale analysis of structurally conserved motifs. Submitted.  F. Cazals and R. Tetley. Multiscale analysis of structurally conserved motifs. Submitted.","key":"e_1_2_1_29_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_30_1","DOI":"10.1142\/S0129054107004656"},{"doi-asserted-by":"publisher","key":"e_1_2_1_31_1","DOI":"10.3390\/a11010010"},{"unstructured":"M. R. Garey and D. S. Johnson. 2002. Computers and Intractability. Vol. 29. Freeman.  M. R. Garey and D. S. Johnson. 2002. Computers and Intractability. Vol. 29. Freeman.","key":"e_1_2_1_32_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_33_1","DOI":"10.5555\/2797333.2797336"},{"doi-asserted-by":"publisher","key":"e_1_2_1_34_1","DOI":"10.1137\/S0097539792251730"},{"doi-asserted-by":"crossref","unstructured":"L. Zhao H. Nagamochi and T. Ibaraki. 1999. Approximating the Minimum k-way Cut in a Graph via Minimum 3-way Cuts. Springer Berlin Germany.  L. Zhao H. Nagamochi and T. Ibaraki. 1999. Approximating the Minimum k-way Cut in a Graph via Minimum 3-way Cuts. Springer Berlin Germany.","key":"e_1_2_1_35_1","DOI":"10.1007\/3-540-46632-0_38"},{"volume-title":"Proceedings of the IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS\u201911)","author":"Kawarabayashi K-I.","unstructured":"K-I. Kawarabayashi and M. Thorup . 2011. The minimum k-way cut of bounded size is fixed-parameter tractable . In Proceedings of the IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS\u201911) . 160--169. K-I. Kawarabayashi and M. Thorup. 2011. The minimum k-way cut of bounded size is fixed-parameter tractable. In Proceedings of the IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS\u201911). 160--169.","key":"e_1_2_1_36_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_37_1","DOI":"10.1016\/S1571-0661(04)81014-4"},{"doi-asserted-by":"publisher","key":"e_1_2_1_38_1","DOI":"10.1137\/S0097539792225297"},{"doi-asserted-by":"publisher","key":"e_1_2_1_39_1","DOI":"10.1145\/28869.28874"},{"doi-asserted-by":"publisher","key":"e_1_2_1_40_1","DOI":"10.1016\/0022-0000(91)90023-X"},{"doi-asserted-by":"publisher","key":"e_1_2_1_41_1","DOI":"10.1007\/978-1-4684-2001-2_9"},{"doi-asserted-by":"publisher","key":"e_1_2_1_42_1","DOI":"10.1016\/0020-0190(91)90246-E"},{"doi-asserted-by":"publisher","key":"e_1_2_1_43_1","DOI":"10.1016\/j.tcs.2013.01.027"},{"doi-asserted-by":"publisher","key":"e_1_2_1_44_1","DOI":"10.1016\/j.dam.2008.10.017"},{"doi-asserted-by":"publisher","key":"e_1_2_1_45_1","DOI":"10.5555\/89657"},{"doi-asserted-by":"crossref","unstructured":"P. Flajolet and R. Sedgewick. 2009. Analytic Combinatorics. Cambridge University Press.  P. Flajolet and R. Sedgewick. 2009. Analytic Combinatorics. Cambridge University Press.","key":"e_1_2_1_46_1","DOI":"10.1017\/CBO9780511801655"},{"doi-asserted-by":"publisher","key":"e_1_2_1_47_1","DOI":"10.1017\/S0963548304006315"}],"container-title":["ACM Journal of Experimental Algorithmics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3345951","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3345951","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:23:58Z","timestamp":1750202638000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3345951"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,10,11]]},"references-count":46,"alternative-id":["10.1145\/3345951"],"URL":"https:\/\/doi.org\/10.1145\/3345951","relation":{},"ISSN":["1084-6654","1084-6654"],"issn-type":[{"type":"print","value":"1084-6654"},{"type":"electronic","value":"1084-6654"}],"subject":[],"published":{"date-parts":[[2019,10,11]]}}}