{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T22:25:50Z","timestamp":1740176750603,"version":"3.37.3"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2022,5,11]],"date-time":"2022-05-11T00:00:00Z","timestamp":1652227200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,5,11]],"date-time":"2022-05-11T00:00:00Z","timestamp":1652227200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Data Sci Anal"],"published-print":{"date-parts":[[2022,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Subspace clustering aims to discover clusters in projections of highly dimensional numerical data. In this paper, we focus on discovering small collections of highly interesting subspace clusters that do not try to cluster all data points, leaving noisy data points unclustered. To this end, we propose a randomised method that first converts the highly dimensional database to a binarised one using projected samples of the original database. Subsequently, this database is mined for frequent itemsets, which we show can be translated back to subspace clusters. In this way, we are able to explore multiple subspaces of different sizes at the same time. In our extensive experimental analysis, we show on synthetic as well as real-world data that our method is capable of discovering highly interesting subspace clusters efficiently.<\/jats:p>","DOI":"10.1007\/s41060-022-00327-y","type":"journal-article","created":{"date-parts":[[2022,5,11]],"date-time":"2022-05-11T09:02:44Z","timestamp":1652259764000},"page":"243-259","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["RASCL: a randomised approach to subspace clusters"],"prefix":"10.1007","volume":"14","author":[{"given":"Sandy","family":"Moens","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1539-0892","authenticated-orcid":false,"given":"Boris","family":"Cule","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bart","family":"Goethals","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2022,5,11]]},"reference":[{"issue":"3","key":"327_CR1","doi-asserted-by":"publisher","first-page":"264","DOI":"10.1145\/331499.331504","volume":"31","author":"AK Jain","year":"1999","unstructured":"Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264\u2013323 (1999)","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"327_CR2","volume-title":"Adaptive Control Processes: A Guided Tour","author":"RE Bellman","year":"2015","unstructured":"Bellman, R.E.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (2015)"},{"issue":"1","key":"327_CR3","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1145\/1007730.1007731","volume":"6","author":"L Parsons","year":"2004","unstructured":"Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data: a review. ACM SIGKDD Explor. Newslett. 6(1), 90\u2013105 (2004)","journal-title":"ACM SIGKDD Explor. Newslett."},{"key":"327_CR4","doi-asserted-by":"crossref","unstructured":"Moise, G., Sander, J., Ester, M.: P3c: A robust projected clustering algorithm. In Sixth international conference on data mining (ICDM\u201906). IEEE, 2006, pp. 414\u2013425","DOI":"10.1109\/ICDM.2006.123"},{"key":"327_CR5","doi-asserted-by":"crossref","unstructured":"Aksehirli, E., Goethals, B., Muller, E., Vreeken, J.: Cartification: a neighborhood preserving transformation for mining high dimensional data, In 2013 IEEE 13th international conference on data mining (ICDM), IEEE, (2013), pp. 937\u2013942","DOI":"10.1109\/ICDM.2013.146"},{"key":"327_CR6","doi-asserted-by":"crossref","unstructured":"Moens, S., Cule, B., Goethals, B.: A sampling-based approach for discovering subspace clusters. In: Discovery science - 22nd international conference, DS. Springer 2019, 61\u201371 (2019)","DOI":"10.1007\/978-3-030-33778-0_6"},{"issue":"7","key":"327_CR7","doi-asserted-by":"publisher","first-page":"902","DOI":"10.1109\/TKDE.2006.106","volume":"18","author":"A Patrikainen","year":"2006","unstructured":"Patrikainen, A., Meila, M.: Comparing subspace clusterings. IEEE Trans. Knowl. Data Eng. 18(7), 902\u2013916 (2006)","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"327_CR8","doi-asserted-by":"crossref","unstructured":"G\u00fcnnemann, S., F\u00e4rber, I., M\u00fcller, E., Assent, I., Seidl, T.: External evaluation measures for subspace clustering, In: Proc of the 20th ACM international conference on information and knowledge management. ACM, 2011, pp. 1363\u20131372","DOI":"10.1145\/2063576.2063774"},{"key":"327_CR9","doi-asserted-by":"crossref","unstructured":"Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. ACM, (1998), vol.\u00a027, no.\u00a02","DOI":"10.1145\/276305.276314"},{"key":"327_CR10","unstructured":"Kriegel, H.-P., Kroger, P., Renz, M., Wurst, S.: A generic framework for efficient subspace clustering of high-dimensional data, In: Fifth IEEE international conference on data mining (ICDM\u201905). IEEE, 2005, pp. 8\u2013pp"},{"key":"327_CR11","doi-asserted-by":"crossref","unstructured":"Aggarwal, C.C., Wolf, J.L., Yu, P.S., Procopiuc, C., Park: J.S.: Fast algorithms for projected clustering, In: ACM SIGMoD Record, vol.\u00a028, no.\u00a02. ACM, (1999), pp. 61\u201372","DOI":"10.1145\/304181.304188"},{"issue":"4","key":"327_CR12","first-page":"453","volume":"57","author":"D Freedman","year":"1981","unstructured":"Freedman, D., Diaconis, P.: On the histogram as a density estimator: L 2 theory. Probab. Theory Relat. Fields 57(4), 453\u2013476 (1981)","journal-title":"Probab. Theory Relat. Fields"},{"key":"327_CR13","doi-asserted-by":"crossref","unstructured":"Moens, S., Goethals, B.: Randomly sampling maximal itemsets. In: KDD workshop on interactive data exploration and analytics. ACM, 2013, pp. 79\u201386","DOI":"10.1145\/2501511.2501523"},{"key":"327_CR14","first-page":"2579","volume":"9","author":"LVD Maaten","year":"2008","unstructured":"Maaten, L.V.D., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579\u20132605 (2008)","journal-title":"J. Mach. Learn. Res."},{"key":"327_CR15","doi-asserted-by":"crossref","unstructured":"Andrews, D.F.: Plots of high-dimensional data, Biometrics, pp. 125\u2013136, (1972)","DOI":"10.2307\/2528964"},{"key":"327_CR16","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-39351-3","volume-title":"Nonlinear Dimensionality Reduction","author":"JA Lee","year":"2007","unstructured":"Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer, Berlin (2007)"},{"key":"327_CR17","first-page":"845","volume":"5","author":"JG Dy","year":"2004","unstructured":"Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845\u2013889 (2004)","journal-title":"J. Mach. Learn. Res."},{"key":"327_CR18","unstructured":"MacQueen, J., et\u00a0al.: Some methods for classification and analysis of multivariate observations, In: proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol.\u00a01, no.\u00a014. Oakland, CA, USA, 1967, pp. 281\u2013297"},{"key":"327_CR19","doi-asserted-by":"crossref","unstructured":"Procopiuc, C.M., Jones, M., Agarwal, P.K., Murali, T.: A monte carlo algorithm for fast projective clustering, In: proceedings of the 2002 ACM SIGMOD international conference on management of data. ACM, 2002, pp. 418\u2013427","DOI":"10.1145\/564691.564739"},{"key":"327_CR20","unstructured":"Yiu, M.L., Mamoulis, N., Frequent-pattern based iterative projected clustering. In: Third IEEE international conference on data mining, ICDM 2003, IEEE, pp. 689\u2013692 (2003)"},{"issue":"34","key":"327_CR21","first-page":"226","volume":"96","author":"M Ester","year":"1996","unstructured":"Ester, M., Kriegel, H.-P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96(34), 226\u2013231 (1996)","journal-title":"Kdd"},{"key":"327_CR22","doi-asserted-by":"crossref","unstructured":"Kailing, K., Kriegel, H.-P., Kr\u00f6ger, P.: Density-connected subspace clustering for high-dimensional data, In: proceedings of the 2004 SIAM international conference on data mining. SIAM, 2004, pp. 246\u2013256","DOI":"10.1137\/1.9781611972740.23"},{"key":"327_CR23","doi-asserted-by":"crossref","unstructured":"Cheng, C.-H., Fu, A.W., Zhang, Y.: Entropy-based subspace clustering for mining numerical data, In: proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, (1999), pp. 84\u201393","DOI":"10.1145\/312129.312199"},{"key":"327_CR24","doi-asserted-by":"crossref","unstructured":"Nguyen, H.V., M\u00fcller, E., Vreeken, J., Keller, F., B\u00f6hm, K.: Cmi: An information-theoretic contrast measure for enhancing subspace cluster and outlier detection, In: SIAM international conference on data mining. SIAM, (2013), pp. 198\u2013206","DOI":"10.1137\/1.9781611972832.22"},{"key":"327_CR25","doi-asserted-by":"crossref","unstructured":"Pourkamali-Anaraki, F., Folberth, J., Becker, S.: Efficient solvers for sparse subspace clustering. Signal Process. 172, 107548 (2020)","DOI":"10.1016\/j.sigpro.2020.107548"},{"key":"327_CR26","doi-asserted-by":"crossref","unstructured":"Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: a survey. IEEE\/ACM Trans. Comput. Biol. Bioinf. (TCBB) 1(1), 24\u201345 (2004)","DOI":"10.1109\/TCBB.2004.2"},{"issue":"10","key":"327_CR27","doi-asserted-by":"publisher","first-page":"5076","DOI":"10.1109\/TIP.2018.2848470","volume":"27","author":"X Peng","year":"2018","unstructured":"Peng, X., Feng, J., Xiao, S., Yau, W.-Y., Zhou, J.T., Yang, S.: Structured autoencoders for subspace clustering. IEEE Trans. Image Process. 27(10), 5076\u20135086 (2018)","journal-title":"IEEE Trans. Image Process."},{"key":"327_CR28","doi-asserted-by":"publisher","first-page":"271","DOI":"10.1016\/j.neunet.2018.07.016","volume":"106","author":"A Majumdar","year":"2018","unstructured":"Majumdar, A.: Graph structured autoencoder. Neural Netw. 106, 271\u2013280 (2018)","journal-title":"Neural Netw."},{"key":"327_CR29","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1016\/j.neucom.2018.10.016","volume":"325","author":"Y Ren","year":"2019","unstructured":"Ren, Y., Hu, K., Dai, X., Pan, L., Hoi, S.C., Xu, Z.: Semi-supervised deep embedded clustering. Neurocomputing 325, 121\u2013130 (2019)","journal-title":"Neurocomputing"},{"key":"327_CR30","doi-asserted-by":"crossref","unstructured":"Kelkar, B.A., Rodd, S.F.: Subspace clustering\u2019a survey, In: Data Management, Analytics and Innovation. Springer, (2019), pp. 209\u2013220","DOI":"10.1007\/978-981-13-1402-5_16"},{"key":"327_CR31","unstructured":"Agrawal, R., Srikant, R., et\u00a0al.: Fast algorithms for mining association rules. In: proceedings 20th international conference very large data bases, VLDB, vol. 1215, (1994), pp. 487\u2013499"}],"container-title":["International Journal of Data Science and Analytics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41060-022-00327-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s41060-022-00327-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41060-022-00327-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,8,13]],"date-time":"2022-08-13T04:21:11Z","timestamp":1660364471000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s41060-022-00327-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,11]]},"references-count":31,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,9]]}},"alternative-id":["327"],"URL":"https:\/\/doi.org\/10.1007\/s41060-022-00327-y","relation":{},"ISSN":["2364-415X","2364-4168"],"issn-type":[{"type":"print","value":"2364-415X"},{"type":"electronic","value":"2364-4168"}],"subject":[],"published":{"date-parts":[[2022,5,11]]},"assertion":[{"value":"3 May 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 April 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 May 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"On behalf of all authors, the corresponding author states that there is no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}