{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T06:57:04Z","timestamp":1777100224782,"version":"3.51.4"},"reference-count":17,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2020,1,30]],"date-time":"2020-01-30T00:00:00Z","timestamp":1580342400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Landesoffensive zur Entwicklung wissenschaftlich - \u00f6konomischer Exzellenz (LOEWE)","award":["LOEWE-Zentrum f\u00fcr Translationale Medizin und Pharmakologie"],"award-info":[{"award-number":["LOEWE-Zentrum f\u00fcr Translationale Medizin und Pharmakologie"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>In the context of data science, data projection and clustering are common procedures. The chosen analysis method is crucial to avoid faulty pattern recognition. It is therefore necessary to know the properties and especially the limitations of projection and clustering algorithms. This report describes a collection of datasets that are grouped together in the Fundamental Clustering and Projection Suite (FCPS). The FCPS contains 10 datasets with the names \u201cAtom\u201d, \u201cChainlink\u201d, \u201cEngyTime\u201d, \u201cGolfball\u201d, \u201cHepta\u201d, \u201cLsun\u201d, \u201cTarget\u201d, \u201cTetra\u201d, \u201cTwoDiamonds\u201d, and \u201cWingNut\u201d. Common clustering methods occasionally identified non-existent clusters or assigned data points to the wrong clusters in the FCPS suite. Likewise, common data projection methods could only partially reproduce the data structure correctly on a two-dimensional plane. In conclusion, the FCPS dataset collection addresses general challenges for clustering and projection algorithms such as lack of linear separability, different or small inner class spacing, classes defined by data density rather than data spacing, no cluster structure at all, outliers, or classes that are in contact. This report describes a collection of datasets that are grouped together in the Fundamental Clustering and Projection Suite (FCPS). It is designed to address specific problems of structure discovery in high-dimensional spaces.<\/jats:p>","DOI":"10.3390\/data5010013","type":"journal-article","created":{"date-parts":[[2020,1,31]],"date-time":"2020-01-31T05:55:46Z","timestamp":1580450146000},"page":"13","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":23,"title":["The Fundamental Clustering and Projection Suite (FCPS): A Dataset Collection to Test the Performance of Clustering and Data Projection Algorithms"],"prefix":"10.3390","volume":"5","author":[{"given":"Alfred","family":"Ultsch","sequence":"first","affiliation":[{"name":"DataBionics Research Institute, University of Marburg, 35032 Marburg, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5818-6958","authenticated-orcid":false,"given":"J\u00f6rn","family":"L\u00f6tsch","sequence":"additional","affiliation":[{"name":"Institute of Clinical Pharmacology, Goethe - University, 60590 Frankfurt am Main, Germany"},{"name":"Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Project Group Translational Medicine and Pharmacology TMP, 60590 Frankfurt am Main, Germany"}]}],"member":"1968","published-online":{"date-parts":[[2020,1,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1002\/nav.3800080314","article-title":"Adaptive control processes\u2014a guided tour, by Richard Bellman, Princeton University Press, Princeton, New Jersey, 1961, 255 pp., $6.50","volume":"8","author":"Wilcox","year":"1961","journal-title":"Naval Res. Logist. Q."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"178","DOI":"10.4169\/college.math.j.46.3.178","article-title":"On the shrinking volume of the hypersphere","volume":"46","author":"Peters","year":"2015","journal-title":"College Math. J."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1016\/j.jbi.2016.12.011","article-title":"Machine-learned cluster identification in high-dimensional data","volume":"66","author":"Ultsch","year":"2017","journal-title":"J. Biomed. Inform."},{"key":"ref_4","unstructured":"Ultsch, A. (2005, January 10\u201312). U*c: Self-Organized Clustering with Emergent Feature Maps. Proceedings of the Lernen, Wissensentdeckung und Adaptivit\u00e4t (LWA) 2005, GI Workshops, Saarbr\u00fccken, Germany."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"L\u00f6tsch, J., and Ultsch, A. (2019). Current projection methods-induced biases at subgroup detection for machine-learning based data-analysis of biomedical data. Int. J. Mol. Sci., 21.","DOI":"10.3390\/ijms21010079"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1023\/A:1007662407062","article-title":"Large margin classification using the perceptron algorithm","volume":"37","author":"Freund","year":"1999","journal-title":"Machine Learn."},{"key":"ref_7","unstructured":"Baggenstoss, P.M. (2002). Statistical Modeling Using Gaussian Mixtures and HMMs with MATLAB, Naval Undersea Warfare Center. Technical Report."},{"key":"ref_8","unstructured":"Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., and Hornik, K. (2017). Cluster: Cluster analysis basics and extensions R package version 2.0. 1. 2015."},{"key":"ref_9","unstructured":"R Development Core Team (2018). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing. Available online: https:\/\/www.R-project.org\/."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v025.i01","article-title":"Factominer: A package for multivariate analysis","volume":"25","author":"Le","year":"2008","journal-title":"J. Stat. Softw."},{"key":"ref_11","unstructured":"Krijthe, J.H. (2019, December 26). Rtsne: T-distributed stochastic neighbor embedding using barnes-hut implementation. Available online: https:\/\/github.com\/jkrijthe\/Rtsne."},{"key":"ref_12","unstructured":"Lammers, B. (2019, December 26). Ann2: Artificial neural networks for anomaly detection. Available online: https:\/\/rdrr.io\/cran\/ANN2\/."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1080\/01621459.1963.10500845","article-title":"Hierarchical grouping to optimize an objective function","volume":"58","year":"1963","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"623","DOI":"10.2307\/2528417","article-title":"A comparison of some methods of cluster analysis","volume":"23","author":"Gower","year":"1967","journal-title":"Biometrics"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1080\/14786440109462720","article-title":"LIII. On lines and planes of closest fit to systems of points in space","volume":"2","author":"Pearson","year":"1901","journal-title":"London, Edinburgh&Dublin Philosoph. Mag. J. Sci."},{"key":"ref_16","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press."},{"key":"ref_17","first-page":"3221","article-title":"Accelerating t-SNE using tree-based algorithms","volume":"15","year":"2014","journal-title":"J. Machine Learn. Res."}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/5\/1\/13\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T08:53:20Z","timestamp":1760172800000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/5\/1\/13"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,1,30]]},"references-count":17,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2020,3]]}},"alternative-id":["data5010013"],"URL":"https:\/\/doi.org\/10.3390\/data5010013","relation":{},"ISSN":["2306-5729"],"issn-type":[{"value":"2306-5729","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,1,30]]}}}