{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T07:33:21Z","timestamp":1767857601489,"version":"3.49.0"},"reference-count":48,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,3,17]],"date-time":"2022-03-17T00:00:00Z","timestamp":1647475200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,3,17]],"date-time":"2022-03-17T00:00:00Z","timestamp":1647475200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["01IS18036A"],"award-info":[{"award-number":["01IS18036A"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Adv Data Anal Classif"],"published-print":{"date-parts":[[2023,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>When researchers publish new cluster algorithms, they usually demonstrate the strengths of their novel approaches by comparing the algorithms\u2019 performance with existing competitors. However, such studies are likely to be optimistically biased towards the new algorithms, as the authors have a vested interest in presenting their method as favorably as possible in order to increase their chances of getting published. Therefore, the superior performance of newly introduced cluster algorithms is over-optimistic and might not be confirmed in independent benchmark studies performed by neutral and unbiased authors. This problem is known among many researchers, but so far, the different mechanisms leading to over-optimism in cluster algorithm evaluation have never been systematically studied and discussed. Researchers are thus often not aware of the full extent of the problem. We present an illustrative study to illuminate the mechanisms by which authors\u2014consciously or unconsciously\u2014paint their cluster algorithm\u2019s performance in an over-optimistic light. Using the recently published cluster algorithm Rock as an example, we demonstrate how optimization of the used datasets or data characteristics, of the algorithm\u2019s parameters and of the choice of the competing cluster algorithms leads to Rock\u2019s performance appearing better than it actually is. Our study is thus a cautionary tale that illustrates how easy it can be for researchers to claim apparent \u201csuperiority\u201d of a new cluster algorithm. This illuminates the vital importance of strategies for avoiding the problems of over-optimism (such as, e.g., neutral benchmark studies), which we also discuss in the article.\n\n<\/jats:p>","DOI":"10.1007\/s11634-022-00496-5","type":"journal-article","created":{"date-parts":[[2022,3,17]],"date-time":"2022-03-17T08:04:41Z","timestamp":1647504281000},"page":"211-238","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Over-optimistic evaluation and reporting of novel cluster algorithms: an illustrative study"],"prefix":"10.1007","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1215-8561","authenticated-orcid":false,"given":"Theresa","family":"Ullmann","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6890-997X","authenticated-orcid":false,"given":"Anna","family":"Beer","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9848-3714","authenticated-orcid":false,"given":"Maximilian","family":"H\u00fcnem\u00f6rder","sequence":"additional","affiliation":[]},{"given":"Thomas","family":"Seidl","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2729-0947","authenticated-orcid":false,"given":"Anne-Laure","family":"Boulesteix","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,3,17]]},"reference":[{"key":"496_CR1","doi-asserted-by":"crossref","unstructured":"Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A next-generation hyperparameter optimization framework. In: Proceedings of the 25rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 2623\u20132631","DOI":"10.1145\/3292500.3330701"},{"issue":"2","key":"496_CR2","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1007\/s00357-006-0017-z","volume":"23","author":"AN Albatineh","year":"2006","unstructured":"Albatineh AN, Niewiadomska-Bugaj M, Mihalko D (2006) On similarity indices and correction for chance agreement. J Classif 23(2):301\u2013313","journal-title":"J Classif"},{"key":"496_CR3","unstructured":"Beer A, Kazempour D, Seidl T (2019) Rock-let the points roam to their clusters themselves. In: Proceedings of the 22nd International Conference on Extending Database Technology (EDBT), pp 630\u2013633"},{"key":"496_CR4","first-page":"2546","volume":"24","author":"J Bergstra","year":"2011","unstructured":"Bergstra J, Bardenet R, Bengio Y, K\u00e9gl B (2011) Algorithms for hyper-parameter optimization. Adv Neural Inf Process Syst NIPS 24:2546\u20132554","journal-title":"Adv Neural Inf Process Syst NIPS"},{"key":"496_CR5","unstructured":"Bischl B, Binder M, Lang M, Pielok T, Richter J, Coors S, Thomas J, Ullmann T, Becker M, Boulesteix AL, Deng D, Lindauer M (2021) Hyperparameter optimization: Foundations, algorithms, best practices and open challenges. arXiv preprint arXiv:2107.05847"},{"issue":"4","key":"496_CR6","doi-asserted-by":"publisher","first-page":"e1004191","DOI":"10.1371\/journal.pcbi.1004191","volume":"11","author":"AL Boulesteix","year":"2015","unstructured":"Boulesteix AL (2015) Ten simple rules for reducing overoptimistic reporting in methodological computational research. PLoS Comput Biol 11(4):e1004191","journal-title":"PLoS Comput Biol"},{"key":"496_CR7","first-page":"77","volume":"6","author":"AL Boulesteix","year":"2008","unstructured":"Boulesteix AL, Strobl C, Augustin T, Daumer M (2008) Evaluating microarray-based classifiers: an overview. Cancer Inf 6:77\u201397","journal-title":"Cancer Inf"},{"issue":"4","key":"496_CR8","doi-asserted-by":"publisher","first-page":"e61562","DOI":"10.1371\/journal.pone.0061562","volume":"8","author":"AL Boulesteix","year":"2013","unstructured":"Boulesteix AL, Lauer S, Eugster MJ (2013) A plea for neutral comparison studies in computational sciences. PLoS ONE 8(4):e61562","journal-title":"PLoS ONE"},{"key":"496_CR9","doi-asserted-by":"crossref","unstructured":"Boulesteix AL, Stierle V, Hapfelmeier A (2015) Publication bias in methodological computational research. Cancer Informatics 14(S5):11\u201319","DOI":"10.4137\/CIN.S30747"},{"key":"496_CR10","doi-asserted-by":"publisher","first-page":"138","DOI":"10.1186\/s12874-017-0417-2","volume":"17","author":"AL Boulesteix","year":"2017","unstructured":"Boulesteix AL, Wilson R, Hapfelmeier A (2017) Towards evidence-based computational statistics: lessons from clinical research on the role and design of real-data benchmark studies. BMC Med Res Methodol 17:138","journal-title":"BMC Med Res Methodol"},{"issue":"1","key":"496_CR11","doi-asserted-by":"publisher","first-page":"216","DOI":"10.1002\/bimj.201700129","volume":"60","author":"AL Boulesteix","year":"2018","unstructured":"Boulesteix AL, Binder H, Abrahamowicz M, Sauerbrei W (2018) On the necessity and design of studies comparing statistical methods. Biometr J 60(1):216\u2013218","journal-title":"Biometr J"},{"issue":"5","key":"496_CR12","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1111\/1740-9713.01444","volume":"17","author":"AL Boulesteix","year":"2020","unstructured":"Boulesteix AL, Hoffmann S, Charlton A, Seibold H (2020) A replication crisis in methodological research? Significance 17(5):18\u201321","journal-title":"Significance"},{"key":"496_CR13","doi-asserted-by":"publisher","first-page":"152","DOI":"10.1186\/s13059-021-02365-4","volume":"22","author":"S Buchka","year":"2021","unstructured":"Buchka S, Hapfelmeier A, Gardner PP, Wilson R, Boulesteix AL (2021) On the optimistic performance evaluation of newly introduced bioinformatic methods. Genome Biol 22:152","journal-title":"Genome Biol"},{"issue":"1","key":"496_CR14","first-page":"1","volume":"3","author":"T Cali\u0144ski","year":"1974","unstructured":"Cali\u0144ski T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat 3(1):1\u201327","journal-title":"Commun Stat"},{"issue":"2","key":"496_CR15","doi-asserted-by":"publisher","first-page":"404","DOI":"10.1080\/10618600.2017.1390469","volume":"27","author":"A Cerioli","year":"2018","unstructured":"Cerioli A, Garc\u00eda-Escudero LA, Mayo-Iscar A, Riani M (2018) Finding the number of normal groups in model-based clustering via constrained likelihoods. J Comput Graph Stat 27(2):404\u2013416","journal-title":"J Comput Graph Stat"},{"key":"496_CR16","first-page":"3625","volume":"34","author":"A Chhabra","year":"2020","unstructured":"Chhabra A, Roy A, Mohapatra P (2020) Suspicion-free adversarial attacks on clustering algorithms. Proc AAAI Conf Artif Intell 34:3625\u20133632","journal-title":"Proc AAAI Conf Artif Intell"},{"issue":"2","key":"496_CR17","doi-asserted-by":"publisher","first-page":"270","DOI":"10.1109\/91.580801","volume":"5","author":"RN Dav\u00e9","year":"1997","unstructured":"Dav\u00e9 RN, Krishnapuram R (1997) Robust clustering methods: a unified view. IEEE Trans Fuzzy Syst 5(2):270\u2013293","journal-title":"IEEE Trans Fuzzy Syst"},{"key":"496_CR18","doi-asserted-by":"crossref","unstructured":"Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell. PAMI-1(2):224\u2013227","DOI":"10.1109\/TPAMI.1979.4766909"},{"key":"496_CR19","unstructured":"Dua D, Graff C (2017) UCI machine learning repository. http:\/\/archive.ics.uci.edu\/ml"},{"key":"496_CR20","unstructured":"Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD\u201996: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp 226\u2013231"},{"issue":"2","key":"496_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3434185","volume":"39","author":"M Ferrari Dacrema","year":"2021","unstructured":"Ferrari Dacrema M, Boglio S, Cremonesi P, Jannach D (2021) A troubling analysis of reproducibility and progress in recommender systems research. ACM Trans Inf Syst 39(2):1\u201349","journal-title":"ACM Trans Inf Syst"},{"issue":"1","key":"496_CR22","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1109\/TIT.1975.1055330","volume":"21","author":"K Fukunaga","year":"1975","unstructured":"Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21(1):32\u201340","journal-title":"IEEE Trans Inf Theory"},{"key":"496_CR23","doi-asserted-by":"crossref","unstructured":"Gan J, Tao Y (2015) DBSCAN revisited: mis-claim, un-fixability, and approximation. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp 519\u2013530","DOI":"10.1145\/2723372.2737792"},{"issue":"7","key":"496_CR24","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1145\/3134599","volume":"61","author":"I Goodfellow","year":"2018","unstructured":"Goodfellow I, McDaniel P, Papernot N (2018) Making machine learning robust against adversarial inputs. Commun ACM 61(7):56\u201366","journal-title":"Commun ACM"},{"key":"496_CR25","first-page":"616","volume-title":"Handbook of cluster analysis","author":"M Halkidi","year":"2015","unstructured":"Halkidi M, Vazirgiannis M, Hennig C (2015) Method-independent indices for cluster validation and estimating the number of clusters. In: Hennig C, Meila M, Murtagh F, Rocci R (eds) Handbook of cluster analysis. Chapman and Hall\/CRC, Boca Raton, pp 616\u2013639"},{"key":"496_CR26","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1016\/j.patrec.2015.04.009","volume":"64","author":"C Hennig","year":"2015","unstructured":"Hennig C (2015) What are the true clusters? Pattern Recogn Lett 64:53\u201362","journal-title":"Pattern Recogn Lett"},{"key":"496_CR27","doi-asserted-by":"publisher","unstructured":"Hennig C (2021) An empirical comparison and characterisation of nine popular clustering methods. Adv Data Anal Classif. https:\/\/doi.org\/10.1007\/s11634-021-00478-z","DOI":"10.1007\/s11634-021-00478-z"},{"issue":"1","key":"496_CR28","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1007\/BF01908075","volume":"2","author":"L Hubert","year":"1985","unstructured":"Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193\u2013218","journal-title":"J Classif"},{"issue":"16","key":"496_CR29","doi-asserted-by":"publisher","first-page":"1990","DOI":"10.1093\/bioinformatics\/btq323","volume":"26","author":"M Jelizarow","year":"2010","unstructured":"Jelizarow M, Guillemot V, Tenenhaus A, Strimmer K, Boulesteix AL (2010) Over-optimism in bioinformatics: an illustration. Bioinformatics 26(16):1990\u20131998","journal-title":"Bioinformatics"},{"key":"496_CR30","volume-title":"Finding groups in data: an introduction to cluster analysis","author":"L Kaufman","year":"2009","unstructured":"Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis. John Wiley & Sons, Hoboken, NJ"},{"issue":"3","key":"496_CR31","doi-asserted-by":"publisher","first-page":"517","DOI":"10.1109\/TSMC.1987.4309069","volume":"17","author":"TO Kvalseth","year":"1987","unstructured":"Kvalseth TO (1987) Entropy and correlation: some comments. IEEE Trans Syst Man Cybern 17(3):517\u2013519","journal-title":"IEEE Trans Syst Man Cybern"},{"issue":"2","key":"496_CR32","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1109\/TIT.1982.1056489","volume":"28","author":"S Lloyd","year":"1982","unstructured":"Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129\u2013137","journal-title":"IEEE Trans Inf Theory"},{"key":"496_CR33","doi-asserted-by":"publisher","first-page":"355","DOI":"10.1146\/annurev-statistics-031017-100325","volume":"6","author":"GJ McLachlan","year":"2019","unstructured":"McLachlan GJ, Lee SX, Rathnayake SI (2019) Finite mixture models. Ann Rev Stat Appl 6:355\u2013378","journal-title":"Ann Rev Stat Appl"},{"key":"496_CR34","first-page":"640","volume-title":"Handbook of cluster analysis","author":"M Meila","year":"2015","unstructured":"Meila M (2015) Criteria for comparing clusterings. In: Hennig C, Meila M, Murtagh F, Rocci R (eds) Handbook of cluster analysis. Chapman and Hall\/CRC, London, pp 640\u2013657"},{"key":"496_CR35","unstructured":"Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: Analysis and an algorithm. In: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, pp 849\u2013856"},{"issue":"1","key":"496_CR36","doi-asserted-by":"publisher","first-page":"537","DOI":"10.1038\/msb.2011.70","volume":"7","author":"R Norel","year":"2011","unstructured":"Norel R, Rice JJ, Stolovitzky G (2011) The self-assessment trap: can we all be better than average? Mol Syst Biol 7(1):537","journal-title":"Mol Syst Biol"},{"key":"496_CR37","doi-asserted-by":"publisher","first-page":"182","DOI":"10.1038\/526182a","volume":"526","author":"R Nuzzo","year":"2015","unstructured":"Nuzzo R (2015) How scientists fool themselves-and how they can stop. Nat News 526:182\u2013185","journal-title":"Nat News"},{"key":"496_CR38","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"issue":"3","key":"496_CR39","doi-asserted-by":"publisher","first-page":"361","DOI":"10.1007\/s10115-008-0150-6","volume":"19","author":"D Pfitzner","year":"2009","unstructured":"Pfitzner D, Leibbrandt R, Powers D (2009) Characterization and evaluation of similarity measures for pairs of clusterings. Knowl Inf Syst 19(3):361\u2013394","journal-title":"Knowl Inf Syst"},{"issue":"3","key":"496_CR40","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3068335","volume":"42","author":"E Schubert","year":"2017","unstructured":"Schubert E, Sander J, Ester M, Kriegel HP, Xu X (2017) DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans Database Syst 42(3):1\u201321","journal-title":"ACM Trans Database Syst"},{"issue":"1","key":"496_CR41","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1109\/JPROC.2015.2494218","volume":"104","author":"B Shahriari","year":"2016","unstructured":"Shahriari B, Swersky K, Wang Z, Adams RP, de Freitas N (2016) Taking the human out of the loop: a review of Bayesian optimization. Proc IEEE 104(1):148\u2013175","journal-title":"Proc IEEE"},{"key":"496_CR42","first-page":"583","volume":"3","author":"A Strehl","year":"2002","unstructured":"Strehl A, Ghosh J (2002) Cluster ensembles\u2013a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583\u2013617","journal-title":"J Mach Learn Res"},{"key":"496_CR43","volume-title":"The visual display of quantitative information","author":"E Tufte","year":"1983","unstructured":"Tufte E (1983) The visual display of quantitative information. Graphics Press, Cheshire, CT"},{"key":"496_CR44","doi-asserted-by":"crossref","unstructured":"Ullmann T, Hennig C, Boulesteix AL (2021) Validation of cluster analysis results on validation data: a systematic framework. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery e1444","DOI":"10.1002\/widm.1444"},{"key":"496_CR45","unstructured":"Van\u00a0Mechelen I, Boulesteix AL, Dangl R, Dean N, Guyon I, Hennig C, Leisch F, Steinley D (2018) Benchmarking in cluster analysis: a white paper. arXiv preprint arXiv:180910496"},{"key":"496_CR46","first-page":"2837","volume":"11","author":"NX Vinh","year":"2010","unstructured":"Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837\u20132854","journal-title":"J Mach Learn Res"},{"key":"496_CR47","doi-asserted-by":"publisher","first-page":"120","DOI":"10.1109\/RBME.2010.2083647","volume":"3","author":"R Xu","year":"2010","unstructured":"Xu R, Wunsch DC (2010) Clustering algorithms in biomedical research: a review. IEEE Rev Biomed Eng 3:120\u2013154","journal-title":"IEEE Rev Biomed Eng"},{"issue":"1","key":"496_CR48","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1093\/bioinformatics\/btp605","volume":"26","author":"MR Yousefi","year":"2010","unstructured":"Yousefi MR, Hua J, Sima C, Dougherty ER (2010) Reporting bias when using real data sets to analyze classification performance. Bioinformatics 26(1):68\u201376","journal-title":"Bioinformatics"}],"container-title":["Advances in Data Analysis and Classification"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-022-00496-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11634-022-00496-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-022-00496-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,27]],"date-time":"2023-02-27T11:34:48Z","timestamp":1677497688000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11634-022-00496-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,17]]},"references-count":48,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,3]]}},"alternative-id":["496"],"URL":"https:\/\/doi.org\/10.1007\/s11634-022-00496-5","relation":{},"ISSN":["1862-5347","1862-5355"],"issn-type":[{"value":"1862-5347","type":"print"},{"value":"1862-5355","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,17]]},"assertion":[{"value":"12 August 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 December 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 February 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 March 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Our fully reproducible code is available at: .","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Code availability"}}]}}