{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,27]],"date-time":"2025-05-27T02:47:08Z","timestamp":1748314028044,"version":"3.37.3"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2022,9,3]],"date-time":"2022-09-03T00:00:00Z","timestamp":1662163200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,9,3]],"date-time":"2022-09-03T00:00:00Z","timestamp":1662163200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Spanish Ministry of Education","award":["Archive ID 18C01\/003730"],"award-info":[{"award-number":["Archive ID 18C01\/003730"]}]},{"name":"Ministerio de Ciencia, Innovaci\u00f3n y Universidades","award":["PID2020-116567GB-C22","PID2020-112796RB-C22"],"award-info":[{"award-number":["PID2020-116567GB-C22","PID2020-112796RB-C22"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Adv Data Anal Classif"],"published-print":{"date-parts":[[2023,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The <jats:italic>k<\/jats:italic>-Means algorithm is one of the most popular choices for clustering data but is well-known to be sensitive to the initialization process. There is a substantial number of methods that aim at finding optimal initial seeds for <jats:italic>k<\/jats:italic>-Means, though none of them is universally valid. This paper presents an extension to longitudinal data of one of such methods, the BRIk algorithm, that relies on clustering a set of centroids derived from bootstrap replicates of the data and on the use of the versatile Modified Band Depth. In our approach we improve the BRIk method by adding a step where we fit appropriate B-splines to our observations and a resampling process that allows computational feasibility and handling issues such as noise or missing data. We have derived two techniques for providing suitable initial seeds, each of them stressing respectively the multivariate or the functional nature of the data. Our results with simulated and real data sets indicate that our <jats:italic>F<\/jats:italic>unctional Data <jats:italic>A<\/jats:italic>pproach to the BRIK method (FABRIk) and our <jats:italic>F<\/jats:italic>unctional <jats:italic>D<\/jats:italic>ata <jats:italic>E<\/jats:italic>xtension of the BRIK method (FDEBRIk) are more effective than previous proposals at providing seeds to initialize <jats:italic>k<\/jats:italic>-Means in terms of clustering recovery.<\/jats:p>","DOI":"10.1007\/s11634-022-00510-w","type":"journal-article","created":{"date-parts":[[2022,9,3]],"date-time":"2022-09-03T21:02:20Z","timestamp":1662238940000},"page":"463-484","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Band depth based initialization of K-means for functional data clustering"],"prefix":"10.1007","volume":"17","author":[{"given":"Javier","family":"Albert-Smet","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9183-8367","authenticated-orcid":false,"given":"Aurora","family":"Torrente","sequence":"additional","affiliation":[]},{"given":"Juan","family":"Romo","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,9,3]]},"reference":[{"key":"510_CR1","doi-asserted-by":"publisher","first-page":"581","DOI":"10.1111\/1467-9469.00350","volume":"30","author":"C Abraham","year":"2003","unstructured":"Abraham C, Cornillon PA, Matzner-L\u00f8ber E et al (2003) Unsupervised curve clustering using B-Splines. Scandinavian J Stat 30:581\u2013595. https:\/\/doi.org\/10.1111\/1467-9469.00350","journal-title":"Scandinavian J Stat"},{"issue":"3","key":"510_CR2","doi-asserted-by":"publisher","first-page":"398","DOI":"10.1093\/biostatistics\/kxr037","volume":"13","author":"A Arribas-Gil","year":"2011","unstructured":"Arribas-Gil A, Romo J (2011) Robust depth-based estimation in the time warping model. Biostatistics 13(3):398\u2013414. https:\/\/doi.org\/10.1093\/biostatistics\/kxr037","journal-title":"Biostatistics"},{"issue":"4","key":"510_CR3","doi-asserted-by":"publisher","first-page":"603","DOI":"10.1093\/biostatistics\/kxu006","volume":"15","author":"A Arribas-Gil","year":"2014","unstructured":"Arribas-Gil A, Romo J (2014) Shape outlier detection and visualization for functional data: the outliergram. Biostatistics 15(4):603\u2013619. https:\/\/doi.org\/10.1093\/biostatistics\/kxu006","journal-title":"Biostatistics"},{"key":"510_CR4","unstructured":"Arthur D, Vassilvitskii S (2007) k-means++: The advantages of careful seeding. Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms pp 1027\u20131035"},{"issue":"4","key":"510_CR5","doi-asserted-by":"publisher","first-page":"260","DOI":"10.1016\/j.imavis.2010.10.002","volume":"29","author":"ME Celebi","year":"2011","unstructured":"Celebi ME (2011) Improving the performance of k-means for color quantization. Image Vis Comput 29(4):260\u2013271. https:\/\/doi.org\/10.1016\/j.imavis.2010.10.002","journal-title":"Image Vis Comput"},{"issue":"1","key":"510_CR6","doi-asserted-by":"publisher","first-page":"200","DOI":"10.1016\/j.eswa.2012.07.021","volume":"40","author":"ME Celebi","year":"2013","unstructured":"Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl 40(1):200\u2013210. https:\/\/doi.org\/10.1016\/j.eswa.2012.07.021","journal-title":"Expert Syst Appl"},{"key":"510_CR7","unstructured":"Dau HA, Keogh E, Kamgar K, et\u00a0al (2018) The UCR Time Series Classification Archive. https:\/\/www.cs.ucr.edu\/~eamonn\/time_series_data_2018\/"},{"issue":"9","key":"510_CR8","doi-asserted-by":"publisher","first-page":"1925","DOI":"10.1080\/03610910903168603","volume":"38","author":"L Ferreira","year":"2009","unstructured":"Ferreira L, Hitchcock DB (2009) A comparison of hierarchical methods for clustering functional data. Communications in Statistics - Simulation and Computation 38(9):1925\u20131949. https:\/\/doi.org\/10.1080\/03610910903168603","journal-title":"Communications in Statistics - Simulation and Computation"},{"key":"510_CR9","first-page":"768","volume":"21","author":"E Forgy","year":"1965","unstructured":"Forgy E (1965) Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21:768\u2013780","journal-title":"Biometrics"},{"key":"510_CR10","volume-title":"Computers and intractability: a guide to the theory of NP-completeness","author":"MR Garey","year":"1979","unstructured":"Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. Freeman, New York"},{"key":"510_CR11","doi-asserted-by":"publisher","unstructured":"Hall P (2018) Principal component analysis for functional data: Methodology, theory, and discussion. Oxford University Press Inc., New York. https:\/\/doi.org\/10.1093\/oxfordhb\/9780199568444.013.8","DOI":"10.1093\/oxfordhb\/9780199568444.013.8"},{"key":"510_CR12","doi-asserted-by":"publisher","unstructured":"He J, Lan M, Tan CL, et\u00a0al (2004) Initialization of cluster refinement algorithms: a review and comparative study. In: IEEE International Joint Conference on Neural Networks, Budapest, Hungary, pp 297\u2013302, https:\/\/doi.org\/10.1109\/IJCNN.2004.1379917","DOI":"10.1109\/IJCNN.2004.1379917"},{"key":"510_CR13","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1007\/BF01908075","volume":"2","author":"L Hubert","year":"1985","unstructured":"Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193\u2013198. https:\/\/doi.org\/10.1007\/BF01908075","journal-title":"J Classif"},{"key":"510_CR14","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1007\/BF01898350","volume":"1","author":"A Inselberg","year":"1985","unstructured":"Inselberg A (1985) The plane with parallel coordinates. Vis Comput 1:69\u201391. https:\/\/doi.org\/10.1007\/BF01898350","journal-title":"Vis Comput"},{"issue":"3","key":"510_CR15","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1007\/s11634-013-0158-y","volume":"8","author":"J Jacques","year":"2014","unstructured":"Jacques J, Preda C (2014) Functional data clustering: a survey. Adv Data Anal Classif 8(3):231\u2013255. https:\/\/doi.org\/10.1007\/s11634-013-0158-y","journal-title":"Adv Data Anal Classif"},{"key":"510_CR16","doi-asserted-by":"publisher","DOI":"10.1002\/9780470316801","volume-title":"Finding groups in data: an introduction to cluster analysis","author":"L Kaufman","year":"1990","unstructured":"Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York"},{"issue":"1","key":"510_CR17","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1093\/bioinformatics\/bti742","volume":"22","author":"X Leng","year":"2006","unstructured":"Leng X, M\u00fcller HG (2006) Classification using functional data analysis for temporal gene expression data. Bioinformatics 22(1):68\u201376. https:\/\/doi.org\/10.1093\/bioinformatics\/bti742","journal-title":"Bioinformatics"},{"issue":"10","key":"510_CR18","doi-asserted-by":"publisher","first-page":"1766","DOI":"10.3390\/app8101766","volume":"8","author":"A Leroy","year":"2018","unstructured":"Leroy A, Marc A, Dupas O et al (2018) Functional data analysis in sport science: Example of swimmers\u2019 progression curves clustering. Appl Sci 8(10):1766. https:\/\/doi.org\/10.3390\/app8101766","journal-title":"Appl Sci"},{"key":"510_CR19","doi-asserted-by":"publisher","unstructured":"L\u00f3pez-Pintado S, Romo J (2003) Depth-based classification for functional data. In: Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications, pp 103\u2013119, https:\/\/doi.org\/10.1090\/dimacs\/072\/08","DOI":"10.1090\/dimacs\/072\/08"},{"issue":"486","key":"510_CR20","doi-asserted-by":"publisher","first-page":"718","DOI":"10.1198\/jasa.2009.0108","volume":"104","author":"S L\u00f3pez-Pintado","year":"2009","unstructured":"L\u00f3pez-Pintado S, Romo J (2009) On the concept of depth for functional data. J Am Stat Assoc 104(486):718\u2013734. https:\/\/doi.org\/10.1198\/jasa.2009.0108","journal-title":"J Am Stat Assoc"},{"key":"510_CR21","unstructured":"Olszewski R (2001) Generalized feature extraction for structural pattern recognition in time-series data. PhD thesis, School of Computer Science, Carnegie Mellon University"},{"key":"510_CR22","unstructured":"R Core Team (2017) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, https:\/\/www.R-project.org\/"},{"issue":"2","key":"510_CR23","doi-asserted-by":"publisher","first-page":"351","DOI":"10.1111\/1467-9868.00129","volume":"60","author":"JO Ramsay","year":"1998","unstructured":"Ramsay JO, Li X (1998) Curve registration. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 60(2):351\u2013363. https:\/\/doi.org\/10.1111\/1467-9868.00129","journal-title":"Journal of the Royal Statistical Society: Series B (Statistical Methodology)"},{"issue":"5","key":"510_CR24","doi-asserted-by":"publisher","first-page":"1219","DOI":"10.1016\/j.csda.2009.12.008","volume":"54","author":"LM Sangalli","year":"2010","unstructured":"Sangalli LM, Secchi P, Vantini S et al (2010) K-mean alignment for curve clustering. Comput Stat Data Anal 54(5):1219\u20131233. https:\/\/doi.org\/10.1016\/j.csda.2009.12.008","journal-title":"Comput Stat Data Anal"},{"issue":"1","key":"510_CR25","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1109\/TPAMI.1984.4767478","volume":"6","author":"SZ Selim","year":"1984","unstructured":"Selim SZ, Ismail MA (1984) K-means type algorithms: a generalized convergence theorem and characterization of local optimality. IEEE Transactions on Pattern Analysis 6(1):81\u201387. https:\/\/doi.org\/10.1109\/TPAMI.1984.4767478","journal-title":"IEEE Transactions on Pattern Analysis"},{"issue":"2","key":"510_CR26","doi-asserted-by":"publisher","first-page":"178","DOI":"10.1037\/1082-989X.11.2.178","volume":"11","author":"D Steinley","year":"2006","unstructured":"Steinley D (2006) Profiling local optima in k-means clustering: developing a diagnostic technique. Psychol Methods 11(2):178\u2013192. https:\/\/doi.org\/10.1037\/1082-989X.11.2.178","journal-title":"Psychol Methods"},{"key":"510_CR27","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1007\/s00357-007-0003-0","volume":"24","author":"D Steinley","year":"2007","unstructured":"Steinley D, Brusco MJ (2007) Initializing k-means batch clustering: a critical evaluation of several techniques. J Classif 24:99\u2013121. https:\/\/doi.org\/10.1007\/s00357-007-0003-0","journal-title":"J Classif"},{"issue":"2","key":"510_CR28","doi-asserted-by":"publisher","first-page":"316","DOI":"10.1198\/jcgs.2011.09224","volume":"20","author":"Y Sun","year":"2011","unstructured":"Sun Y, Genton M (2011) Functional boxplots. J Comput Graph Stat 20(2):316\u2013334. https:\/\/doi.org\/10.1198\/jcgs.2011.09224","journal-title":"J Comput Graph Stat"},{"issue":"2","key":"510_CR29","doi-asserted-by":"publisher","first-page":"232","DOI":"10.1007\/s00357-020-09372-3","volume":"38","author":"A Torrente","year":"2021","unstructured":"Torrente A, Romo J (2021) Initializing k-means clustering by bootstrap and data depth. J Classif 38(2):232\u2013256. https:\/\/doi.org\/10.1007\/s00357-020-09372-3","journal-title":"J Classif"},{"key":"510_CR30","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1186\/1471-2105-14-237","volume":"14","author":"A Torrente","year":"2013","unstructured":"Torrente A, L\u00f3pez-Pintado S, Romo J (2013) DepthTools: an R package for a robust analysis of gene expression data. BMC Bioinformatics 14:237. https:\/\/doi.org\/10.1186\/1471-2105-14-237","journal-title":"BMC Bioinformatics"},{"issue":"301","key":"510_CR31","doi-asserted-by":"publisher","first-page":"236","DOI":"10.1080\/01621459.1963.10500845","volume":"58","author":"JH Ward Jr","year":"1963","unstructured":"Ward JH Jr (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58(301):236\u2013244. https:\/\/doi.org\/10.1080\/01621459.1963.10500845","journal-title":"J Am Stat Assoc"},{"key":"510_CR32","doi-asserted-by":"publisher","DOI":"10.1201\/b15005","volume-title":"Analysis of variance for functional data","author":"JT Zhang","year":"2013","unstructured":"Zhang JT (2013) Analysis of variance for functional data. Chapman and Hall, New York"}],"container-title":["Advances in Data Analysis and Classification"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-022-00510-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11634-022-00510-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-022-00510-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T13:29:38Z","timestamp":1684330178000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11634-022-00510-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,3]]},"references-count":32,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,6]]}},"alternative-id":["510"],"URL":"https:\/\/doi.org\/10.1007\/s11634-022-00510-w","relation":{},"ISSN":["1862-5347","1862-5355"],"issn-type":[{"type":"print","value":"1862-5347"},{"type":"electronic","value":"1862-5355"}],"subject":[],"published":{"date-parts":[[2022,9,3]]},"assertion":[{"value":"24 March 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 July 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 July 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 September 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 November 2022","order":5,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Clarification","order":6,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Missing Open Access funding information has been added in the Funding Note.","order":7,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Authors of this manuscript have no financial or non-financial interests that are directly or indirectly related to the work submitted for publication.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}},{"value":"The three authors consent to the publication of this manuscript.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}]}}