{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T01:13:17Z","timestamp":1760058797214,"version":"build-2065373602"},"reference-count":47,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2025,4,26]],"date-time":"2025-04-26T00:00:00Z","timestamp":1745625600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["5R01DE032366-02A1"],"award-info":[{"award-number":["5R01DE032366-02A1"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>Functional data, including one-dimensional curves and higher-dimensional surfaces, have become increasingly prominent across scientific disciplines. They offer a continuous perspective that captures subtle dynamics and richer structures compared to discrete representations, thereby preserving essential information and facilitating the more natural modeling of real-world phenomena, especially in sparse or irregularly sampled settings. A key challenge lies in identifying low-dimensional representations and estimating covariance structures that capture population statistics effectively. We propose a novel Bayesian framework with a nonparametric kernel expansion and a sparse prior, enabling the direct modeling of measured data and avoiding the artificial biases from regridding. Our method, Bayesian scalable functional data analysis (BSFDA), automatically selects both subspace dimensionalities and basis functions, reducing the computational overhead through an efficient variational optimization strategy. We further propose a faster approximate variant that maintains comparable accuracy but accelerates computations significantly on large-scale datasets. Extensive simulation studies demonstrate that our framework outperforms conventional techniques in covariance estimation and dimensionality selection, showing resilience to high dimensionality and irregular sampling. The proposed methodology proves effective for multidimensional functional data and showcases practical applicability in biomedical and meteorological datasets. Overall, BSFDA offers an adaptive, continuous, and scalable solution for modern functional data analysis across diverse scientific domains.<\/jats:p>","DOI":"10.3390\/a18050254","type":"journal-article","created":{"date-parts":[[2025,4,28]],"date-time":"2025-04-28T06:23:32Z","timestamp":1745821412000},"page":"254","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Integrated Model Selection and Scalability in Functional Data Analysis Through Bayesian Learning"],"prefix":"10.3390","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8745-5996","authenticated-orcid":false,"given":"Wenzheng","family":"Tao","sequence":"first","affiliation":[{"name":"School of Computing, The University of Utah, Salt Lake City, UT 84112, USA"},{"name":"Scientific Computing and Imaging Institute, The University of Utah, Salt Lake City, UT 84112, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3446-4810","authenticated-orcid":false,"given":"Sarang","family":"Joshi","sequence":"additional","affiliation":[{"name":"Scientific Computing and Imaging Institute, The University of Utah, Salt Lake City, UT 84112, USA"},{"name":"Biomedical Engineering, The University of Utah, Salt Lake City, UT 84112, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ross","family":"Whitaker","sequence":"additional","affiliation":[{"name":"School of Computing, The University of Utah, Salt Lake City, UT 84112, USA"},{"name":"Scientific Computing and Imaging Institute, The University of Utah, Salt Lake City, UT 84112, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,4,26]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1146\/annurev-statistics-041715-033624","article-title":"Functional Data Analysis","volume":"3","author":"Wang","year":"2016","journal-title":"Annu. Rev. Stat. Its Appl."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Ramsay, J.O., and Silverman, B.W. (2002). Applied Functional Data Analysis: Methods and Case Studies, Springer.","DOI":"10.1007\/b98886"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1111\/j.2517-6161.1991.tb01821.x","article-title":"Estimating the Mean and Covariance Structure Nonparametrically When the Data are Curves","volume":"53","author":"Rice","year":"1991","journal-title":"J. R. Stat. Soc. Ser. B Stat. Methodol."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.jmva.2018.11.007","article-title":"Recent advances in functional data analysis and high-dimensional statistics","volume":"170","author":"Aneiros","year":"2019","journal-title":"J. Multivar. Anal."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"104806","DOI":"10.1016\/j.jmva.2021.104806","article-title":"From multivariate to functional data analysis: Fundamentals, recent developments, and emerging areas","volume":"188","author":"Li","year":"2022","journal-title":"J. Multivar. Anal."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"649","DOI":"10.1080\/01621459.2016.1273115","article-title":"Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains","volume":"113","author":"Happ","year":"2018","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1161","DOI":"10.1214\/23-BA1410","article-title":"Semiparametric Functional Factor Models with Bayesian Rank Selection","volume":"18","author":"Kowal","year":"2023","journal-title":"Bayesian Anal."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1214\/16-BA1003","article-title":"Bayesian Estimation of Principal Components for Functional Data","volume":"12","author":"Suarez","year":"2017","journal-title":"Bayesian Anal."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1080\/10618600.2024.2362227","article-title":"Ultra-Efficient MCMC for Bayesian Longitudinal Functional Data Analysis","volume":"34","author":"Sun","year":"2024","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Rasmussen, C.E., and Williams, C.K.I. (2006). Gaussian Processes for Machine Learning, MIT Press. Adaptive Computation and Machine Learning.","DOI":"10.7551\/mitpress\/3206.001.0001"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"577","DOI":"10.1198\/016214504000001745","article-title":"Functional Data Analysis for Sparse Longitudinal Data","volume":"100","author":"Yao","year":"2005","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1214\/08-AOAS206","article-title":"Multilevel functional principal component analysis","volume":"3","author":"Di","year":"2009","journal-title":"Ann. Appl. Stat."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"995","DOI":"10.1198\/jcgs.2009.08011","article-title":"A geometric approach to maximum likelihood estimation of the functional principal components from sparse longitudinal data","volume":"18","author":"Peng","year":"2009","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_14","first-page":"1571","article-title":"Multivariate functional principal component analysis: A normalization approach","volume":"24","author":"Chiou","year":"2014","journal-title":"Stat. Sin."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Trefethen, L.N. (2019). Approximation Theory and Approximation Practice, Extended Edition, SIAM.","DOI":"10.1137\/1.9781611975949"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1017\/S0962492904000182","article-title":"Sparse grids","volume":"13","author":"Bungartz","year":"2004","journal-title":"Acta Numer."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1127","DOI":"10.1080\/10618600.2022.2035738","article-title":"Two-Dimensional Functional Principal Component Analysis for Image Feature Extraction","volume":"31","author":"Shi","year":"2022","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1064","DOI":"10.1111\/j.1541-0420.2012.01788.x","article-title":"Bayesian Latent Factor Regression for Functional and Longitudinal Data","volume":"68","author":"Montagna","year":"2012","journal-title":"Biometrics"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"629","DOI":"10.1080\/10618600.2019.1710837","article-title":"Bayesian Function-on-Scalars Regression for High-Dimensional Data","volume":"29","author":"Kowal","year":"2020","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"958","DOI":"10.1080\/02664763.2023.2172143","article-title":"Bayesian adaptive selection of basis functions for functional data representation","volume":"51","author":"Sousa","year":"2024","journal-title":"J. Appl. Stat."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1284","DOI":"10.1080\/01621459.2013.788980","article-title":"Selecting the Number of Principal Components in Functional Data","volume":"108","author":"Li","year":"2013","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"558","DOI":"10.1093\/biostatistics\/kxaa041","article-title":"Bayesian analysis of longitudinal and multidimensional functional data","volume":"23","author":"Shamshoian","year":"2022","journal-title":"Biostatistics"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1080\/10618600.2022.2107532","article-title":"Ultra-Fast Approximate Inference Using Variational Functional Mixed Models","volume":"32","author":"Huo","year":"2023","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_24","unstructured":"Liu, Y., Qiao, X., Pei, Y., and Wang, L. (2024, January 21\u201327). Deep Functional Factor Models: Forecasting High-Dimensional Functional Time Series via Bayesian Nonparametric Factorization. Proceedings of the International Conference on Machine Learning, PMLR, Vienna, Austria."},{"key":"ref_25","first-page":"20150202","article-title":"Principal component analysis: A review and recent developments","volume":"374","author":"Jolliffe","year":"2016","journal-title":"Philos. Trans. R. Soc. Math. Phys. Eng. Sci."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1111\/1467-9868.00196","article-title":"Probabilistic principal component analysis","volume":"61","author":"Tipping","year":"1999","journal-title":"J. R. Stat. Soc. Ser. B Stat. Methodol."},{"key":"ref_27","first-page":"1957","article-title":"Practical approaches to principal component analysis in the presence of missing values","volume":"11","author":"Ilin","year":"2010","journal-title":"J. Mach. Learn. Res."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Bishop, C.M. (1999, January 7\u201310). Variational Principal Components. Proceedings of the Ninth International Conference on Artificial Neural Networks, ICANN\u201999, Edinburgh, UK.","DOI":"10.1049\/cp:19991160"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1162\/089976699300016728","article-title":"Mixtures of probabilistic principal component analyzers","volume":"11","author":"Tipping","year":"1999","journal-title":"Neural Comput."},{"key":"ref_30","first-page":"211","article-title":"Sparse Bayesian learning and the relevance vector machine","volume":"1","author":"Tipping","year":"2001","journal-title":"J. Mach. Learn. Res."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"MacKay, D.J. (1996). Bayesian methods for backpropagation networks. Models of Neural Networks III: Association, Generalization, and Representation, Springer.","DOI":"10.1007\/978-1-4612-0723-8_6"},{"key":"ref_32","unstructured":"Neal, R.M. (2012). Bayesian Learning for Neural Networks, Springer Science & Business Media."},{"key":"ref_33","unstructured":"Wipf, D., and Nagarajan, S. (2007, January 3\u20136). A new view of automatic relevance determination. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Girolami, M., and Rogers, S. (2005, January 7\u201311). Hierarchic Bayesian models for kernel learning. Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany.","DOI":"10.1145\/1102351.1102382"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"108826","DOI":"10.1016\/j.sigpro.2022.108826","article-title":"Bayesian low-rank matrix completion with dual-graph embedding: Prior analysis and tuning-free inference","volume":"204","author":"Chen","year":"2023","journal-title":"Signal Process."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1109\/MSP.2022.3198201","article-title":"Rethinking Bayesian Learning for Data Analysis: The Art of Prior and Inference in Sparsity-Aware Modeling","volume":"39","author":"Cheng","year":"2022","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Wong, A.P.S., Wijffels, S.E., Riser, S.C., Pouliquen, S., Hosoda, S., Roemmich, D., Gilson, J., Johnson, G.C., Martini, K., and Murphy, D.J. (2020). Argo Data 1999\u20132019: Two Million Temperature-Salinity Profiles and Subsurface Velocity Observations From a Global Array of Profiling Floats. Front. Mar. Sci., 7.","DOI":"10.3389\/fmars.2020.00700"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"859","DOI":"10.1080\/01621459.2017.1285773","article-title":"Variational Inference: A Review for Statisticians","volume":"112","author":"Blei","year":"2017","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_39","unstructured":"Tipping, M.E., and Faul, A.C. (2003, January 3\u20136). Fast marginal likelihood maximisation for sparse Bayesian models. Proceedings of the International Workshop on Artificial Intelligence and Statistics, PMLR, Key West, FL, USA."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"e1546","DOI":"10.1002\/wics.1546","article-title":"Improving the Gibbs sampler","volume":"14","author":"Park","year":"2022","journal-title":"Wiley Interdiscip. Rev. Comput. Stat."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1016\/S0047-259X(03)00096-4","article-title":"A well-conditioned estimator for large-dimensional covariance matrices","volume":"88","author":"Ledoit","year":"2004","journal-title":"J. Multivar. Anal."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1093\/aje\/126.2.310","article-title":"The Multicenter AIDS Cohort Study: Rationale, organization, and selected characteristics of the participants","volume":"126","author":"Kaslow","year":"1987","journal-title":"Am. J. Epidemiol."},{"key":"ref_43","unstructured":"(2024, November 29). Argo Float Data and Metadata from Global Data Assembly Centre (Argo GDAC)-Snapshot of Argo GDAC of 9 November 2024. Available online: https:\/\/www.seanoe.org\/data\/00311\/42182\/."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1214\/21-AOAS1477","article-title":"A functional-data approach to the Argo data","volume":"16","author":"Yarger","year":"2022","journal-title":"Ann. Appl. Stat."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"de Boyer Mont\u00e9gut, C., Madec, G., Fischer, A.S., Lazar, A., and Iudicone, D. (2004). Mixed layer depth over the global ocean: An examination of profile data and a profile-based climatology. J. Geophys. Res. Ocean., 109.","DOI":"10.1029\/2004JC002378"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/j.pocean.2009.03.004","article-title":"The 2004\u20132008 mean and annual cycle of temperature, salinity, and steric height in the global ocean from the Argo Program","volume":"82","author":"Roemmich","year":"2009","journal-title":"Prog. Oceanogr."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"20180400","DOI":"10.1098\/rspa.2018.0400","article-title":"Locally stationary spatio-temporal interpolation of Argo profiling float data","volume":"474","author":"Kuusela","year":"2018","journal-title":"Proc. R. Soc. A"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/5\/254\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:22:07Z","timestamp":1760030527000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/5\/254"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,26]]},"references-count":47,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2025,5]]}},"alternative-id":["a18050254"],"URL":"https:\/\/doi.org\/10.3390\/a18050254","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2025,4,26]]}}}