{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,12]],"date-time":"2026-06-12T07:17:56Z","timestamp":1781248676430,"version":"3.54.1"},"reference-count":26,"publisher":"Springer Science and Business Media LLC","issue":"8","license":[{"start":{"date-parts":[[2022,3,18]],"date-time":"2022-03-18T00:00:00Z","timestamp":1647561600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,3,18]],"date-time":"2022-03-18T00:00:00Z","timestamp":1647561600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Research Grants Council of the Hong Kong SAR","award":["CUHK14173817"],"award-info":[{"award-number":["CUHK14173817"]}]},{"name":"Research Grants Council of the Hong Kong SAR","award":["CUHK14303819"],"award-info":[{"award-number":["CUHK14303819"]}]},{"DOI":"10.13039\/501100004853","name":"Chinese University of Hong Kong","doi-asserted-by":"publisher","award":["4053357"],"award-info":[{"award-number":["4053357"]}],"id":[{"id":"10.13039\/501100004853","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002703","name":"Jiangsu University","doi-asserted-by":"publisher","award":["5501190012"],"award-info":[{"award-number":["5501190012"]}],"id":[{"id":"10.13039\/501100002703","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2022,8]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>High-dimensional covariance matrix estimation plays a central role in multivariate statistical analysis. It is well-known that the sample covariance matrix is singular when the sample size is smaller than the dimension of the variable, but the covariance estimate must be positive-definite. This motivates some modifications of the sample covariance matrix to preserve its efficient estimation of pairwise covariance. In this paper, we modify the sample correlation matrix using the Bagging technique. The proposed Bagging estimator is flexible for general continuous data. Under some mild conditions, we show theoretically that the Bagging estimator can ensure positive-definiteness with probability one in finite samples. We also prove the consistency of the bootstrap estimator of Pearson correlation and the consistency of our Bagging estimator when the dimension <jats:italic>p<\/jats:italic> is fixed. Simulation results and a real application are provided to demonstrate that our method strikes a better balance between RMSE and likelihood, and is more robust, than other existing estimators.<\/jats:p>","DOI":"10.1007\/s10994-022-06138-3","type":"journal-article","created":{"date-parts":[[2022,3,18]],"date-time":"2022-03-18T21:02:52Z","timestamp":1647637372000},"page":"2905-2927","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["High-dimensional correlation matrix estimation for general continuous data with Bagging technique"],"prefix":"10.1007","volume":"111","author":[{"given":"Chaojie","family":"Wang","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jin","family":"Du","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2744-9030","authenticated-orcid":false,"given":"Xiaodan","family":"Fan","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2022,3,18]]},"reference":[{"issue":"4","key":"6138_CR1","first-page":"1281","volume":"10","author":"J Barnard","year":"2000","unstructured":"Barnard, J., McCulloch, R., & Meng, X. L. (2000). Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage. Statistica Sinica, 10(4), 1281\u20131311.","journal-title":"Statistica Sinica"},{"issue":"5","key":"6138_CR2","doi-asserted-by":"publisher","first-page":"666","DOI":"10.1016\/j.ccell.2015.09.018","volume":"28","author":"MG Best","year":"2015","unstructured":"Best, M. G., Sol, N., Kooi, I., Tannous, J., Westerman, B. A., Rustenburg, F., et al. (2015). RNA-seq of tumor-educated platelets enables blood-based pan-cancer, multiclass, and molecular pathway cancer diagnostics. Cancer Cell, 28(5), 666\u2013676.","journal-title":"Cancer Cell"},{"key":"6138_CR3","doi-asserted-by":"crossref","unstructured":"Bhattacharjee, A., Richards, W. G., Staunton, J., Li, C., Monti, S., Vasa, P., Ladd, C., Beheshti, J., Bueno, R., Gillette, M., et al. (2001). Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences, 98(24), 13790\u201313795.","DOI":"10.1073\/pnas.191502998"},{"issue":"6","key":"6138_CR4","first-page":"2577","volume":"36","author":"PJ Bickel","year":"2008","unstructured":"Bickel, P. J., & Levina, E. (2008). Covariance regularization by thresholding. The Annals of Statistics, 36(6), 2577\u20132604.","journal-title":"The Annals of Statistics"},{"issue":"1","key":"6138_CR5","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1214\/009053607000000758","volume":"36","author":"PJ Bickel","year":"2008","unstructured":"Bickel, P. J., Levina, E., et al. (2008). Regularized estimation of large covariance matrices. The Annals of Statistics, 36(1), 199\u2013227.","journal-title":"The Annals of Statistics"},{"issue":"1","key":"6138_CR6","doi-asserted-by":"publisher","first-page":"371","DOI":"10.1016\/j.ejor.2017.09.028","volume":"266","author":"T Bodnar","year":"2018","unstructured":"Bodnar, T., Parolya, N., & Schmid, W. (2018). Estimation of the global minimum variance portfolio in high dimensions. European Journal of Operational Research, 266(1), 371\u2013390.","journal-title":"European Journal of Operational Research"},{"issue":"2","key":"6138_CR7","first-page":"123","volume":"24","author":"L Breiman","year":"1996","unstructured":"Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123\u2013140.","journal-title":"Machine Learning"},{"issue":"494","key":"6138_CR8","doi-asserted-by":"publisher","first-page":"672","DOI":"10.1198\/jasa.2011.tm10560","volume":"106","author":"T Cai","year":"2011","unstructured":"Cai, T., & Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. Journal of the American Statistical Association, 106(494), 672\u2013684.","journal-title":"Journal of the American Statistical Association"},{"key":"6138_CR9","doi-asserted-by":"crossref","unstructured":"Cai, T. T., & Zhou, H. H. (2012). Minimax estimation of large covariance matrices under $$\\ell _1$$-norm. Statistica Sinica (pp. 1319\u20131349).","DOI":"10.5705\/ss.2010.253"},{"issue":"4","key":"6138_CR10","doi-asserted-by":"publisher","first-page":"2118","DOI":"10.1214\/09-AOS752","volume":"38","author":"TT Cai","year":"2010","unstructured":"Cai, T. T., Zhang, C. H., Zhou, H. H., et al. (2010). Optimal rates of convergence for covariance matrix estimation. The Annals of Statistics, 38(4), 2118\u20132144.","journal-title":"The Annals of Statistics"},{"issue":"1","key":"6138_CR11","doi-asserted-by":"publisher","first-page":"C1","DOI":"10.1111\/ectj.12061","volume":"19","author":"J Fan","year":"2016","unstructured":"Fan, J., Liao, Y., & Liu, H. (2016). An overview of the estimation of large covariance and precision matrices. The Econometrics Journal, 19(1), C1\u2013C32.","journal-title":"The Econometrics Journal"},{"issue":"3","key":"6138_CR12","doi-asserted-by":"publisher","first-page":"432","DOI":"10.1093\/biostatistics\/kxm045","volume":"9","author":"J Friedman","year":"2008","unstructured":"Friedman, J., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3), 432\u2013441.","journal-title":"Biostatistics"},{"key":"6138_CR13","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-01689-9","volume-title":"Computation of multivariate normal and t probabilities","author":"A Genz","year":"2009","unstructured":"Genz, A., & Bretz, F. (2009). Computation of multivariate normal and t probabilities. New York: Springer."},{"issue":"11","key":"6138_CR14","doi-asserted-by":"publisher","first-page":"4143","DOI":"10.1016\/j.laa.2012.01.013","volume":"436","author":"D Guillot","year":"2012","unstructured":"Guillot, D., & Rajaratnam, B. (2012). Retaining positive definiteness in thresholded matrices. Linear Algebra and its Applications, 436(11), 4143\u20134160.","journal-title":"Linear Algebra and its Applications"},{"issue":"2","key":"6138_CR15","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1214\/aos\/1009210544","volume":"29","author":"IM Johnstone","year":"2001","unstructured":"Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. The Annals of Statistics, 29(2), 295\u2013327.","journal-title":"The Annals of Statistics"},{"issue":"2","key":"6138_CR16","doi-asserted-by":"publisher","first-page":"365","DOI":"10.1016\/S0047-259X(03)00096-4","volume":"88","author":"O Ledoit","year":"2004","unstructured":"Ledoit, O., & Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88(2), 365\u2013411.","journal-title":"Journal of Multivariate Analysis"},{"key":"6138_CR17","doi-asserted-by":"crossref","unstructured":"Lehmann, E. L. (1999). Elements of large-sample theory. Springer, 1999.","DOI":"10.1007\/b98855"},{"issue":"9","key":"6138_CR18","doi-asserted-by":"publisher","first-page":"6256","DOI":"10.1109\/TIT.2011.2162175","volume":"57","author":"TL Marzetta","year":"2011","unstructured":"Marzetta, T. L., Tucci, G. H., & Simon, S. H. (2011). A random matrix-theoretic approach to handling singular covariance estimates. IEEE Transactions on Information Theory, 57(9), 6256\u20136271.","journal-title":"IEEE Transactions on Information Theory"},{"issue":"11","key":"6138_CR19","doi-asserted-by":"publisher","first-page":"5113","DOI":"10.1109\/TIT.2008.929938","volume":"54","author":"X Mestre","year":"2008","unstructured":"Mestre, X. (2008). Improved estimation of eigenvalues and eigenvectors of covariance matrices using their sample estimates. IEEE Transactions on Information Theory, 54(11), 5113\u20135129.","journal-title":"IEEE Transactions on Information Theory"},{"key":"6138_CR20","doi-asserted-by":"crossref","unstructured":"Mestre, X., & Lagunas, M. A. (2005). Diagonal loading for finite sample size beamforming: An asymptotic approach. Robust Adaptive Beamforming (pp. 200\u2013266).","DOI":"10.1002\/0471733482.ch4"},{"key":"6138_CR21","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1016\/0024-3795(90)90363-H","volume":"127","author":"H Neudecker","year":"1990","unstructured":"Neudecker, H., & Wesselman, A. M. (1990). The asymptotic variance matrix of the sample correlation matrix. Linear Algebra and its Applications, 127, 589\u2013599.","journal-title":"Linear Algebra and its Applications"},{"key":"6138_CR22","unstructured":"Pishro-Nik, H. (2016). Introduction to probability, statistics, and random processes."},{"issue":"485","key":"6138_CR23","doi-asserted-by":"publisher","first-page":"177","DOI":"10.1198\/jasa.2009.0101","volume":"104","author":"AJ Rothman","year":"2009","unstructured":"Rothman, A. J., Levina, E., & Zhu, J. (2009). Generalized thresholding of large covariance matrices. Journal of the American Statistical Association, 104(485), 177\u2013186.","journal-title":"Journal of the American Statistical Association"},{"issue":"2","key":"6138_CR24","doi-asserted-by":"publisher","first-page":"549","DOI":"10.1007\/s00220-010-1044-5","volume":"298","author":"T Tao","year":"2010","unstructured":"Tao, T., & Vu, V. (2010). Random matrices: Universality of local eigenvalue statistics up to the edge. Communications in Mathematical Physics, 298(2), 549\u2013572.","journal-title":"Communications in Mathematical Physics"},{"issue":"2","key":"6138_CR25","doi-asserted-by":"publisher","first-page":"770","DOI":"10.1109\/TIT.2018.2888734","volume":"65","author":"GH Tucci","year":"2019","unstructured":"Tucci, G. H., & Wang, K. (2019). New methods for handling singular sample covariance matrices. IEEE Transactions on Information Theory, 65(2), 770\u2013786.","journal-title":"IEEE Transactions on Information Theory"},{"issue":"4","key":"6138_CR26","first-page":"1755","volume":"19","author":"WB Wu","year":"2009","unstructured":"Wu, W. B., & Pourahmadi, M. (2009). Banding sample autocovariance matrices of stationary processes. Statistica Sinica, 19(4), 1755\u20131768.","journal-title":"Statistica Sinica"}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-022-06138-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-022-06138-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-022-06138-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,8,1]],"date-time":"2022-08-01T20:07:47Z","timestamp":1659384467000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-022-06138-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,18]]},"references-count":26,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2022,8]]}},"alternative-id":["6138"],"URL":"https:\/\/doi.org\/10.1007\/s10994-022-06138-3","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,18]]},"assertion":[{"value":"3 January 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 November 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 January 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 March 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Code on results is available upon request.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Code availability"}}]}}