{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T06:53:09Z","timestamp":1774335189892,"version":"3.50.1"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,2,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Model-based clustering has been widely used, e.g. in microarray data analysis. Since for high-dimensional data variable selection is necessary, several penalized model-based clustering methods have been proposed t\u00f8realize simultaneous variable selection and clustering. However, the existing methods all assume that the variables are independent with the use of diagonal covariance matrices.<\/jats:p><jats:p>Results: To model non-independence of variables (e.g. correlated gene expressions) while alleviating the problem with the large number of unknown parameters associated with a general non-diagonal covariance matrix, we generalize the mixture of factor analyzers to that with penalization, which, among others, can effectively realize variable selection. We use simulated data and real microarray data to illustrate the utility and advantages of the proposed method over several existing ones.<\/jats:p><jats:p>Contact: \u00a0weip@biostat.umn.edu<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp707","type":"journal-article","created":{"date-parts":[[2009,12,24]],"date-time":"2009-12-24T01:52:12Z","timestamp":1261619532000},"page":"501-508","source":"Crossref","is-referenced-by-count":23,"title":["Penalized mixtures of factor analyzers with application to clustering high-dimensional microarray data"],"prefix":"10.1093","volume":"26","author":[{"given":"Benhuai","family":"Xie","sequence":"first","affiliation":[{"name":"1 Division of Biostatistics, School of Public Health and 2 School of Statistics, University of Minnesota, Minneapolis, MN, USA"}]},{"given":"Wei","family":"Pan","sequence":"additional","affiliation":[{"name":"1 Division of Biostatistics, School of Public Health and 2 School of Statistics, University of Minnesota, Minneapolis, MN, USA"}]},{"given":"Xiaotong","family":"Shen","sequence":"additional","affiliation":[{"name":"1 Division of Biostatistics, School of Public Health and 2 School of Statistics, University of Minnesota, Minneapolis, MN, USA"}]}],"member":"286","published-online":{"date-parts":[[2009,12,23]]},"reference":[{"key":"2023012508030393100_B1","volume-title":"Mixtures of factor analyzers with common factor loadings for the clustering and visualisation of high-dimensional data.","author":"Baek","year":"2008"},{"key":"2023012508030393100_B2","article-title":"Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualisation of high-dimensional data","author":"Baek","year":"2009","journal-title":"To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence."},{"key":"2023012508030393100_B3","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1038\/nm733","article-title":"Gene-expression profiles predict survival of patients with lung adenocarcinoma","volume":"8","author":"Beer","year":"2002","journal-title":"Nat. Med."},{"key":"2023012508030393100_B4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm (with discussion)","volume":"39","author":"Dempster","year":"1977","journal-title":"J. R. Stat. Soc. Series B"},{"key":"2023012508030393100_B5","doi-asserted-by":"crossref","first-page":"14863","DOI":"10.1073\/pnas.95.25.14863","article-title":"Cluster analysis and display of genome-wide expression patterns","volume":"95","author":"Eisen","year":"1998","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012508030393100_B6","doi-asserted-by":"crossref","DOI":"10.1007\/978-94-009-5564-6","volume-title":"An Introduction to Latent Variable Models.","author":"Everitt","year":"1984"},{"key":"2023012508030393100_B7","doi-asserted-by":"crossref","first-page":"849","DOI":"10.1111\/j.1467-9868.2008.00674.x","article-title":"Sure independence screening for ultrahigh dimensional feature space (with discussion)","volume":"70","author":"Fan","year":"2008","journal-title":"J. R. Stat. Soc. Series B"},{"key":"2023012508030393100_B8","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1198\/016214502760047131","article-title":"Model-based clustering, discriminant analysis, and density estimation","volume":"97","author":"Fraley","year":"2002","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012508030393100_B9","doi-asserted-by":"crossref","DOI":"10.18637\/jss.v018.i06","article-title":"Model-based methods of classification: using the mclust software in chemometrics","volume":"18","author":"Fraley","year":"2007","journal-title":"J. Stat. Software"},{"key":"2023012508030393100_B10","article-title":"The EM algorithm for mixtures of factor analyzers","volume-title":"Technical Report CRG-TR-96-1.","author":"Ghahramani","year":"1997"},{"key":"2023012508030393100_B11","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1126\/science.286.5439.531","article-title":"Molecular classification of cancer: class discovery and class prediction by gene expression monitoring","volume":"286","author":"Golub","year":"1999","journal-title":"Science"},{"key":"2023012508030393100_B12","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1109\/72.554192","article-title":"Modeling the manifolds of images of handwritten digits","volume":"8","author":"Hinton","year":"1997","journal-title":"IEEE Trans. Neural Networks"},{"key":"2023012508030393100_B13","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1093\/biomet\/93.1.85","article-title":"Covariance selection and estimation via penalised normal likelihood","volume":"93","author":"Huang","year":"2006","journal-title":"Biometrika"},{"key":"2023012508030393100_B14","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"Hubert","year":"1985","journal-title":"J. Classif."},{"key":"2023012508030393100_B15","first-page":"599","article-title":"Mixtures of factor analyzers","volume-title":"Proceedings of the Seventeenth International Conference on Machine Learning.","author":"McLachlan","year":"2000"},{"key":"2023012508030393100_B16","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1093\/bioinformatics\/18.3.413","article-title":"A mixture model-based approach to the clustering of microarray expression data","volume":"18","author":"McLachlan","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012508030393100_B17","volume-title":"Finite Mixture Model.","author":"McLachlan","year":"2002"},{"key":"2023012508030393100_B18","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1016\/S0167-9473(02)00183-4","article-title":"Modeling high-dimensional data by mixtures of factor analyzers","volume":"41","author":"McLachlan","year":"2003","journal-title":"Comput. Stat. Data Analysis"},{"key":"2023012508030393100_B19","doi-asserted-by":"crossref","first-page":"5327","DOI":"10.1016\/j.csda.2006.09.015","article-title":"Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution","volume":"51","author":"McLachlan","year":"2007","journal-title":"Comput. Stat. Data Analysis"},{"key":"2023012508030393100_B20","first-page":"1145","article-title":"Penalized model-based clustering with application to variable selection","volume":"8","author":"Pan","year":"2007","journal-title":"J. Mach. Learn. Res."},{"key":"2023012508030393100_B21","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1198\/016214506000000113","article-title":"Variable selection for model-based clustering","volume":"101","author":"Raftery","year":"2006","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012508030393100_B22","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1080\/01621459.1971.10482356","article-title":"Objective criteria for the evaluation of clustering methods","volume":"66","author":"Rand","year":"1971","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012508030393100_B23","doi-asserted-by":"crossref","first-page":"2405","DOI":"10.1093\/bioinformatics\/btl406","article-title":"Evaluation and comparison of gene clustering methods in microarray analysis","volume":"22","author":"Thalamuthu","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012508030393100_B24","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1111\/j.1541-0420.2007.00922.x","article-title":"Variable selection for model-based high-dimensional clustering and its application to microarray data","volume":"64","author":"Wang","year":"2008","journal-title":"Biometrics"},{"key":"2023012508030393100_B25","doi-asserted-by":"crossref","first-page":"921","DOI":"10.1111\/j.1541-0420.2007.00955.x","article-title":"Variable selection in penalized model-based clustering via regularization on grouped parameters","volume":"64","author":"Xie","year":"2008","journal-title":"Biometrics"},{"key":"2023012508030393100_B26","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1214\/08-EJS194","article-title":"Penalized model-based clustering with cluster-specific diagonal covariances and grouped variables","volume":"2","author":"Xie","year":"2008","journal-title":"Electron. J. Stat."},{"key":"2023012508030393100_B27","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1111\/j.1467-9868.2005.00532.x","article-title":"Model selection and estimation in regression with grouped variables","volume":"68","author":"Yuan","year":"2006","journal-title":"J. R. Stat. Soc. Series B"},{"key":"2023012508030393100_B28","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1093\/biomet\/asm018","article-title":"Model selection and estimation in the Gaussian graphical model","volume":"94","author":"Yuan","year":"2007","journal-title":"Biometrika"},{"key":"2023012508030393100_B29","doi-asserted-by":"crossref","first-page":"1473","DOI":"10.1214\/09-EJS487","article-title":"Penalized model-based clustering with unconstrained covariance matrices","volume":"3","author":"Zhou","year":"2009","journal-title":"Electronic J. Stat."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/4\/501\/48856080\/bioinformatics_26_4_501.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/4\/501\/48856080\/bioinformatics_26_4_501.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,15]],"date-time":"2025-02-15T21:48:05Z","timestamp":1739656085000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/4\/501\/244318"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,12,23]]},"references-count":29,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2010,2,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp707","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,2,15]]},"published":{"date-parts":[[2009,12,23]]}}}