{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T18:27:16Z","timestamp":1776796036128,"version":"3.51.2"},"reference-count":48,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2020,12,10]],"date-time":"2020-12-10T00:00:00Z","timestamp":1607558400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Nature Science Foundation of China","doi-asserted-by":"crossref","award":["61863010"],"award-info":[{"award-number":["61863010"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Key Research and Development Program of Shandong Province of China","award":["2019GGX101001"],"award-info":[{"award-number":["2019GGX101001"]}]},{"name":"Natural Science Foundation of Shandong Province of China","award":["ZR2018MC007"],"award-info":[{"award-number":["ZR2018MC007"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,7,20]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>The rapid development of single-cell RNA sequencing (scRNA-Seq) technology provides strong technical support for accurate and efficient analyzing single-cell gene expression data. However, the analysis of scRNA-Seq is accompanied by many obstacles, including dropout events and the curse of dimensionality. Here, we propose the scGMAI, which is a new single-cell Gaussian mixture clustering method based on autoencoder networks and the fast independent component analysis (FastICA). Specifically, scGMAI utilizes autoencoder networks to reconstruct gene expression values from scRNA-Seq data and FastICA is used to reduce the dimensions of reconstructed data. The integration of these computational techniques in scGMAI leads to outperforming results compared to existing tools, including Seurat, in clustering cells from 17 public scRNA-Seq datasets. In summary, scGMAI is an effective tool for accurately clustering and identifying cell types from scRNA-Seq data and shows the great potential of its applicative power in scRNA-Seq data analysis. The source code is available at https:\/\/github.com\/QUST-AIBBDRC\/scGMAI\/.<\/jats:p>","DOI":"10.1093\/bib\/bbaa316","type":"journal-article","created":{"date-parts":[[2020,10,19]],"date-time":"2020-10-19T19:13:25Z","timestamp":1603134805000},"source":"Crossref","is-referenced-by-count":60,"title":["scGMAI: a Gaussian mixture model for clustering single-cell RNA-Seq data based on deep autoencoder"],"prefix":"10.1093","volume":"22","author":[{"given":"Bin","family":"Yu","sequence":"first","affiliation":[{"name":"College of Mathematics and Physics, Qingdao University of Science and Technolog, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chen","family":"Chen","sequence":"additional","affiliation":[{"name":"College of Mathematics and Physics, Qingdao University of Science and Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ren","family":"Qi","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ruiqing","family":"Zheng","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Central South University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Patrick J","family":"Skillman-Lawrence","sequence":"additional","affiliation":[{"name":"College of Medicine, The Ohio State University, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaolin","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Mathematics and Physics, Qingdao University of Science and Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anjun","family":"Ma","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, The Ohio State University, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haiming","family":"Gu","sequence":"additional","affiliation":[{"name":"College of Mathematics and Physics, Qingdao University of Science and Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2020,12,10]]},"reference":[{"issue":"7453","key":"2021072117024681700_ref1","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1038\/nature12172","article-title":"Single cell transcriptomics reveals bimodality in expression and splicing in immune cells","volume":"498","author":"Shalek","year":"2013","journal-title":"Nature"},{"issue":"2","key":"2021072117024681700_ref2","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1038\/nbt.3102","article-title":"Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells","volume":"33","author":"Buettner","year":"2015","journal-title":"Nat Biotechnol"},{"key":"2021072117024681700_ref3","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.gde.2013.12.004","article-title":"Single cell analysis of cancer genomes","volume":"24","author":"Van Loo","year":"2014","journal-title":"Curr Opin Genet Dev"},{"key":"2021072117024681700_ref4","doi-asserted-by":"crossref","first-page":"407","DOI":"10.3389\/fgene.2020.00407","article-title":"An adaptive sparse subspace clustering for cell type identification","volume":"11","author":"Zheng","year":"2020","journal-title":"Front Genet"},{"key":"2021072117024681700_ref5","doi-asserted-by":"crossref","DOI":"10.1186\/s13059-016-0927-y","article-title":"Design and computational analysis of single-cell RNA-sequencing experiments","volume":"17","author":"Bacher","year":"2016","journal-title":"Genome Biol"},{"issue":"4","key":"2021072117024681700_ref6","doi-asserted-by":"crossref","first-page":"1209","DOI":"10.1093\/bib\/bbz063","article-title":"Machine learning and statistical methods for clustering single-cell RNA-sequencing data","volume":"21","author":"Petegrosso","year":"2020","journal-title":"Brief Bioinform"},{"issue":"7","key":"2021072117024681700_ref7","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1038\/s41592-018-0033-z","article-title":"SAVER: gene expression recovery for single-cell RNA sequencing","volume":"15","author":"Huang","year":"2018","journal-title":"Nat Methods"},{"issue":"1","key":"2021072117024681700_ref8","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1038\/s41467-018-07931-2","article-title":"Single-cell RNA-seq denoising using a deep count autoencoder","volume":"10","author":"Eraslan","year":"2019","journal-title":"Nat Commun"},{"issue":"3","key":"2021072117024681700_ref9","doi-asserted-by":"crossref","first-page":"716","DOI":"10.1016\/j.cell.2018.05.061","article-title":"Recovering gene interactions from single-cell data using data diffusion","volume":"174","author":"Van Dijk","year":"2018","journal-title":"Cell"},{"issue":"1","key":"2021072117024681700_ref10","doi-asserted-by":"crossref","first-page":"997","DOI":"10.1038\/s41467-018-03405-7","article-title":"An accurate and robust imputation method scImpute for single-cell RNA-seq data","volume":"9","author":"Li","year":"2018","journal-title":"Nat Commun"},{"issue":"4","key":"2021072117024681700_ref11","doi-asserted-by":"crossref","first-page":"1196","DOI":"10.1093\/bib\/bbz062","article-title":"Clustering and classification methods for single-cell RNA-sequencing data","volume":"21","author":"Qi","year":"2020","journal-title":"Brief Bioinform"},{"issue":"1","key":"2021072117024681700_ref12","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/0169-7439(87)80084-9","article-title":"Principal component analysis","volume":"2","author":"Wold","year":"1987","journal-title":"Chemom Intel Lab Syst"},{"key":"2021072117024681700_ref13","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Der Maaten","year":"2008","journal-title":"J Mach Learn Res"},{"issue":"1","key":"2021072117024681700_ref14","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1038\/nbt.4314","article-title":"Dimensionality reduction for visualizing single-cell data using UMAP","volume":"37","author":"Becht","year":"2019","journal-title":"Nat Biotechnol"},{"issue":"1","key":"2021072117024681700_ref15","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1186\/s13059-015-0805-z","article-title":"ZIFA: dimensionality reduction for zero-inflated single-cell gene expression analysis","volume":"16","author":"Pierson","year":"2015","journal-title":"Genome Biol"},{"issue":"17","key":"2021072117024681700_ref16","doi-asserted-by":"crossref","first-page":"e156","DOI":"10.1093\/nar\/gkx681","article-title":"Using neural networks for reducing the dimensions of single-cell RNA-Seq data","volume":"45","author":"Lin","year":"2017","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2021072117024681700_ref17","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1093\/bioinformatics\/btx490","article-title":"DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data","volume":"34","author":"Sun","year":"2018","journal-title":"Bioinformatics"},{"issue":"19","key":"2021072117024681700_ref18","doi-asserted-by":"crossref","first-page":"3642","DOI":"10.1093\/bioinformatics\/btz139","article-title":"SinNLRR: a robust subspace clustering method for cell type detection by non-negative and low-rank representation","volume":"35","author":"Zheng","year":"2019","journal-title":"Bioinformatics"},{"issue":"4","key":"2021072117024681700_ref19","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1038\/nmeth.4207","article-title":"Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning","volume":"14","author":"Wang","year":"2017","journal-title":"Nat Methods"},{"issue":"12","key":"2021072117024681700_ref20","doi-asserted-by":"crossref","first-page":"1974","DOI":"10.1093\/bioinformatics\/btv088","article-title":"Identification of cell types from single-cell transcriptomes using a novel clustering method","volume":"31","author":"Xu","year":"2015","journal-title":"Bioinformatics"},{"issue":"1","key":"2021072117024681700_ref21","doi-asserted-by":"crossref","DOI":"10.1186\/s12859-016-1175-6","article-title":"CellTree: an R\/bioconductor package to infer the hierarchical structure of cell populations from single-cell RNA-seq data","volume":"17","author":"Duverle","year":"2016","journal-title":"BMC Bioinf"},{"issue":"5","key":"2021072117024681700_ref22","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1038\/nmeth.4236","article-title":"SC3: consensus clustering of single-cell RNA-seq data","volume":"14","author":"Kiselev","year":"2017","journal-title":"Nat Methods"},{"issue":"5","key":"2021072117024681700_ref23","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1038\/nbt.3192","article-title":"Spatial reconstruction of single-cell gene expression data","volume":"33","author":"Satija","year":"2015","journal-title":"Nat Biotechnol"},{"issue":"10","key":"2021072117024681700_ref24","doi-asserted-by":"crossref","first-page":"3156","DOI":"10.1093\/bioinformatics\/btaa139","article-title":"scRMD: imputation for single cell RNA-seq data via robust matrix decomposition","volume":"36","author":"Chen","year":"2020","journal-title":"Bioinformatics"},{"issue":"Jan","key":"2021072117024681700_ref25","first-page":"993","article-title":"Latent dirichlet\u00a0allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J Mach Learn Res"},{"issue":"5786","key":"2021072117024681700_ref26","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1126\/science.1127647","article-title":"Reducing the dimensionality of data with neural networks","volume":"313","author":"Hinton","year":"2006","journal-title":"Science"},{"issue":"4","key":"2021072117024681700_ref27","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1038\/s41592-019-0353-7","article-title":"Scalable analysis of cell-type composition from single-cell transcriptomics using deep recurrent learning","volume":"16","author":"Deng","year":"2019","journal-title":"Nat Methods"},{"issue":"1","key":"2021072117024681700_ref28","first-page":"67","article-title":"Pairwise input neural network for target-ligand interaction prediction","author":"Wang","year":"2014","journal-title":"Int Conf Bioinf Biomed"},{"issue":"13","key":"2021072117024681700_ref29","first-page":"472","article-title":"Incorporating second-order functional knowledge for better option pricing","author":"Dugas","year":"2002","journal-title":"Neural Inf Process Syst"},{"issue":"4\u20135","key":"2021072117024681700_ref30","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1016\/S0893-6080(00)00026-5","article-title":"Independent component analysis: algorithms and applications","volume":"13","author":"Hyv\u00e4rinen","year":"2000","journal-title":"Neural Netw"},{"issue":"7","key":"2021072117024681700_ref31","doi-asserted-by":"crossref","first-page":"e0181195","DOI":"10.1371\/journal.pone.0181195","article-title":"Independent component analysis (ICA) based-clustering of temporal RNA-seq data","volume":"12","author":"Nascimento","year":"2017","journal-title":"PLoS One"},{"issue":"3","key":"2021072117024681700_ref32","doi-asserted-by":"crossref","first-page":"140","DOI":"10.1002\/hbm.1048","article-title":"A method for making group inferences from functional MRI data using independent component analysis","volume":"14","author":"Calhoun","year":"2001","journal-title":"Hum Brain Mapp"},{"issue":"3","key":"2021072117024681700_ref33","first-page":"425","article-title":"Transformation and model choice for RNA-seq co-expression analysis","volume":"19","author":"Rau","year":"2018","journal-title":"Brief Bioinform"},{"issue":"8","key":"2021072117024681700_ref34","doi-asserted-by":"crossref","first-page":"897","DOI":"10.1038\/nbt1406","article-title":"What is the expectation maximization algorithm?","volume":"26","author":"Do","year":"2008","journal-title":"Nat Biotechnol"},{"key":"2021072117024681700_ref35","first-page":"1027","volume-title":"Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms","author":"Arthur","year":"2007"},{"issue":"3","key":"2021072117024681700_ref36","doi-asserted-by":"crossref","first-page":"759","DOI":"10.1093\/biomet\/asn034","article-title":"Extended Bayesian information criteria for model selection with large model spaces","volume":"95","author":"Chen","year":"2008","journal-title":"Biometrika"},{"issue":"10","key":"2021072117024681700_ref37","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2013","journal-title":"J Mach Learn Res"},{"issue":"3","key":"2021072117024681700_ref38","first-page":"583","article-title":"Cluster ensembles - a knowledge reuse framework for combining multiple partitions","volume":"3","author":"Strehl","year":"2002","journal-title":"J Mach Learn Res"},{"issue":"1","key":"2021072117024681700_ref39","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"Hubert","year":"1985","journal-title":"J Classif"},{"key":"2021072117024681700_ref40","first-page":"2837","article-title":"Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance","volume":"11","author":"Vinh","year":"2010","journal-title":"J Mach Learn Res"},{"issue":"1","key":"2021072117024681700_ref41","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1186\/s13059-017-1188-0","article-title":"Ultrafast and accurate clustering through imputation for single-cell RNA-seq data","volume":"18","author":"Lin","year":"2017","journal-title":"Genome Biol"},{"issue":"2","key":"2021072117024681700_ref42","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/bioinformatics\/btw607","article-title":"Robust classification of single-cell transcriptome data by nonnegative matrix factorization","volume":"33","author":"Shao","year":"2017","journal-title":"Bioinformatics"},{"issue":"2","key":"2021072117024681700_ref43","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1101\/gr.254557.119","article-title":"SHARP: hyper-fast and accurate processing of single-cell RNA-seq via ensemble random projection","volume":"30","author":"Wan","year":"2020","journal-title":"Genome Res"},{"issue":"1","key":"2021072117024681700_ref44","first-page":"100","article-title":"Algorithm AS 136: a k-means clustering algorithm","volume":"28","author":"Hartigan","year":"1979","journal-title":"J R I State Dent Soc"},{"issue":"4","key":"2021072117024681700_ref45","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1007\/s11222-007-9033-z","article-title":"A tutorial on spectral clustering","volume":"17","author":"Luxburg","year":"2007","journal-title":"Stat Comput"},{"issue":"11","key":"2021072117024681700_ref46","doi-asserted-by":"crossref","first-page":"1893","DOI":"10.1093\/bioinformatics\/bty908","article-title":"Bixgboost: a scalable, flexible boosting based method for reconstructing gene regulatory networks","volume":"35","author":"Zheng","year":"2019","journal-title":"Bioinformatics"},{"issue":"4","key":"2021072117024681700_ref47","doi-asserted-by":"crossref","first-page":"1074","DOI":"10.1093\/bioinformatics\/btz734","article-title":"SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting","volume":"36","author":"Yu","year":"2020","journal-title":"Bioinformatics"},{"issue":"14","key":"2021072117024681700_ref48","doi-asserted-by":"crossref","first-page":"2395","DOI":"10.1093\/bioinformatics\/bty995","article-title":"Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique","volume":"35","author":"Wang","year":"2019","journal-title":"Bioinformatics"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bib\/article-pdf\/22\/4\/bbaa316\/39139781\/bbaa316.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/bib\/article-pdf\/22\/4\/bbaa316\/39139781\/bbaa316.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,21]],"date-time":"2021-07-21T17:23:09Z","timestamp":1626888189000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbaa316\/6029147"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,10]]},"references-count":48,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2021,7,20]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbaa316","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,7]]},"published":{"date-parts":[[2020,12,10]]},"article-number":"bbaa316"}}