{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T13:22:05Z","timestamp":1771680125704,"version":"3.50.1"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"16","license":[{"start":{"date-parts":[[2018,12,28]],"date-time":"2018-12-28T00:00:00Z","timestamp":1545955200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Council of the Hong Kong Special Administrative Region","award":["CityU 21200816"],"award-info":[{"award-number":["CityU 21200816"]}]},{"name":"Council of the Hong Kong Special Administrative Region","award":["CityU 11203217"],"award-info":[{"award-number":["CityU 11203217"]}]},{"name":"Council of the Hong Kong Special Administrative Region","award":["CityU 11200218"],"award-info":[{"award-number":["CityU 11200218"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61603087"],"award-info":[{"award-number":["61603087"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100007847","name":"Natural Science Foundation of Jilin Province","doi-asserted-by":"publisher","award":["20190103006JH"],"award-info":[{"award-number":["20190103006JH"]}],"id":[{"id":"10.13039\/100007847","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","award":["2412017FZ026"],"award-info":[{"award-number":["2412017FZ026"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,8,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>In recent years, single-cell RNA sequencing enables us to discover cell types or even subtypes. Its increasing availability provides opportunities to identify cell populations from single-cell RNA-seq data. Computational methods have been employed to reveal the gene expression variations among multiple cell populations. Unfortunately, the existing ones can suffer from realistic restrictions such as experimental noises, numerical instability, high dimensionality and computational scalability.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We propose an evolutionary multiobjective ensemble pruning algorithm (EMEP) that addresses those realistic restrictions. Our EMEP algorithm first applies the unsupervised dimensionality reduction to project data from the original high dimensions to low-dimensional subspaces; basic clustering algorithms are applied in those new subspaces to generate different clustering results to form cluster ensembles. However, most of those cluster ensembles are unnecessarily bulky with the expense of extra time costs and memory consumption. To overcome that problem, EMEP is designed to dynamically select the suitable clustering results from the ensembles. Moreover, to guide the multiobjective ensemble evolution, three cluster validity indices including the overall cluster deviation, the within-cluster compactness and the number of basic partition clusters are formulated as the objective functions to unleash its cell type discovery performance using evolutionary multiobjective optimization. We applied EMEP to 55 simulated datasets and seven real single-cell RNA-seq datasets, including six single-cell RNA-seq dataset and one large-scale dataset with 3005 cells and 4412 genes. Two case studies are also conducted to reveal mechanistic insights into the biological relevance of EMEP. We found that EMEP can achieve superior performance over the other clustering algorithms, demonstrating that EMEP can identify cell populations clearly.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>EMEP is written in Matlab and available at https:\/\/github.com\/lixt314\/EMEP<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty1056","type":"journal-article","created":{"date-parts":[[2018,12,21]],"date-time":"2018-12-21T14:17:15Z","timestamp":1545401835000},"page":"2809-2817","source":"Crossref","is-referenced-by-count":23,"title":["Single-cell RNA-seq interpretations using evolutionary multiobjective ensemble pruning"],"prefix":"10.1093","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9089-1799","authenticated-orcid":false,"given":"Xiangtao","family":"Li","sequence":"first","affiliation":[{"name":"School of Computer Science and Information Technology, Northeast Normal University, Changchun, Jilin, China"},{"name":"Department of Computer Science, City University of Hong Kong, Hong Kong SAR"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0314-9199","authenticated-orcid":false,"given":"Shixiong","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, City University of Hong Kong, Hong Kong SAR"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6062-733X","authenticated-orcid":false,"given":"Ka-Chun","family":"Wong","sequence":"additional","affiliation":[{"name":"Department of Computer Science, City University of Hong Kong, Hong Kong SAR"}]}],"member":"286","published-online":{"date-parts":[[2018,12,28]]},"reference":[{"key":"2023062708580627100_bty1056-B1","doi-asserted-by":"crossref","first-page":"i29","DOI":"10.1093\/bioinformatics\/btm212","article-title":"An ensemble framework for clustering protein\u2013protein interaction networks","volume":"23","author":"Asur","year":"2007","journal-title":"Bioinformatics"},{"key":"2023062708580627100_bty1056-B2","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1016\/j.artmed.2008.07.014","article-title":"Fuzzy ensemble clustering based on random projections for DNA microarray data analysis","volume":"45","author":"Avogadri","year":"2009","journal-title":"Artif. Intell. Med"},{"key":"2023062708580627100_bty1056-B3","doi-asserted-by":"crossref","first-page":"155.","DOI":"10.1038\/nbt.3102","article-title":"Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells","volume":"33","author":"Buettner","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023062708580627100_bty1056-B4","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1002\/asi.20266","article-title":"Link-based similarity measures for the classification of web documents","volume":"57","author":"Calado","year":"2006","journal-title":"J. Am. Soc. Inform. Sci. Technol"},{"key":"2023062708580627100_bty1056-B5","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1109\/TEVC.2010.2059031","article-title":"Differential evolution: a survey of the state-of-the-art","volume":"15","author":"Das","year":"2011","journal-title":"IEEE Trans. Evol. Comput"},{"key":"2023062708580627100_bty1056-B6","doi-asserted-by":"crossref","first-page":"577","DOI":"10.1109\/TEVC.2013.2281535","article-title":"An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part i: solving problems with box constraints","volume":"18","author":"Deb","year":"2014","journal-title":"IEEE Trans. Evol. Comput"},{"key":"2023062708580627100_bty1056-B7","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1126\/science.1245316","article-title":"Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells","volume":"343","author":"Deng","year":"2014","journal-title":"Science"},{"key":"2023062708580627100_bty1056-B8","author":"Greene","year":"2004"},{"key":"2023062708580627100_bty1056-B9","doi-asserted-by":"crossref","first-page":"1722","DOI":"10.1093\/bioinformatics\/btn286","article-title":"Ensemble non-negative matrix factorization methods for clustering proteinymposium onDeng&lt;\/snam","volume":"24","author":"Greene","year":"2008","journal-title":"Bioinformatics"},{"key":"2023062708580627100_bty1056-B10","author":"Gupta","year":"2011"},{"key":"2023062708580627100_bty1056-B11","doi-asserted-by":"crossref","first-page":"1513","DOI":"10.1093\/bioinformatics\/btq226","article-title":"Lce: a link-based cluster ensemble method for improved gene expression data analysis","volume":"26","author":"Iam-On","year":"2010","journal-title":"Bioinformatics"},{"key":"2023062708580627100_bty1056-B12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v036.i09","article-title":"Linkclue: a matlab package for link-based cluster ensembles","volume":"36","author":"Iam-On","year":"2010","journal-title":"J. Stat. Softw"},{"key":"2023062708580627100_bty1056-B13","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1109\/TKDE.2010.268","article-title":"A link-based cluster ensemble approach for categorical data clustering","volume":"24","author":"Iam-On","year":"2012","journal-title":"IEEE Trans. Knowl. Data Eng"},{"key":"2023062708580627100_bty1056-B14","first-page":"11","article-title":"Single cell clustering based on cell-pair differentiability correlation and variance analysis","volume":"1","author":"Jiang","year":"2018","journal-title":"Bioinformatics"},{"key":"2023062708580627100_bty1056-B15","doi-asserted-by":"crossref","first-page":"10220","DOI":"10.1038\/ncomms10220","article-title":"A microfluidic platform enabling single-cell RNA-seq of multigenerational lineages","volume":"7","author":"Kimmerling","year":"2016","journal-title":"Nat. Commun"},{"key":"2023062708580627100_bty1056-B16","doi-asserted-by":"crossref","first-page":"483.","DOI":"10.1038\/nmeth.4236","article-title":"Sc3: consensus clustering of single-cell RNA-seq data","volume":"14","author":"Kiselev","year":"2017","journal-title":"Nat. Methods"},{"key":"2023062708580627100_bty1056-B17","author":"Klink","year":"2006"},{"key":"2023062708580627100_bty1056-B18","first-page":"556","article-title":"Algorithms for non-negative matrix factorization","author":"Lee","year":"2001","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2023062708580627100_bty1056-B19","first-page":"1","article-title":"Evolutionary multiobjective clustering and its applications to patient stratification","volume":"99","author":"Li","year":"2018","journal-title":"IEEE Trans. Cybernetics"},{"key":"2023062708580627100_bty1056-B20","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1109\/TNB.2017.2725991","article-title":"Evolving spatial clusters of genomic regions from high-throughput chromatin conformation capture data","volume":"16","author":"Li","year":"2017","journal-title":"IEEE Trans. Nanobiosci"},{"key":"2023062708580627100_bty1056-B21","doi-asserted-by":"crossref","first-page":"2691","DOI":"10.1093\/bioinformatics\/btx167","article-title":"Entropy-based consensus clustering for patient stratification","volume":"33","author":"Liu","year":"2017","journal-title":"Bioinformatics"},{"key":"2023062708580627100_bty1056-B22","first-page":"2579","article-title":"Visualizing data using t-sne","volume":"9","author":"Maaten","year":"2008","journal-title":"J. Mach. Learn. Res"},{"key":"2023062708580627100_bty1056-B23","doi-asserted-by":"crossref","first-page":"1.","DOI":"10.1145\/2742642","article-title":"A survey of multiobjective evolutionary clustering","volume":"47","author":"Mukhopadhyay","year":"2015","journal-title":"ACM Comput. Surveys"},{"key":"2023062708580627100_bty1056-B24","first-page":"8","article-title":"Spectral clustering based on learning similarity matrix","volume":"1","author":"Park","year":"2018","journal-title":"Bioinformatics"},{"key":"2023062708580627100_bty1056-B25","doi-asserted-by":"crossref","first-page":"1053.","DOI":"10.1038\/nbt.2967","article-title":"Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex","volume":"32","author":"Pollen","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023062708580627100_bty1056-B26","doi-asserted-by":"crossref","first-page":"777.","DOI":"10.1038\/nbt.2282","article-title":"Full-length mRNA-seq from single-cell levels of RNA and individual circulating tumor cells","volume":"30","author":"Ramsk\u00f6ld","year":"2012","journal-title":"Nat. Biotechnol"},{"key":"2023062708580627100_bty1056-B27","doi-asserted-by":"crossref","first-page":"1492","DOI":"10.1126\/science.1242072","article-title":"Clustering by fast search and find of density peaks","volume":"344","author":"Rodriguez","year":"2014","journal-title":"Science"},{"key":"2023062708580627100_bty1056-B28","doi-asserted-by":"crossref","first-page":"718.","DOI":"10.1038\/ni.3200","article-title":"Identification of cdc1-and cdc2-committed dc progenitors reveals early lineage priming at the common dc progenitor stage in the bone marrow","volume":"16","author":"Schlitzer","year":"2015","journal-title":"Nat. Immunol"},{"key":"2023062708580627100_bty1056-B29","doi-asserted-by":"crossref","first-page":"1005.","DOI":"10.1038\/nbt.3039","article-title":"How deep is enough in single-cell RNA-seq?","volume":"32","author":"Streets","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023062708580627100_bty1056-B30","doi-asserted-by":"crossref","first-page":"1905","DOI":"10.1016\/j.celrep.2014.08.029","article-title":"Single-cell RNA sequencing identifies extracellular matrix gene expression by pancreatic circulating tumor cells","volume":"8","author":"Ting","year":"2014","journal-title":"Cell Rep"},{"key":"2023062708580627100_bty1056-B31","doi-asserted-by":"crossref","first-page":"371.","DOI":"10.1038\/nature13173","article-title":"Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq","volume":"509","author":"Treutlein","year":"2014","journal-title":"Nature"},{"key":"2023062708580627100_bty1056-B32","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1007\/s11222-007-9033-z","article-title":"A tutorial on spectral clustering","volume":"17","author":"Von Luxburg","year":"2007","journal-title":"Stat. Comput"},{"key":"2023062708580627100_bty1056-B33","doi-asserted-by":"crossref","first-page":"414.","DOI":"10.1038\/nmeth.4207","article-title":"Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning","volume":"14","author":"Wang","year":"2017","journal-title":"Nat. Methods"},{"key":"2023062708580627100_bty1056-B34","doi-asserted-by":"crossref","first-page":"689.","DOI":"10.1186\/s12864-017-4019-5","article-title":"Saic: an iterative clustering approach for analysis of single cell RNA-seq data","volume":"18","author":"Yang","year":"2017","journal-title":"BMC Genomics"},{"key":"2023062708580627100_bty1056-B35","doi-asserted-by":"crossref","first-page":"296","DOI":"10.2174\/157489310794072508","article-title":"A review of ensemble methods in bioinformatics","volume":"5","author":"Yang","year":"2010","journal-title":"Curr. Bioinformatics"},{"key":"2023062708580627100_bty1056-B36","doi-asserted-by":"crossref","first-page":"2888","DOI":"10.1093\/bioinformatics\/btm463","article-title":"Graph-based consensus clustering for class discovery from gene expression data","volume":"23","author":"Yu","year":"2007","journal-title":"Bioinformatics"},{"key":"2023062708580627100_bty1056-B37","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1109\/TNB.2011.2144997","article-title":"Knowledge based cluster ensemble for cancer discovery from biomolecular data","volume":"10","author":"Yu","year":"2011","journal-title":"IEEE Trans. Nanobiosci"},{"key":"2023062708580627100_bty1056-B38","doi-asserted-by":"crossref","first-page":"1138","DOI":"10.1126\/science.aaa1934","article-title":"Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq","volume":"347","author":"Zeisel","year":"2015","journal-title":"Science"},{"key":"2023062708580627100_bty1056-B39","doi-asserted-by":"crossref","first-page":"e1006053.","DOI":"10.1371\/journal.pcbi.1006053","article-title":"A multitask clustering approach for single-cell RNA-seq analysis in recessive dystrophic epidermolysis bullosa","volume":"14","author":"Zhang","year":"2018","journal-title":"PLoS Comput. Biol"},{"key":"2023062708580627100_bty1056-B40","doi-asserted-by":"crossref","first-page":"93.","DOI":"10.1186\/s12859-018-2092-7","article-title":"An interpretable framework for clustering single-cell RNA-seq datasets","volume":"19","author":"Zhang","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023062708580627100_bty1056-B41","doi-asserted-by":"crossref","first-page":"712","DOI":"10.1109\/TEVC.2007.892759","article-title":"Moea\/d: a multiobjective evolutionary algorithm based on decomposition","volume":"11","author":"Zhang","year":"2007","journal-title":"IEEE Trans. Evolution. Comput"},{"key":"2023062708580627100_bty1056-B42","doi-asserted-by":"crossref","first-page":"e2888.","DOI":"10.7717\/peerj.2888","article-title":"Detecting heterogeneity in single-cell RNA-seq data by non-negative matrix factorization","volume":"5","author":"Zhu","year":"2017","journal-title":"PeerJ"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/16\/2809\/50719264\/bty1056.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/16\/2809\/50719264\/bty1056.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T09:02:28Z","timestamp":1687856548000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/16\/2809\/5265329"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,12,28]]},"references-count":42,"journal-issue":{"issue":"16","published-print":{"date-parts":[[2019,8,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty1056","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,8,15]]},"published":{"date-parts":[[2018,12,28]]}}}