{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T20:09:54Z","timestamp":1773778194625,"version":"3.50.1"},"reference-count":49,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2024,4,29]],"date-time":"2024-04-29T00:00:00Z","timestamp":1714348800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62372303"],"award-info":[{"award-number":["62372303"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62002234"],"award-info":[{"award-number":["62002234"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62131004"],"award-info":[{"award-number":["62131004"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["12271522"],"award-info":[{"award-number":["12271522"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["11901575"],"award-info":[{"award-number":["11901575"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,5,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Continuous advancements in single-cell RNA sequencing (scRNA-seq) technology have enabled researchers to further explore the study of cell heterogeneity, trajectory inference, identification of rare cell types, and neurology. Accurate scRNA-seq data clustering is crucial in single-cell sequencing data analysis. However, the high dimensionality, sparsity, and presence of \u201cfalse\u201d zero values in the data can pose challenges to clustering. Furthermore, current unsupervised clustering algorithms have not effectively leveraged prior biological knowledge, making cell clustering even more challenging.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>This study investigates a semisupervised clustering model called scTPC, which integrates the triplet constraint, pairwise constraint, and cross-entropy constraint based on deep learning. Specifically, the model begins by pretraining a denoising autoencoder based on a zero-inflated negative binomial distribution. Deep clustering is then performed in the learned latent feature space using triplet constraints and pairwise constraints generated from partial labeled cells. Finally, to address imbalanced cell-type datasets, a weighted cross-entropy loss is introduced to optimize the model. A series of experimental results on 10 real scRNA-seq datasets and five simulated datasets demonstrate that scTPC achieves accurate clustering with a well-designed framework.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>scTPC is a Python-based algorithm, and the code is available from https:\/\/github.com\/LF-Yang\/Code or https:\/\/zenodo.org\/records\/10951780.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae293","type":"journal-article","created":{"date-parts":[[2024,4,29]],"date-time":"2024-04-29T22:55:33Z","timestamp":1714431333000},"source":"Crossref","is-referenced-by-count":15,"title":["scTPC: a novel semisupervised deep clustering model for scRNA-seq data"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9393-3648","authenticated-orcid":false,"given":"Yushan","family":"Qiu","sequence":"first","affiliation":[{"name":"School of Mathematical Sciences, Shenzhen University, Shenzhen, Guangdong 518000, China"}]},{"given":"Lingfei","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Mathematical Sciences, Shenzhen University, Shenzhen, Guangdong 518000, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5891-6044","authenticated-orcid":false,"given":"Hao","family":"Jiang","sequence":"additional","affiliation":[{"name":"School of Mathematics, Renmin University of China , Haidian District , Beijing 100872, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6406-1142","authenticated-orcid":false,"given":"Quan","family":"Zou","sequence":"additional","affiliation":[{"name":"Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China , Chengdu 610056, China"}]}],"member":"286","published-online":{"date-parts":[[2024,4,29]]},"reference":[{"key":"2024051408293080100_btae293-B1","doi-asserted-by":"crossref","DOI":"10.1201\/9781584889977","volume-title":"Constrained Clustering: Advances in Algorithms, Theory, and Applications","author":"Basu","year":"2008"},{"key":"2024051408293080100_btae293-B2","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1038\/nbt.4096","article-title":"Integrating single-cell transcriptomic data across different conditions, technologies, and species","volume":"36","author":"Butler","year":"2018","journal-title":"Nat Biotechnol"},{"key":"2024051408293080100_btae293-B3","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1126\/science.aam8940","article-title":"Comprehensive single-cell transcriptional profiling of a multicellular organism","volume":"357","author":"Cao","year":"2017","journal-title":"Science"},{"key":"2024051408293080100_btae293-B4","doi-asserted-by":"crossref","first-page":"775","DOI":"10.1093\/bioinformatics\/btaa908","article-title":"Single-cell RNA-seq data semi-supervised clustering and annotation via structural regularized domain adaptation","volume":"37","author":"Chen","year":"2021","journal-title":"Bioinformatics"},{"key":"2024051408293080100_btae293-B5","doi-asserted-by":"crossref","first-page":"lqaa039","DOI":"10.1093\/nargab\/lqaa039","article-title":"Deep soft k-means clustering with self-training for single-cell RNA sequence data","volume":"2","author":"Chen","year":"2020","journal-title":"NAR Genom Bioinform"},{"key":"2024051408293080100_btae293-B6","doi-asserted-by":"crossref","first-page":"3227","DOI":"10.1016\/j.celrep.2017.03.004","article-title":"Single-cell RNA-seq reveals hypothalamic cell diversity","volume":"18","author":"Chen","year":"2017","journal-title":"Cell Rep"},{"key":"2024051408293080100_btae293-B7","doi-asserted-by":"crossref","first-page":"1111","DOI":"10.1016\/j.neuron.2019.04.010","article-title":"Single-cell RNA-seq analysis of retinal development identifies NFI factors as regulating mitotic exit and late-born cell specification","volume":"102","author":"Clark","year":"2019","journal-title":"Neuron"},{"key":"2024051408293080100_btae293-B8","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1038\/s41576-019-0130-6","article-title":"Challenges in measuring and understanding biological noise","volume":"20","author":"Eling","year":"2019","journal-title":"Nat Rev Genet"},{"key":"2024051408293080100_btae293-B9","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1038\/s41467-018-07931-2","article-title":"Single-cell RNA-seq denoising using a deep count autoencoder","volume":"10","author":"Eraslan","year":"2019","journal-title":"Nat Commun"},{"key":"2024051408293080100_btae293-B10","first-page":"1753","article-title":"Improved deep embedded clustering with local structure preservation","volume":"17","author":"Guo","year":"2017","journal-title":"IJCAI"},{"key":"2024051408293080100_btae293-B11","doi-asserted-by":"crossref","first-page":"1091","DOI":"10.1016\/j.cell.2018.02.001","article-title":"Mapping the mouse cell atlas by Microwell-seq","volume":"172","author":"Han","year":"2018","journal-title":"Cell"},{"key":"2024051408293080100_btae293-B12","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1016\/j.mam.2017.07.003","article-title":"Single-cell RNA sequencing: technical advancements and biological applications","volume":"59","year":"2018","journal-title":"Mol Aspects Med"},{"key":"2024051408293080100_btae293-B13","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"Hubert","year":"1985","journal-title":"J Classif"},{"key":"2024051408293080100_btae293-B14","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1016\/j.patrec.2021.07.017","article-title":"Imbalanced image classification with complement cross entropy","volume":"151","author":"Kim","year":"2021","journal-title":"Pattern Recognit Lett"},{"key":"2024051408293080100_btae293-B15","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1038\/nmeth.4236","article-title":"Sc3: consensus clustering of single-cell RNA-seq data","volume":"14","author":"Kiselev","year":"2017","journal-title":"Nat Methods"},{"key":"2024051408293080100_btae293-B16","doi-asserted-by":"crossref","first-page":"1187","DOI":"10.1016\/j.cell.2015.04.044","article-title":"Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells","volume":"161","author":"Klein","year":"2015","journal-title":"Cell"},{"key":"2024051408293080100_btae293-B17","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1038\/44565","article-title":"Learning the parts of objects by non-negative matrix factorization","volume":"401","author":"Lee","year":"1999","journal-title":"Nature"},{"key":"2024051408293080100_btae293-B18","doi-asserted-by":"crossref","first-page":"7705","DOI":"10.1038\/s41467-022-35031-9","article-title":"Clustering of single-cell multi-omics data with a multimodal deep learning method","volume":"13","author":"Lin","year":"2022","journal-title":"Nat Commun"},{"key":"2024051408293080100_btae293-B19","first-page":"861","article-title":"UMAP: Uniform Manifold Approximation and Projection","volume-title":"J Open Source Softw","year":"2018"},{"key":"2024051408293080100_btae293-B20","first-page":"86","author":"Nigam","year":"2000"},{"key":"2024051408293080100_btae293-B21","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1038\/s41586-018-0394-6","article-title":"A single-cell atlas of the airway epithelium reveals the CFTR-rich pulmonary ionocyte","volume":"560","author":"Plasschaert","year":"2018","journal-title":"Nature"},{"key":"2024051408293080100_btae293-B22","doi-asserted-by":"crossref","first-page":"983","DOI":"10.1038\/s41592-019-0535-3","article-title":"Supervised classification enables rapid annotation of cell atlases","volume":"16","author":"Pliner","year":"2019","journal-title":"Nat Methods"},{"key":"2024051408293080100_btae293-B23","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1038\/s41586-018-0590-4","article-title":"Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris: the Tabula Muris Consortium","volume":"562","author":"Schaum","year":"2018","journal-title":"Nature"},{"key":"2024051408293080100_btae293-B24","first-page":"815","author":"Schroff","year":"2015"},{"key":"2024051408293080100_btae293-B25","first-page":"583","article-title":"Cluster ensembles\u2014a knowledge reuse framework for combining multiple partitions","volume":"3","author":"Strehl","year":"2002","journal-title":"J Mach Learn Res"},{"key":"2024051408293080100_btae293-B26","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1038\/nmeth.1315","article-title":"mRNA-seq whole-transcriptome analysis of a single cell","volume":"6","author":"Tang","year":"2009","journal-title":"Nat Methods"},{"key":"2024051408293080100_btae293-B27","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1038\/s42256-019-0037-0","article-title":"Clustering single-cell RNA-seq data with a model-based deep learning approach","volume":"1","author":"Tian","year":"2019","journal-title":"Nat Mach Intell"},{"key":"2024051408293080100_btae293-B28","doi-asserted-by":"crossref","first-page":"1873","DOI":"10.1038\/s41467-021-22008-3","article-title":"Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data","volume":"12","author":"Tian","year":"2021","journal-title":"Nat Commun"},{"key":"2024051408293080100_btae293-B29","doi-asserted-by":"crossref","first-page":"e100041","DOI":"10.18547\/gcb.2018.vol4.iss2.e100041","article-title":"Principal components analysis: theory and application to gene expression data analysis","volume":"4","author":"Todorov","year":"2018","journal-title":"Genomics Comput Biol"},{"key":"2024051408293080100_btae293-B49","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1007\/s10994-019-05855-6","article-title":"A survey on semi-supervised learning","volume":"109","year":"2020","journal-title":"Mach Learn"},{"key":"2024051408293080100_btae293-B30","first-page":"384","volume-title":"Artificial Intelligence and Statistics","author":"Van Der Maaten","year":"2009"},{"key":"2024051408293080100_btae293-B31","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Van der Maaten","year":"2008","journal-title":"J Mach Learn Res"},{"key":"2024051408293080100_btae293-B32","first-page":"2837","article-title":"Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance","volume-title":"J Mach Learn Res","year":"2010"},{"key":"2024051408293080100_btae293-B33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-10-99","article-title":"Markov clustering versus affinity propagation for the partitioning of protein interaction graphs","volume":"10","author":"Vlasblom","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2024051408293080100_btae293-B34","doi-asserted-by":"crossref","first-page":"1575","DOI":"10.1093\/bioinformatics\/btac011","article-title":"scName: neighborhood contrastive clustering with ancillary mask estimation for scRNA-seq data","volume":"38","author":"Wan","year":"2022","journal-title":"Bioinformatics"},{"key":"2024051408293080100_btae293-B35","doi-asserted-by":"crossref","first-page":"1700232","DOI":"10.1002\/pmic.201700232","article-title":"SIMLR: a tool for large-scale genomic analyses by multi-kernel learning","volume":"18","author":"Wang","year":"2018","journal-title":"Proteomics"},{"key":"2024051408293080100_btae293-B36","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1038\/nmeth.4207","article-title":"Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning","volume":"14","author":"Wang","year":"2017","journal-title":"Nat Methods"},{"key":"2024051408293080100_btae293-B37","doi-asserted-by":"crossref","first-page":"2407","DOI":"10.1073\/pnas.1719474115","article-title":"Pulmonary alveolar type I cell population consists of two distinct subtypes that differ in cell fate","volume":"115","author":"Wang","year":"2018","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024051408293080100_btae293-B38","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1186\/s12859-023-05339-4","article-title":"scSemiAAE: a semi-supervised clustering model for single-cell RNA-seq data","volume":"24","author":"Wang","year":"2023","journal-title":"BMC Bioinformatics"},{"key":"2024051408293080100_btae293-B39","doi-asserted-by":"crossref","first-page":"3825","DOI":"10.1093\/bioinformatics\/btaa231","article-title":"Joint learning dimension reduction and clustering of single-cell RNA-sequencing data","volume":"36","author":"Wu","year":"2020","journal-title":"Bioinformatics"},{"key":"2024051408293080100_btae293-B40","doi-asserted-by":"crossref","first-page":"566","DOI":"10.1109\/TCBB.2022.3161131","article-title":"Network-based structural learning nonnegative matrix factorization algorithm for clustering of scRNA-seq data","volume":"20","author":"Wu","year":"2023","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2024051408293080100_btae293-B41","doi-asserted-by":"crossref","first-page":"bbaa433","DOI":"10.1093\/bib\/bbaa433","article-title":"jSRC: a flexible and accurate joint learning algorithm for clustering of single-cell RNA-sequencing data","volume":"22","author":"Wu","year":"2021","journal-title":"Brief Bioinform"},{"key":"2024051408293080100_btae293-B42","doi-asserted-by":"crossref","first-page":"bbab546","DOI":"10.1093\/bib\/bbab546","article-title":"Network-based integrative analysis of single-cell transcriptomic and epigenomic data for cell types","volume":"23","author":"Wu","year":"2022","journal-title":"Brief Bioinform"},{"key":"2024051408293080100_btae293-B43","first-page":"478","author":"Xie","year":"2016"},{"key":"2024051408293080100_btae293-B44","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1109\/TPAMI.2004.1261097","article-title":"Two-dimensional PCA: a new approach to appearance-based face representation and recognition","volume":"26","author":"Yang","year":"2004","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2024051408293080100_btae293-B45","doi-asserted-by":"crossref","first-page":"594","DOI":"10.1126\/science.aat1699","article-title":"Single-cell transcriptomes from human kidneys reveal the cellular identity of renal tumors","volume":"361","author":"Young","year":"2018","journal-title":"Science"},{"key":"2024051408293080100_btae293-B46","doi-asserted-by":"crossref","first-page":"174","DOI":"10.1186\/s13059-017-1305-0","article-title":"Splatter: simulation of single-cell RNA sequencing data","volume":"18","author":"Zappia","year":"2017","journal-title":"Genome Biol"},{"key":"2024051408293080100_btae293-B47","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/j.ymeth.2022.10.006","article-title":"scSSA: a clustering method for single cell RNA-seq data based on semi-supervised autoencoder","volume":"208","author":"Zhao","year":"2022","journal-title":"Methods"},{"key":"2024051408293080100_btae293-B48","doi-asserted-by":"crossref","first-page":"14049","DOI":"10.1038\/ncomms14049","article-title":"Massively parallel digital transcriptional profiling of single cells","volume":"8","author":"Zheng","year":"2017","journal-title":"Nat Commun"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae293\/57356526\/btae293.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae293\/57572752\/btae293.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae293\/57572752\/btae293.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,17]],"date-time":"2024-11-17T04:57:11Z","timestamp":1731819431000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae293\/7659796"}},"subtitle":[],"editor":[{"given":"Anthony","family":"Mathelier","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,4,29]]},"references-count":49,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,5,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae293","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,5,1]]},"published":{"date-parts":[[2024,4,29]]},"article-number":"btae293"}}