{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T00:18:18Z","timestamp":1767917898946,"version":"3.49.0"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2025,9,22]],"date-time":"2025-09-22T00:00:00Z","timestamp":1758499200000},"content-version":"vor","delay-in-days":22,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004826","name":"Beijing Natural Science Foundation","doi-asserted-by":"publisher","award":["5254050"],"award-info":[{"award-number":["5254050"]}],"id":[{"id":"10.13039\/501100004826","id-type":"DOI","asserted-by":"publisher"}]},{"name":"MOE Project of Key Research Institute of Humanities and Social Sciences","award":["22JJD910001"],"award-info":[{"award-number":["22JJD910001"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,8,31]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Single-cell RNA sequencing (scRNA-seq) provides high-throughput information about the genome-wide gene expression levels at the single-cell resolution, bringing a precise understanding on the transcriptome of individual cells. Unfortunately, the rapidly growing scRNA-seq data and the prevalence of dropout events pose substantial challenges for clustering and cell type annotation. Here, we propose a deep learning method, scHSC, that employs hard sample mining through contrastive learning for clustering scRNA-seq data. Focusing on hard samples, this approach simultaneously integrates gene expression and topological structure information between cells to improve clustering accuracy. By adjusting the weights of hard positive and hard negative samples during the iterative training process, scHSC employs an adaptive weighting strategy to integrate contrastive learning with a ZINB model for single-cell clustering tasks. Extensive experiments on 18 single-cell RNA-seq real datasets demonstrate that scHSC exhibits significant superiority in clustering performance compared to existing deep learning-based clustering methods. scHSC is implemented in Python based on the PyTorch framework. The source code and datasets are available via https:\/\/github.com\/fangs25\/scHSC.<\/jats:p>","DOI":"10.1093\/bib\/bbaf485","type":"journal-article","created":{"date-parts":[[2025,8,29]],"date-time":"2025-08-29T11:41:15Z","timestamp":1756467675000},"source":"Crossref","is-referenced-by-count":1,"title":["scHSC: enhancing single-cell RNA-seq clustering via hard sample contrastive learning"],"prefix":"10.1093","volume":"26","author":[{"given":"Sheng","family":"Fang","sequence":"first","affiliation":[{"name":"Center for Applied Statistics, School of Statistics, Renmin University of China , Beijing 100872 ,","place":["China"]}]},{"given":"Xiaokang","family":"Yu","sequence":"additional","affiliation":[{"name":"Center for Applied Statistics, School of Statistics, Renmin University of China , Beijing 100872 ,","place":["China"]},{"name":"Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania , Philadelphia, PA 19104 ,","place":["United States"]}]},{"given":"Xinyi","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Statistics and Mathematics, Central University of Finance and Economics , Beijing 100081 ,","place":["China"]}]},{"given":"Jingxiao","family":"Zhang","sequence":"additional","affiliation":[{"name":"Center for Applied Statistics, School of Statistics, Renmin University of China , Beijing 100872 ,","place":["China"]}]},{"given":"Xiangjie","family":"Li","sequence":"additional","affiliation":[{"name":"National Clinical Research Center for Cardiovascular Diseases, State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College , Beijing 100037 ,","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2025,9,22]]},"reference":[{"key":"2025092202360989400_ref1","doi-asserted-by":"publisher","first-page":"610","DOI":"10.1016\/j.molcel.2015.04.005","article-title":"The technology and biology of single-cell RNA sequencing","volume":"58","author":"Kolodziejczyk","year":"2015","journal-title":"Mol Cell"},{"key":"2025092202360989400_ref2","doi-asserted-by":"publisher","first-page":"15710","DOI":"10.1609\/aaai.v35i18.17852","article-title":"Effective clustering of scRNA-seq data to identify biomarkers without user input","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Chowdhury","year":"2021"},{"key":"2025092202360989400_ref3","doi-asserted-by":"publisher","first-page":"134","DOI":"10.1038\/ejhg.2012.129","article-title":"RNA-seq and human complex diseases: recent accomplishments and future perspectives","volume":"21","author":"Costa","year":"2013","journal-title":"Eur J Hum Genet"},{"key":"2025092202360989400_ref4","doi-asserted-by":"publisher","first-page":"e1008205","DOI":"10.1371\/journal.pcbi.1008205","article-title":"Tempora: cell trajectory inference using time-series single-cell RNA sequencing data","volume":"16","author":"Tran","year":"2020","journal-title":"PLoS Comput Biol"},{"key":"2025092202360989400_ref5","doi-asserted-by":"publisher","first-page":"1202","DOI":"10.1016\/j.cell.2015.05.002","article-title":"Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets","volume":"161","author":"Macosko","year":"2015","journal-title":"Cell"},{"key":"2025092202360989400_ref6","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1038\/s41576-018-0088-9","article-title":"Challenges in unsupervised clustering of single-cell RNA-seq data","volume":"20","author":"Kiselev","year":"2019","journal-title":"Nat Rev Genet"},{"key":"2025092202360989400_ref7","first-page":"281","article-title":"Some methods for classification and analysis of multivariate observations","volume-title":"Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability","author":"Macqueen","year":"1967"},{"key":"2025092202360989400_ref8","doi-asserted-by":"publisher","first-page":"241","DOI":"10.1007\/BF02289588","article-title":"Hierarchical clustering schemes","volume":"32","author":"Johnson","year":"1967","journal-title":"Psychometrika"},{"key":"2025092202360989400_ref9","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1002\/widm.30","article-title":"Density-based clustering","volume":"1","author":"Kriegel","year":"2011","journal-title":"Wiley Interdiscip Rev: Data Min Knowl Discovery"},{"key":"2025092202360989400_ref10","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1016\/j.coisb.2017.07.004","article-title":"Single cells make big data: new challenges and opportunities in transcriptomics","volume":"4","author":"Angerer","year":"2017","journal-title":"Curr Opin Syst Biol"},{"key":"2025092202360989400_ref11","doi-asserted-by":"publisher","first-page":"637","DOI":"10.1038\/nmeth.2930","article-title":"Validation of noise models for single-cell transcriptomics","volume":"11","author":"Gr\u00fcn","year":"2014","journal-title":"Nat Methods"},{"key":"2025092202360989400_ref12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-017-1188-0","article-title":"CIDR: ultrafast and accurate clustering through imputation for single-cell RNA-seq data","volume":"18","author":"Lin","year":"2017","journal-title":"Genome Biol"},{"key":"2025092202360989400_ref13","doi-asserted-by":"publisher","first-page":"483","DOI":"10.1038\/nmeth.4236","article-title":"SC3: consensus clustering of single-cell RNA-seq data","volume":"14","author":"Kiselev","year":"2017","journal-title":"Nat Methods"},{"key":"2025092202360989400_ref14","doi-asserted-by":"publisher","first-page":"414","DOI":"10.1038\/nmeth.4207","article-title":"Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning","volume":"14","author":"Wang","year":"2017","journal-title":"Nat Methods"},{"key":"2025092202360989400_ref15","doi-asserted-by":"publisher","first-page":"e1012679","DOI":"10.1371\/journal.pcbi.1012679","article-title":"scMoMtF: an interpretable multitask learning framework for single-cell multi-omics data analysis","volume":"20","author":"Lan","year":"2024","journal-title":"PLoS Comput Biol"},{"key":"2025092202360989400_ref16","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.ymeth.2023.11.019","article-title":"JLONMFSC: clustering scRNA-seq data based on joint learning of non-negative matrix factorization and subspace clustering","volume":"222","author":"Lan","year":"2024","journal-title":"Methods"},{"key":"2025092202360989400_ref17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/TCBB.2024.3387911","article-title":"Deep imputation bi-stochastic graph regularized matrix factorization for clustering single-cell RNA-sequencing data","author":"Lan","year":"2024","journal-title":"IEEE Trans Comput Biol Bioinf"},{"key":"2025092202360989400_ref18","doi-asserted-by":"publisher","first-page":"2338","DOI":"10.1038\/s41467-020-15851-3","article-title":"Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis","volume":"11","author":"Li","year":"2020","journal-title":"Nat Commun"},{"key":"2025092202360989400_ref19","doi-asserted-by":"publisher","first-page":"191","DOI":"10.1038\/s42256-019-0037-0","article-title":"Clustering single-cell RNA-seq data with a model-based deep learning approach","volume":"1","author":"Tian","year":"2019","journal-title":"Nat Mach Intell"},{"key":"2025092202360989400_ref20","doi-asserted-by":"publisher","first-page":"1873","DOI":"10.1038\/s41467-021-22008-3","article-title":"Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data","volume":"12","author":"Tian","year":"2021","journal-title":"Nat Commun"},{"key":"2025092202360989400_ref21","doi-asserted-by":"publisher","first-page":"lqaa039","DOI":"10.1093\/nargab\/lqaa039","article-title":"Deep soft K-means clustering with self-training for single-cell RNA sequence data","volume":"2","author":"Chen","year":"2020","journal-title":"NAR Genomics Bioinf"},{"key":"2025092202360989400_ref22","doi-asserted-by":"publisher","first-page":"390","DOI":"10.1038\/s41467-018-07931-2","article-title":"Single-cell RNA-seq denoising using a deep count autoencoder","volume":"10","author":"Eraslan","year":"2019","journal-title":"Nat Commun"},{"key":"2025092202360989400_ref23","doi-asserted-by":"publisher","first-page":"btae020","DOI":"10.1093\/bioinformatics\/btae020","article-title":"scMAE: a masked autoencoder for single-cell RNA-seq clustering","volume":"40","author":"Fang","year":"2024","journal-title":"Bioinformatics"},{"key":"2025092202360989400_ref24","first-page":"519","article-title":"Accurately clustering single-cell RNA-seq data by capturing structural relations between cells through graph convolutional network","volume-title":"IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","author":"Zeng","year":"2020"},{"key":"2025092202360989400_ref25","doi-asserted-by":"publisher","first-page":"4671","DOI":"10.1609\/aaai.v36i4.20392","article-title":"scTAG-ZINB-based graph embedding autoencoder for single-cell RNA-seq interpretations","volume":"36","author":"Yu","year":"2022","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"2025092202360989400_ref26","doi-asserted-by":"publisher","first-page":"400","DOI":"10.1038\/s41467-023-36134-7","article-title":"Topological identification and interpretation for single-cell gene regulation elucidation across multiple platforms using scMGCA","volume":"14","author":"Yu","year":"2023","journal-title":"Nat Commun"},{"key":"2025092202360989400_ref27","doi-asserted-by":"publisher","first-page":"20028","DOI":"10.1038\/s41598-021-99003-7","article-title":"A topology-preserving dimensionality reduction method for single-cell RNA-seq data using graph autoencoder","volume":"11","author":"Luo","year":"2021","journal-title":"Sci Rep"},{"key":"2025092202360989400_ref28","doi-asserted-by":"crossref","DOI":"10.1609\/aaai.v31i1.10814","article-title":"Unsupervised large graph embedding","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Nie","year":"2017"},{"key":"2025092202360989400_ref29","doi-asserted-by":"publisher","first-page":"bbac625","DOI":"10.1093\/bib\/bbac625","article-title":"scDCCA: Deep contrastive clustering for single-cell RNA-seq data based on auto-encoder network","volume":"24","author":"Wang","year":"2023","journal-title":"Brief Bioinf"},{"key":"2025092202360989400_ref30","doi-asserted-by":"publisher","first-page":"bbac018","DOI":"10.1093\/bib\/bbac018","article-title":"Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network","volume":"23","author":"Gan","year":"2022","journal-title":"Brief Bioinf"},{"key":"2025092202360989400_ref31","doi-asserted-by":"publisher","first-page":"280","DOI":"10.1186\/s12859-021-04210-8","article-title":"Contrastive self-supervised clustering of scRNA-seq data","volume":"22","author":"Ciortan","year":"2021","journal-title":"BMC Bioinf"},{"key":"2025092202360989400_ref32","doi-asserted-by":"publisher","first-page":"8914","DOI":"10.1609\/aaai.v37i7.26071","article-title":"Hard sample aware network for contrastive deep graph clustering","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Liu","year":"2023"},{"key":"2025092202360989400_ref33","doi-asserted-by":"publisher","first-page":"1770","DOI":"10.1109\/TIP.2017.2651400","article-title":"Graph Laplacian regularization for image denoising: analysis in the continuous domain","volume":"26","author":"Pang","year":"2017","journal-title":"IEEE Trans Image Process"},{"key":"2025092202360989400_ref34","doi-asserted-by":"crossref","first-page":"976","DOI":"10.1145\/3394486.3403140","article-title":"Adaptive graph encoder for attributed graph embedding","volume-title":"Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","author":"Cui","year":"2020"},{"key":"2025092202360989400_ref35","article-title":"A simple framework for contrastive learning of visual representations","volume-title":"Proceedings of the International Conference on Machine Learning (ICML)","author":"Chen","year":"2020"},{"key":"2025092202360989400_ref36","doi-asserted-by":"publisher","first-page":"696","DOI":"10.1038\/s42256-022-00518-z","article-title":"Contrastive learning enables rapid mapping to multimodal single-cell atlas of multimillion scale","volume":"4","author":"Yang","year":"2022","journal-title":"Nat Mach Intell"},{"key":"2025092202360989400_ref37","first-page":"8765","article-title":"Debiased contrastive learning","volume":"33","author":"Chuang","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2025092202360989400_ref38","first-page":"7482","article-title":"Multi-task learning using uncertainty to weigh losses for scene geometry and semantics","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Kendall","year":"2018"},{"key":"2025092202360989400_ref39","first-page":"1871","article-title":"End-to-end multi-task learning with attention","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu","year":"2019"},{"key":"2025092202360989400_ref40","doi-asserted-by":"publisher","first-page":"191","DOI":"10.1038\/s42256-019-0037-0","article-title":"scDeepcluster: clustering single-cell RNA-seq data with a model-based deep learning approach","volume":"1","author":"Tian","year":"2019","journal-title":"Nat Mach Intell"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/5\/bbaf485\/64339201\/bbaf485.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/5\/bbaf485\/64339201\/bbaf485.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,22]],"date-time":"2025-09-22T06:36:19Z","timestamp":1758522979000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbaf485\/8260785"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,31]]},"references-count":40,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,8,31]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbaf485","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,9]]},"published":{"date-parts":[[2025,8,31]]},"article-number":"bbaf485"}}