{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T12:09:32Z","timestamp":1776082172575,"version":"3.50.1"},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2021,11,19]],"date-time":"2021-11-19T00:00:00Z","timestamp":1637280000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,1,27]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Single-cell RNA sequencing (scRNA-seq) provides transcriptomic profiling for individual cells, allowing researchers to study the heterogeneity of tissues, recognize rare cell identities and discover new cellular subtypes. Clustering analysis is usually used to predict cell class assignments and infer cell identities. However, the high sparsity of scRNA-seq data, accentuated by dropout events generates challenges that have motivated the development of numerous dedicated clustering methods. Nevertheless, there is still no consensus on the best performing method.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>graph-sc is a new method leveraging a graph autoencoder network to create embeddings for scRNA-seq cell data. While this work analyzes the performance of clustering the embeddings with various clustering algorithms, other downstream tasks can also be performed. A broad experimental study has been performed on both simulated and scRNA-seq datasets. The results indicate that although there is no consistently best method across all the analyzed datasets,\u00a0graph-sc\u00a0compares favorably to competing techniques across all types of datasets. Furthermore, the proposed method is stable across consecutive runs, robust to input down-sampling, generally insensitive to changes in the network architecture or training parameters and more computationally efficient than other competing methods based on neural networks. Modeling the data as a graph provides increased flexibility to define custom features characterizing the genes, the cells and their interactions. Moreover, external data (e.g. gene network) can easily be integrated into the graph and used seamlessly under the same optimization task.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>https:\/\/github.com\/ciortanmadalina\/graph-sc.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab787","type":"journal-article","created":{"date-parts":[[2021,11,15]],"date-time":"2021-11-15T13:48:17Z","timestamp":1636984097000},"page":"1037-1044","source":"Crossref","is-referenced-by-count":71,"title":["GNN-based embedding for clustering scRNA-seq data"],"prefix":"10.1093","volume":"38","author":[{"given":"Madalina","family":"Ciortan","sequence":"first","affiliation":[{"name":"Interuniversity Institute of Bioinformatics in Brussels, Universit\u00e9 Libre de Bruxelles , Brussels, Belgium"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3090-3142","authenticated-orcid":false,"given":"Matthieu","family":"Defrance","sequence":"additional","affiliation":[{"name":"Interuniversity Institute of Bioinformatics in Brussels, Universit\u00e9 Libre de Bruxelles , Brussels, Belgium"}]}],"member":"286","published-online":{"date-parts":[[2021,11,19]]},"reference":[{"key":"2023020108515565300_btab787-B1","first-page":"3625","article-title":"Psychrophilic proteases dramatically reduce single-cell RNA-seq artifacts: a molecular atlas of kidney development","volume":"144","author":"Adam","year":"2017","journal-title":"Development (Cambridge)"},{"key":"2023020108515565300_btab787-B2","first-page":"1","article-title":"A dendrite method foe cluster analysis","volume":"3","author":"Cali\u00f1ski","year":"1974","journal-title":"Commun. Stat"},{"key":"2023020108515565300_btab787-B3","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1126\/science.aam8940","article-title":"Comprehensive single-cell transcriptional profiling of a multicellular organism","volume":"357","author":"Cao","year":"2017","journal-title":"Science"},{"key":"2023020108515565300_btab787-B4","doi-asserted-by":"crossref","first-page":"lqaa039","DOI":"10.1093\/nargab\/lqaa039","article-title":"Deep soft K-means clustering with self-training for single-cell RNA sequence data","volume":"2","author":"Chen","year":"2020","journal-title":"NAR Genomics Bioinf"},{"key":"2023020108515565300_btab787-B5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-018-07931-2","article-title":"Single-cell RNA-seq denoising using a deep count autoencoder","volume":"10","author":"Eraslan","year":"2019","journal-title":"Nat. Commun"},{"key":"2023020108515565300_btab787-B6","author":"Freytag","year":"2017"},{"key":"2023020108515565300_btab787-B7","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1038\/nature14966","article-title":"Single-cell messenger RNA sequencing reveals rare intestinal cell types","volume":"525","author":"Gr\u00fcn","year":"2015","journal-title":"Nature"},{"key":"2023020108515565300_btab787-B8","doi-asserted-by":"crossref","first-page":"1091","DOI":"10.1016\/j.cell.2018.02.001","article-title":"Mapping the mouse cell atlas by microwell-seq","volume":"172","author":"Han","year":"2018","journal-title":"Cell"},{"key":"2023020108515565300_btab787-B9","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"Hubert","year":"1985","journal-title":"J. Classif"},{"key":"2023020108515565300_btab787-B10","author":"Kipf","year":"2016"},{"key":"2023020108515565300_btab787-B11","author":"Kipf","year":"2017"},{"key":"2023020108515565300_btab787-B12","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1038\/s41576-018-0088-9","article-title":"Challenges in unsupervised clustering of single-cell RNA-seq data","volume":"20","author":"Kiselev","year":"2019","journal-title":"Nat. Rev. Genet"},{"key":"2023020108515565300_btab787-B13","doi-asserted-by":"crossref","first-page":"1187","DOI":"10.1016\/j.cell.2015.04.044","article-title":"Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells","volume":"161","author":"Klein","year":"2015","journal-title":"Cell"},{"key":"2023020108515565300_btab787-B14","doi-asserted-by":"crossref","first-page":"4200","DOI":"10.1038\/s41598-017-04520-z","article-title":"A comprehensive mouse transcriptomic BodyMap across 17 tissues by RNA-seq","volume":"7","author":"Li","year":"2017","journal-title":"Sci. Rep"},{"key":"2023020108515565300_btab787-B15","first-page":"1","article-title":"Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis","volume":"11","author":"Li","year":"2020","journal-title":"Nat. Commun"},{"key":"2023020108515565300_btab787-B16","author":"Lin","year":"2017"},{"key":"2023020108515565300_btab787-B17","doi-asserted-by":"crossref","first-page":"1053","DOI":"10.1038\/s41592-018-0229-2","article-title":"Deep generative modeling for single-cell transcriptomics","volume":"15","author":"Lopez","year":"2018","journal-title":"Nat. Methods"},{"key":"2023020108515565300_btab787-B18","doi-asserted-by":"crossref","DOI":"10.1093\/bfgp\/ely001","article-title":"Clustering single cells: a review of approaches on high-and low-depth single-cell RNA-seq data","author":"Menon","year":"2019","journal-title":"Brief. Funct. Genomics"},{"key":"2023020108515565300_btab787-B19","doi-asserted-by":"crossref","first-page":"20353","DOI":"10.1038\/s41598-019-56911-z","article-title":"Using transfer learning from prior reference knowledge to improve the clustering of single-cell RNA-Seq data","volume":"9","author":"Mieth","year":"2019","journal-title":"Sci. Rep"},{"key":"2023020108515565300_btab787-B20","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1016\/j.cels.2016.09.002","article-title":"A single-cell transcriptome atlas of the human pancreas","volume":"3","author":"Muraro","year":"2016","journal-title":"Cell Syst"},{"key":"2023020108515565300_btab787-B21","doi-asserted-by":"crossref","first-page":"1196","DOI":"10.1093\/bib\/bbz062","article-title":"Clustering and classification methods for single-cell RNA-sequencing data","volume":"21","author":"Qi","year":"2020","journal-title":"Brief. Bioinf"},{"key":"2023020108515565300_btab787-B22","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1038\/nn.4462","article-title":"Molecular interrogation of hypothalamic organization reveals distinct dopamine neuronal subtypes","volume":"20","author":"Romanov","year":"2017","journal-title":"Nat. Neurosci"},{"key":"2023020108515565300_btab787-B23","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: a graphical aid to the interpretation and validation of cluster analysis","volume":"20","author":"Rousseeuw","year":"1987","journal-title":"J. Comput. Appl. Math"},{"key":"2023020108515565300_btab787-B24","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1038\/nbt.3192","article-title":"Spatial reconstruction of single-cell gene expression data","volume":"33","author":"Satija","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023020108515565300_btab787-B25","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1038\/s41586-018-0590-4","article-title":"Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris","volume":"562","author":"Schaum","year":"2018","journal-title":"Nature"},{"key":"2023020108515565300_btab787-B26","author":"Shao","year":"2021"},{"key":"2023020108515565300_btab787-B27","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1038\/s42256-019-0037-0","article-title":"Clustering single-cell RNA-seq data with a model-based deep learning approach","volume":"1","author":"Tian","year":"2019","journal-title":"Nat. Mach. Intell"},{"key":"2023020108515565300_btab787-B28","author":"Wang","year":"2021"},{"key":"2023020108515565300_btab787-B29","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1186\/s13059-017-1382-0","article-title":"SCANPY: large-scale single-cell gene expression data analysis","volume":"19","author":"Wolf","year":"2018","journal-title":"Genome Biol"},{"key":"2023020108515565300_btab787-B30","first-page":"478","author":"Xie","year":"2016"},{"key":"2023020108515565300_btab787-B31","doi-asserted-by":"crossref","first-page":"594","DOI":"10.1126\/science.aat1699","article-title":"Single-cell transcriptomes from human kidneys reveal the cellular identity of renal tumors","volume":"361","author":"Young","year":"2018","journal-title":"Science"},{"key":"2023020108515565300_btab787-B32","doi-asserted-by":"crossref","first-page":"174","DOI":"10.1186\/s13059-017-1305-0","article-title":"Splatter: simulation of single-cell RNA sequencing data","volume":"18","author":"Zappia","year":"2017","journal-title":"Genome Biol"},{"key":"2023020108515565300_btab787-B33","doi-asserted-by":"crossref","first-page":"e1007794","DOI":"10.1371\/journal.pcbi.1007794","article-title":"ScEDAR: a scalable Python package for single-cell RNA-seq exploratory data analysis","volume":"16","author":"Zhang","year":"2020","journal-title":"PLoS Comput. Biol"},{"key":"2023020108515565300_btab787-B34","doi-asserted-by":"crossref","first-page":"14049","DOI":"10.1038\/ncomms14049","article-title":"Massively parallel digital transcriptional profiling of single cells","volume":"8","author":"Zheng","year":"2017","journal-title":"Nat. Commun"},{"key":"2023020108515565300_btab787-B35","doi-asserted-by":"crossref","first-page":"466","DOI":"10.1073\/pnas.1817715116","article-title":"Semisoft clustering of single-cell data","volume":"116","author":"Zhu","year":"2019","journal-title":"Proc. Natl. Acad. Sci. USA"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab787\/41390144\/btab787.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/4\/1037\/49008146\/btab787.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/4\/1037\/49008146\/btab787.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T20:11:09Z","timestamp":1675282269000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/4\/1037\/6432030"}},"subtitle":[],"editor":[{"given":"Valentina","family":"Boeva","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,11,19]]},"references-count":35,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,1,27]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab787","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,2,15]]},"published":{"date-parts":[[2021,11,19]]}}}