{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,1]],"date-time":"2026-07-01T02:03:45Z","timestamp":1782871425721,"version":"3.54.5"},"reference-count":34,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2021,12,2]],"date-time":"2021-12-02T00:00:00Z","timestamp":1638403200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,2,7]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Single-cell RNA sequencing allows high-resolution views of individual cells for libraries of up to millions of samples, thus motivating the use of deep learning for analysis. In this study, we introduce the use of graph neural networks for the unsupervised exploration of scRNA-seq data by developing a variational graph autoencoder architecture with graph attention layers that operates directly on the connectivity between cells, focusing on dimensionality reduction and clustering. With the help of several case studies, we show that our model, named CellVGAE, can be effectively used for exploratory analysis even on challenging datasets, by extracting meaningful features from the data and providing the means to visualize and interpret different aspects of the model.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We show that CellVGAE is more interpretable than existing scRNA-seq variational architectures by analysing the graph attention coefficients. By drawing parallels with other scRNA-seq studies on interpretability, we assess the validity of the relationships modelled by attention, and furthermore, we show that CellVGAE can intrinsically capture information such as pseudotime and NF-\u0138B activation dynamics, the latter being a property that is not generally shared by existing neural alternatives. We then evaluate the dimensionality reduction and clustering performance on 9 difficult and well-annotated datasets by comparing with three leading neural and non-neural techniques, concluding that CellVGAE outperforms competing methods. Finally, we report a decrease in training times of up to \u00d7 20 on a dataset of 1.3 million cells compared to existing deep learning architectures.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availabilityand implementation<\/jats:title>\n                    <jats:p>The CellVGAE code is available at https:\/\/github.com\/davidbuterez\/CellVGAE.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab804","type":"journal-article","created":{"date-parts":[[2021,11,24]],"date-time":"2021-11-24T15:20:34Z","timestamp":1637767234000},"page":"1277-1286","source":"Crossref","is-referenced-by-count":38,"title":["CellVGAE: an unsupervised scRNA-seq analysis workflow with graph attention networks"],"prefix":"10.1093","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6558-0833","authenticated-orcid":false,"given":"David","family":"Buterez","sequence":"first","affiliation":[{"name":"Department of Computer Science and Technology, University of Cambridge , Cambridge CB3 0FD, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ioana","family":"Bica","sequence":"additional","affiliation":[{"name":"Department of Engineering Science, University of Oxford , Oxford OX1 3PJ, UK"},{"name":"The Alan Turing Institute , London NW1 2DB, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ifrah","family":"Tariq","sequence":"additional","affiliation":[{"name":"Computational and Systems Biology Program, Department of Biological Engineering, Massachusetts Institute of Technology , Cambridge, MA 02142, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Helena","family":"Andr\u00e9s-Terr\u00e9","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, University of Cambridge , Cambridge CB3 0FD, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Pietro","family":"Li\u00f2","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, University of Cambridge , Cambridge CB3 0FD, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2021,12,2]]},"reference":[{"key":"2023020108545511000_btab804-B1","doi-asserted-by":"crossref","first-page":"9790","DOI":"10.1038\/s41598-020-66166-8","article-title":"Unsupervised generative and graph representation learning for modelling cell differentiation","volume":"10","author":"Bica","year":"2020","journal-title":"Sci. Rep"},{"key":"2023020108545511000_btab804-B2","doi-asserted-by":"crossref","first-page":"2223","DOI":"10.1093\/bioinformatics\/btab085","article-title":"Normalization of single-cell RNA-seq counts by log(x + 1) or log(1 + x)","volume":"37","author":"Booeshaghi","year":"2021","journal-title":"Bioinformatics"},{"key":"2023020108545511000_btab804-B3","author":"Brody","year":"2022"},{"key":"2023020108545511000_btab804-B4","doi-asserted-by":"crossref","first-page":"317","DOI":"10.3389\/fgene.2019.00317","article-title":"Single-cell RNA-seq technologies and related computational data analysis","volume":"10","author":"Chen","year":"2019","journal-title":"Front. Genet"},{"key":"2023020108545511000_btab804-B5","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1038\/s41467-018-07931-2","article-title":"Single-cell RNA-seq denoising using a deep count autoencoder","volume":"10","author":"Eraslan","year":"2019","journal-title":"Nat. Commun"},{"key":"2023020108545511000_btab804-B6","doi-asserted-by":"crossref","first-page":"4415","DOI":"10.1093\/bioinformatics\/btaa293","article-title":"scVAE: variational auto-encoders for single-cell gene expression data","volume":"36","author":"Gr\u00f8nbech","year":"2020","journal-title":"Bioinformatics"},{"key":"2023020108545511000_btab804-B7","first-page":"115","volume-title":"Methods in Molecular Biology","author":"Ji","year":"2019"},{"key":"2023020108545511000_btab804-B8","author":"Johnson","year":"2021"},{"key":"2023020108545511000_btab804-B9","doi-asserted-by":"crossref","first-page":"2316","DOI":"10.1093\/bib\/bby076","article-title":"Impact of similarity metrics on single-cell RNA-seq data clustering","volume":"20","author":"Kim","year":"2019","journal-title":"Brief. Bioinf"},{"key":"2023020108545511000_btab804-B10","author":"Kingma","year":"2014"},{"key":"2023020108545511000_btab804-B11","author":"Kipf","year":"2016"},{"key":"2023020108545511000_btab804-B12","author":"Kipf","year":"2016"},{"key":"2023020108545511000_btab804-B13","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1038\/nmeth.4236","article-title":"Sc3: consensus clustering of single-cell RNA-seq data","volume":"14","author":"Kiselev","year":"2017","journal-title":"Nat. Methods"},{"key":"2023020108545511000_btab804-B14","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1016\/j.cels.2017.03.010","article-title":"Measuring signaling and RNA-seq in the same cell links gene expression to dynamic patterns of nf-kb activation","volume":"4","author":"Lane","year":"2017","journal-title":"Cell Syst"},{"key":"2023020108545511000_btab804-B15","doi-asserted-by":"crossref","first-page":"1053","DOI":"10.1038\/s41592-018-0229-2","article-title":"Deep generative modeling for single-cell transcriptomics","volume":"15","author":"Lopez","year":"2018","journal-title":"Nat. Methods"},{"key":"2023020108545511000_btab804-B16","author":"Maas","year":"2013"},{"key":"2023020108545511000_btab804-B17","doi-asserted-by":"crossref","first-page":"205","DOI":"10.21105\/joss.00205","article-title":"hdbscan: hierarchical density based clustering","volume":"2","author":"McInnes","year":"2017","journal-title":"J. Open Source Softw"},{"key":"2023020108545511000_btab804-B18","doi-asserted-by":"publisher","author":"McInnes","year":"2018","DOI":"10.21105\/joss.00861"},{"key":"2023020108545511000_btab804-B19","volume-title":"Hematopathology: Morphology, Immunophenotype, Cytogenetics and Molecular Approaches","author":"Naeim","year":"2008","edition":"1st ed"},{"key":"2023020108545511000_btab804-B20","doi-asserted-by":"crossref","first-page":"961","DOI":"10.1016\/j.csbj.2021.01.015","article-title":"Automated methods for cell type annotation on scRNA-seq data","volume":"19","author":"Pasquini","year":"2021","journal-title":"Comput. Struct. Biotechnol. J"},{"key":"2023020108545511000_btab804-B21","doi-asserted-by":"crossref","first-page":"1663","DOI":"10.1016\/j.cell.2015.11.013","article-title":"Transcriptional heterogeneity and lineage commitment in myeloid progenitors","volume":"163","author":"Paul","year":"2015","journal-title":"Cell"},{"key":"2023020108545511000_btab804-B22","volume-title":"The Graph-Tool Python Library","author":"Peixoto","year":"2014"},{"key":"2023020108545511000_btab804-B23","doi-asserted-by":"crossref","first-page":"1308","DOI":"10.1016\/j.cell.2016.07.054","article-title":"Comprehensive classification of retinal bipolar neurons by single-cell transcriptomics","volume":"166","author":"Shekhar","year":"2016","journal-title":"Cell"},{"key":"2023020108545511000_btab804-B24","doi-asserted-by":"crossref","first-page":"1888","DOI":"10.1016\/j.cell.2019.05.031","article-title":"Comprehensive integration of single-cell data","volume":"177","author":"Stuart","year":"2019","journal-title":"Cell"},{"key":"2023020108545511000_btab804-B25","doi-asserted-by":"crossref","first-page":"3418","DOI":"10.1093\/bioinformatics\/btaa169","article-title":"Interpretable factor models of single-cell RNA-seq via variational autoencoders","volume":"36","author":"Svensson","year":"2020","journal-title":"Bioinformatics"},{"key":"2023020108545511000_btab804-B26","doi-asserted-by":"crossref","first-page":"e48994","DOI":"10.7554\/eLife.48994","article-title":"Self-assembling manifolds in single-cell RNA sequencing data","volume":"8","author":"Tarashansky","year":"2019","journal-title":"Elife"},{"key":"2023020108545511000_btab804-B27","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"van der Maaten","year":"2008","journal-title":"J. Mach. Learn. Res"},{"key":"2023020108545511000_btab804-B28","author":"Veli\u010dkovi\u0107","year":"2018"},{"key":"2023020108545511000_btab804-B29","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1038\/nmeth.4207","article-title":"Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning","volume":"14","author":"Wang","year":"2017","journal-title":"Nat. Methods"},{"key":"2023020108545511000_btab804-B30","doi-asserted-by":"crossref","first-page":"11626","DOI":"10.1128\/JVI.01515-13","article-title":"Klrg1 negatively regulates natural killer cell functions through the AKT pathway in individuals with chronic hepatitis c virus infection","volume":"87","author":"Wang","year":"2013","journal-title":"J. Virol"},{"key":"2023020108545511000_btab804-B31","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1186\/s13059-019-1663-x","article-title":"Paga: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells","volume":"20","author":"Wolf","year":"2019","journal-title":"Genome Biol"},{"key":"2023020108545511000_btab804-B32","doi-asserted-by":"crossref","first-page":"1583","DOI":"10.1093\/bib\/bby011","article-title":"Evaluation of tools for highly variable gene discovery from single-cell RNA-seq data","volume":"20","author":"Yip","year":"2019","journal-title":"Brief. Bioinf"},{"key":"2023020108545511000_btab804-B33","author":"Zappia","year":"2021"},{"key":"2023020108545511000_btab804-B34","doi-asserted-by":"publisher","first-page":"5885","DOI":"10.1609\/aaai.v33i01.33015885","author":"Zhao","year":"2019"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab804\/41818027\/btab804.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/5\/1277\/49009403\/btab804.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/5\/1277\/49009403\/btab804.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T15:23:06Z","timestamp":1675264986000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/5\/1277\/6448212"}},"subtitle":[],"editor":[{"given":"Valentina","family":"Boeva","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2021,12,2]]},"references-count":34,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,2,7]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab804","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.12.20.423645","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,3,1]]},"published":{"date-parts":[[2021,12,2]]}}}