{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T22:09:13Z","timestamp":1774476553160,"version":"3.50.1"},"reference-count":28,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2021,10,8]],"date-time":"2021-10-08T00:00:00Z","timestamp":1633651200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2016YFA0502303"],"award-info":[{"award-number":["2016YFA0502303"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Key Basic Research Project of China","award":["2015CB910303"],"award-info":[{"award-number":["2015CB910303"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["31871342"],"award-info":[{"award-number":["31871342"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,1,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Single-cell RNA-seq (scRNA-seq) has been widely used to resolve cellular heterogeneity. After collecting scRNA-seq data, the natural next step is to integrate the accumulated data to achieve a common ontology of cell types and states. Thus, an effective and efficient cell-type identification method is urgently needed. Meanwhile, high-quality reference data remain a necessity for precise annotation. However, such tailored reference data are always lacking in practice. To address this, we aggregated multiple datasets into a meta-dataset on which annotation is conducted. Existing supervised or semi-supervised annotation methods suffer from batch effects caused by different sequencing platforms, the effect of which increases in severity with multiple reference datasets.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Herein, a robust deep learning-based single-cell Multiple Reference Annotator (scMRA) is introduced. In scMRA, a knowledge graph is constructed to represent the characteristics of cell types in different datasets, and a graphic convolutional network serves as a discriminator based on this graph. scMRA keeps intra-cell-type closeness and the relative position of cell types across datasets. scMRA is remarkably powerful at transferring knowledge from multiple reference datasets, to the unlabeled target domain, thereby gaining an advantage over other state-of-the-art annotation methods in multi-reference data experiments. Furthermore, scMRA can remove batch effects. To the best of our knowledge, this is the first attempt to use multiple insufficient reference datasets to annotate target data, and it is, comparatively, the best annotation method for multiple scRNA-seq datasets.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>An implementation of scMRA is available from https:\/\/github.com\/ddb-qiwang\/scMRA-torch.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab700","type":"journal-article","created":{"date-parts":[[2021,10,6]],"date-time":"2021-10-06T21:41:28Z","timestamp":1633556488000},"page":"738-745","source":"Crossref","is-referenced-by-count":36,"title":["scMRA: a robust deep learning method to annotate scRNA-seq data with multiple reference datasets"],"prefix":"10.1093","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5674-1639","authenticated-orcid":false,"given":"Musu","family":"Yuan","sequence":"first","affiliation":[{"name":"School of Mathematical Sciences, Peking University , Beijing 100871, China"},{"name":"Center for Quantitative Biology, Peking University , Beijing 100871, China"}]},{"given":"Liang","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Mathematical Sciences, Peking University , Beijing 100871, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9143-1898","authenticated-orcid":false,"given":"Minghua","family":"Deng","sequence":"additional","affiliation":[{"name":"School of Mathematical Sciences, Peking University , Beijing 100871, China"},{"name":"Center for Quantitative Biology, Peking University , Beijing 100871, China"},{"name":"Center for Statistical Science, Peking University , Beijing 100871, China"}]}],"member":"286","published-online":{"date-parts":[[2021,10,8]]},"reference":[{"key":"2023020108495934900_btab700-B1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-019-1795-z","article-title":"A comparison of automatic cell identification methods for single-cell RNA sequencing data","volume":"20","author":"Abdelaal","year":"2019","journal-title":"Genome Biol"},{"key":"2023020108495934900_btab700-B2","author":"Blitzer","year":"2008"},{"key":"2023020108495934900_btab700-B3","doi-asserted-by":"crossref","first-page":"1200","DOI":"10.1038\/s41592-020-00979-3","article-title":"Mars: discovering novel cell types across heterogeneous single-cell experiments","volume":"17","author":"Brbi\u0107","year":"2020","journal-title":"Nat. Methods"},{"key":"2023020108495934900_btab700-B4","doi-asserted-by":"crossref","first-page":"496","DOI":"10.1038\/s41586-019-0969-x","article-title":"The single-cell transcriptional landscape of mammalian organogenesis","volume":"566","author":"Cao","year":"2019","journal-title":"Nature"},{"key":"2023020108495934900_btab700-B5","doi-asserted-by":"crossref","first-page":"lqaa039","DOI":"10.1093\/nargab\/lqaa039","article-title":"Deep soft k-means clustering with self-training for single-cell RNA sequence data","volume":"2","author":"Chen","year":"2020","journal-title":"NAR Genomics Bioinf"},{"key":"2023020108495934900_btab700-B6","doi-asserted-by":"crossref","first-page":"792","DOI":"10.3390\/genes11070792","article-title":"Integrating deep supervised, self-supervised and unsupervised learning for single-cell RNA-seq clustering and annotation","volume":"11","author":"Chen","year":"2020","journal-title":"Genes"},{"key":"2023020108495934900_btab700-B7","doi-asserted-by":"crossref","first-page":"295","DOI":"10.3389\/fgene.2020.00295","article-title":"Single-cell transcriptome data clustering via multinomial modeling and adaptive fuzzy k-means algorithm","volume":"11","author":"Chen","year":"2020","journal-title":"Front. Genet"},{"key":"2023020108495934900_btab700-B8","doi-asserted-by":"crossref","first-page":"775","DOI":"10.1093\/bioinformatics\/btaa908","article-title":"Single-cell rna-seq data semi-supervised clustering and annotation via structural regularized domain adaptation","volume":"37","author":"Chen","year":"2021","journal-title":"Bioinformatics"},{"key":"2023020108495934900_btab700-B9","first-page":"1180","author":"Ganin","year":"2015"},{"key":"2023020108495934900_btab700-B10","doi-asserted-by":"crossref","first-page":"666","DOI":"10.1016\/j.celrep.2012.08.003","article-title":"Cel-seq: single-cell RNA-seq by multiplexed linear amplification","volume":"2","author":"Hashimshony","year":"2012","journal-title":"Cell Rep"},{"key":"2023020108495934900_btab700-B11","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1038\/s42256-020-00233-7","article-title":"Iterative transfer learning with neural network for clustering and cell type classification in single-cell RNA-seq analysis","volume":"2","author":"Hu","year":"2020","journal-title":"Nat. Mach. Intell"},{"key":"2023020108495934900_btab700-B12","author":"Kipf","year":"2017"},{"key":"2023020108495934900_btab700-B13","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1038\/nmeth.4236","article-title":"Sc3: consensus clustering of single-cell RNA-seq data","volume":"14","author":"Kiselev","year":"2017","journal-title":"Nat. Methods"},{"key":"2023020108495934900_btab700-B14","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1038\/nmeth.4644","article-title":"scmap: projection of single-cell RNA-seq data across data sets","volume":"15","author":"Kiselev","year":"2018","journal-title":"Nat. Methods"},{"key":"2023020108495934900_btab700-B15","doi-asserted-by":"crossref","first-page":"2540","DOI":"10.1039\/C7LC90070H","article-title":"Indrops and drop-seq technologies for single-cell sequencing","volume":"17","author":"Klein","year":"2017","journal-title":"Lab Chip"},{"key":"2023020108495934900_btab700-B16","first-page":"97","author":"Long","year":"2015"},{"key":"2023020108495934900_btab700-B17","first-page":"1041","article-title":"Domain adaptation with multiple sources","volume":"21","author":"Mansour","year":"2008","journal-title":"Adv. Neural Inf. Process Syst"},{"key":"2023020108495934900_btab700-B18","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1038\/s41587-020-0469-4","article-title":"Benchmarking single-cell RNA-sequencing protocols for cell atlas projects","volume":"38","author":"Mereu","year":"2020","journal-title":"Nat. Biotechnol"},{"key":"2023020108495934900_btab700-B19","doi-asserted-by":"crossref","first-page":"1096","DOI":"10.1038\/nmeth.2639","article-title":"Smart-seq2 for sensitive full-length transcriptome profiling in single cells","volume":"10","author":"Picelli","year":"2013","journal-title":"Nat. Methods"},{"key":"2023020108495934900_btab700-B20","doi-asserted-by":"crossref","first-page":"983","DOI":"10.1038\/s41592-019-0535-3","article-title":"Supervised classification enables rapid annotation of cell atlases","volume":"16","author":"Pliner","year":"2019","journal-title":"Nat. Methods"},{"key":"2023020108495934900_btab700-B21","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1038\/nbt.3192","article-title":"Spatial reconstruction of single-cell gene expression data","volume":"33","author":"Satija","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023020108495934900_btab700-B22","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1038\/s41586-018-0590-4","article-title":"Single-cell transcriptomics of 20 mouse organs creates a tabula muris: the tabula muris consortium","volume":"562","author":"Schaum","year":"2018","journal-title":"Nature"},{"key":"2023020108495934900_btab700-B23","doi-asserted-by":"crossref","first-page":"1888","DOI":"10.1016\/j.cell.2019.05.031","article-title":"Comprehensive integration of single-cell data","volume":"177","author":"Stuart","year":"2019","journal-title":"Cell"},{"key":"2023020108495934900_btab700-B24","first-page":"443","author":"Sun","year":"2016"},{"key":"2023020108495934900_btab700-B25","first-page":"7167","author":"Tzeng","year":"2017"},{"key":"2023020108495934900_btab700-B26","first-page":"727","author":"Wang","year":"2020"},{"key":"2023020108495934900_btab700-B27","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-017-1305-0","article-title":"Splatter: simulation of single-cell RNA sequencing data","volume":"18","author":"Zappia","year":"2017","journal-title":"Genome Biol"},{"key":"2023020108495934900_btab700-B28","first-page":"3801","author":"Zhang","year":"2018"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab700\/41149090\/btab700.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/3\/738\/49008567\/btab700.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/3\/738\/49008567\/btab700.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T20:07:55Z","timestamp":1675282075000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/3\/738\/6384568"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,10,8]]},"references-count":28,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,1,12]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab700","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,2,1]]},"published":{"date-parts":[[2021,10,8]]}}}