{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,12]],"date-time":"2026-04-12T13:40:56Z","timestamp":1776001256496,"version":"3.50.1"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2023,11,27]],"date-time":"2023-11-27T00:00:00Z","timestamp":1701043200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R21CA248122"],"award-info":[{"award-number":["R21CA248122"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Cell-type annotation is a time-consuming yet critical first step in the analysis of single-cell RNA-seq data, especially when multiple similar cell subtypes with overlapping marker genes are present. Existing automated annotation methods have a number of limitations, including requiring large reference datasets, high computation time, shallow annotation resolution, and difficulty in identifying cancer cells or their most likely cell of origin.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We developed Census, a biologically intuitive and fully automated cell-type identification method for single-cell RNA-seq data that can deeply annotate normal cells in mammalian tissues and identify malignant cells and their likely cell of origin. Motivated by the inherently stratified developmental programs of cellular differentiation, Census infers hierarchical cell-type relationships and uses gradient-boosted \\decision trees that capitalize on nodal cell-type relationships to achieve high prediction speed and accuracy. When benchmarked on 44 atlas-scale normal and cancer, human and mouse tissues, Census significantly outperforms state-of-the-art methods across multiple metrics and naturally predicts the cell-of-origin of different cancers. Census is pretrained on the Tabula Sapiens to classify 175 cell-types from 24 organs; however, users can seamlessly train their own models for customized applications.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Census is available at Zenodo https:\/\/zenodo.org\/records\/7017103 and on our Github https:\/\/github.com\/sjdlabgroup\/Census.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad714","type":"journal-article","created":{"date-parts":[[2023,11,25]],"date-time":"2023-11-25T22:14:10Z","timestamp":1700950450000},"source":"Crossref","is-referenced-by-count":6,"title":["Hierarchical and automated cell-type annotation and inference of cancer cell of origin with Census"],"prefix":"10.1093","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2472-6140","authenticated-orcid":false,"given":"Bassel","family":"Ghaddar","sequence":"first","affiliation":[{"name":"Center for Systems and Computational Biology, Rutgers Cancer Institute of New Jersey, Rutgers University , New Brunswick, NJ 08901, United States"}]},{"given":"Subhajyoti","family":"De","sequence":"additional","affiliation":[{"name":"Center for Systems and Computational Biology, Rutgers Cancer Institute of New Jersey, Rutgers University , New Brunswick, NJ 08901, United States"}]}],"member":"286","published-online":{"date-parts":[[2023,11,27]]},"reference":[{"key":"2023121122511219000_btad714-B1","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1186\/s13059-019-1795-z","article-title":"A comparison of automatic cell identification methods for single-cell RNA sequencing data","volume":"20","author":"Abdelaal","year":"2019","journal-title":"Genome Biol"},{"key":"2023121122511219000_btad714-B2","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1038\/s41590-018-0276-y","article-title":"Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage","volume":"20","author":"Aran","year":"2019","journal-title":"Nat Immunol"},{"key":"2023121122511219000_btad714-B3","doi-asserted-by":"crossref","first-page":"649","DOI":"10.1016\/j.ccell.2021.02.015","article-title":"Tumor and immune reprogramming during immunotherapy in advanced renal cell carcinoma","volume":"39","author":"Bi","year":"2021","journal-title":"Cancer Cell"},{"key":"2023121122511219000_btad714-B4","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1011288","article-title":"The specious art of single-cell genomics","volume-title":"PLOS Comput. Biol.","author":"Chari","year":"2023"},{"key":"2023121122511219000_btad714-B5","doi-asserted-by":"crossref","first-page":"104318","DOI":"10.1016\/j.isci.2022.104318","article-title":"hECA: the cell-centric assembly of a cell atlas","volume":"25","author":"Chen","year":"2022","journal-title":"iScience"},{"key":"2023121122511219000_btad714-B6","author":"Chen","year":"2016"},{"key":"2023121122511219000_btad714-B7","doi-asserted-by":"crossref","first-page":"2749","DOI":"10.1038\/s41596-021-00534-0","article-title":"Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods","volume":"16","author":"Clarke","year":"2021","journal-title":"Nat Protoc"},{"key":"2023121122511219000_btad714-B8","doi-asserted-by":"crossref","first-page":"1095","DOI":"10.1038\/s41587-021-00896-6","article-title":"Gene signature extraction and cell identity recognition at the single-cell level with Cell-ID","volume":"39","author":"Cortal","year":"2021","journal-title":"Nat Biotechnol"},{"key":"2023121122511219000_btad714-B9","doi-asserted-by":"crossref","first-page":"e95","DOI":"10.1093\/nar\/gkz543","article-title":"CHETAH: a selective, hierarchical cell type identification method for single-cell RNA sequencing","volume":"47","author":"de Kanter","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023121122511219000_btad714-B10","doi-asserted-by":"crossref","first-page":"eabl5197","DOI":"10.1126\/science.abl5197","article-title":"Cross-tissue immune cell analysis reveals tissue-specific features in humans","volume":"376","author":"Dom\u00ednguez Conde","year":"2023","journal-title":"Science"},{"key":"2023121122511219000_btad714-B11","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1186\/s13059-021-02281-7","article-title":"scSorter: assigning cells to known cell types according to marker genes","volume":"22","author":"Guo","year":"2021","journal-title":"Genome Biol"},{"key":"2023121122511219000_btad714-B12","doi-asserted-by":"crossref","first-page":"1178","DOI":"10.1038\/s41588-022-01134-8","article-title":"Single-nucleus and spatial transcriptome profiling of pancreatic cancer identifies multicellular dynamics associated with neoadjuvant treatment","volume":"54","author":"Hwang","year":"2022","journal-title":"Nat Genet"},{"key":"2023121122511219000_btad714-B13","doi-asserted-by":"crossref","first-page":"1246","DOI":"10.1038\/s41467-022-28803-w","article-title":"Fully-automated and ultra-fast cell-type identification using specific marker combinations from single-cell transcriptomic data","volume":"13","author":"Ianevski","year":"2022","journal-title":"Nat Commun"},{"key":"2023121122511219000_btad714-B14","doi-asserted-by":"crossref","first-page":"eabl4896","DOI":"10.1126\/science.abl4896","article-title":"The Tabula Sapiens: a multiple-organ, single-cell transcriptomic atlas of humans","volume":"376","author":"Jones","year":"2022","journal-title":"Science"},{"key":"2023121122511219000_btad714-B15","doi-asserted-by":"crossref","first-page":"2285","DOI":"10.1038\/s41467-020-16164-1","article-title":"Single-cell RNA sequencing demonstrates the molecular and cellular reprogramming of metastatic lung adenocarcinoma","volume":"11","author":"Kim","year":"2020","journal-title":"Nat Commun"},{"key":"2023121122511219000_btad714-B16","doi-asserted-by":"crossref","first-page":"1208","DOI":"10.1038\/s41588-020-00726-6","article-title":"Pan-cancer single-cell RNA-seq identifies recurring programs of cellular heterogeneity","volume":"52","author":"Kinker","year":"2020","journal-title":"Nat Genet"},{"key":"2023121122511219000_btad714-B17","doi-asserted-by":"crossref","first-page":"662","DOI":"10.1016\/j.ccell.2021.03.007","article-title":"Single-cell sequencing links multiregional immune landscapes and tissue-resident T cells in ccRCC to tumor topology and therapy efficacy","volume":"39","author":"Krishna","year":"2021","journal-title":"Cancer Cell"},{"key":"2023121122511219000_btad714-B18","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1186\/s13059-020-1926-6","article-title":"Eleven grand challenges in single-cell data science","volume":"21","author":"L\u00e4hnemann","year":"2020","journal-title":"Genome Biol"},{"key":"2023121122511219000_btad714-B19","doi-asserted-by":"crossref","first-page":"594","DOI":"10.1038\/s41588-020-0636-z","article-title":"Lineage-dependent gene expression programs influence the immune landscape of colorectal cancer","volume":"52","author":"Lee","year":"2020","journal-title":"Nat Genet"},{"key":"2023121122511219000_btad714-B20","doi-asserted-by":"crossref","first-page":"1818","DOI":"10.1038\/s41467-020-15523-2","article-title":"SciBet as a portable and fast single cell type identifier","volume":"11","author":"Li","year":"2020","journal-title":"Nat Commun"},{"key":"2023121122511219000_btad714-B21","doi-asserted-by":"crossref","first-page":"e9389","DOI":"10.15252\/msb.20199389","article-title":"scClassify: sample size estimation and multiscale classification of cells using single and multiple reference","volume":"16","author":"Lin","year":"2020","journal-title":"Mol Syst Biol"},{"key":"2023121122511219000_btad714-B22","doi-asserted-by":"crossref","first-page":"466","DOI":"10.1038\/s41586-020-2797-4","article-title":"Cells of the adult human heart","volume":"588","author":"Litvi\u0148ukov\u00e1","year":"2020","journal-title":"Nature"},{"key":"2023121122511219000_btad714-B23","doi-asserted-by":"crossref","first-page":"1397","DOI":"10.1016\/j.jhep.2021.06.028","article-title":"Single-cell atlas of tumor cell evolution in response to therapy in hepatocellular carcinoma and intrahepatic cholangiocarcinoma","volume":"75","author":"Ma","year":"2021","journal-title":"J Hepatol"},{"key":"2023121122511219000_btad714-B24","doi-asserted-by":"crossref","first-page":"e9682","DOI":"10.15252\/msb.20209682","article-title":"A single cell atlas of the human liver tumor microenvironment","volume":"16","author":"Massalha","year":"2020","journal-title":"Mol Syst Biol"},{"key":"2023121122511219000_btad714-B25","volume-title":"et al.","author":"Nofech-Mozes","year":"2023"},{"key":"2023121122511219000_btad714-B26","doi-asserted-by":"crossref","first-page":"961","DOI":"10.1016\/j.csbj.2021.01.015","article-title":"Automated methods for cell type annotation on scRNA-seq data","volume":"19","author":"Pasquini","year":"2021","journal-title":"Comput Struct Biotechnol J"},{"key":"2023121122511219000_btad714-B27","doi-asserted-by":"crossref","first-page":"4734","DOI":"10.1016\/j.cell.2021.08.003","article-title":"Spatially organized multicellular immune hubs in human colorectal cancer","volume":"184","author":"Pelka","year":"2021","journal-title":"Cell"},{"key":"2023121122511219000_btad714-B28","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1038\/s41422-019-0195-y","article-title":"Single-cell RNA-seq highlights intra-tumoral heterogeneity and malignant progression in pancreatic ductal adenocarcinoma","volume":"29","author":"Peng","year":"2019","journal-title":"Nat Cell Res"},{"key":"2023121122511219000_btad714-B29","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/j.cell.2020.03.053","article-title":"The human tumor atlas network: charting tumor transitions across space and time at single-cell resolution","volume":"181","author":"Rozenblatt-Rosen","year":"2020","journal-title":"Cell"},{"key":"2023121122511219000_btad714-B30","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1038\/s41586-018-0590-4","article-title":"Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris","volume":"562","author":"Schaum","year":"2018","journal-title":"Nature"},{"key":"2023121122511219000_btad714-B31","doi-asserted-by":"crossref","first-page":"100882","DOI":"10.1016\/j.isci.2020.100882","article-title":"scCATCH: automatic annotation on cell types of clusters from single-cell RNA sequencing data","volume":"23","author":"Shao","year":"2020","journal-title":"iScience"},{"key":"2023121122511219000_btad714-B32","first-page":"45","author":"Shekhar","year":"2019"},{"key":"2023121122511219000_btad714-B33","doi-asserted-by":"crossref","first-page":"714","DOI":"10.1016\/j.cell.2019.06.029","article-title":"Intra- and inter-cellular rewiring of the human colon during ulcerative colitis","volume":"178","author":"Smillie","year":"2019","journal-title":"Cell"},{"key":"2023121122511219000_btad714-B34","doi-asserted-by":"crossref","first-page":"1888","DOI":"10.1016\/j.cell.2019.05.031","article-title":"Comprehensive integration of single-cell data","volume":"177","author":"Stuart","year":"2019","journal-title":"Cell"},{"key":"2023121122511219000_btad714-B35","doi-asserted-by":"crossref","first-page":"619","DOI":"10.1038\/s41586-020-2922-4","article-title":"A molecular cell atlas of the human lung from single-cell RNA sequencing","volume":"587","author":"Travaglini","year":"2020","journal-title":"Nature"},{"key":"2023121122511219000_btad714-B36","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-21706-2","volume-title":"Modern Applied Statistics with S Fourth","author":"Veneables","year":"2002"},{"key":"2023121122511219000_btad714-B37","doi-asserted-by":"crossref","first-page":"2540","DOI":"10.1038\/s41467-021-22801-0","article-title":"Single-cell profiling of tumor heterogeneity and the microenvironment in advanced non-small cell lung cancer","volume":"12","author":"Wu","year":"2021","journal-title":"Nat Commun"},{"key":"2023121122511219000_btad714-B38","doi-asserted-by":"crossref","first-page":"1334","DOI":"10.1038\/s41588-021-00911-1","article-title":"A single-cell and spatially resolved atlas of human breast cancers","volume":"53","author":"Wu","year":"2021","journal-title":"Nat Genet"},{"key":"2023121122511219000_btad714-B39","doi-asserted-by":"crossref","first-page":"e104063","DOI":"10.15252\/embj.2019104063","article-title":"Stromal cell diversity associated with immune evasion in human triple-negative breast cancer","volume":"39","author":"Wu","year":"2020","journal-title":"EMBO J"},{"key":"2023121122511219000_btad714-B40","doi-asserted-by":"crossref","first-page":"5874","DOI":"10.1016\/j.csbj.2021.10.027","article-title":"Automatic cell type identification methods for single-cell RNA sequencing","volume":"19","author":"Xie","year":"2021","journal-title":"Comput Struct Biotechnol J"},{"key":"2023121122511219000_btad714-B41","doi-asserted-by":"crossref","first-page":"1007","DOI":"10.1038\/s41592-019-0529-1","article-title":"Probabilistic cell-type assignment of single-cell RNA-seq for tumor microenvironment profiling","volume":"16","author":"Zhang","year":"2019","journal-title":"Nat Methods"},{"key":"2023121122511219000_btad714-B42","doi-asserted-by":"crossref","first-page":"e43","DOI":"10.1093\/nar\/gkab1275","article-title":"scMAGIC: accurately annotating single cells using two rounds of reference-based classification","volume":"50","author":"Zhang","year":"2022","journal-title":"Nucleic Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad714\/53842747\/btad714.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/12\/btad714\/54257000\/btad714.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/12\/btad714\/54257000\/btad714.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,12]],"date-time":"2023-12-12T01:53:01Z","timestamp":1702345981000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad714\/7451827"}},"subtitle":[],"editor":[{"given":"Anthony","family":"Mathelier","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,11,27]]},"references-count":42,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2023,12,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad714","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,12,1]]},"published":{"date-parts":[[2023,11,27]]},"article-number":"btad714"}}