{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,28]],"date-time":"2026-01-28T13:07:48Z","timestamp":1769605668989,"version":"3.49.0"},"reference-count":17,"publisher":"Oxford University Press (OUP)","issue":"23","license":[{"start":{"date-parts":[[2021,7,13]],"date-time":"2021-07-13T00:00:00Z","timestamp":1626134400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100001003","name":"Boehringer Ingelheim","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100001003","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Harvard University Division of Science"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,12,7]]},"abstract":"<jats:title>ABSTRACT<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The emergence of single-cell RNA sequencing (scRNA-seq) has led to an explosion in novel methods to study biological variation among individual cells, and to classify cells into functional and biologically meaningful categories.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Here, we present a new cell type projection tool, Hierarchical Random Forest for Information Transfer (HieRFIT), based on hierarchical random forests. HieRFIT uses a priori information about cell type relationships to improve classification accuracy, taking as input a hierarchical tree structure representing the class relationships, along with the reference data. We use an ensemble approach combining multiple random forest models, organized in a hierarchical decision tree structure. We show that our hierarchical classification approach improves accuracy and reduces incorrect predictions especially for inter-dataset tasks which reflect real-life applications. We use a scoring scheme that adjusts probability distributions for candidate class labels and resolves uncertainties while avoiding the assignment of cells to incorrect types by labeling cells at internal nodes of the hierarchy when necessary.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>HieRFIT is implemented as an R package, and it is available at (https:\/\/github.com\/yasinkaymaz\/HieRFIT\/releases\/tag\/v1.0.0).<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab499","type":"journal-article","created":{"date-parts":[[2021,7,2]],"date-time":"2021-07-02T20:18:00Z","timestamp":1625257080000},"page":"4431-4436","source":"Crossref","is-referenced-by-count":9,"title":["HieRFIT: a hierarchical cell type classification tool for projections from complex single-cell atlas datasets"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9725-7536","authenticated-orcid":false,"given":"Yasin","family":"Kaymaz","sequence":"first","affiliation":[{"name":"Informatics Group , Cambridge, MA 02138, USA"},{"name":", Harvard University , Cambridge, MA 02138, USA"}]},{"given":"Florian","family":"Ganglberger","sequence":"additional","affiliation":[{"name":"VRVis Research Center , 1220 Vienna, Austria"}]},{"given":"Ming","family":"Tang","sequence":"additional","affiliation":[{"name":"Informatics Group , Cambridge, MA 02138, USA"},{"name":", Harvard University , Cambridge, MA 02138, USA"}]},{"given":"Christian","family":"Haslinger","sequence":"additional","affiliation":[{"name":"Global Computational Biology and Digital Sciences, , 88400 Biberach an der Ri\u00df, Germany"},{"name":"Boehringer Ingelheim Pharma GmbH and Co KG , 88400 Biberach an der Ri\u00df, Germany"}]},{"given":"Francesc","family":"Fernandez-Albert","sequence":"additional","affiliation":[{"name":"Global Computational Biology and Digital Sciences, , 88400 Biberach an der Ri\u00df, Germany"},{"name":"Boehringer Ingelheim Pharma GmbH and Co KG , 88400 Biberach an der Ri\u00df, Germany"}]},{"given":"Nathan","family":"Lawless","sequence":"additional","affiliation":[{"name":"Global Computational Biology and Digital Sciences, , 88400 Biberach an der Ri\u00df, Germany"},{"name":"Boehringer Ingelheim Pharma GmbH and Co KG , 88400 Biberach an der Ri\u00df, Germany"}]},{"given":"Timothy B","family":"Sackton","sequence":"additional","affiliation":[{"name":"Informatics Group , Cambridge, MA 02138, USA"},{"name":", Harvard University , Cambridge, MA 02138, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,7,13]]},"reference":[{"key":"2023061310490316200_btab499-B1","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1186\/s13059-019-1795-z","article-title":"A comparison of automatic cell identification methods for single-cell RNA sequencing data","volume":"20","author":"Abdelaal","year":"2019","journal-title":"Genome Biol"},{"key":"2023061310490316200_btab499-B2","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1145\/136035.136043","article-title":"Symbolic Boolean manipulation with ordered binary-decision diagrams","volume":"24","author":"Bryant","year":"1992","journal-title":"ACM Comput. Surv"},{"key":"2023061310490316200_btab499-B3","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1038\/nbt.4096","article-title":"Integrating single-cell transcriptomic data across different conditions, technologies, and species","volume":"36","author":"Butler","year":"2018","journal-title":"Nat. Biotechnol"},{"key":"2023061310490316200_btab499-B4","doi-asserted-by":"crossref","first-page":"484","DOI":"10.1038\/nn.4495","article-title":"A molecular census of arcuate hypothalamus and median eminence cell types","volume":"20","author":"Campbell","year":"2017","journal-title":"Nat. Neurosci"},{"key":"2023061310490316200_btab499-B5","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1126\/science.aam8940","article-title":"Comprehensive single-cell transcriptional profiling of a multicellular organism","volume":"357","author":"Cao","year":"2017","journal-title":"Science"},{"key":"2023061310490316200_btab499-B6","doi-asserted-by":"crossref","first-page":"737","DOI":"10.1038\/s41587-020-0465-8","article-title":"Systematic comparison of single-cell and single-nucleus RNA-sequencing methods","volume":"38","author":"Ding","year":"2020","journal-title":"Nat. Biotechnol"},{"key":"2023061310490316200_btab499-B7","doi-asserted-by":"crossref","first-page":"e95","DOI":"10.1093\/nar\/gkz543","article-title":"CHETAH: a selective, hierarchical cell type identification method for single-cell RNA sequencing","volume":"47","author":"Kanter","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023061310490316200_btab499-B8","article-title":"Functional Annotation of Genes using Hierarchical Text Categorization","author":"Kiritchenko","year":"2005"},{"key":"2023061310490316200_btab499-B9","doi-asserted-by":"crossref","first-page":"e9389","DOI":"10.15252\/msb.20199389","article-title":"scClassify: sample size estimation and multiscale classification of cells using single and multiple reference","volume":"16","author":"Lin","year":"2020","journal-title":"Mol. Syst. Biol"},{"key":"2023061310490316200_btab499-B10","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1093\/bioinformatics\/btz592","article-title":"ACTINN: automated identification of cell types in single cell RNA sequencing","volume":"36","author":"Ma","year":"2020","journal-title":"Bioinformatics"},{"key":"2023061310490316200_btab499-B11","doi-asserted-by":"crossref","first-page":"1209","DOI":"10.1093\/bib\/bbz063","article-title":"Machine learning and statistical methods for clustering single-cell RNA-sequencing data","volume":"21","author":"Petegrosso","year":"2020","journal-title":"Brief. Bioinform"},{"key":"2023061310490316200_btab499-B12","doi-asserted-by":"crossref","first-page":"983","DOI":"10.1038\/s41592-019-0535-3","article-title":"Supervised classification enables rapid annotation of cell atlases","volume":"16","author":"Pliner","year":"2019","journal-title":"Nat. Methods"},{"key":"2023061310490316200_btab499-B13","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1126\/science.aam8999","article-title":"Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding","volume":"360","author":"Rosenberg","year":"2018","journal-title":"Science"},{"key":"2023061310490316200_btab499-B14","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1016\/j.cels.2019.06.004","article-title":"SingleCellNet: a computational tool to classify single cell RNA-seq data across platforms and across species","volume":"9","author":"Tan","year":"2019","journal-title":"Cell Syst"},{"key":"2023061310490316200_btab499-B15","doi-asserted-by":"crossref","first-page":"999","DOI":"10.1016\/j.cell.2018.06.021","article-title":"Molecular architecture of the mouse nervous system","volume":"174","author":"Zeisel","year":"2018","journal-title":"Cell"},{"key":"2023061310490316200_btab499-B16","doi-asserted-by":"crossref","first-page":"14049","DOI":"10.1038\/ncomms14049","article-title":"Massively parallel digital transcriptional profiling of single cells","volume":"8","author":"Zheng","year":"2017","journal-title":"Nat. Commun"},{"key":"2023061310490316200_btab499-B17","first-page":"27","article-title":"Asymmetric and sample size sensitive entropy measures for supervised learning","author":"Zighed","year":"2010","journal-title":"Adv. Intel. Inform. Syst"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab499\/39511000\/btab499.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/23\/4431\/50579318\/btab499.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/23\/4431\/50579318\/btab499.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,13]],"date-time":"2023-06-13T10:49:48Z","timestamp":1686653388000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/23\/4431\/6320801"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,7,13]]},"references-count":17,"journal-issue":{"issue":"23","published-print":{"date-parts":[[2021,12,7]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab499","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,12,1]]},"published":{"date-parts":[[2021,7,13]]}}}