{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:11Z","timestamp":1772138051982,"version":"3.50.1"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2022,3,7]],"date-time":"2022-03-07T00:00:00Z","timestamp":1646611200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,4,28]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>An important step in the transcriptomic analysis of individual cells involves manually determining the cellular identities. To ease this labor-intensive annotation of cell-types, there has been a growing interest in automated cell annotation, which can be achieved by training classification algorithms on previously annotated datasets. Existing pipelines employ dataset integration methods to remove potential batch effects between source (annotated) and target (unannotated) datasets. However, the integration and classification steps are usually independent of each other and performed by different tools. We propose JIND (joint integration and discrimination for automated single-cell annotation), a neural-network-based framework for automated cell-type identification that performs integration in a space suitably chosen to facilitate cell classification. To account for batch effects, JIND performs a novel asymmetric alignment in which unseen cells are mapped onto the previously learned latent space, avoiding the need of retraining the classification model for new datasets. JIND also learns cell-type-specific confidence thresholds to identify cells that cannot be reliably classified.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We show on several batched datasets that the joint approach to integration and classification of JIND outperforms in accuracy existing pipelines, and a smaller fraction of cells is rejected as unlabeled as a result of the cell-specific confidence thresholds. Moreover, we investigate cells misclassified by JIND and provide evidence suggesting that they could be due to outliers in the annotated datasets or errors in the original approach used for annotation of the target batch.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Implementation for JIND is available at https:\/\/github.com\/mohit1997\/JIND and the data underlying this article can be accessed at https:\/\/doi.org\/10.5281\/zenodo.6246322.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac140","type":"journal-article","created":{"date-parts":[[2022,3,3]],"date-time":"2022-03-03T07:25:53Z","timestamp":1646292353000},"page":"2488-2495","source":"Crossref","is-referenced-by-count":10,"title":["JIND: joint integration and discrimination for automated single-cell annotation"],"prefix":"10.1093","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8835-158X","authenticated-orcid":false,"given":"Mohit","family":"Goyal","sequence":"first","affiliation":[{"name":"Electrical and Computer Engineering Department, University of Illinois , Urbana, IL 61801, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3972-6186","authenticated-orcid":false,"given":"Guillermo","family":"Serrano","sequence":"additional","affiliation":[{"name":"Computational Biology Program, Center for Applied Medical Research (CIMA), University of Navarra , Pamplona 31008, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Josepmaria","family":"Argemi","sequence":"additional","affiliation":[{"name":"Center for Liver Diseases, Pittsburgh Liver Research Center, Division of Gastroenterology, Hepatology and Nutrition, University of Pittsburgh Medical Center , Pittsburgh, PA 15213, USA"},{"name":"Centro de Investigaci\u00f3n Biom\u00e9dica en Red de Enfermedades Hep\u00e1ticas y Digestivas , Madrid 28029, Spain"},{"name":"Liver Unit, Clinica Universitaria de Navarra , Pamplona 31008, Spain"},{"name":"Hepatology Program, Center for Applied Medical Research (CIMA) Universidad de Navarra , Pamplona 31008, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ilan","family":"Shomorony","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering Department, University of Illinois , Urbana, IL 61801, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mikel","family":"Hernaez","sequence":"additional","affiliation":[{"name":"Computational Biology Program, Center for Applied Medical Research (CIMA), University of Navarra , Pamplona 31008, Spain"},{"name":"Carl R. Woese Institute for Genomic Biology, University of Illinois , Urbana, IL 61801, USA"},{"name":"Artificial Intelligence and Data Science Institute (DATAI), University of Navarra , Pamplona 31008, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1864-7868","authenticated-orcid":false,"given":"Idoia","family":"Ochoa","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering Department, University of Illinois , Urbana, IL 61801, USA"},{"name":"Artificial Intelligence and Data Science Institute (DATAI), University of Navarra , Pamplona 31008, Spain"},{"name":"Department of Electrical Engineering, Tecnun, University of Navarra , Donostia 20018, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2022,3,7]]},"reference":[{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1186\/s13059-019-1795-z","article-title":"A comparison of automatic cell identification methods for single-cell RNA sequencing data","volume":"20","author":"Abdelaal","year":"2019","journal-title":"Genome Biol"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"4768","DOI":"10.1038\/s41467-018-07165-2","article-title":"A web server for comparative analysis of single-cell RNA-seq data","volume":"9","author":"Alavi","year":"2018","journal-title":"Nat. Commun"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1186\/s13059-019-1862-5","article-title":"scpred: accurate supervised method for cell-type classification from single-cell RNA-seq data","volume":"20","author":"Alquicira-Hernandez","year":"2019","journal-title":"Genome Biol"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-019-40481-1","article-title":"Adult human pancreatic acinar cells dedifferentiate into an embryonic progenitor-like state in 3D suspension culture","volume":"9","author":"Baldan","year":"2019","journal-title":"Sci. Rep"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.cels.2016.08.011","article-title":"A single-cell transcriptomic map of the human and mouse pancreas reveals inter- and intra-cell population structure","volume":"3","author":"Baron","year":"2016","journal-title":"Cell Syst"},{"key":"2023041402564199900_","article-title":"SCID: identification of equivalent transcriptional cell populations across single cell RNA-seq data using discriminant analysis","author":"Boufea","year":"2020"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"1200","DOI":"10.1038\/s41592-020-00979-3","article-title":"Mars: discovering novel cell types across heterogeneous single-cell experiments","volume":"17","author":"Brbi\u0107","year":"2020","journal-title":"Nat. Methods"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"296","DOI":"10.12688\/f1000research.18490.1","article-title":"Evaluation of methods to assign cell type labels to cell clusters from single-cell RNA-sequencing data","volume":"8","author":"Diaz-Mejia","year":"2019","journal-title":"F1000Research"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","DOI":"10.1093\/database\/baz046","article-title":"PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data","volume":"2019","author":"Franz\u00e9n","year":"2019","journal-title":"Database"},{"key":"2023041402564199900_","author":"Goodfellow","year":"2014"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"1458","DOI":"10.1038\/s41587-019-0332-7","article-title":"Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia","volume":"37","author":"Granja","year":"2019","journal-title":"Nat. Biotechnol"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1016\/j.tem.2011.05.003","article-title":"Mafa and mafb activity in pancreatic \u03b2 cells","volume":"22","author":"Hang","year":"2011","journal-title":"Trends Endocrinol. Metab"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"685","DOI":"10.1038\/s41587-019-0113-3","article-title":"Efficient integration of heterogeneous single-cell transcriptomes using scanorama","volume":"37","author":"Hie","year":"2019","journal-title":"Nat. Biotechnol"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1038\/s42256-020-00233-7","article-title":"Iterative transfer learning with neural network for clustering and cell type classification in single-cell RNA-seq analysis","volume":"2","author":"Hu","year":"2020","journal-title":"Nat. Mach. Intell"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1093\/biostatistics\/kxj037","article-title":"Adjusting batch effects in microarray expression data using empirical Bayes methods","volume":"8","author":"Johnson","year":"2007","journal-title":"Biostatistics"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1038\/nmeth.4644","article-title":"scmap: projection of single-cell RNA-seq data across data sets","volume":"15","author":"Kiselev","year":"2018","journal-title":"Nat. Methods"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"1289","DOI":"10.1038\/s41592-019-0619-0","article-title":"Fast, sensitive and accurate integration of single-cell data with harmony","volume":"16","author":"Korsunsky","year":"2019","journal-title":"Nat. Methods"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"R29","DOI":"10.1186\/gb-2014-15-2-r29","article-title":"voom: precision weights unlock linear model analysis tools for RNA-seq read counts","volume":"15","author":"Law","year":"2014","journal-title":"Genome Biol"},{"key":"2023041402564199900_","first-page":"896","author":"Lee","year":"2013"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"2338","DOI":"10.1038\/s41467-020-15851-3","article-title":"Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis","volume":"11","author":"Li","year":"2020","journal-title":"Nat. Commun"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"1053","DOI":"10.1038\/s41592-018-0229-2","article-title":"Deep generative modeling for single-cell transcriptomics","volume":"15","author":"Lopez","year":"2018","journal-title":"Nat. Methods"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","DOI":"10.1101\/532093","article-title":"Automated identification of cell types in single cell RNA sequencing","author":"Ma","year":"2019"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1016\/j.cels.2016.09.002","article-title":"A single-cell transcriptome atlas of the human pancreas","volume":"3","author":"Muraro","year":"2016","journal-title":"Cell Syst"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","DOI":"10.1101\/397042","article-title":"Fast batch alignment of single cell transcriptomes unifies multiple mouse cell atlases into an integrated landscape","author":"Park","year":"2018"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"e27041","DOI":"10.7554\/eLife.27041","article-title":"Science forum: the human cell atlas","volume":"6","author":"Regev","year":"2017","journal-title":"eLife"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"1726","DOI":"10.3389\/fimmu.2018.01726","article-title":"Monocyte subsets: phenotypes and function in tuberculosis infection","volume":"9","author":"Sampath","year":"2018","journal-title":"Front. Immunol"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1038\/s41586-018-0590-4","article-title":"Single-cell transcriptomics of 20 mouse organs creates a Tabula muris","volume":"562","author":"Schaum","year":"2018","journal-title":"Nature"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1016\/j.cmet.2016.08.020","article-title":"Single-cell transcriptome profiling of human pancreatic islets in health and type 2 diabetes","volume":"24","author":"Segerstolpe","year":"2016","journal-title":"Cell Metab"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"1888","DOI":"10.1016\/j.cell.2019.05.031","article-title":"Comprehensive integration of single-cell data","volume":"177","author":"Stuart","year":"2019","journal-title":"Cell"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1186\/s13059-019-1850-9","article-title":"A benchmark of batch-effect correction methods for single-cell RNA sequencing data","volume":"21","author":"Tran","year":"2020","journal-title":"Genome Biol"},{"key":"2023041402564199900_","doi-asserted-by":"crossref","first-page":"1138","DOI":"10.1126\/science.aaa1934","article-title":"Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq","volume":"347","author":"Zeisel","year":"2015","journal-title":"Science"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btac140\/42946826\/btac140.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/9\/2488\/49874473\/btac140.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/9\/2488\/49874473\/btac140.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,18]],"date-time":"2023-11-18T05:21:11Z","timestamp":1700284871000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/9\/2488\/6543609"}},"subtitle":[],"editor":[{"given":"Anthony","family":"Mathelier","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2022,3,7]]},"references-count":31,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2022,4,28]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac140","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.10.06.327601","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,5,1]]},"published":{"date-parts":[[2022,3,7]]}}}