{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T00:17:32Z","timestamp":1760660252720,"version":"build-2065373602"},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2025,9,18]],"date-time":"2025-09-18T00:00:00Z","timestamp":1758153600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Gobierno de Navarra through the ANDIA 2021 Program","award":["0011-3947-2021-000023"],"award-info":[{"award-number":["0011-3947-2021-000023"]}]},{"name":"ERA PerMed JTC2022 PORTRAIT Project","award":["0011-2750-2022-000000"],"award-info":[{"award-number":["0011-2750-2022-000000"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,10,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Despite the inherent complexity associated to automatic cell type assignments, most supervised learning models overlook rigorous uncertainty quantification on the annotations. Although some existing pipelines incorporate rejection options under predefined circumstances, they usually rely on arbitrary assumptions and do not provide statistical guarantees. In this work, we propose a methodology based on the conformal prediction framework to provide reliable single-cell annotations. Conformal prediction provides statistical guarantees on the outcome predictions without making any assumption about the underlying distribution of the data. Our methodological proposal leverages conformal inference to address two critical challenges in single-cell RNA sequencing annotations: (i) detect out-of-distribution cell types in the query data; and, (ii) perform reliable uncertainty quantification of the cell annotations through well-calibrated prediction sets.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We evaluated the anomaly detector and the uncertainty-aware annotator in 10 batched experiments derived from various tissues. Specifically, we studied three different annotation taxonomies (standard, classwise, and cluster) alongside three different non-conformity measures. The results showed that our anomaly detector effectively identified previously unseen cell types, producing well-calibrated prediction sets. This rigorous annotation helped maintain coverage probabilities at the expected significance level. Finally, we illustrate how the integration of conformal prediction outputs enhanced further downstream analyses.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The automatic scRNA-seq annotator is available at https:\/\/github.com\/digital-medicine-research-group-UNAV\/conformalized_single_cell_annotator and https:\/\/doi.org\/10.5281\/zenodo.15870599.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf521","type":"journal-article","created":{"date-parts":[[2025,9,20]],"date-time":"2025-09-20T00:33:43Z","timestamp":1758328423000},"source":"Crossref","is-referenced-by-count":0,"title":["Conformal inference for reliable single cell RNA-seq annotation"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8179-3839","authenticated-orcid":false,"given":"Marcos","family":"L\u00f3pez-De-Castro","sequence":"first","affiliation":[{"name":"Institute of Data Science and Artificial Intelligence (DATAI), University of Navarra , Pamplona, Navarra 31009,","place":["Spain"]},{"name":"TECNUN School of Engineering, University of Navarra , Donostia-San Sebasti\u00e1n, Basque Country, 20018,","place":["Spain"]},{"name":"Cancer Center CCUN, Cl\u00ednica Universidad de Navarra , Pamplona, Navarra, 31080,","place":["Spain"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7692-9113","authenticated-orcid":false,"given":"Alberto","family":"Garc\u00eda-Galindo","sequence":"additional","affiliation":[{"name":"Institute of Data Science and Artificial Intelligence (DATAI), University of Navarra , Pamplona, Navarra 31009,","place":["Spain"]},{"name":"TECNUN School of Engineering, University of Navarra , Donostia-San Sebasti\u00e1n, Basque Country, 20018,","place":["Spain"]},{"name":"Cancer Center CCUN, Cl\u00ednica Universidad de Navarra , Pamplona, Navarra, 31080,","place":["Spain"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8549-0453","authenticated-orcid":false,"given":"Jos\u00e9","family":"Gonz\u00e1lez-Gomariz","sequence":"additional","affiliation":[{"name":"Institute of Data Science and Artificial Intelligence (DATAI), University of Navarra , Pamplona, Navarra 31009,","place":["Spain"]},{"name":"TECNUN School of Engineering, University of Navarra , Donostia-San Sebasti\u00e1n, Basque Country, 20018,","place":["Spain"]},{"name":"Cancer Center CCUN, Cl\u00ednica Universidad de Navarra , Pamplona, Navarra, 31080,","place":["Spain"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4049-0000","authenticated-orcid":false,"given":"Rub\u00e9n","family":"Arma\u00f1anzas","sequence":"additional","affiliation":[{"name":"Institute of Data Science and Artificial Intelligence (DATAI), University of Navarra , Pamplona, Navarra 31009,","place":["Spain"]},{"name":"TECNUN School of Engineering, University of Navarra , Donostia-San Sebasti\u00e1n, Basque Country, 20018,","place":["Spain"]},{"name":"Cancer Center CCUN, Cl\u00ednica Universidad de Navarra , Pamplona, Navarra, 31080,","place":["Spain"]}]}],"member":"286","published-online":{"date-parts":[[2025,9,18]]},"reference":[{"key":"2025101607420256800_btaf521-B1","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1186\/s13059-019-1795-z","article-title":"A comparison of automatic cell identification methods for single-cell RNA sequencing data","volume":"20","author":"Abdelaal","year":"2019","journal-title":"Genome Biol"},{"key":"2025101607420256800_btaf521-B2","doi-asserted-by":"crossref","first-page":"494","DOI":"10.1561\/2200000101","article-title":"Conformal prediction: a gentle introduction","volume":"16","author":"Angelopoulos","year":"2023","journal-title":"FNT in Machine Learning"},{"year":"2021","author":"Angelopoulos","key":"2025101607420256800_btaf521-B3"},{"key":"2025101607420256800_btaf521-B4","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1038\/s41590-018-0276-y","article-title":"Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage","volume":"20","author":"Aran","year":"2019","journal-title":"Nat Immunol"},{"key":"2025101607420256800_btaf521-B5","first-page":"111","article-title":"A single-cell transcriptomic map of the human and mouse pancreas reveals inter- and intra-cell population structure","volume":"3","author":"Baron","year":"2016","journal-title":"Cell Syst"},{"key":"2025101607420256800_btaf521-B6","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1214\/22-AOS2244","article-title":"Testing for outliers with conformal p-values","volume":"51","author":"Bates","year":"2023","journal-title":"Ann Stat"},{"key":"2025101607420256800_btaf521-B7","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1109\/ICMLA.2008.107","volume-title":"2008 Seventh International Conference on Machine Learning and Applications","author":"Bostr\u00f6m","year":"2008"},{"key":"2025101607420256800_btaf521-B8","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1038\/nbt.4096","article-title":"Integrating single-cell transcriptomic data across different conditions, technologies, and species","volume":"36","author":"Butler","year":"2018","journal-title":"Nat Biotechnol"},{"key":"2025101607420256800_btaf521-B9","doi-asserted-by":"crossref","first-page":"2749","DOI":"10.1038\/s41596-021-00534-0","article-title":"Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods","volume":"16","author":"Clarke","year":"2021","journal-title":"Nat Protoc"},{"key":"2025101607420256800_btaf521-B10","doi-asserted-by":"crossref","first-page":"eabl5197","DOI":"10.1126\/science.abl5197","article-title":"Cross-tissue immune cell analysis reveals tissue-specific features in humans","volume":"376","author":"Dom\u00ednguez Conde","year":"2022","journal-title":"Science"},{"key":"2025101607420256800_btaf521-B11","first-page":"64555","volume-title":"Advances in Neural Information Processing Systems","author":"Ding","year":"2023"},{"key":"2025101607420256800_btaf521-B12","doi-asserted-by":"crossref","first-page":"2488","DOI":"10.1093\/bioinformatics\/btac140","article-title":"JIND: joint integration and discrimination for automated single-cell annotation","volume":"38","author":"Goyal","year":"2022","journal-title":"Bioinformatics"},{"key":"2025101607420256800_btaf521-B13","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1038\/s41576-023-00586-w","article-title":"Best practices for single-cell analysis across modalities","volume":"24","author":"Heumos","year":"2023","journal-title":"Nat Rev Genet"},{"year":"2024","author":"Huang","key":"2025101607420256800_btaf521-B14"},{"key":"2025101607420256800_btaf521-B15","doi-asserted-by":"crossref","first-page":"e694","DOI":"10.1002\/ctm2.694","article-title":"Single-cell RNA sequencing technologies and applications: a brief overview","volume":"12","author":"Jovic","year":"2022","journal-title":"Clin Transl Med"},{"key":"2025101607420256800_btaf521-B16","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1038\/s41586-022-04541-3","article-title":"Human distal lung maps and lineage hierarchies reveal a bipotent progenitor","volume":"604","author":"Kadur","year":"2022","journal-title":"Nature"},{"first-page":"109","year":"2022","author":"Khatri","key":"2025101607420256800_btaf521-B17"},{"key":"2025101607420256800_btaf521-B18","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1038\/s41576-018-0088-9","article-title":"Challenges in unsupervised clustering of single-cell RNA-seq data","volume":"20","author":"Kiselev","year":"2019","journal-title":"Nat Rev Genet"},{"key":"2025101607420256800_btaf521-B19","doi-asserted-by":"crossref","first-page":"1289","DOI":"10.1038\/s41592-019-0619-0","article-title":"Fast, sensitive and accurate integration of single-cell data with harmony","volume":"16","author":"Korsunsky","year":"2019","journal-title":"Nat Methods"},{"key":"2025101607420256800_btaf521-B20","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1145\/1833280.1833287","volume-title":"Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques, StreamKDD\u201910","author":"Laxhammar","year":"2010"},{"key":"2025101607420256800_btaf521-B21","first-page":"15682","volume-title":"Advances in Neural Information Processing Systems","author":"Minderer","year":"2021"},{"key":"2025101607420256800_btaf521-B22","first-page":"345","volume-title":"Machine Learning: ECML 2002: Proceedings of the 13th European Conference on Machine Learning, Helsinki, Finland, August 19\u201323, 2002","author":"Papadopoulos","year":"2002"},{"key":"2025101607420256800_btaf521-B23","doi-asserted-by":"crossref","first-page":"490","DOI":"10.1038\/s41586-019-0933-9","article-title":"A single-cell molecular map of mouse gastrulation and early organogenesis","volume":"566","author":"Pijuan-Sala","year":"2019","journal-title":"Nature"},{"key":"2025101607420256800_btaf521-B24","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1016\/j.sigpro.2013.12.026","article-title":"A review of novelty detection","volume":"99","author":"Pimentel","year":"2014","journal-title":"Signal Process"},{"key":"2025101607420256800_btaf521-B25","doi-asserted-by":"crossref","first-page":"983","DOI":"10.1038\/s41592-019-0535-3","article-title":"Supervised classification enables rapid annotation of cell atlases","volume":"16","author":"Pliner","year":"2019","journal-title":"Nat Methods"},{"volume-title":"Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS\u201920","year":"2020","author":"Romano","key":"2025101607420256800_btaf521-B26"},{"key":"2025101607420256800_btaf521-B27","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1080\/01621459.2017.1395341","article-title":"Least ambiguous set-valued classifiers with bounded error levels","volume":"114","author":"Sadinle","year":"2019","journal-title":"J Am Stat Assoc"},{"key":"2025101607420256800_btaf521-B28","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1038\/nbt.3192","article-title":"Spatial reconstruction of single-cell gene expression data","volume":"33","author":"Satija","year":"2015","journal-title":"Nat Biotechnol"},{"key":"2025101607420256800_btaf521-B29","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1038\/s41592-024-02184-y","article-title":"Tissue: uncertainty-calibrated prediction of single-cell spatial transcriptomics improves downstream analyses","volume":"21","author":"Sun","year":"2024","journal-title":"Nat Methods"},{"key":"2025101607420256800_btaf521-B30","doi-asserted-by":"crossref","first-page":"btae128","DOI":"10.1093\/bioinformatics\/btae128","article-title":"Uncertainty-aware single-cell annotation with a hierarchical reject option","volume":"40","author":"Theunissen","year":"2024","journal-title":"Bioinformatics"},{"key":"2025101607420256800_btaf521-B31","doi-asserted-by":"crossref","first-page":"bbaf239","DOI":"10.1093\/bib\/bbaf239","article-title":"Evaluation of out-of-distribution detection methods for data shifts in single-cell transcriptomics","volume":"26","author":"Theunissen","year":"2025","journal-title":"Brief Bioinform"},{"key":"2025101607420256800_btaf521-B32","doi-asserted-by":"crossref","first-page":"108507","DOI":"10.1016\/j.patcog.2021.108507","article-title":"Introduction to conformal predictors","volume":"124","author":"Toccaceli","year":"2022","journal-title":"Pattern Recognit"},{"key":"2025101607420256800_btaf521-B33","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-031-06649-8","volume-title":"Algorithmic Learning in a Random World","author":"Vovk","year":"2022","edition":"2nd edn"},{"key":"2025101607420256800_btaf521-B34","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1038\/s41592-019-0691-5","article-title":"Single-cell multimodal omics: the power of many","volume":"17","author":"Zhu","year":"2020","journal-title":"Nat Methods"},{"key":"2025101607420256800_btaf521-B35","doi-asserted-by":"crossref","first-page":"1317","DOI":"10.1016\/j.immuni.2019.03.009","article-title":"Single-cell transcriptomics of human and mouse lung cancers reveals conserved myeloid populations across individuals and species","volume":"50","author":"Zilionis","year":"2019","journal-title":"Immunity"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf521\/64316557\/btaf521.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/10\/btaf521\/64316557\/btaf521.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/10\/btaf521\/64316557\/btaf521.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T11:42:13Z","timestamp":1760614933000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf521\/8257682"}},"subtitle":[],"editor":[{"given":"Anthony","family":"Mathelier","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,9,18]]},"references-count":35,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2025,10,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf521","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2025,10]]},"published":{"date-parts":[[2025,9,18]]},"article-number":"btaf521"}}