{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T17:52:50Z","timestamp":1776275570901,"version":"3.50.1"},"reference-count":37,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2023,12,13]],"date-time":"2023-12-13T00:00:00Z","timestamp":1702425600000},"content-version":"vor","delay-in-days":12,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100012338","name":"Alan Turing Institute","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100012338","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000289","name":"Cancer Research UK","doi-asserted-by":"publisher","award":["FC001002"],"award-info":[{"award-number":["FC001002"]}],"id":[{"id":"10.13039\/501100000289","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000289","name":"Cancer Research UK","doi-asserted-by":"publisher","award":["FC001169"],"award-info":[{"award-number":["FC001169"]}],"id":[{"id":"10.13039\/501100000289","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000289","name":"Cancer Research UK","doi-asserted-by":"publisher","award":["FC001745"],"award-info":[{"award-number":["FC001745"]}],"id":[{"id":"10.13039\/501100000289","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000289","name":"Cancer Research UK","doi-asserted-by":"publisher","award":["FC001130"],"award-info":[{"award-number":["FC001130"]}],"id":[{"id":"10.13039\/501100000289","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000265","name":"UK Medical Research Council","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000265","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Cell type identification plays an important role in the analysis and interpretation of single-cell data and can be carried out via supervised or unsupervised clustering approaches. Supervised methods are best suited where we can list all cell types and their respective marker genes a priori, while unsupervised clustering algorithms look for groups of cells with similar expression properties. This property permits the identification of both known and unknown cell populations, making unsupervised methods suitable for discovery. Success is dependent on the relative strength of the expression signature of each group as well as the number of cells. Rare cell types therefore present a particular challenge that is magnified when they are defined by differentially expressing a small number of genes.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Typical unsupervised approaches fail to identify such rare subpopulations, and these cells tend to be absorbed into more prevalent cell types. In order to balance these competing demands, we have developed a novel statistical framework for unsupervised clustering, named Rarity, that enables the discovery process for rare cell types to be more robust, consistent, and interpretable. We achieve this by devising a novel clustering method based on a Bayesian latent variable model in which we assign cells to inferred latent binary on\/off expression profiles. This lets us achieve increased sensitivity to rare cell populations while also allowing us to control and interpret potential false positive discoveries. We systematically study the challenges associated with rare cell type identification and demonstrate the utility of Rarity on various IMC datasets.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Implementation of Rarity together with examples is available from the Github repository (https:\/\/github.com\/kasparmartens\/rarity).<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad750","type":"journal-article","created":{"date-parts":[[2023,12,12]],"date-time":"2023-12-12T07:59:34Z","timestamp":1702367974000},"source":"Crossref","is-referenced-by-count":9,"title":["Rarity: discovering rare cell populations from single-cell imaging data"],"prefix":"10.1093","volume":"39","author":[{"given":"Kaspar","family":"M\u00e4rtens","sequence":"first","affiliation":[{"name":"The Alan Turing Institute , London NW1 2DB, United Kingdom"}]},{"given":"Michele","family":"Bortolomeazzi","sequence":"additional","affiliation":[{"name":"Francis Crick Institute , London NW1 1AT, United Kingdom"},{"name":"King\u2019s College London , London WC2R 2LS, United Kingdom"}]},{"given":"Lucia","family":"Montorsi","sequence":"additional","affiliation":[{"name":"Francis Crick Institute , London NW1 1AT, United Kingdom"},{"name":"King\u2019s College London , London WC2R 2LS, United Kingdom"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7202-2431","authenticated-orcid":false,"given":"Jo","family":"Spencer","sequence":"additional","affiliation":[{"name":"King\u2019s College London , London WC2R 2LS, United Kingdom"}]},{"given":"Francesca","family":"Ciccarelli","sequence":"additional","affiliation":[{"name":"Francis Crick Institute , London NW1 1AT, United Kingdom"},{"name":"Bart\u2019s Cancer Institute - Centre for Cancer Genomics & Computational Biology, Queen Mary University of London , Charterhouse Square , London, EC1M 6BQ, United Kingdom"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7615-8523","authenticated-orcid":false,"given":"Christopher","family":"Yau","sequence":"additional","affiliation":[{"name":"The Alan Turing Institute , London NW1 2DB, United Kingdom"},{"name":"Nuffield Department for Women\u2019s & Reproductive Health, University of Oxford, Women\u2019s Centre (Level 3), John Radcliffe Hospital , Oxford OX3 9DU, United Kingdom"}]}],"member":"286","published-online":{"date-parts":[[2023,12,13]]},"reference":[{"key":"2023122701325631600_btad750-B1","doi-asserted-by":"crossref","first-page":"769","DOI":"10.1002\/cyto.a.23738","article-title":"Predicting cell populations in single cell mass cytometry data","volume":"95","author":"Abdelaal","year":"2019","journal-title":"Cytometry A"},{"key":"2023122701325631600_btad750-B2","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nm.3488","article-title":"Multiplexed ion beam imaging of human breast tumors","volume":"20","author":"Angelo","year":"2014","journal-title":"Nat Med"},{"key":"2023122701325631600_btad750-B3","article-title":"Dimensionality reduction for visualizing single-cell data using UMAP","author":"Becht","year":"2018","journal-title":"Nat Biotechnol"},{"key":"2023122701325631600_btad750-B4","doi-asserted-by":"crossref","first-page":"194445","DOI":"10.1016\/j.bbagrm.2019.194445","article-title":"Identification of non-cancer cells from cancer transcriptomic data","volume":"1863","author":"Bortolomeazzi","year":"2020","journal-title":"Biochim Biophys Acta Gene Regul Mech"},{"key":"2023122701325631600_btad750-B5","doi-asserted-by":"crossref","first-page":"781","DOI":"10.1038\/s41467-022-28470-x","article-title":"A SIMPLI (single-cell identification from MultiPLexed images) approach for spatially-resolved tissue phenotyping at single-cell resolution","volume":"13","author":"Bortolomeazzi","year":"2022","journal-title":"Nat Commun"},{"key":"2023122701325631600_btad750-B6","doi-asserted-by":"crossref","first-page":"e1003130","DOI":"10.1371\/journal.pcbi.1003130","article-title":"Hierarchical modeling for rare event detection and cell subset alignment across flow cytometry samples","volume":"9","author":"Cron","year":"2013","journal-title":"PLoS Comput Biol"},{"key":"2023122701325631600_btad750-B7","first-page":"2023","author":"Cui","year":"2023"},{"key":"2023122701325631600_btad750-B8","doi-asserted-by":"crossref","first-page":"755","DOI":"10.1016\/j.cmet.2018.11.014","article-title":"A map of human type 1 diabetes progression by imaging mass cytometry","volume":"29","author":"Damond","year":"2019","journal-title":"Cell Metab"},{"key":"2023122701325631600_btad750-B9","doi-asserted-by":"crossref","first-page":"5706","DOI":"10.1093\/bioinformatics\/btaa1061","article-title":"Cytomapper: an R\/Bioconductor package for visualization of highly multiplexed imaging data","volume":"36","author":"Eling","year":"2020","journal-title":"Bioinformatics"},{"key":"2023122701325631600_btad750-B10","doi-asserted-by":"crossref","first-page":"4197","DOI":"10.1038\/s41467-021-24489-8","article-title":"GapClust is a light-weight approach distinguishing rare cells from voluminous single cell expression profiles","volume":"12","author":"Fa","year":"2021","journal-title":"Nat Commun"},{"key":"2023122701325631600_btad750-B11","doi-asserted-by":"crossref","first-page":"11982","DOI":"10.1073\/pnas.1300136110","article-title":"Highly multiplexed single-cell analysis of formalin-fixed, paraffin-embedded cancer tissue","volume":"110","author":"Gerdes","year":"2013","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2023122701325631600_btad750-B12","doi-asserted-by":"crossref","first-page":"1173","DOI":"10.1016\/j.cels.2021.08.012","article-title":"Automated assignment of cell identity from single-cell multiplexed imaging and proteomic data","volume":"12","author":"Geuenich","year":"2021","journal-title":"Cell Syst"},{"key":"2023122701325631600_btad750-B13","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1038\/nmeth.2869","article-title":"Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry","volume":"11","author":"Giesen","year":"2014","journal-title":"Nat Methods"},{"key":"2023122701325631600_btad750-B14","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1016\/j.stem.2016.05.010","article-title":"De novo prediction of stem cell identity using single-cell transcriptome data","volume":"19","author":"Gr\u00fcn","year":"2016","journal-title":"Cell Stem Cell"},{"key":"2023122701325631600_btad750-B15","doi-asserted-by":"crossref","first-page":"172","DOI":"10.1038\/mi.2008.8","article-title":"Brokering the peace: the origin of intestinal T cells","volume":"1","author":"Hayday","year":"2008","journal-title":"Mucosal Immunol"},{"key":"2023122701325631600_btad750-B16","doi-asserted-by":"crossref","first-page":"615","DOI":"10.1038\/s41586-019-1876-x","article-title":"The single-cell pathology landscape of breast cancer","volume":"578","author":"Jackson","year":"2020","journal-title":"Nature"},{"key":"2023122701325631600_btad750-B17","doi-asserted-by":"crossref","first-page":"4719","DOI":"10.1038\/s41467-018-07234-6","article-title":"Discovery of rare cells from voluminous single cell expression data","volume":"9","author":"Jindal","year":"2018","journal-title":"Nat Commun"},{"key":"2023122701325631600_btad750-B18","doi-asserted-by":"crossref","first-page":"1179","DOI":"10.1093\/bioinformatics\/btr095","article-title":"Improved structure, function and compatibility for CellProfiler: modular high-throughput image analysis software","volume":"27","author":"Kamentsky","year":"2011","journal-title":"Bioinformatics"},{"key":"2023122701325631600_btad750-B19","doi-asserted-by":"crossref","first-page":"1373","DOI":"10.1016\/j.cell.2018.08.039","article-title":"A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging","volume":"174","author":"Keren","year":"2018","journal-title":"Cell"},{"key":"2023122701325631600_btad750-B20","author":"Kingma","year":"2014"},{"key":"2023122701325631600_btad750-B21","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1016\/j.immuni.2014.08.004","article-title":"Unconventional intraepithelial gut T cells: the TCR says it all","volume":"41","author":"Kurd","year":"2014","journal-title":"Immunity"},{"key":"2023122701325631600_btad750-B22","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1016\/j.cell.2015.05.047","article-title":"Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis","volume":"162","author":"Levine","year":"2015","journal-title":"Cell"},{"key":"2023122701325631600_btad750-B23","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.31657","article-title":"Highly multiplexed immunofluorescence imaging of human tissues and tumors using t-CyCIF and conventional optical microscopes","volume":"7","author":"Lin","year":"2018","journal-title":"Elife"},{"key":"2023122701325631600_btad750-B24","author":"M\u00e4rtens","year":"2022"},{"key":"2023122701325631600_btad750-B25","author":"McInnes","year":"2018"},{"key":"2023122701325631600_btad750-B26","doi-asserted-by":"crossref","first-page":"408","DOI":"10.1002\/cyto.a.22446","article-title":"SWIFT-scalable clustering for automated identification of rare cell populations in large, high-dimensional flow cytometry datasets, part 1: algorithm design","volume":"85","author":"Naim","year":"2014","journal-title":"Cytometry Pt A"},{"key":"2023122701325631600_btad750-B27","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.62915","article-title":"ImmunoCluster provides a computational framework for the nonspecialist to profile high-dimensional cytometry data","volume":"10","author":"Opzoomer","year":"2021","journal-title":"Elife"},{"key":"2023122701325631600_btad750-B28","author":"Rezende","year":"2014"},{"key":"2023122701325631600_btad750-B29","first-page":"410","author":"Rosenberg","year":"2007"},{"key":"2023122701325631600_btad750-B30","doi-asserted-by":"crossref","first-page":"1888","DOI":"10.1016\/j.cell.2019.05.031","article-title":"Comprehensive integration of single-cell data","volume":"177","author":"Stuart","year":"2019","journal-title":"Cell"},{"key":"2023122701325631600_btad750-B31","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1186\/s13059-018-1431-3","article-title":"GiniClust2: a cluster-aware, weighted ensemble clustering method for cell-type detection","volume":"19","author":"Tsoucas","year":"2018","journal-title":"Genome Biol"},{"key":"2023122701325631600_btad750-B32","first-page":"2579","article-title":"Visualizing data using t-sne","volume":"9","author":"Van der Maaten","year":"2008","journal-title":"J Mach Learn Res"},{"key":"2023122701325631600_btad750-B33","doi-asserted-by":"crossref","first-page":"636","DOI":"10.1002\/cyto.a.22625","article-title":"FlowSOM: using self-organizing maps for visualization and interpretation of cytometry data","volume":"87","author":"Van Gassen","year":"2015","journal-title":"Cytometry Pt A"},{"issue":"8","key":"2023122701325631600_btad750-B37","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1136\/gut.31.8.841","article-title":"Gamma\/delta T cells in the gut epithelium","volume":"31","author":"Viney","year":"1990","journal-title":"Gut"},{"key":"2023122701325631600_btad750-B34","doi-asserted-by":"crossref","first-page":"1084","DOI":"10.1002\/cyto.a.23030","article-title":"Comparison of clustering methods for high-dimensional single-cell flow and mass cytometry data","volume":"89","author":"Weber","year":"2016","journal-title":"Cytometry Pt A"},{"key":"2023122701325631600_btad750-B35","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1186\/s13059-019-1739-7","article-title":"CellSIUS provides sensitive and specific detection of rare cell populations from complex single-cell RNA-seq data","volume":"20","author":"Wegmann","year":"2019","journal-title":"Genome Biol"},{"issue":"4","key":"2023122701325631600_btad750-B36","article-title":"scAIDE: clustering of large-scale single-cell RNA-seq data reveals putative and rare cell types","volume":"2","author":"Xie","year":"2020","journal-title":"NAR Genom Bioinform"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad750\/54408128\/btad750.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/12\/btad750\/54879279\/btad750.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/12\/btad750\/54879279\/btad750.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,26]],"date-time":"2023-12-26T20:33:41Z","timestamp":1703622821000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad750\/7471872"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,12,1]]},"references-count":37,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2023,12,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad750","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.07.15.500256","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,12,1]]},"published":{"date-parts":[[2023,12,1]]},"article-number":"btad750"}}