{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,14]],"date-time":"2026-06-14T09:16:24Z","timestamp":1781428584436,"version":"3.54.1"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2019,9,10]],"date-time":"2019-09-10T00:00:00Z","timestamp":1568073600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Institute of General Medical Sciences of the National Institutes of Health","award":["R01GM115836"],"award-info":[{"award-number":["R01GM115836"]}]},{"name":"University of Pittsburgh School of Medicine"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,2,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Single-cell RNA sequencing (scRNA-seq) technologies enable the study of transcriptional heterogeneity at the resolution of individual cells and have an increasing impact on biomedical research. However, it is known that these methods sometimes wrongly consider two or more cells as single cells, and that a number of so-called doublets is present in the output of such experiments. Treating doublets as single cells in downstream analyses can severely bias a study\u2019s conclusions, and therefore computational strategies for the identification of doublets are needed.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>With scds, we propose two new approaches for in silico doublet identification: Co-expression based doublet scoring (cxds) and binary classification based doublet scoring (bcds). The co-expression based approach, cxds, utilizes binarized (absence\/presence) gene expression data and, employing a binomial model for the co-expression of pairs of genes, yields interpretable doublet annotations. bcds, on the other hand, uses a binary classification approach to discriminate artificial doublets from original data. We apply our methods and existing computational doublet identification approaches to four datasets with experimental doublet annotations and find that our methods perform at least as well as the state of the art, at comparably little computational cost. We observe appreciable differences between methods and across datasets and that no approach dominates all others. In summary, scds presents a scalable, competitive approach that allows for doublet annotation of datasets with thousands of cells in a matter of seconds.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>scds is implemented as a Bioconductor R package (doi: 10.18129\/B9.bioc.scds).<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz698","type":"journal-article","created":{"date-parts":[[2019,9,5]],"date-time":"2019-09-05T15:27:43Z","timestamp":1567697263000},"page":"1150-1158","source":"Crossref","is-referenced-by-count":270,"title":["scds: computational annotation of doublets in single-cell RNA sequencing data"],"prefix":"10.1093","volume":"36","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5910-1306","authenticated-orcid":false,"given":"Abha S","family":"Bais","sequence":"first","affiliation":[{"name":"Department of Developmental Biology , USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1460-5487","authenticated-orcid":false,"given":"Dennis","family":"Kostka","sequence":"additional","affiliation":[{"name":"Department of Developmental Biology , USA"},{"name":"Department of Computational and Systems Biology and Pittsburgh Center for Evolutionary Biology and Medicine, University of Pittsburgh School of Medicine , Pittsburgh, PA 15201, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2019,9,10]]},"reference":[{"key":"2023013110162274500_btz698-B1","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1016\/j.omtm.2018.07.003","article-title":"An introduction to the analysis of single-cell RNA-sequencing data. Mol. Ther.","volume":"10","author":"AlJanahi","year":"2018","journal-title":"Methods Clin. Dev"},{"key":"2023013110162274500_btz698-B2","doi-asserted-by":"crossref","first-page":"44.","DOI":"10.1186\/s12915-017-0383-5","article-title":"Cell fixation and preservation for droplet-based single-cell transcriptomics","volume":"15","author":"Alles","year":"2017","journal-title":"BMC Biol"},{"key":"2023013110162274500_btz698-B3","doi-asserted-by":"crossref","first-page":"2128.","DOI":"10.1038\/s41467-017-02001-5","article-title":"Differentiation dynamics of mammary epithelial cells revealed by single-cell RNA sequencing","volume":"8","author":"Bach","year":"2017","journal-title":"Nat. Commun"},{"key":"2023013110162274500_btz698-B4","doi-asserted-by":"crossref","first-page":"411.","DOI":"10.1038\/nbt.4096","article-title":"Integrating single-cell transcriptomic data across different conditions, technologies, and species","volume":"36","author":"Butler","year":"2018","journal-title":"Nat. Biotechnol"},{"key":"2023013110162274500_btz698-B5","doi-asserted-by":"crossref","first-page":"785","DOI":"10.1145\/2939672.2939785","volume-title":"Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD \u201916","author":"Chen","year":"2016"},{"key":"2023013110162274500_btz698-B6","author":"Chen","year":"2019"},{"key":"2023013110162274500_btz698-B7","doi-asserted-by":"crossref","first-page":"2938","DOI":"10.1093\/bioinformatics\/btx364","article-title":"UpSetR: an R package for the visualization of intersecting sets and their properties","volume":"33","author":"Conway","year":"2017","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023013110162274500_btz698-B8","first-page":"233","author":"Davis","year":"2006"},{"key":"2023013110162274500_btz698-B9","author":"DePasquale","year":"2018"},{"key":"2023013110162274500_btz698-B10","doi-asserted-by":"crossref","first-page":"1184.","DOI":"10.1038\/nprot.2009.97","article-title":"Mapping identifiers for the integration of genomic datasets with the R\/Bioconductor package biomaRt","volume":"4","author":"Durinck","year":"2009","journal-title":"Nat. Protocols"},{"key":"2023013110162274500_btz698-B11","author":"Erichson","year":"2016"},{"key":"2023013110162274500_btz698-B12","author":"Gehring","year":"2018"},{"key":"2023013110162274500_btz698-B13","first-page":"1083","article-title":"DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data","volume":"29","author":"Gong","year":"2013","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023013110162274500_btz698-B14","volume-title":"The Elements of Statistical Learning, Data Mining, Inference, and Prediction","author":"Hastie","year":"2001","edition":"2003 edition"},{"key":"2023013110162274500_btz698-B15","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1038\/s41556-017-0013-z","article-title":"Defining murine organogenesis at single-cell resolution reveals a role for the leukotriene pathway in regulating blood progenitor formation","volume":"20","author":"Ibarra-Soria","year":"2018","journal-title":"Nat. Cell Biol"},{"key":"2023013110162274500_btz698-B16","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1038\/nbt.4042","article-title":"Multiplexed droplet single-cell RNA-sequencing using natural genetic variation","volume":"36","author":"Kang","year":"2018","journal-title":"Nat. Biotechnol"},{"key":"2023013110162274500_btz698-B17","doi-asserted-by":"crossref","first-page":"e92209.","DOI":"10.1371\/journal.pone.0092209","article-title":"Area under precision-recall curves for weighted and unweighted data","volume":"9","author":"Keilwagen","year":"2014","journal-title":"PLoS One"},{"key":"2023013110162274500_btz698-B18","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1038\/s41576-018-0088-9","article-title":"Challenges in unsupervised clustering of single-cell RNA-seq data","volume":"20","author":"Kiselev","year":"2019","journal-title":"Nat. Rev. Genet"},{"key":"2023013110162274500_btz698-B19","doi-asserted-by":"crossref","first-page":"1187","DOI":"10.1016\/j.cell.2015.04.044","article-title":"Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells","volume":"161","author":"Klein","year":"2015","journal-title":"Cell"},{"key":"2023013110162274500_btz698-B20","doi-asserted-by":"crossref","first-page":"1551","DOI":"10.1016\/j.stemcr.2018.11.008","article-title":"Single-cell transcriptome profiling of mouse and hESC-derived pancreatic progenitors","volume":"11","author":"Krentz","year":"2018","journal-title":"Stem Cell Rep"},{"key":"2023013110162274500_btz698-B21","author":"Krijthe","year":"2015"},{"key":"2023013110162274500_btz698-B22","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1016\/j.cell.2015.05.047","article-title":"Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis","volume":"162","author":"Levine","year":"2015","journal-title":"Cell"},{"key":"2023013110162274500_btz698-B23","doi-asserted-by":"crossref","first-page":"708","DOI":"10.1038\/ng.3818","article-title":"Reference component analysis of single-cell transcriptomes elucidates cellular heterogeneity in human colorectal tumors","volume":"49","author":"Li","year":"2017","journal-title":"Nat. Genet"},{"key":"2023013110162274500_btz698-B24","first-page":"2122.","article-title":"A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor","volume":"5","author":"Lun","year":"2016","journal-title":"F1000 Res"},{"key":"2023013110162274500_btz698-B25","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1016\/j.cels.2019.03.003","article-title":"DoubletFinder: doublet detection in single-cell RNA sequencing data using artificial nearest neighbors","volume":"8","author":"McGinnis","year":"2019","journal-title":"Cell Syst"},{"key":"2023013110162274500_btz698-B26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41592-019-0433-8","article-title":"MULTI-seq: sample multiplexing for single-cell RNA sequencing using lipid-tagged indices","volume":"16","author":"McGinnis","year":"2019","journal-title":"Nat. Methods"},{"key":"2023013110162274500_btz698-B27","doi-asserted-by":"crossref","first-page":"479","DOI":"10.1038\/s41581-018-0021-7","article-title":"Single-cell RNA sequencing for the study of development, physiology and disease","volume":"14","author":"Potter","year":"2018","journal-title":"Nat. Rev. Nephrol"},{"key":"2023013110162274500_btz698-B28","doi-asserted-by":"crossref","first-page":"103.","DOI":"10.1186\/s13059-016-0957-5","article-title":"Single-cell analysis of CD4+ T-cell differentiation reveals three major cell states and progressive acceleration of proliferation","volume":"17","author":"Proserpio","year":"2016","journal-title":"Genome Biol"},{"key":"2023013110162274500_btz698-B29","volume-title":"R: A Language and Environment for Statistical Computing","year":"2018"},{"key":"2023013110162274500_btz698-B30","doi-asserted-by":"crossref","first-page":"77.","DOI":"10.1186\/1471-2105-12-77","article-title":"pROC: an open-source package for R and S+ to analyze and compare ROC curves","volume":"12","author":"Robin","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023013110162274500_btz698-B31","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1126\/science.aam8999","article-title":"Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding","volume":"360","author":"Rosenberg","year":"2018","journal-title":"Science (New York, NY)"},{"key":"2023013110162274500_btz698-B32","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1016\/j.cmet.2016.08.020","article-title":"Single-cell transcriptome profiling of human pancreatic islets in health and type 2 diabetes","volume":"24","author":"Segerstolpe","year":"2016","journal-title":"Cell Metab"},{"key":"2023013110162274500_btz698-B33","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1038\/nrg3833","article-title":"Computational and analytical challenges in single-cell transcriptomics","volume":"16","author":"Stegle","year":"2015","journal-title":"Nat. Rev. Genet"},{"key":"2023013110162274500_btz698-B34","doi-asserted-by":"crossref","first-page":"224.","DOI":"10.1186\/s13059-018-1603-1","article-title":"Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics","volume":"19","author":"Stoeckius","year":"2018","journal-title":"Genome Biol"},{"key":"2023013110162274500_btz698-B35","doi-asserted-by":"crossref","first-page":"565","DOI":"10.1038\/nmeth.4292","article-title":"Normalizing single-cell RNA sequencing data: challenges and opportunities","volume":"14","author":"Vallejos","year":"2017","journal-title":"Nat. Methods"},{"key":"2023013110162274500_btz698-B36","doi-asserted-by":"crossref","first-page":"3028","DOI":"10.2337\/db16-0405","article-title":"Single-cell transcriptomics of the human endocrine pancreas","volume":"65","author":"Wang","year":"2016","journal-title":"Diabetes"},{"key":"2023013110162274500_btz698-B37","doi-asserted-by":"crossref","first-page":"281.","DOI":"10.1016\/j.cels.2018.11.005","article-title":"Scrublet: computational identification of cell doublets in single-cell transcriptomic data","volume":"8","author":"Wolock","year":"2019","journal-title":"Cell Syst"},{"key":"2023013110162274500_btz698-B38","doi-asserted-by":"crossref","first-page":"D754.","DOI":"10.1093\/nar\/gkx1098","article-title":"Ensembl 2018","volume":"46","author":"Zerbino","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023013110162274500_btz698-B39","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1016\/j.molcel.2017.01.023","article-title":"Comparative analysis of single-cell RNA sequencing methods","volume":"65","author":"Ziegenhain","year":"2017","journal-title":"Mol. Cell"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz698\/30102941\/btz698.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/4\/1150\/48982328\/btz698.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/4\/1150\/48982328\/btz698.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T15:22:55Z","timestamp":1675178575000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/4\/1150\/5566507"}},"subtitle":[],"editor":[{"given":"Arne","family":"Elofsson","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2019,9,10]]},"references-count":39,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,2,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz698","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/564021","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,2,15]]},"published":{"date-parts":[[2019,9,10]]}}}