{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:33:59Z","timestamp":1772138039038,"version":"3.50.1"},"reference-count":48,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2020,7,13]],"date-time":"2020-07-13T00:00:00Z","timestamp":1594598400000},"content-version":"vor","delay-in-days":12,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100022769","name":"Yad Hanadiv","doi-asserted-by":"crossref","award":["#9960"],"award-info":[{"award-number":["#9960"]}],"id":[{"id":"10.13039\/100022769","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,7,1]]},"abstract":"<jats:title>ABSTRACT<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Summary<\/jats:title>\n                    <jats:p>Current technologies for single-cell transcriptomics allow thousands of cells to be analyzed in a single experiment. The increased scale of these methods raises the risk of cell doublets contamination. Available tools and algorithms for identifying doublets and estimating their occurrence in single-cell experimental data focus on doublets of different species, cell types or individuals. In this study, we analyze transcriptomic data from single cells having an identical genetic background. We claim that the ratio of monoallelic to biallelic expression provides a discriminating power toward doublets\u2019 identification. We present a pipeline called BIallelic Ratio for Doublets (BIRD) that relies on heterologous genetic variations, from single-cell RNA sequencing. For each dataset, doublets were artificially created from the actual data and used to train a predictive model. BIRD was applied on Smart-seq data from 163 primary fibroblast single cells. The model achieved 100% accuracy in annotating the randomly simulated doublets. Bonafide doublets were verified based on a biallelic expression signal amongst X-chromosome of female fibroblasts. Data from 10X Genomics microfluidics of human peripheral blood cells achieved in average 83% (\u00b13.7%) accuracy, and an area under the curve of 0.88 (\u00b10.04) for a collection of \u223c13\u00a0300 single cells. BIRD addresses instances of doublets, which were formed from cell mixtures of identical genetic background and cell identity. Maximal performance is achieved for high-coverage data from Smart-seq. Success in identifying doublets is data specific which varies according to the experimental methodology, genomic diversity between haplotypes, sequence coverage and depth.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa474","type":"journal-article","created":{"date-parts":[[2020,7,11]],"date-time":"2020-07-11T07:10:28Z","timestamp":1594451428000},"page":"i251-i257","source":"Crossref","is-referenced-by-count":1,"title":["BIRD: identifying cell doublets via biallelic expression from single cells"],"prefix":"10.1093","volume":"36","author":[{"given":"Kerem","family":"Wainer-Katsir","sequence":"first","affiliation":[{"name":"Department of Biological Chemistry, The Institute of Life Sciences, The Hebrew University of Jerusalem , Jerusalem, Givat Ram 91904, Israel"}]},{"given":"Michal","family":"Linial","sequence":"additional","affiliation":[{"name":"Department of Biological Chemistry, The Institute of Life Sciences, The Hebrew University of Jerusalem , Jerusalem, Givat Ram 91904, Israel"}]}],"member":"286","published-online":{"date-parts":[[2020,7,13]]},"reference":[{"key":"2024021913343291800_btaa474-B1","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1093\/bioinformatics\/btu638","article-title":"HTSeq\u2013a Python framework to work with high-throughput sequencing data","volume":"31","author":"Anders","year":"2015","journal-title":"Bioinformatics"},{"key":"2024021913343291800_btaa474-B2","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1186\/s13059-016-0927-y","article-title":"Design and computational analysis of single-cell RNA-sequencing experiments","volume":"17","author":"Bacher","year":"2016","journal-title":"Genome Biol"},{"key":"2024021913343291800_btaa474-B3","doi-asserted-by":"crossref","first-page":"2114","DOI":"10.1093\/bioinformatics\/btu170","article-title":"Trimmomatic: a flexible trimmer for Illumina sequence data","volume":"30","author":"Bolger","year":"2014","journal-title":"Bioinformatics"},{"key":"2024021913343291800_btaa474-B4","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1016\/j.ajhg.2014.12.001","article-title":"Biased allelic expression in human primary fibroblast single cells","volume":"96","author":"Borel","year":"2015","journal-title":"Am. J. Hum. Genet"},{"key":"2024021913343291800_btaa474-B5","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1038\/nbt.3102","article-title":"Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells","volume":"33","author":"Buettner","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2024021913343291800_btaa474-B6","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1186\/s13059-015-0762-6","article-title":"Tools and best practices for data processing in allelic expression analysis","volume":"16","author":"Castel","year":"2015","journal-title":"Genome Biol"},{"key":"2024021913343291800_btaa474-B7","doi-asserted-by":"crossref","first-page":"317","DOI":"10.3389\/fgene.2019.00317","article-title":"Single-cell RNA-seq technologies and related computational data analysis","volume":"10","author":"Chen","year":"2019","journal-title":"Front. Genet"},{"key":"2024021913343291800_btaa474-B8","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1126\/science.1245316","article-title":"Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells","volume":"343","author":"Deng","year":"2014","journal-title":"Science"},{"key":"2024021913343291800_btaa474-B9","doi-asserted-by":"crossref","first-page":"11.14.1","DOI":"10.1002\/0471250953.bi1114s51","article-title":"Mapping RNA-seq reads with STAR","volume":"51","author":"Dobin","year":"2015","journal-title":"Curr. Protoc. Bioinformatics"},{"key":"2024021913343291800_btaa474-B10","doi-asserted-by":"crossref","first-page":"1258367","DOI":"10.1126\/science.1258367","article-title":"Expression profiling. Combinatorial labeling of single cells for gene expression cytometry","volume":"347","author":"Fan","year":"2015","journal-title":"Science"},{"key":"2024021913343291800_btaa474-B11","doi-asserted-by":"crossref","first-page":"13015","DOI":"10.1073\/pnas.1806811115","article-title":"Extensive cellular heterogeneity of X inactivation revealed by single-cell allele-specific expression in human fibroblasts","volume":"115","author":"Garieri","year":"2018","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2024021913343291800_btaa474-B12","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1186\/s13073-017-0467-4","article-title":"A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications","volume":"9","author":"Haque","year":"2017","journal-title":"Genome Med"},{"key":"2024021913343291800_btaa474-B13","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1186\/s13059-016-0938-8","article-title":"CEL-Seq2: sensitive highly-multiplexed single-cell RNA-Seq","volume":"17","author":"Hashimshony","year":"2016","journal-title":"Genome Biol"},{"key":"2024021913343291800_btaa474-B14","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1186\/s13059-016-0888-1","article-title":"Classification of low quality cells from single-cell RNA-seq data","volume":"17","author":"Ilicic","year":"2016","journal-title":"Genome Biol"},{"key":"2024021913343291800_btaa474-B15","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1186\/s13059-017-1200-8","article-title":"SCALE: modeling allele-specific gene expression by single-cell RNA sequencing","volume":"18","author":"Jiang","year":"2017","journal-title":"Genome Biol"},{"key":"2024021913343291800_btaa474-B16","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1038\/nbt.4042","article-title":"Multiplexed droplet single-cell RNA-sequencing using natural genetic variation","volume":"36","author":"Kang","year":"2018","journal-title":"Nat. Biotechnol"},{"key":"2024021913343291800_btaa474-B17","doi-asserted-by":"crossref","first-page":"8687","DOI":"10.1038\/ncomms9687","article-title":"Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression","volume":"6","author":"Kim","year":"2015","journal-title":"Nat. Commun"},{"key":"2024021913343291800_btaa474-B18","doi-asserted-by":"crossref","first-page":"1187","DOI":"10.1016\/j.cell.2015.04.044","article-title":"Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells","volume":"161","author":"Klein","year":"2015","journal-title":"Cell"},{"key":"2024021913343291800_btaa474-B19","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1016\/j.molcel.2015.04.005","article-title":"The technology and biology of single-cell RNA sequencing","volume":"58","author":"Kolodziejczyk","year":"2015","journal-title":"Mol. Cell"},{"key":"2024021913343291800_btaa474-B20","doi-asserted-by":"crossref","first-page":"640","DOI":"10.1038\/nbt.3880","article-title":"Single-cell genome sequencing at ultra-high-throughput with microfluidic droplet barcoding","volume":"35","author":"Lan","year":"2017","journal-title":"Nat. Biotechnol"},{"key":"2024021913343291800_btaa474-B21","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1038\/s41586-018-0836-1","article-title":"Genomic encoding of transcriptional burst kinetics","volume":"565","author":"Larsson","year":"2019","journal-title":"Nature"},{"key":"2024021913343291800_btaa474-B22","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1186\/s13059-016-0947-7","article-title":"Pooling across cells to normalize single-cell RNA sequencing data with many zero counts","volume":"17","author":"Lun","year":"2016","journal-title":"Genome Biol"},{"key":"2024021913343291800_btaa474-B23","first-page":"2122","article-title":"A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor","volume":"5","author":"Lun","year":"2016","journal-title":"F1000Res"},{"key":"2024021913343291800_btaa474-B24","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1016\/j.tig.2016.12.003","article-title":"Single-cell multiomics: multiple measurements from single cells","volume":"33","author":"Macaulay","year":"2017","journal-title":"Trends Genet"},{"key":"2024021913343291800_btaa474-B25","doi-asserted-by":"crossref","first-page":"1179","DOI":"10.1093\/bioinformatics\/btw777","article-title":"Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R","volume":"33","author":"McCarthy","year":"2017","journal-title":"Bioinformatics"},{"key":"2024021913343291800_btaa474-B26","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1016\/j.cels.2019.03.003","article-title":"DoubletFinder: doublet detection in single-cell RNA sequencing data using artificial nearest neighbors","volume":"8","author":"McGinnis","year":"2019","journal-title":"Cell Syst"},{"key":"2024021913343291800_btaa474-B27","doi-asserted-by":"crossref","first-page":"619","DOI":"10.1038\/s41592-019-0433-8","article-title":"MULTI-seq: sample multiplexing for single-cell RNA sequencing using lipid-tagged indices","volume":"16","author":"McGinnis","year":"2019","journal-title":"Nat. Methods"},{"key":"2024021913343291800_btaa474-B28","doi-asserted-by":"crossref","first-page":"1739","DOI":"10.1109\/TVCG.2016.2570755","article-title":"Approximated and user steerable tSNE for progressive visual analytics","volume":"23","author":"Pezzotti","year":"2017","journal-title":"IEEE Trans. Vis. Comput. Graph"},{"key":"2024021913343291800_btaa474-B29","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1038\/nprot.2014.006","article-title":"Full-length RNA-seq from single cells using Smart-seq2","volume":"9","author":"Picelli","year":"2014","journal-title":"Nat. Protoc"},{"key":"2024021913343291800_btaa474-B30","doi-asserted-by":"crossref","first-page":"653","DOI":"10.1038\/nrg3888","article-title":"Random monoallelic expression of autosomal genes: stochastic transcription and allele-level regulation","volume":"16","author":"Reinius","year":"2015","journal-title":"Nat. Rev. Genet"},{"key":"2024021913343291800_btaa474-B31","doi-asserted-by":"crossref","first-page":"284","DOI":"10.1038\/s41467-017-02554-5","article-title":"A general and flexible method for signal extraction from single-cell RNA-seq data","volume":"9","author":"Risso","year":"2018","journal-title":"Nat. Commun"},{"key":"2024021913343291800_btaa474-B32","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1038\/nmeth.4145","article-title":"Effective detection of variation in single-cell transcriptomes using MATQ-seq","volume":"14","author":"Sheng","year":"2017","journal-title":"Nat. Methods"},{"key":"2024021913343291800_btaa474-B33","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1007\/978-1-4939-9240-9_5","article-title":"Single-cell RNA-seq by multiple annealing and tailing-based quantitative single-cell RNA-seq (MATQ-Seq)","volume":"1979","author":"Sheng","year":"2019","journal-title":"Methods Mol. Biol"},{"key":"2024021913343291800_btaa474-B34","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1038\/nrg3833","article-title":"Computational and analytical challenges in single-cell transcriptomics","volume":"16","author":"Stegle","year":"2015","journal-title":"Nat. Rev. Genet"},{"key":"2024021913343291800_btaa474-B35","doi-asserted-by":"crossref","first-page":"224","DOI":"10.1186\/s13059-018-1603-1","article-title":"Cell hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics","volume":"19","author":"Stoeckius","year":"2018","journal-title":"Genome Biol"},{"key":"2024021913343291800_btaa474-B36","doi-asserted-by":"crossref","first-page":"e21208","DOI":"10.1371\/journal.pone.0021208","article-title":"Deterministic and stochastic allele specific gene expression in single mouse blastomeres","volume":"6","author":"Tang","year":"2011","journal-title":"PLoS One"},{"key":"2024021913343291800_btaa474-B37","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1038\/nature24265","article-title":"Landscape of X chromosome inactivation across human tissues","volume":"550","author":"Tukiainen","year":"2017","journal-title":"Nature"},{"key":"2024021913343291800_btaa474-B38","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1038\/nn.3881","article-title":"Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing","volume":"18","author":"Usoskin","year":"2015","journal-title":"Nat. Neurosci"},{"key":"2024021913343291800_btaa474-B39","doi-asserted-by":"crossref","first-page":"11.10.1","DOI":"10.1002\/0471250953.bi1110s43","article-title":"From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline","volume":"43","author":"Van der Auwera","year":"2013","journal-title":"Curr. Protoc. Bioinformatics"},{"key":"2024021913343291800_btaa474-B40","doi-asserted-by":"crossref","first-page":"eaah4573","DOI":"10.1126\/science.aah4573","article-title":"Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors","volume":"356","author":"Villani","year":"2017","journal-title":"Science"},{"key":"2024021913343291800_btaa474-B41","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.ymben.2018.04.015","article-title":"A comparative analysis of single cell and droplet-based FACS for improving production phenotypes: riboflavin overproduction in Yarrowia lipolytica","volume":"47","author":"Wagner","year":"2018","journal-title":"Metab. Eng"},{"key":"2024021913343291800_btaa474-B42","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1186\/s12864-019-5507-6","article-title":"Human genes escaping X-inactivation revealed by single cell expression data","volume":"20","author":"Wainer-Katsir","year":"2019","journal-title":"BMC Genomics"},{"key":"2024021913343291800_btaa474-B43","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1016\/j.cels.2018.11.005","article-title":"Scrublet: computational identification of cell doublets in single-cell transcriptomic data","volume":"8","author":"Wolock","year":"2019","journal-title":"Cell Syst"},{"key":"2024021913343291800_btaa474-B44","doi-asserted-by":"crossref","first-page":"3293","DOI":"10.1073\/pnas.1602306113","article-title":"Use of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells","volume":"113","author":"Xin","year":"2016","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2024021913343291800_btaa474-B45","doi-asserted-by":"crossref","first-page":"1138","DOI":"10.1126\/science.aaa1934","article-title":"Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq","volume":"347","author":"Zeisel","year":"2015","journal-title":"Science"},{"key":"2024021913343291800_btaa474-B46","doi-asserted-by":"crossref","first-page":"130","DOI":"10.1016\/j.molcel.2018.10.020","article-title":"Comparative analysis of droplet-based ultra-high-throughput single-cell RNA-seq systems","volume":"73","author":"Zhang","year":"2019","journal-title":"Mol. Cell"},{"key":"2024021913343291800_btaa474-B47","doi-asserted-by":"crossref","first-page":"14049","DOI":"10.1038\/ncomms14049","article-title":"Massively parallel digital transcriptional profiling of single cells","volume":"8","author":"Zheng","year":"2017","journal-title":"Nat. Commun"},{"key":"2024021913343291800_btaa474-B48","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1038\/nprot.2016.154","article-title":"Single-cell barcoding and sequencing using droplet microfluidics","volume":"12","author":"Zilionis","year":"2017","journal-title":"Nat. Protoc"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/Supplement_1\/i251\/56702496\/bioinformatics_36_supplement1_i251.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/Supplement_1\/i251\/56702496\/bioinformatics_36_supplement1_i251.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,19]],"date-time":"2024-02-19T08:44:26Z","timestamp":1708332266000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/Supplement_1\/i251\/5870510"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,1]]},"references-count":48,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2020,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa474","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/709451","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,7]]},"published":{"date-parts":[[2020,7,1]]}}}