{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,10]],"date-time":"2026-06-10T21:41:44Z","timestamp":1781127704889,"version":"3.54.1"},"reference-count":38,"publisher":"Oxford University Press (OUP)","issue":"22","license":[{"start":{"date-parts":[[2021,6,19]],"date-time":"2021-06-19T00:00:00Z","timestamp":1624060800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Library of Medicine Bio-Data Science Training program","award":["T32LM012413"],"award-info":[{"award-number":["T32LM012413"]}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["NIHGM102756"],"award-info":[{"award-number":["NIHGM102756"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,11,18]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Normalization to remove technical or experimental artifacts is critical in the analysis of single-cell RNA-sequencing experiments, even those for which unique molecular identifiers are available. The majority of methods for normalizing single-cell RNA-sequencing data adjust average expression for library size (LS), allowing the variance and other properties of the gene-specific expression distribution to be non-constant in LS. This often results in reduced power and increased false discoveries in downstream analyses, a problem which is exacerbated by the high proportion of zeros present in most datasets.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>To address this, we present Dino, a normalization method based on a flexible negative-binomial mixture model of gene expression. As demonstrated in both simulated and case study datasets, by normalizing the entire gene expression distribution, Dino is robust to shallow sequencing, sample heterogeneity and varying zero proportions, leading to improved performance in downstream analyses in a number of settings.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The R package, Dino, is available on GitHub at https:\/\/github.com\/JBrownBiostat\/Dino. The Dino package is further archived and freely available on Zenodo at https:\/\/doi.org\/10.5281\/zenodo.4897558.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab450","type":"journal-article","created":{"date-parts":[[2021,6,15]],"date-time":"2021-06-15T15:14:17Z","timestamp":1623770057000},"page":"4123-4128","source":"Crossref","is-referenced-by-count":20,"title":["Normalization by distributional resampling of high throughput single-cell RNA-sequencing data"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9151-4386","authenticated-orcid":false,"given":"Jared","family":"Brown","sequence":"first","affiliation":[{"name":"Department of Statistics, University of Wisconsin Madison , Madison, WI 53706, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zijian","family":"Ni","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of Wisconsin Madison , Madison, WI 53706, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Chitrasen","family":"Mohanty","sequence":"additional","affiliation":[{"name":"Department of Biostatistics and Medical Informatics, University of Wisconsin Madison , Madison, WI 53792, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Rhonda","family":"Bacher","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, University of Florida Gainesville , Gainesville, FL 32603, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Christina","family":"Kendziorski","sequence":"additional","affiliation":[{"name":"Department of Biostatistics and Medical Informatics, University of Wisconsin Madison , Madison, WI 53792, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2021,6,19]]},"reference":[{"key":"2023051607120735100_btab450-B1","doi-asserted-by":"crossref","first-page":"R106","DOI":"10.1186\/gb-2010-11-10-r106","article-title":"Differential expression analysis for sequence count data","volume":"11","author":"Anders","year":"2010","journal-title":"Genome Biol"},{"key":"2023051607120735100_btab450-B2","doi-asserted-by":"crossref","first-page":"584","DOI":"10.1038\/nmeth.4263","article-title":"SCnorm: robust normalization of single-cell RNA-seq data","volume":"14","author":"Bacher","year":"2017","journal-title":"Nat. Methods"},{"key":"2023051607120735100_btab450-B3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-016-0927-y","article-title":"Design and computational analysis of single-cell RNA-sequencing experiments","volume":"17","author":"Bacher","year":"2016","journal-title":"Genome Biol"},{"key":"2023051607120735100_btab450-B4","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"2023051607120735100_btab450-B5","doi-asserted-by":"crossref","first-page":"928","DOI":"10.1086\/113176","article-title":"Alternatives to least squares","volume":"87","author":"Branham","year":"1982","journal-title":"Astron. J"},{"key":"2023051607120735100_btab450-B6","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1038\/nbt.4096","article-title":"Integrating single-cell transcriptomic data across different conditions, technologies, and species","volume":"36","author":"Butler","year":"2018","journal-title":"Nat. Biotechnol"},{"key":"2023051607120735100_btab450-B7","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1023\/A:1004165218295","article-title":"Probability density function estimation using gamma kernels","volume":"52","author":"Chen","year":"2000","journal-title":"Ann. Inst. Stat. Math"},{"key":"2023051607120735100_btab450-B8","doi-asserted-by":"crossref","first-page":"1459","DOI":"10.1080\/01621459.1997.10473667","article-title":"Deconvolution of a distribution function","volume":"92","author":"Cordy","year":"1997","journal-title":"J. Am. Stat. Assoc"},{"key":"2023051607120735100_btab450-B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-015-0844-5","article-title":"MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data","volume":"16","author":"Finak","year":"2015","journal-title":"Genome Biol"},{"key":"2023051607120735100_btab450-B10","doi-asserted-by":"crossref","first-page":"637","DOI":"10.1038\/nmeth.2930","article-title":"Validation of noise models for single-cell transcriptomics","volume":"11","author":"Gr\u00fcn","year":"2014","journal-title":"Nat. Methods"},{"key":"2023051607120735100_btab450-B11","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1186\/s13059-019-1874-1","article-title":"Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression","volume":"20","author":"Hafemeister","year":"2019","journal-title":"Genome Biol"},{"key":"2023051607120735100_btab450-B12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13073-017-0467-4","article-title":"A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications","volume":"9","author":"Haque","year":"2017","journal-title":"Genome Med"},{"key":"2023051607120735100_btab450-B13","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1038\/s41592-018-0033-z","article-title":"SAVER: gene expression recovery for single-cell RNA sequencing","volume":"15","author":"Huang","year":"2018","journal-title":"Nat. Methods"},{"key":"2023051607120735100_btab450-B14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s12276-018-0071-8","article-title":"Single-cell RNA sequencing technologies and bioinformatics pipelines","volume":"50","author":"Hwang","year":"2018","journal-title":"Exp. Mol. Med"},{"key":"2023051607120735100_btab450-B15","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1038\/nmeth.2772","article-title":"Quantitative single-cell RNA-seq with unique molecular identifiers","volume":"11","author":"Islam","year":"2014","journal-title":"Nat. Methods"},{"key":"2023051607120735100_btab450-B16","doi-asserted-by":"crossref","first-page":"569","DOI":"10.1111\/1467-9868.00083","article-title":"Acceleration of the EM Algorithm by using Quasi-Newton Methods","volume":"59","author":"Jamshidian","year":"1997","journal-title":"J. R. Stat. Soc. Ser. B (Stat. Methodol.)"},{"key":"2023051607120735100_btab450-B17","doi-asserted-by":"crossref","first-page":"1187","DOI":"10.1016\/j.cell.2015.04.044","article-title":"Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells","volume":"161","author":"Klein","year":"2015","journal-title":"Cell"},{"key":"2023051607120735100_btab450-B18","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1016\/j.molcel.2015.04.005","article-title":"The technology and biology of single-cell RNA sequencing","volume":"58","author":"Kolodziejczyk","year":"2015","journal-title":"Mol. Cell"},{"key":"2023051607120735100_btab450-B19","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1016\/j.cels.2015.12.004","article-title":"The molecular signatures database hallmark gene set collection","volume":"1","author":"Liberzon","year":"2015","journal-title":"Cell Syst"},{"key":"2023051607120735100_btab450-B20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-014-0550-8","article-title":"Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2","volume":"15","author":"Love","year":"2014","journal-title":"Genome Biol"},{"key":"2023051607120735100_btab450-B4874545","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1186\/s13059-019-1662-y","article-title":"EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data","volume":"20","author":"Lun","year":"2019","journal-title":"Genome Biol."},{"key":"2023051607120735100_btab450-B22","first-page":"1","article-title":"Pooling across cells to normalize single-cell RNA sequencing data with many zero counts","volume":"17","author":"Lun","year":"2016","journal-title":"Genome Biol"},{"key":"2023051607120735100_btab450-B23","first-page":"3221","article-title":"Accelerating t-SNE using tree-based algorithms","volume":"15","author":"Van Der Maaten","year":"2014","journal-title":"J. Mach. Learn. Res"},{"key":"2023051607120735100_btab450-B24","first-page":"2579","article-title":"Visualizing high-dimensional data using t-SNE","volume":"9","author":"van der Maaten","year":"2008","journal-title":"J. Mach. Learn. Res"},{"key":"2023051607120735100_btab450-B25","doi-asserted-by":"crossref","first-page":"1202","DOI":"10.1016\/j.cell.2015.05.002","article-title":"Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets","volume":"161","author":"Macosko","year":"2015","journal-title":"Cell"},{"key":"2023051607120735100_btab450-B26","doi-asserted-by":"crossref","first-page":"1389","DOI":"10.1038\/s41588-019-0489-5","article-title":"A pooled single-cell genetic screen identifies regulatory checkpoints in the continuum of the epithelial-to-mesenchymal transition","volume":"51","author":"McFaline-Figueroa","year":"2019","journal-title":"Nat. Genet"},{"key":"2023051607120735100_btab450-B27","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1016\/0304-4076(86)90016-3","article-title":"Censored regression quantiles","volume":"32","author":"Powell","year":"1986","journal-title":"J. Econom"},{"key":"2023051607120735100_btab450-B28","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1016\/0304-4076(84)90004-6","article-title":"Least absolute deviations estimation for the censored regression model","volume":"25","author":"Powell","year":"1984","journal-title":"J. Econom"},{"key":"2023051607120735100_btab450-B29","doi-asserted-by":"crossref","first-page":"979","DOI":"10.1038\/nmeth.4402","article-title":"Reversed graph embedding resolves complex single-cell trajectories","volume":"14","author":"Qiu","year":"2017","journal-title":"Nat. Methods"},{"key":"2023051607120735100_btab450-B30","doi-asserted-by":"crossref","first-page":"284","DOI":"10.1038\/s41467-017-02554-5","article-title":"A general and flexible method for signal extraction from single-cell RNA-seq data","volume":"9","author":"Risso","year":"2018","journal-title":"Nat. Commun"},{"key":"2023051607120735100_btab450-B31","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1038\/nbt.3192","article-title":"Spatial reconstruction of single-cell gene expression data","volume":"33","author":"Satija","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023051607120735100_btab450-B32","first-page":"275","article-title":"False discovery rates: a new deal","volume":"18","author":"Stephens","year":"2017","journal-title":"Biostatistics"},{"key":"2023051607120735100_btab450-B33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-019-1861-6","article-title":"Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model","volume":"20","author":"Townes","year":"2019","journal-title":"Genome Biol"},{"key":"2023051607120735100_btab450-B34","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1038\/nbt.2859","article-title":"The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells","volume":"32","author":"Trapnell","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023051607120735100_btab450-B35","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1186\/s13059-018-1431-3","article-title":"GiniClust2: a cluster-aware, weighted ensemble clustering method for cell-type detection","volume":"19","author":"Tsoucas","year":"2018","journal-title":"Genome Biol"},{"key":"2023051607120735100_btab450-B36","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/srep39921","article-title":"Batch effects and the effective design of single-cell gene expression studies","volume":"7","author":"Tung","year":"2017","journal-title":"Sci. Rep"},{"key":"2023051607120735100_btab450-B37","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1146\/annurev-anchem-061516-045228","article-title":"Single-cell transcriptional analysis","volume":"10","author":"Wu","year":"2017","journal-title":"Annu. Rev. Anal. Chem"},{"key":"2023051607120735100_btab450-B38","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/ncomms14049","article-title":"Massively parallel digital transcriptional profiling of single cells","volume":"8","author":"Zheng","year":"2017","journal-title":"Nat. Commun"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab450\/38854631\/btab450.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/22\/4123\/50335293\/btab450.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/22\/4123\/50335293\/btab450.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,1]],"date-time":"2024-09-01T17:34:04Z","timestamp":1725212044000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/22\/4123\/6306403"}},"subtitle":[],"editor":[{"given":"Anthony","family":"Mathelier","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2021,6,19]]},"references-count":38,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2021,11,18]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab450","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.10.28.359901","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,11,15]]},"published":{"date-parts":[[2021,6,19]]}}}