{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T12:57:33Z","timestamp":1760101053630,"version":"3.37.3"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"19","license":[{"start":{"date-parts":[[2018,4,24]],"date-time":"2018-04-24T00:00:00Z","timestamp":1524528000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CCF-1553281"],"award-info":[{"award-number":["CCF-1553281"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100005825","name":"USDA NIFA","doi-asserted-by":"crossref","award":["06-505570-01006"],"award-info":[{"award-number":["06-505570-01006"]}],"id":[{"id":"10.13039\/100005825","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Rapid adoption of high-throughput sequencing technologies has enabled better understanding of genome-wide molecular profile changes associated with phenotypic differences in biomedical studies. Often, these changes are due to multiple interacting factors. Existing methods are mostly considering differential expression across two conditions studying one main factor without considering other confounding factors. In addition, they are often coupled with essential sophisticated ad-hoc pre-processing steps such as normalization, restricting their adaptability to general experimental setups. Complex multi-factor experimental design to accurately decipher genotype-phenotype relationships signifies the need for developing effective statistical tools for genome-scale sequencing data profiled under multi-factor conditions.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We have developed a novel Bayesian negative binomial regression (BNB-R) method for the analysis of RNA sequencing (RNA-seq) count data. In particular, the natural model parameterization removes the needs for the normalization step, while the method is capable of tackling complex experimental design involving multi-variate dependence structures. Efficient Bayesian inference of model parameters is obtained by exploiting conditional conjugacy via novel data augmentation techniques. Comprehensive studies on both synthetic and real-world RNA-seq data demonstrate the superior performance of BNB-R in terms of the areas under both the receiver operating characteristic and precision-recall curves.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>BNB-R is implemented in R language and is available at https:\/\/github.com\/siamakz\/BNBR.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty330","type":"journal-article","created":{"date-parts":[[2018,4,20]],"date-time":"2018-04-20T19:10:23Z","timestamp":1524251423000},"page":"3349-3356","source":"Crossref","is-referenced-by-count":13,"title":["Bayesian negative binomial regression for differential expression with confounding factors"],"prefix":"10.1093","volume":"34","author":[{"given":"Siamak Zamani","family":"Dadaneh","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering, TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX, USA"}]},{"given":"Mingyuan","family":"Zhou","sequence":"additional","affiliation":[{"name":"Department of Information, Risk, and Operations Management, The University of Texas at Austin, Austin, TX, USA"}]},{"given":"Xiaoning","family":"Qian","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX, USA"}]}],"member":"286","published-online":{"date-parts":[[2018,4,24]]},"reference":[{"key":"2023012712491158100_bty330-B1","doi-asserted-by":"crossref","first-page":"i113","DOI":"10.1093\/bioinformatics\/btu274","article-title":"Methods for time series analysis of RNA-seq data with application to human Th17 cell differentiation","volume":"30","author":"\u00c4ij\u00f6","year":"2014","journal-title":"Bioinformatics"},{"key":"2023012712491158100_bty330-B2","doi-asserted-by":"crossref","first-page":"R106","DOI":"10.1186\/gb-2010-11-10-r106","article-title":"Differential expression analysis for sequence count data","volume":"11","author":"Anders","year":"2010","journal-title":"Genome Biol"},{"key":"2023012712491158100_bty330-B3","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1020281327116","article-title":"An introduction to mcmc for machine learning","volume":"50","author":"Andrieu","year":"2003","journal-title":"Mach. Learn"},{"year":"2017","author":"Boluki","key":"2023012712491158100_bty330-B4"},{"key":"2023012712491158100_bty330-B5","doi-asserted-by":"crossref","first-page":"552","DOI":"10.1186\/s12859-017-1893-4","article-title":"Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors","volume":"18","author":"Boluki","year":"2017","journal-title":"BMC Bioinformatics"},{"key":"2023012712491158100_bty330-B6","doi-asserted-by":"crossref","first-page":"3710","DOI":"10.1093\/bioinformatics\/bth456","article-title":"GO:: termFinder\u2014open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes","volume":"20","author":"Boyle","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012712491158100_bty330-B7","doi-asserted-by":"crossref","first-page":"3306","DOI":"10.1093\/bioinformatics\/btw395","article-title":"A subpopulation model to analyze heterogeneous cell differentiation dynamics","volume":"32","author":"Chan","year":"2016","journal-title":"Bioinformatics"},{"key":"2023012712491158100_bty330-B8","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1080\/00031305.1995.10476177","article-title":"Understanding the Metropolis-Hastings algorithm","volume":"49","author":"Chib","year":"1995","journal-title":"Am. Stat"},{"key":"2023012712491158100_bty330-B9","article-title":"BNP-Seq: Bayesian nonparametric differential expression analysis of sequencing count data","author":"Dadaneh","year":"2017","journal-title":"J. Am. Stat. Assoc"},{"key":"2023012712491158100_bty330-B10","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-319-07212-8","volume-title":"Statistical Analysis of Next Generation Sequencing Data","author":"Datta","year":"2014"},{"key":"2023012712491158100_bty330-B11","doi-asserted-by":"crossref","first-page":"5748","DOI":"10.4049\/jimmunol.0801162","article-title":"IL-27 blocks RORc expression to inhibit lineage commitment of Th17 cells","volume":"182","author":"Diveu","year":"2009","journal-title":"J. Immunol"},{"key":"2023012712491158100_bty330-B12","doi-asserted-by":"crossref","first-page":"392.","DOI":"10.1037\/0033-2909.118.3.392","article-title":"Regression analyses of counts and rates: poisson, overdispersed Poisson, and negative binomial models","volume":"118","author":"Gardner","year":"1995","journal-title":"Psychol. Bull"},{"key":"2023012712491158100_bty330-B13","doi-asserted-by":"crossref","first-page":"R80.","DOI":"10.1186\/gb-2004-5-10-r80","article-title":"Bioconductor: open software development for computational biology and bioinformatics","volume":"5","author":"Gentleman","year":"2004","journal-title":"Genome Biol"},{"key":"2023012712491158100_bty330-B14","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511973420","volume-title":"Negative Binomial Regression","author":"Hilbe","year":"2011"},{"key":"2023012712491158100_bty330-B15","doi-asserted-by":"crossref","DOI":"10.1002\/0471715816","volume-title":"Univariate Discrete Distributions, Volume 444","author":"Johnson","year":"2005"},{"key":"2023012712491158100_bty330-B16","article-title":"Quantitative RT-PCR: a review of current methodologies, RT-PCR Protocols","author":"Joyce","year":"2002","journal-title":"Methods Mol. Biol"},{"first-page":"1078","year":"2017","author":"Karbalayghareh","key":"2023012712491158100_bty330-B17"},{"key":"2023012712491158100_bty330-B18","doi-asserted-by":"crossref","first-page":"R29.","DOI":"10.1186\/gb-2014-15-2-r29","article-title":"Voom: precision weights unlock linear model analysis tools for RNA-seq read counts","volume":"15","author":"Law","year":"2014","journal-title":"Genome Biol"},{"key":"2023012712491158100_bty330-B19","doi-asserted-by":"crossref","first-page":"e161","DOI":"10.1093\/nar\/gku864","article-title":"Svaseq: removing batch effects and other unwanted noise from sequencing data","volume":"42","author":"Leek","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023012712491158100_bty330-B20","doi-asserted-by":"crossref","first-page":"882","DOI":"10.1093\/bioinformatics\/bts034","article-title":"The sva package for removing batch effects and other unwanted variation in high-throughput experiments","volume":"28","author":"Leek","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012712491158100_bty330-B21","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1177\/0962280211428386","article-title":"Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-Seq data","volume":"22","author":"Li","year":"2013","journal-title":"Stat. Methods Med. Res"},{"key":"2023012712491158100_bty330-B22","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1186\/s13059-014-0550-8","article-title":"Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2","volume":"15","author":"Love","year":"2014","journal-title":"Genome Biol"},{"key":"2023012712491158100_bty330-B23","doi-asserted-by":"crossref","first-page":"1151.","DOI":"10.1038\/nbt1239","article-title":"The MicroArray Quality Control (MAQC) project shows inter-and intraplatform reproducibility of gene expression measurements","volume":"24","author":"Maqc Consortium and Others","year":"2006","journal-title":"Nat. Biotechnol"},{"key":"2023012712491158100_bty330-B24","doi-asserted-by":"crossref","first-page":"157.","DOI":"10.1038\/gene.2011.9","article-title":"LIF in the regulation of T-cell fate and as a potential therapeutic","volume":"12","author":"Metcalfe","year":"2011","journal-title":"Genes Immun"},{"key":"2023012712491158100_bty330-B25","doi-asserted-by":"crossref","first-page":"5155","DOI":"10.1210\/jc.2009-0947","article-title":"Adipose tissue collagen VI in obesity","volume":"94","author":"Pasarica","year":"2009","journal-title":"J. Clin. Endocrinol. Metab"},{"year":"2011","author":"Polson","key":"2023012712491158100_bty330-B26"},{"key":"2023012712491158100_bty330-B27","doi-asserted-by":"crossref","first-page":"R95","DOI":"10.1186\/gb-2013-14-9-r95","article-title":"Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data","volume":"14","author":"Rapaport","year":"2013","journal-title":"Genome Biol"},{"key":"2023012712491158100_bty330-B28","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1093\/bioinformatics\/btp616","article-title":"edgeR: a Bioconductor package for differential expression analysis of digital gene expression data","volume":"26","author":"Robinson","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012712491158100_bty330-B29","doi-asserted-by":"crossref","first-page":"903","DOI":"10.1038\/nbt.2957","article-title":"A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium","volume":"32","author":"SEQC\/MAQC-III Consortium","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023012712491158100_bty330-B30","doi-asserted-by":"crossref","first-page":"1","DOI":"10.2202\/1544-6115.1027","article-title":"Linear models and empirical bayes methods for assessing differential expression in microarray experiments","volume":"3","author":"Smyth","year":"2004","journal-title":"Stat. Appl. Genet. Mol. Biol"},{"key":"2023012712491158100_bty330-B31","doi-asserted-by":"crossref","first-page":"397","DOI":"10.1007\/0-387-29362-0_23","volume-title":"Bioinformatics and Computational Biology Solutions Using R and Bioconductor","author":"Smyth","year":"2005"},{"key":"2023012712491158100_bty330-B32","doi-asserted-by":"crossref","first-page":"91.","DOI":"10.1186\/1471-2105-14-91","article-title":"A comparison of methods for differential expression analysis of RNA-seq data","volume":"14","author":"Soneson","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023012712491158100_bty330-B33","doi-asserted-by":"crossref","first-page":"e151","DOI":"10.1182\/blood-2012-01-407528","article-title":"Identification of early gene expression changes during human Th17 cell differentiation","volume":"119","author":"Tuomela","year":"2012","journal-title":"Blood"},{"key":"2023012712491158100_bty330-B34","doi-asserted-by":"crossref","first-page":"13416.","DOI":"10.18632\/oncotarget.7963","article-title":"Comparative analysis of human and mouse transcriptomes of Th17 cell priming","volume":"7","author":"Tuomela","year":"2016","journal-title":"Oncotarget"},{"key":"2023012712491158100_bty330-B35","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nrg2484","article-title":"RNA-Seq: a revolutionary tool for transcriptomics","volume":"10","author":"Wang","year":"2009","journal-title":"Nat. Rev. Genet"},{"volume-title":"Econometric Analysis of Count Data","year":"2013","author":"Winkelmann","key":"2023012712491158100_bty330-B36"},{"key":"2023012712491158100_bty330-B37","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1109\/TPAMI.2013.211","article-title":"Negative binomial process count and mixture modeling","volume":"37","author":"Zhou","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intel"},{"year":"2012","author":"Zhou","key":"2023012712491158100_bty330-B38"},{"key":"2023012712491158100_bty330-B39","doi-asserted-by":"crossref","first-page":"1144","DOI":"10.1080\/01621459.2015.1075407","article-title":"Priors for random count matrices derived from a family of negative binomial processes","volume":"111","author":"Zhou","year":"2016","journal-title":"J. Am. Stat. Assoc"},{"key":"2023012712491158100_bty330-B40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2015\/621690","article-title":"The impact of normalization methods on rna-seq data analysis","volume":"2015","author":"Zyprych-Walczak","year":"2015","journal-title":"BioMed Res. Int"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/19\/3349\/48919004\/bioinformatics_34_19_3349.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/19\/3349\/48919004\/bioinformatics_34_19_3349.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,6]],"date-time":"2024-07-06T05:55:26Z","timestamp":1720245326000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/19\/3349\/4984508"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,4,24]]},"references-count":40,"journal-issue":{"issue":"19","published-print":{"date-parts":[[2018,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty330","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2018,10,1]]},"published":{"date-parts":[[2018,4,24]]}}}