{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,17]],"date-time":"2025-12-17T17:58:23Z","timestamp":1765994303345},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Modern high-throughput biotechnologies such as microarray are capable of producing a massive amount of information for each sample. However, in a typical high-throughput experiment, only limited number of samples were assayed, thus the classical \u2018large p, small n\u2019 problem. On the other hand, rapid propagation of these high-throughput technologies has resulted in a substantial collection of data, often carried out on the same platform and using the same protocol. It is highly desirable to utilize the existing data when performing analysis and inference on a new dataset.<\/jats:p><jats:p>Results: Utilizing existing data can be carried out in a straightforward fashion under the Bayesian framework in which the repository of historical data can be exploited to build informative priors and used in new data analysis. In this work, using microarray data, we investigate the feasibility and effectiveness of deriving informative priors from historical data and using them in the problem of detecting differentially expressed genes. Through simulation and real data analysis, we show that the proposed strategy significantly outperforms existing methods including the popular and state-of-the-art Bayesian hierarchical model-based approaches. Our work illustrates the feasibility and benefits of exploiting the increasingly available genomics big data in statistical inference and presents a promising practical strategy for dealing with the \u2018large p, small n\u2019 problem.<\/jats:p><jats:p>Availability and implementation: Our method is implemented in R package IPBT, which is freely available from https:\/\/github.com\/benliemory\/IPBT.<\/jats:p><jats:p>Contact: \u00a0yuzhu@purdue.edu; zhaohui.qin@emory.edu<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btv631","type":"journal-article","created":{"date-parts":[[2015,10,31]],"date-time":"2015-10-31T02:38:11Z","timestamp":1446259091000},"page":"682-689","source":"Crossref","is-referenced-by-count":13,"title":["Bayesian inference with historical data-based informative priors improves detection of differentially expressed genes"],"prefix":"10.1093","volume":"32","author":[{"given":"Ben","family":"Li","sequence":"first","affiliation":[{"name":"1 Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA,"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhaonan","family":"Sun","sequence":"additional","affiliation":[{"name":"2 Department of Statistics, Purdue University, West Lafayette, IN 47906, USA and"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qing","family":"He","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA,"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yu","family":"Zhu","sequence":"additional","affiliation":[{"name":"2 Department of Statistics, Purdue University, West Lafayette, IN 47906, USA and"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhaohui S.","family":"Qin","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA,"},{"name":"3 Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA 30322, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2015,10,30]]},"reference":[{"key":"2023020110432432200_btv631-B1","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1177\/1471082X1001100201","article-title":"Exploiting blank spots for model-based background correction in discovering genes with DNA array data","volume":"11","author":"Arima","year":"2011","journal-title":"Stat. Modell."},{"key":"2023020110432432200_btv631-B2","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. The Gene Ontology Consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet."},{"key":"2023020110432432200_btv631-B3","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. Ser. B Stat. Methodol."},{"key":"2023020110432432200_btv631-B4","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1186\/1471-2105-8-80","article-title":"Bayesian meta-analysis models for microarray data: a comparative study","volume":"8","author":"Conlon","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023020110432432200_btv631-B5","first-page":"101","article-title":"A selective overview of variable selection in high dimensional feature space","volume":"20","author":"Fan","year":"2010","journal-title":"Stat. Sin."},{"key":"2023020110432432200_btv631-B6","doi-asserted-by":"crossref","first-page":"e0123791","DOI":"10.1371\/journal.pone.0123791","article-title":"Robust modeling of differential gene expression data using normal\/independent distributions: a Bayesian approach","volume":"10","author":"Ganjali","year":"2015","journal-title":"PLoS One"},{"key":"2023020110432432200_btv631-B7","volume-title":"Bayesian Data Analysis","author":"Gelman","year":"2004"},{"key":"2023020110432432200_btv631-B8","volume-title":"The Estimation of Probabilities: An Essay on Modern Bayesian Methods","author":"Good","year":"1965"},{"key":"2023020110432432200_btv631-B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/nar\/gkn923","article-title":"Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists","volume":"37","author":"Huang da","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023020110432432200_btv631-B10","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1038\/nprot.2008.211","article-title":"Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources","volume":"4","author":"Huang da","year":"2009","journal-title":"Nat. Protoc."},{"key":"2023020110432432200_btv631-B11","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1093\/biostatistics\/4.2.249","article-title":"Exploration, normalization, and summaries of high density oligonucleotide array probe level data","volume":"4","author":"Irizarry","year":"2003","journal-title":"Biostatistics"},{"key":"2023020110432432200_btv631-B12","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1038\/nbt.1619","article-title":"Analyzing 'omics data using hierarchical models","volume":"28","author":"Ji","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023020110432432200_btv631-B13","doi-asserted-by":"crossref","first-page":"3629","DOI":"10.1093\/bioinformatics\/bti593","article-title":"TileMap: create chromosomal map of tiling array hybridizations","volume":"21","author":"Ji","year":"2005","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023020110432432200_btv631-B14","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1093\/biostatistics\/2.2.183","article-title":"Experimental design for gene expression microarrays","volume":"2","author":"Kerr","year":"2001","journal-title":"Biostatistics"},{"key":"2023020110432432200_btv631-B15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.1541-0420.2005.00394.x","article-title":"Bayesian modeling of differential gene expression","volume":"62","author":"Lewin","year":"2006","journal-title":"Biometrics"},{"key":"2023020110432432200_btv631-B16","doi-asserted-by":"crossref","first-page":"1550018","DOI":"10.1142\/S0219720015500183","article-title":"A quantum leap in the reproducibility, precision, and sensitivity of gene expression profile analysis even when sample size is extremely small","volume":"13","author":"Lim","year":"2015","journal-title":"J. Bioinform. Comput. Biol."},{"key":"2023020110432432200_btv631-B17","first-page":"189","article-title":"Finding consistent disease subnetworks using PFSNet","volume":"30","author":"Lim","year":"2014","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023020110432432200_btv631-B18","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1093\/bioinformatics\/btl612","article-title":"Flexible empirical Bayes models for differential gene expression","volume":"23","author":"Lo","year":"2007","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023020110432432200_btv631-B19","doi-asserted-by":"crossref","first-page":"322","DOI":"10.1038\/nbt0410-322","article-title":"A global map of human gene expression","volume":"28","author":"Lukk","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023020110432432200_btv631-B20","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1093\/biostatistics\/kxp059","article-title":"Frozen robust multiarray analysis (fRMA)","volume":"11","author":"McCall","year":"2010","journal-title":"Biostatistics"},{"key":"2023020110432432200_btv631-B21","doi-asserted-by":"crossref","first-page":"621","DOI":"10.1038\/nmeth.1226","article-title":"Mapping and quantifying mammalian transcriptomes by RNA-Seq","volume":"5","author":"Mortazavi","year":"2008","journal-title":"Nat. Methods"},{"key":"2023020110432432200_btv631-B22","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1089\/106652701300099074","article-title":"On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data","volume":"8","author":"Newton","year":"2001","journal-title":"J. Comput. Biol. J. Comput. Mol. Cell Biol."},{"key":"2023020110432432200_btv631-B23","doi-asserted-by":"crossref","DOI":"10.1007\/b97411","volume-title":"The Analysis of Gene Expression Data : Methods and Software","author":"Parmigiani","year":"2003"},{"key":"2023020110432432200_btv631-B24","doi-asserted-by":"crossref","first-page":"Article3","DOI":"10.2202\/1544-6115.1027","article-title":"Linear models and empirical bayes methods for assessing differential expression in microarray experiments","volume":"3","author":"Smyth","year":"2004","journal-title":"Stat. Appl. Genet. Mol. Biol."},{"key":"2023020110432432200_btv631-B25","doi-asserted-by":"crossref","first-page":"S15","DOI":"10.1186\/1471-2105-12-S13-S15","article-title":"Finding consistent disease subnetworks across microarray datasets","volume":"12","author":"Soh","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023020110432432200_btv631-B26","doi-asserted-by":"crossref","first-page":"1501","DOI":"10.1089\/cmb.2009.0063","article-title":"Background adjustment for DNA microarrays using a database of microarray experiments","volume":"16","author":"Sui","year":"2009","journal-title":"J. Comput. Biol. J. Comput. Mol. Cell Biol."},{"key":"2023020110432432200_btv631-B27","doi-asserted-by":"crossref","first-page":"3785","DOI":"10.1093\/nar\/gkr1265","article-title":"Comprehensive literature review and statistical considerations for microarray meta-analysis","volume":"40","author":"Tseng","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023020110432432200_btv631-B28","doi-asserted-by":"crossref","first-page":"5116","DOI":"10.1073\/pnas.091062498","article-title":"Significance analysis of microarrays applied to the ionizing radiation response","volume":"98","author":"Tusher","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020110432432200_btv631-B29","doi-asserted-by":"crossref","first-page":"656","DOI":"10.1038\/nbt0604-656b","article-title":"Preprocessing of oligonucleotide array data","volume":"22","author":"Wu","year":"2004","journal-title":"Nat. Biotechnol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/5\/682\/49017612\/bioinformatics_32_5_682.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/5\/682\/49017612\/bioinformatics_32_5_682.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,12]],"date-time":"2024-06-12T04:25:16Z","timestamp":1718166316000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/32\/5\/682\/1743658"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,10,30]]},"references-count":29,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2016,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btv631","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2016,3,1]]},"published":{"date-parts":[[2015,10,30]]}}}