{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,14]],"date-time":"2025-12-14T15:57:13Z","timestamp":1765727833545,"version":"3.37.3"},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2018,5,30]],"date-time":"2018-05-30T00:00:00Z","timestamp":1527638400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2018,5,30]],"date-time":"2018-05-30T00:00:00Z","timestamp":1527638400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2018,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>One of the most important and often neglected components of a successful RNA sequencing (RNA-Seq) experiment is sample size estimation. A few negative binomial model-based methods have been developed to estimate sample size based on the parameters of a single gene. However, thousands of genes are quantified and tested for differential expression simultaneously in RNA-Seq experiments. Thus, additional issues should be carefully addressed, including the false discovery rate for multiple statistic tests, widely distributed read counts and dispersions for different genes.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>To solve these issues, we developed a sample size and power estimation method named RnaSeqSampleSize, based on the distributions of gene average read counts and dispersions estimated from real RNA-seq data. Datasets from previous, similar experiments such as the Cancer Genome Atlas (TCGA) can be used as a point of reference. Read counts and their dispersions were estimated from the reference\u2019s distribution; using that information, we estimated and summarized the power and sample size. RnaSeqSampleSize is implemented in R language and can be installed from Bioconductor website. A user friendly web graphic interface is provided at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/cqs.app.vumc.org\/shiny\/RnaSeqSampleSize\/\">https:\/\/cqs.app.vumc.org\/shiny\/RnaSeqSampleSize\/<\/jats:ext-link>.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>RnaSeqSampleSize provides a convenient and powerful way for power and sample size estimation for an RNAseq experiment. It is also equipped with several unique features, including estimation for interested genes or pathway, power curve visualization, and parameter optimization.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-018-2191-5","type":"journal-article","created":{"date-parts":[[2018,5,30]],"date-time":"2018-05-30T00:40:06Z","timestamp":1527640806000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":77,"title":["RnaSeqSampleSize: real data based sample size estimation for RNA sequencing"],"prefix":"10.1186","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3921-3965","authenticated-orcid":false,"given":"Shilin","family":"Zhao","sequence":"first","affiliation":[]},{"given":"Chung-I","family":"Li","sequence":"additional","affiliation":[]},{"given":"Yan","family":"Guo","sequence":"additional","affiliation":[]},{"given":"Quanhu","family":"Sheng","sequence":"additional","affiliation":[]},{"given":"Yu","family":"Shyr","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2018,5,30]]},"reference":[{"issue":"1","key":"2191_CR1","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1038\/nrg2484","volume":"10","author":"Z Wang","year":"2009","unstructured":"Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10(1):57\u201363.","journal-title":"Nat Rev Genet"},{"issue":"1","key":"2191_CR2","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1093\/biostatistics\/kxh026","volume":"6","author":"SH Jung","year":"2005","unstructured":"Jung SH, Bang H, Young S. Sample size calculation for multiple testing in microarray data analysis. Biostatistics. 2005;6(1):157\u201369.","journal-title":"Biostatistics"},{"issue":"468","key":"2191_CR3","doi-asserted-by":"publisher","first-page":"990","DOI":"10.1198\/016214504000001646","volume":"99","author":"P M\u00fcller","year":"2004","unstructured":"M\u00fcller P, Parmigiani G, Robert C, Rousseau J. Optimal sample size for multiple testing: the case of gene expression microarrays. J Am Stat Assoc. 2004;99(468):990\u20131001.","journal-title":"J Am Stat Assoc"},{"issue":"5","key":"2191_CR4","doi-asserted-by":"publisher","first-page":"656","DOI":"10.1093\/bioinformatics\/btt015","volume":"29","author":"MA Busby","year":"2013","unstructured":"Busby MA, Stewart C, Miller CA, Grzeda KR, Marth GT. Scotty: a web tool for designing RNA-Seq experiments to measure differential gene expression. Bioinformatics. 2013;29(5):656\u20137.","journal-title":"Bioinformatics"},{"issue":"Suppl 3","key":"2191_CR5","doi-asserted-by":"publisher","first-page":"S1","DOI":"10.1186\/1752-0509-5-S3-S1","volume":"5","author":"Z Chen","year":"2011","unstructured":"Chen Z, Liu J, Ng HK, Nadarajah S, Kaufman HL, Yang JY, Deng Y. Statistical methods on detecting differentially expressed genes for RNA-seq data. BMC Syst Biol. 2011;5(Suppl 3):S1.","journal-title":"BMC Syst Biol"},{"issue":"3","key":"2191_CR6","doi-asserted-by":"publisher","first-page":"280","DOI":"10.1093\/bib\/bbr004","volume":"12","author":"Z Fang","year":"2011","unstructured":"Fang Z, Cui X. Design and validation issues in RNA-seq experiments. Brief Bioinform. 2011;12(3):280\u20137.","journal-title":"Brief Bioinform"},{"issue":"3","key":"2191_CR7","doi-asserted-by":"publisher","first-page":"R25","DOI":"10.1186\/gb-2010-11-3-r25","volume":"11","author":"MD Robinson","year":"2010","unstructured":"Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010;11(3):R25.","journal-title":"Genome Biol"},{"issue":"10","key":"2191_CR8","doi-asserted-by":"publisher","first-page":"R106","DOI":"10.1186\/gb-2010-11-10-r106","volume":"11","author":"S Anders","year":"2010","unstructured":"Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11(10):R106.","journal-title":"Genome Biol"},{"issue":"12","key":"2191_CR9","doi-asserted-by":"publisher","first-page":"970","DOI":"10.1089\/cmb.2012.0283","volume":"20","author":"SN Hart","year":"2013","unstructured":"Hart SN, Therneau TM, Zhang Y, Poland GA, Kocher JP. Calculating sample size estimates for RNA sequencing data. J Comput Biol. 2013;20(12):970\u20138.","journal-title":"J Comput Biol"},{"key":"2191_CR10","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1186\/1471-2105-14-357","volume":"14","author":"CI Li","year":"2013","unstructured":"Li CI, Su PF, Shyr Y. Sample size calculation based on exact test for assessing differential expression analysis in RNA-seq data. BMC bioinformatics. 2013;14:357.","journal-title":"BMC bioinformatics"},{"issue":"3","key":"2191_CR11","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1093\/bioinformatics\/btt688","volume":"30","author":"Y Liu","year":"2014","unstructured":"Liu Y, Zhou J, White KP. RNA-seq differential expression studies: more sequence or more replication? Bioinformatics. 2014;30(3):301\u20134.","journal-title":"Bioinformatics"},{"issue":"11","key":"2191_CR12","doi-asserted-by":"publisher","first-page":"1684","DOI":"10.1261\/rna.046011.114","volume":"20","author":"T Ching","year":"2014","unstructured":"Ching T, Huang S, Garmire LX. Power analysis and sample size estimation for RNA-Seq differential expression. RNA. 2014;20(11):1684\u201396.","journal-title":"RNA"},{"doi-asserted-by":"crossref","unstructured":"Li CI, Samuels DC, Zhao YY, Shyr Y, Guo Y. Power and sample size calculations for high-throughput sequencing-based experiments. Brief Bioinform. 2017; https:\/\/www.ncbi.nlm.nih.gov\/pubmed\/28605403.","key":"2191_CR13","DOI":"10.1093\/bib\/bbx061"},{"unstructured":"Therneau TM, Hart SN, Kocher JP. RNASeqPower: Calculating samples Size estimates for RNA Seq studies. R package version 1.18.0. 2013.","key":"2191_CR14"},{"issue":"9","key":"2191_CR15","doi-asserted-by":"publisher","first-page":"1210","DOI":"10.1093\/bioinformatics\/btt118","volume":"29","author":"Y Guo","year":"2013","unstructured":"Guo Y, Li J, Li CI, Shyr Y, Samuels DC. MitoSeek: extracting mitochondria information and performing high-throughput mitochondria sequencing analysis. Bioinformatics. 2013;29(9):1210\u20131.","journal-title":"Bioinformatics"},{"issue":"2","key":"2191_CR16","doi-asserted-by":"publisher","first-page":"233","DOI":"10.1093\/bioinformatics\/btu640","volume":"31","author":"H Wu","year":"2015","unstructured":"Wu H, Wang C, Wu ZJ. PROPER: comprehensive power evaluation for differential expression using RNA-seq. Bioinformatics. 2015;31(2):233\u201341.","journal-title":"Bioinformatics"},{"issue":"11","key":"2191_CR17","doi-asserted-by":"publisher","first-page":"e91","DOI":"10.1093\/nar\/gku310","volume":"42","author":"X Zhou","year":"2014","unstructured":"Zhou X, Lindsay H, Robinson MD. Robustly detecting differential expression in RNA sequencing data using observation weights. Nucleic Acids Res. 2014;42(11):e91.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2191_CR18","doi-asserted-by":"publisher","first-page":"234","DOI":"10.1186\/s12859-017-1648-2","volume":"18","author":"L Yu","year":"2017","unstructured":"Yu L, Fernandez S, Brock G. Power analysis for RNA-Seq differential expression studies. BMC Bioinformatics. 2017;18(1):234.","journal-title":"BMC Bioinformatics"},{"issue":"Database issue","key":"2191_CR19","doi-asserted-by":"publisher","first-page":"D691","DOI":"10.1093\/nar\/gkq1018","volume":"39","author":"D Croft","year":"2011","unstructured":"Croft D, O\u2019Kelly G, Wu G, Haw R, Gillespie M, Matthews L, Caudy M, Garapati P, Gopinath G, Jassal B, et al. Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res. 2011;39(Database issue):D691\u20137.","journal-title":"Nucleic Acids Res"},{"issue":"9","key":"2191_CR20","doi-asserted-by":"publisher","first-page":"R95","DOI":"10.1186\/gb-2013-14-9-r95","volume":"14","author":"F Rapaport","year":"2013","unstructured":"Rapaport F, Khanin R, Liang Y, Pirun M, Krek A, Zumbo P, Mason CE, Socci ND, Betel D. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biol. 2013;14(9):R95.","journal-title":"Genome Biol"},{"key":"2191_CR21","volume-title":"R foundation for statistical computing","author":"R Core Team","year":"2016","unstructured":"R Core Team. R: a language and environment for statistical computing. In: R foundation for statistical computing; 2016. https:\/\/www.R-project.org\/:."},{"issue":"2","key":"2191_CR22","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1038\/nmeth.3252","volume":"12","author":"W Huber","year":"2015","unstructured":"Huber W, Carey VJ, Gentleman R, Anders S, Carlson M, Carvalho BS, Bravo HC, Davis S, Gatto L, Girke T, et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods. 2015;12(2):115\u201321.","journal-title":"Nat Methods"},{"issue":"2","key":"2191_CR23","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1093\/biostatistics\/kxm030","volume":"9","author":"MD Robinson","year":"2008","unstructured":"Robinson MD, Smyth GK. Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics. 2008;9(2):321\u201332.","journal-title":"Biostatistics"},{"issue":"21","key":"2191_CR24","doi-asserted-by":"publisher","first-page":"2881","DOI":"10.1093\/bioinformatics\/btm453","volume":"23","author":"MD Robinson","year":"2007","unstructured":"Robinson MD, Smyth GK. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics. 2007;23(21):2881\u20137.","journal-title":"Bioinformatics"},{"issue":"1","key":"2191_CR25","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1093\/bioinformatics\/btp616","volume":"26","author":"MD Robinson","year":"2010","unstructured":"Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139\u201340.","journal-title":"Bioinformatics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-018-2191-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-018-2191-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-018-2191-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,30]],"date-time":"2022-05-30T07:04:41Z","timestamp":1653894281000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-018-2191-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,5,30]]},"references-count":25,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2018,12]]}},"alternative-id":["2191"],"URL":"https:\/\/doi.org\/10.1186\/s12859-018-2191-5","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2018,5,30]]},"assertion":[{"value":"8 August 2017","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 May 2018","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 May 2018","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}},{"value":"Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Publisher\u2019s Note"}}],"article-number":"191"}}