{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,1]],"date-time":"2025-10-01T15:19:52Z","timestamp":1759331992977,"version":"3.33.0"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2008,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Choosing the appropriate sample size is an important step in the design of a microarray experiment, and recently methods have been proposed that estimate sample sizes for control of the False Discovery Rate (FDR). Many of these methods require knowledge of the distribution of effect sizes among the differentially expressed genes. If this distribution can be determined then accurate sample size requirements can be calculated.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We present a mixture model approach to estimating the distribution of effect sizes in data from two-sample comparative studies. Specifically, we present a novel, closed form, algorithm for estimating the noncentrality parameters in the test statistic distributions of differentially expressed genes. We then show how our model can be used to estimate sample sizes that control the FDR together with other statistical measures like average power or the false nondiscovery rate. Method performance is evaluated through a comparison with existing methods for sample size estimation, and is found to be very good.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusion<\/jats:title><jats:p>A novel method for estimating the appropriate sample size for a two-sample comparative microarray study is presented. The method is shown to perform very well when compared to existing methods.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1471-2105-9-117","type":"journal-article","created":{"date-parts":[[2008,2,26]],"date-time":"2008-02-26T07:14:15Z","timestamp":1204010055000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["A mixture model approach to sample size estimation in two-sample comparative microarray experiments"],"prefix":"10.1186","volume":"9","author":[{"given":"Tommy S","family":"J\u00f8rstad","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Herman","family":"Midelfart","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Atle M","family":"Bones","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2008,2,25]]},"reference":[{"key":"2102_CR1","doi-asserted-by":"publisher","first-page":"2022","DOI":"10.1101\/gr.10.12.2022","volume":"10","author":"MJ Callow","year":"2000","unstructured":"Callow MJ, Dudoit S, Gong EL, Speed TP, Rubin EM: Microarray Expression Profiling Identifies Genes with Altered Expression in HDL-Deficient Mice. Genome Res 2000, 10: 2022\u20132029.","journal-title":"Genome Res"},{"key":"2102_CR2","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","volume":"57","author":"Y Benjamini","year":"1995","unstructured":"Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 1995, 57: 289\u2013300.","journal-title":"J R Stat Soc Ser B"},{"key":"2102_CR3","doi-asserted-by":"publisher","first-page":"479","DOI":"10.1111\/1467-9868.00346","volume":"64","author":"JD Storey","year":"2002","unstructured":"Storey JD: A direct approach to false discovery rates. J R Stat Soc Ser B 2002, 64: 479\u2013498.","journal-title":"J R Stat Soc Ser B"},{"key":"2102_CR4","doi-asserted-by":"publisher","first-page":"46","DOI":"10.1016\/j.tplants.2007.01.001","volume":"12","author":"TS J\u00f8rstad","year":"2007","unstructured":"J\u00f8rstad TS, Langaas M, Bones AM: Understanding sample size: what determines the required number of microarrays for an experiment? Trends Plant Sci 2007, 12: 46\u201350.","journal-title":"Trends Plant Sci"},{"key":"2102_CR5","doi-asserted-by":"publisher","first-page":"325","DOI":"10.1191\/0962280204sm369ra","volume":"13","author":"GL Gadbury","year":"2004","unstructured":"Gadbury GL, Page GP, Edwards J, Kayo T, Prolla TA, Weindruch R, Permana PA, Mountz JD, Allison DB: Power and sample size estimation in high dimensional biology. Stat Methods Med Res 2004, 13: 325\u2013338.","journal-title":"Stat Methods Med Res"},{"key":"2102_CR6","doi-asserted-by":"publisher","first-page":"990","DOI":"10.1198\/016214504000001646","volume":"99","author":"P M\u00fcller","year":"2004","unstructured":"M\u00fcller P, Parmigiani G, Robert C, Rousseau J: Optimal Sample Size for Multiple Testing: the Case of Gene Expression Microarrays. J Am Stat Assoc 2004, 99: 990\u20131001.","journal-title":"J Am Stat Assoc"},{"key":"2102_CR7","doi-asserted-by":"publisher","first-page":"3097","DOI":"10.1093\/bioinformatics\/bti456","volume":"21","author":"SH Jung","year":"2005","unstructured":"Jung SH: Sample size for FDR-control in microarray data analysis. Bioinformatics 2005, 21: 3097\u20133104.","journal-title":"Bioinformatics"},{"key":"2102_CR8","doi-asserted-by":"publisher","first-page":"2267","DOI":"10.1002\/sim.2119","volume":"24","author":"SS Li","year":"2005","unstructured":"Li SS, Bigler J, Lampe JW, Potter JD, Feng Z: FDR-controlling testing procedures and sample size determination for microarrays. Stat Med 2005, 24: 2267\u20132280.","journal-title":"Stat Med"},{"key":"2102_CR9","doi-asserted-by":"publisher","first-page":"3017","DOI":"10.1093\/bioinformatics\/bti448","volume":"21","author":"Y Pawitan","year":"2005","unstructured":"Pawitan Y, Michiels S, Koscielny S, Gusnanto A, Ploner A: False discovery rate, sensitivity and sample size for microarray studies. Bioinformatics 2005, 21: 3017\u20133024.","journal-title":"Bioinformatics"},{"key":"2102_CR10","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1186\/1471-2105-7-106","volume":"7","author":"R Tibshirani","year":"2006","unstructured":"Tibshirani R: A simple method for assessing sample sizes in microarray experiments. BMC Bioinformatics 2006, 7: 106.","journal-title":"BMC Bioinformatics"},{"key":"2102_CR11","doi-asserted-by":"publisher","first-page":"739","DOI":"10.1093\/bioinformatics\/btl664","volume":"23","author":"P Liu","year":"2007","unstructured":"Liu P, Hwang JTG: Quick Calculation for Sample Size while Controlling False Discovery Rate with Application to Microarray Anaylsis. Bioinformatics 2007, 23: 739\u2013746.","journal-title":"Bioinformatics"},{"key":"2102_CR12","first-page":"Article 8","volume":"2","author":"JA Ferreira","year":"2007","unstructured":"Ferreira JA, Zwinderman AH: Approximate Power and Sample Size Calculations with the Benjamini-Hochberg Method. Int J Biostat 2007, 2: Article 8.","journal-title":"Int J Biostat"},{"key":"2102_CR13","doi-asserted-by":"publisher","first-page":"3264","DOI":"10.1093\/bioinformatics\/bti519","volume":"21","author":"J Hu","year":"2005","unstructured":"Hu J, Zou F, Wright FA: Practical FDR-based sample size calculations in microarray experiments. Bioinformatics 2005, 21: 3264\u20133272.","journal-title":"Bioinformatics"},{"key":"2102_CR14","doi-asserted-by":"publisher","first-page":"4263","DOI":"10.1093\/bioinformatics\/bti699","volume":"21","author":"S Pounds","year":"2005","unstructured":"Pounds S, Cheng C: Sample Size Determination for the False Discovery Rate. Bioinformatics 2005, 21: 4263\u20134271.","journal-title":"Bioinformatics"},{"key":"2102_CR15","doi-asserted-by":"publisher","first-page":"2013","DOI":"10.1214\/aos\/1074290335","volume":"31","author":"JD Storey","year":"2003","unstructured":"Storey JD: The Positive False Discovery Rate: A Bayesian Interpretation and the q-value. Ann Stat 2003, 31: 2013\u20132035.","journal-title":"Ann Stat"},{"key":"2102_CR16","doi-asserted-by":"publisher","first-page":"3865","DOI":"10.1093\/bioinformatics\/bti626","volume":"21","author":"Y Pawitan","year":"2005","unstructured":"Pawitan Y, Murthy KRK, Michiels S, Ploner A: Bias in the estimation of the false discovery rate in microarray studies. Bioinformatics 2005, 21: 3865\u20133872.","journal-title":"Bioinformatics"},{"key":"2102_CR17","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1214\/aos\/1176346059","volume":"11","author":"BG Lindsay","year":"1983","unstructured":"Lindsay BG: The Geometry of Mixture Likelihoods: A General Theory. Ann Stat 1983, 11: 86\u201394.","journal-title":"Ann Stat"},{"key":"2102_CR18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","volume":"39","author":"AP Dempster","year":"1977","unstructured":"Dempster AP, Laird NM, Rubin DB: Maximum Likelihood from Incomplete Data via the EM Algorithm. J R Stat Soc Ser B 1977, 39: 1\u201338.","journal-title":"J R Stat Soc Ser B"},{"key":"2102_CR19","volume-title":"Continuous Univariate Distributions","author":"NL Johnson","year":"1995","unstructured":"Johnson NL, Kotz S, Balakrishnan N: Continuous Univariate Distributions. Volume 2. second edition. John Wiley and Sons, Inc; 1995.","edition":"second"},{"key":"2102_CR20","doi-asserted-by":"publisher","DOI":"10.1002\/0471721182","volume-title":"Finite Mixture Models","author":"G McLachlan","year":"2000","unstructured":"McLachlan G, Peel D: Finite Mixture Models. John Wiley and Sons, Inc; 2000."},{"key":"2102_CR21","volume-title":"Methods of Mathematical Physics","author":"H Jeffreys","year":"1972","unstructured":"Jeffreys H, Jeffreys BS: Methods of Mathematical Physics. third edition. Cambridge University Press; 1972.","edition":"third"},{"key":"2102_CR22","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1093\/biomet\/69.3.493","volume":"69","author":"T Schweder","year":"1982","unstructured":"Schweder T, Spj\u00f8tvoll E: Plots of p-values to evaluate many tests simultaneously. Biometrika 1982, 69: 493\u2013502.","journal-title":"Biometrika"},{"key":"2102_CR23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/S0167-9473(01)00046-9","volume":"39","author":"DB Allison","year":"2002","unstructured":"Allison DB, Gadbury GL, Heo M, Fernandez JR, Lee CK, Prolla TA, Weindruch R: A mixture model approach for the analysis of microarray gene expression data. Comput Stat Data An 2002, 39: 1\u201320.","journal-title":"Comput Stat Data An"},{"key":"2102_CR24","doi-asserted-by":"publisher","first-page":"555","DOI":"10.1111\/j.1467-9868.2005.00515.x","volume":"67","author":"M Langaas","year":"2005","unstructured":"Langaas M, Lindqvist BH, Ferkingstad E: Estimating the proportion of true null hypotheses, with application to DNA microarray data. J R Stat Soc Ser B 2005, 67: 555\u2013572.","journal-title":"J R Stat Soc Ser B"},{"key":"2102_CR25","first-page":"267","volume-title":"Second International Symposium on Information Theory","author":"H Akaike","year":"1973","unstructured":"Akaike H: Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory Edited by: Petrov BN, Csaki F. 1973, 267\u2013281."},{"key":"2102_CR26","doi-asserted-by":"publisher","first-page":"461","DOI":"10.1214\/aos\/1176344136","volume":"6","author":"G Schwarz","year":"1978","unstructured":"Schwarz G: Estimating the Dimension of a Model. Ann Stat 1978, 6: 461\u2013464.","journal-title":"Ann Stat"},{"key":"2102_CR27","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/BF02595811","volume":"12","author":"JD Storey","year":"2003","unstructured":"Storey JD: Invited comment on 'Resampling-based multiple testing for DNA microarray data analysis' by Ge, Dudoit, and Speed. Test 2003, 12: 1\u201377.","journal-title":"Test"},{"key":"2102_CR28","doi-asserted-by":"publisher","first-page":"329","DOI":"10.1093\/imanum\/22.3.329","volume":"22","author":"NJ Higham","year":"2002","unstructured":"Higham NJ: Computing the nearest correlation matrix \u2013 a problem from finance. IMA J Numer Anal 2002, 22: 329\u2013343.","journal-title":"IMA J Numer Anal"},{"key":"2102_CR29","doi-asserted-by":"publisher","first-page":"397","DOI":"10.1007\/0-387-29362-0_23","volume-title":"Bioinformatics and Computational Biology Solutions using R and Bioconductor","author":"GK Smyth","year":"2005","unstructured":"Smyth GK: Limma: linear models for microarray data. In Bioinformatics and Computational Biology Solutions using R and Bioconductor. Edited by: Gentleman R, Carey V, Dudoit S, Irizarry R, Huber W. Springer, New York; 2005:397\u2013420."},{"key":"2102_CR30","doi-asserted-by":"publisher","first-page":"557","DOI":"10.1089\/106652701753307485","volume":"8","author":"DM Rocke","year":"2001","unstructured":"Rocke DM, Durbin B: A Model for Measurement Error for Gene Expression Arrays. J Comput Biol 2001, 8: 557\u2013569.","journal-title":"J Comput Biol"},{"key":"2102_CR31","doi-asserted-by":"crossref","first-page":"Article 3","DOI":"10.2202\/1544-6115.1027","volume":"3","author":"GK Smyth","year":"2004","unstructured":"Smyth GK: Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments. Stat Appl Genet Mol Biol 2004, 3: Article 3.","journal-title":"Stat Appl Genet Mol Biol"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-9-117.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,28]],"date-time":"2025-01-28T19:25:32Z","timestamp":1738092332000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-9-117"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,2,25]]},"references-count":31,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,12]]}},"alternative-id":["2102"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-9-117","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2008,2,25]]},"assertion":[{"value":"3 July 2007","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 February 2008","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 February 2008","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"117"}}