{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,11]],"date-time":"2025-02-11T05:15:02Z","timestamp":1739250902428,"version":"3.37.0"},"reference-count":26,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2009,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Large-scale statistical analyses have become hallmarks of post-genomic era biological research due to advances in high-throughput assays and the integration of large biological databases. One accompanying issue is the simultaneous estimation of p-values for a large number of hypothesis tests. In many applications, a parametric assumption in the null distribution such as normality may be unreasonable, and resampling-based p-values are the preferred procedure for establishing statistical significance. Using resampling-based procedures for multiple testing is computationally intensive and typically requires large numbers of resamples.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We present a new approach to more efficiently assign resamples (such as bootstrap samples or permutations) within a nonparametric multiple testing framework. We formulated a Bayesian-inspired approach to this problem, and devised an algorithm that adapts the assignment of resamples iteratively with negligible space and running time overhead. In two experimental studies, a breast cancer microarray dataset and a genome wide association study dataset for Parkinson's disease, we demonstrated that our differential allocation procedure is substantially more accurate compared to the traditional uniform resample allocation.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusion<\/jats:title><jats:p>Our experiments demonstrate that using a more sophisticated allocation strategy can improve our inference for hypothesis testing without a drastic increase in the amount of computation on randomized data. Moreover, we gain more improvement in efficiency when the number of tests is large. R code for our algorithm and the shortcut method are available at<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"http:\/\/people.pcbi.upenn.edu\/~lswang\/pub\/bmc2009\/\" ext-link-type=\"uri\">http:\/\/people.pcbi.upenn.edu\/~lswang\/pub\/bmc2009\/<\/jats:ext-link>.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1471-2105-10-198","type":"journal-article","created":{"date-parts":[[2009,6,30]],"date-time":"2009-06-30T06:13:31Z","timestamp":1246342411000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["A Bayesian approach to efficient differential allocation for resampling-based significance testing"],"prefix":"10.1186","volume":"10","author":[{"given":"Shane T","family":"Jensen","sequence":"first","affiliation":[]},{"given":"Sameer","family":"Soi","sequence":"additional","affiliation":[]},{"given":"Li-San","family":"Wang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2009,6,28]]},"reference":[{"key":"2928_CR1","doi-asserted-by":"publisher","first-page":"5116","DOI":"10.1073\/pnas.091062498","volume":"98","author":"VG Tusher","year":"2001","unstructured":"Tusher VG, Tibshirani RJ, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Nat Acad Sci 2001, 98: 5116\u20135121.","journal-title":"Proc Nat Acad Sci"},{"key":"2928_CR2","doi-asserted-by":"publisher","first-page":"834","DOI":"10.1093\/bioinformatics\/btg093","volume":"19","author":"SD Peddada","year":"2003","unstructured":"Peddada SD, Lobenhofer EK, Li L, Afshari CA, Weinberg CR, Umbach DM: Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics 2003, 19: 834\u2013841.","journal-title":"Bioinformatics"},{"key":"2928_CR3","volume-title":"Practical Nonparametric Statistics","author":"W Conover","year":"1998","unstructured":"Conover W: Practical Nonparametric Statistics. 3rd edition. New York, NY, USA: Wiley; 1998.","edition":"3"},{"key":"2928_CR4","doi-asserted-by":"crossref","DOI":"10.1201\/9780429246593","volume-title":"An Introduction to the Bootstrap","author":"B Efron","year":"1994","unstructured":"Efron B, Tibshirani RJ: An Introduction to the Bootstrap. Boca Raton, FL, USA: Chapman & Hall\/CRC; 1994."},{"key":"2928_CR5","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4613-8122-8","volume-title":"Simultaneous statistical inference","author":"RG Miller","year":"1981","unstructured":"Miller RG: Simultaneous statistical inference. 2nd edition. New York, NY, USA: Springer Verlag; 1981.","edition":"2"},{"key":"2928_CR6","doi-asserted-by":"publisher","first-page":"70","DOI":"10.1002\/gepi.1124","volume":"23","author":"B Efron","year":"2002","unstructured":"Efron B, Tibshirani R: Empirical Bayes methods and false discovery rates for microarrays. Genetic Epidemiology 2002, 23: 70\u201386.","journal-title":"Genetic Epidemiology"},{"key":"2928_CR7","doi-asserted-by":"publisher","first-page":"9440","DOI":"10.1073\/pnas.1530509100","volume":"100","author":"JD Storey","year":"2003","unstructured":"Storey JD, Tibshirani R: Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences 2003, 100: 9440\u20139445.","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"3","key":"2928_CR8","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1109\/TCBB.2004.24","volume":"1","author":"S Scheid","year":"2004","unstructured":"Scheid S, Spang R: A Stochastic Downhill Search Algorithm for Estimating the Local False Discovery Rate. IEEE Transactions on Computational Biology and Bioinformatics 2004, 1(3):98\u2013108.","journal-title":"IEEE Transactions on Computational Biology and Bioinformatics"},{"key":"2928_CR9","doi-asserted-by":"publisher","first-page":"368","DOI":"10.1093\/bioinformatics\/btf877","volume":"19","author":"A Reiner","year":"2003","unstructured":"Reiner A, Yekutieli D, Benjamini Y: Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics 2003, 19: 368\u2013375.","journal-title":"Bioinformatics"},{"key":"2928_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/BF02595811","volume":"12","author":"Y Ge","year":"2007","unstructured":"Ge Y, Dudoit S, Speed TP: Resampling-based multiple testing for microarray data analysis. TEST 2007, 12: 1\u201377.","journal-title":"TEST"},{"key":"2928_CR11","doi-asserted-by":"publisher","first-page":"187","DOI":"10.1186\/1471-2105-6-187","volume":"6","author":"N Jain","year":"2005","unstructured":"Jain N, Cho H, O'Connell M, Lee JK: Rank-invariant resampling based estimation of false discovery rate for analysis of small sample microarray data. BMC Bioinformatics 2005, 6: 187.","journal-title":"BMC Bioinformatics"},{"key":"2928_CR12","doi-asserted-by":"publisher","first-page":"5116","DOI":"10.1073\/pnas.091062498","volume":"98","author":"VG Tusher","year":"2001","unstructured":"Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences 2001, 98: 5116\u20135121.","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"2928_CR13","doi-asserted-by":"publisher","first-page":"4280","DOI":"10.1093\/bioinformatics\/bti685","volume":"21","author":"Y Xie","year":"2005","unstructured":"Xie Y, Pan W, Khodursky A: A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data. Bioinformatics 2005, 21: 4280\u20134288.","journal-title":"Bioinformatics"},{"key":"2928_CR14","doi-asserted-by":"publisher","first-page":"38","DOI":"10.1093\/bioinformatics\/btl548","volume":"23","author":"H Yang","year":"2007","unstructured":"Yang H, Churchill G: Estimating p-values in small microarray experiments. Bioinformatics 2007, 23: 38\u201343.","journal-title":"Bioinformatics"},{"key":"2928_CR15","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1093\/biomet\/78.2.301","volume":"78","author":"J Besag","year":"1991","unstructured":"Besag J, Clifford P: Sequential Monte Carlo p-values. Biometrika 1991, 78: 301\u2013304.","journal-title":"Biometrika"},{"issue":"3","key":"2928_CR16","doi-asserted-by":"publisher","first-page":"559","DOI":"10.1086\/519795","volume":"81","author":"S Purcell","year":"2007","unstructured":"Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira M, Bender D, Maller J, Sklar P, de Bakker P, Daly M, Sham P: PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics 2007, 81(3):559\u2013575.","journal-title":"American Journal of Human Genetics"},{"key":"2928_CR17","doi-asserted-by":"publisher","first-page":"539","DOI":"10.1056\/NEJM200102223440801","volume":"344","author":"I Hedenfalk","year":"2001","unstructured":"Hedenfalk I, Duggan D, Chen YD, Radmacher M, Bittner M, Simon R, Meltzer P, Gusterson B, Esteller M, Kallioniemi OP, et al.: Gene-Expression Profiles in Hereditary Breast Cancer. N Engl J Med 2001, 344: 539\u2013548.","journal-title":"N Engl J Med"},{"issue":"11","key":"2928_CR18","doi-asserted-by":"publisher","first-page":"911","DOI":"10.1016\/S1474-4422(06)70578-6","volume":"5","author":"HC Fung","year":"2006","unstructured":"Fung HC, Scholz S, Matarin M, Sim\u00f3n-S\u00e1nchez J, Hernandez D, Britton A, Gibbs JR, Langefeld C, Stiegert ML, et al.: Genome-wide genotyping in Parkinson's disease and neurologically normal controls: first stage analysis and public release of data. Lancet Neurology 2006, 5(11):911\u2013916.","journal-title":"Lancet Neurology"},{"key":"2928_CR19","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","volume":"57","author":"Y Benjamini","year":"1995","unstructured":"Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 1995, 57: 289\u2013300.","journal-title":"J R Stat Soc Ser B"},{"key":"2928_CR20","doi-asserted-by":"publisher","first-page":"1149","DOI":"10.1101\/gr.5076506","volume":"16","author":"SJ Diskin","year":"2006","unstructured":"Diskin SJ, Eck T, Greshock J, Mosse YP, Naylor T, Christian J, Stoeckert J, Weber BL, Maris JM, Grant GR: STAC: A method for testing the significance of DNA copy-number aberrations across multiple array-CGH experiments. Genome Research 2006, 16: 1149\u20131158.","journal-title":"Genome Research"},{"key":"2928_CR21","volume-title":"R: A Language and Environment for Statistical Computing","author":"R Development Core Team","year":"2005","unstructured":"R Development Core Team:R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria; 2005. [ISBN 3\u2013900051\u201307\u20130] [http:\/\/www.R-project.org] [ISBN 3-900051-07-0]"},{"key":"2928_CR22","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1214\/ss\/1009213286","volume":"16","author":"LD Brown","year":"2001","unstructured":"Brown LD, Cai TT, DasGupta A: Interval Estimation for a Binomial Proportion. Statistical Science 2001, 16: 101\u2013133.","journal-title":"Statistical Science"},{"key":"2928_CR23","doi-asserted-by":"publisher","first-page":"329","DOI":"10.1002\/9780471781141.ch12","volume-title":"Genetic Analysis of Complex Disease","author":"ER Martin","year":"2006","unstructured":"Martin ER: Linkage Disequilibrium and Association Analysis. In Genetic Analysis of Complex Disease. 2nd edition. Edited by: Haines JL, Pericak-Vance M. New York, NY, USA: Wiley; 2006:329\u2013354.","edition":"2"},{"key":"2928_CR24","doi-asserted-by":"publisher","first-page":"303","DOI":"10.1198\/016214504000000287","volume":"99","author":"DP Foster","year":"2004","unstructured":"Foster DP, Stine RA: Variable Selection in Data Mining: Building a Predictive Model for Bankruptcy. Journal of the American Statistical Association 2004, 99: 303\u2013313.","journal-title":"Journal of the American Statistical Association"},{"key":"2928_CR25","doi-asserted-by":"publisher","first-page":"15545","DOI":"10.1073\/pnas.0506580102","volume":"102","author":"A Subramaniana","year":"2005","unstructured":"Subramaniana A, Tamayoa P, Moothaa VK, Mukherjeed S, Eberta BL, Gillettea MA, Paulovichg A, Pomeroyh SL, Goluba TR, Landera ES, Mesirova JP: Family-based designs in the age of large-scale gene-association studies. Proceedings of National Academy of Sciences 2005, 102: 15545\u201315550.","journal-title":"Proceedings of National Academy of Sciences"},{"key":"2928_CR26","doi-asserted-by":"publisher","first-page":"385","DOI":"10.1038\/nrg1839","volume":"7","author":"NM Laird","year":"2006","unstructured":"Laird NM, Lange C: Family-based designs in the age of large-scale gene-association studies. Nature Reviews Genetics 2006, 7: 385\u2013394.","journal-title":"Nature Reviews Genetics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-10-198.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,10]],"date-time":"2025-02-10T14:11:17Z","timestamp":1739196677000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-10-198"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,6,28]]},"references-count":26,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2009,12]]}},"alternative-id":["2928"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-10-198","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2009,6,28]]},"assertion":[{"value":"23 December 2008","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 June 2009","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 June 2009","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"198"}}