{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T15:12:19Z","timestamp":1773328339166,"version":"3.50.1"},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,10,30]],"date-time":"2023-10-30T00:00:00Z","timestamp":1698624000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,10,30]],"date-time":"2023-10-30T00:00:00Z","timestamp":1698624000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Gene-wise differential expression is usually the first major step in the statistical analysis of high-throughput data obtained from techniques such as microarrays or RNA-sequencing. The analysis at gene level is often complemented by interrogating the data in a broader biological context that considers as unit of measure groups of genes that may have a common function or biological trait. Among the vast number of publications about gene set analysis (GSA), the rotation test for gene set analysis, also referred to as roast, is a general sample randomization approach that maintains the integrity of the intra-gene set correlation structure in defining the null distribution of the test.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We present <jats:italic>roastgsa<\/jats:italic>, an R package that contains several enrichment score functions that feed the roast algorithm for hypothesis testing. These implemented methods are evaluated using both simulated and benchmarking data in microarray and RNA-seq datasets. We find that computationally intensive measures based on Kolmogorov-Smirnov (KS) statistics fail to improve the rates of simpler measures of GSA like mean and maxmean scores. We also show the importance of accounting for the gene linear dependence structure of the testing set, which is linked to the loss of effective signature size. Complete graphical representation of the results, including an approximation for the effective signature size, can be obtained as part of the <jats:italic>roastgsa<\/jats:italic> output.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>We encourage the usage of the absmean (non-directional), mean (directional) and maxmean (directional) scores for roast GSA analysis as these are simple measures of enrichment that have presented dominant results in all provided analyses in comparison to the more complex KS measures.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-023-05510-x","type":"journal-article","created":{"date-parts":[[2023,10,30]],"date-time":"2023-10-30T15:03:00Z","timestamp":1698678180000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Roastgsa: a comparison of rotation-based scores for gene set enrichment analysis"],"prefix":"10.1186","volume":"24","author":[{"given":"Adri\u00e0","family":"Caball\u00e9-Mestres","sequence":"first","affiliation":[]},{"given":"Antoni","family":"Berenguer-Llergo","sequence":"additional","affiliation":[]},{"given":"Camille","family":"Stephan-Otto Attolini","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,10,30]]},"reference":[{"issue":"8","key":"5510_CR1","doi-asserted-by":"publisher","first-page":"980","DOI":"10.1093\/bioinformatics\/btm051","volume":"23","author":"JJ Goeman","year":"2007","unstructured":"Goeman JJ, Buhlmann P. Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics. 2007;23(8):980\u20137. https:\/\/doi.org\/10.1093\/bioinformatics\/btm051.","journal-title":"Bioinformatics"},{"issue":"17","key":"5510_CR2","doi-asserted-by":"publisher","first-page":"2176","DOI":"10.1093\/bioinformatics\/btq401","volume":"26","author":"E Lim","year":"2010","unstructured":"Lim E, Wu D, Smyth GK, Asselin-Labat M-L, Vaillant F, Visvader JE. ROAST: rotation gene set tests for complex microarray experiments. Bioinformatics. 2010;26(17):2176\u201382. https:\/\/doi.org\/10.1093\/bioinformatics\/btq401.","journal-title":"Bioinformatics"},{"issue":"13","key":"5510_CR3","doi-asserted-by":"publisher","first-page":"511","DOI":"10.1093\/bioinformatics\/btq380","volume":"27","author":"D Nam","year":"2011","unstructured":"Nam D. De-correlating expression in gene-set analysis. Bioinformatics. 2011;27(13):511\u20136. https:\/\/doi.org\/10.1093\/bioinformatics\/btq380.","journal-title":"Bioinformatics"},{"issue":"1","key":"5510_CR4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12859-015-0571-7","volume":"16","author":"JL Larson","year":"2015","unstructured":"Larson JL, Owen AB. Moment based gene set tests. BMC Bioinf. 2015;16(1):1\u201317. https:\/\/doi.org\/10.1186\/s12859-015-0571-7.","journal-title":"BMC Bioinf"},{"issue":"9","key":"5510_CR5","doi-asserted-by":"publisher","first-page":"1943","DOI":"10.1093\/bioinformatics\/bti260","volume":"21","author":"WT Barry","year":"2005","unstructured":"Barry WT, Nobel AB, Wright FA. Significance analysis of functional categories in gene expression studies: A structured permutation approach. Bioinformatics. 2005;21(9):1943\u20139. https:\/\/doi.org\/10.1093\/bioinformatics\/bti260.","journal-title":"Bioinformatics"},{"issue":"43","key":"5510_CR6","doi-asserted-by":"publisher","first-page":"15545","DOI":"10.1073\/pnas.0506580102","volume":"102","author":"A Subramanian","year":"2005","unstructured":"Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceed Natl Academy Sci. 2005;102(43):15545\u201350. https:\/\/doi.org\/10.1073\/pnas.0506580102.","journal-title":"Proceed Natl Academy Sci"},{"issue":"1","key":"5510_CR7","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1214\/07-aoas101.0610667v2","volume":"1","author":"B Efron","year":"2007","unstructured":"Efron B, Tibshirani R. On testing the significance of sets of genes. Annals Appl Statist. 2007;1(1):107\u201329. https:\/\/doi.org\/10.1214\/07-aoas101.0610667v2.","journal-title":"Annals Appl Statist"},{"issue":"17","key":"5510_CR8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1093\/nar\/gks461","volume":"40","author":"D Wu","year":"2012","unstructured":"Wu D, Smyth GK. Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Res. 2012;40(17):1\u201312. https:\/\/doi.org\/10.1093\/nar\/gks461.","journal-title":"Nucleic Acids Res"},{"key":"5510_CR9","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2105-6-144","volume":"6","author":"SY Kim","year":"2005","unstructured":"Kim SY, Volsky DJ. PAGE: parametric analysis of gene set enrichment. BMC Bioinf. 2005;6:1\u201312. https:\/\/doi.org\/10.1186\/1471-2105-6-144.","journal-title":"BMC Bioinf"},{"key":"5510_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2105-10-161","volume":"10","author":"W Luo","year":"2009","unstructured":"Luo W, Friedman MS, Shedden K, Hankenson KD, Woolf PJ. GAGE: generally applicable gene set enrichment for pathway analysis. BMC Bioinf. 2009;10:1\u201317. https:\/\/doi.org\/10.1186\/1471-2105-10-161.","journal-title":"BMC Bioinf"},{"issue":"18","key":"5510_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1093\/nar\/gkt660","volume":"41","author":"G Yaari","year":"2013","unstructured":"Yaari G, Bolen CR, Thakar J, Kleinstein SH. Quantitative set analysis for gene expression: a method to quantify gene set differential expression including gene-gene correlations. Nucleic Acids Res. 2013;41(18):1\u201311. https:\/\/doi.org\/10.1093\/nar\/gkt660.","journal-title":"Nucleic Acids Res"},{"issue":"19","key":"5510_CR12","doi-asserted-by":"publisher","first-page":"2747","DOI":"10.1093\/bioinformatics\/btu374","volume":"30","author":"P Mishra","year":"2014","unstructured":"Mishra P, T\u00f6r\u00f6nen P, Leino Y, Holm L. Gene set analysis: limitations in popular existing methods and proposed improvements. Bioinformatics. 2014;30(19):2747\u201356. https:\/\/doi.org\/10.1093\/bioinformatics\/btu374.","journal-title":"Bioinformatics"},{"key":"5510_CR13","doi-asserted-by":"publisher","DOI":"10.1101\/060012","author":"A Sergushichev","year":"2016","unstructured":"Sergushichev A. An algorithm for fast preranked gene set enrichment analysis using cumulative statistic calculation. bioRxiv. 2016. https:\/\/doi.org\/10.1101\/060012.","journal-title":"bioRxiv"},{"key":"5510_CR14","unstructured":"GSEA-MSigDB Documentation. https:\/\/docs.gsea-msigdb.org\/. Accessed: 2023-01-30"},{"issue":"1","key":"5510_CR15","doi-asserted-by":"publisher","first-page":"472","DOI":"10.1007\/s11065-015-9294-9","volume":"25","author":"P Tamayo","year":"2016","unstructured":"Tamayo P, Steinhardt G, Liberzon A, Mesirov JP. The limitations of simple gene set enrichment analysis assuming gene independence. Stat Methods Med Res. 2016;25(1):472\u201387. https:\/\/doi.org\/10.1007\/s11065-015-9294-9. (Functional.15334406).","journal-title":"Stat Methods Med Res"},{"issue":"7","key":"5510_CR16","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1093\/nar\/gkv007","volume":"43","author":"ME Ritchie","year":"2015","unstructured":"Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):47. https:\/\/doi.org\/10.1093\/nar\/gkv007.","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"5510_CR17","doi-asserted-by":"publisher","first-page":"393","DOI":"10.1093\/bib\/bbv069","volume":"17","author":"Y Rahmatallah","year":"2016","unstructured":"Rahmatallah Y, Emmert-Streib F, Glazko G. Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline. Briefings Bioinf. 2016;17(3):393\u2013407. https:\/\/doi.org\/10.1093\/bib\/bbv069.","journal-title":"Briefings Bioinf"},{"issue":"1","key":"5510_CR18","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1007\/s11222-005-4789-5","volume":"15","author":"\u00d8 Langsrud","year":"2005","unstructured":"Langsrud \u00d8. Rotation tests. Stat Comput. 2005;15(1):53\u201360. https:\/\/doi.org\/10.1007\/s11222-005-4789-5.","journal-title":"Stat Comput"},{"key":"5510_CR19","doi-asserted-by":"publisher","unstructured":"Hanzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-Seq data 14(1), 7 (2013). https:\/\/doi.org\/10.1186\/1471-2105-14-7","DOI":"10.1186\/1471-2105-14-7"},{"issue":"11","key":"5510_CR20","doi-asserted-by":"publisher","first-page":"79217","DOI":"10.1371\/journal.pone.0079217","volume":"8","author":"AL Tarca","year":"2013","unstructured":"Tarca AL, Bhatti G, Romero R. A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity. PLoS ONE. 2013;8(11):79217. https:\/\/doi.org\/10.1371\/journal.pone.0079217.","journal-title":"PLoS ONE"},{"issue":"6","key":"5510_CR21","doi-asserted-by":"publisher","first-page":"417","DOI":"10.1016\/j.cels.2015.12.004","volume":"1","author":"A Liberzon","year":"2015","unstructured":"Liberzon A, Birger C, Thorvaldsd\u00f3ttir H, Ghandi M, Mesirov JP, Tamayo P. The molecular signatures database Hallmark gene set collection. Cell Syst. 2015;1(6):417\u201325. https:\/\/doi.org\/10.1016\/j.cels.2015.12.004.","journal-title":"Cell Syst"},{"key":"5510_CR22","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-020-3450-9","author":"D Gerard","year":"2020","unstructured":"Gerard D. Data-based RNA-seq simulations by binomial thinning. BMC Bioinf. 2020. https:\/\/doi.org\/10.1186\/s12859-020-3450-9.","journal-title":"BMC Bioinf"},{"issue":"550","key":"5510_CR23","doi-asserted-by":"publisher","first-page":"877","DOI":"10.1186\/s13059-014-0550-8","volume":"15","author":"MI Love","year":"2014","unstructured":"Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(550):877\u201387. https:\/\/doi.org\/10.1186\/s13059-014-0550-8.","journal-title":"Genome Biol"},{"key":"5510_CR24","unstructured":"Geistlinger L, Csaba G, Santarelli M, Schiffer L, Ramos M, Zimmer R, Waldron L. GSEABenchmarkeR: Reproducible GSEA Benchmarking. (2019). R package version 1.2.1. https:\/\/github.com\/waldronlab\/GSEABenchmarkeR"},{"key":"5510_CR25","doi-asserted-by":"publisher","first-page":"877","DOI":"10.1093\/nar\/gkw1012","volume":"45","author":"N Rappaport","year":"2017","unstructured":"Rappaport N, Twik M, Plaschkes I, Nudel R, Stein TI, Levitt J, Gershoni M, Morrey CP, Safran M, Lancet D. MalaCards: an amalgamated human disease compendium with diverse clinical and genetic annotation and structured search. Nucleic Acids Res. 2017;45:877\u201387. https:\/\/doi.org\/10.1093\/nar\/gkw1012.","journal-title":"Nucleic Acids Res"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05510-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-023-05510-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05510-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,30]],"date-time":"2023-10-30T15:03:28Z","timestamp":1698678208000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-023-05510-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,30]]},"references-count":25,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["5510"],"URL":"https:\/\/doi.org\/10.1186\/s12859-023-05510-x","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,30]]},"assertion":[{"value":"7 February 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 October 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 October 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"408"}}