{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,31]],"date-time":"2026-05-31T14:03:56Z","timestamp":1780236236007,"version":"3.54.0"},"reference-count":58,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2026,5,31]],"date-time":"2026-05-31T00:00:00Z","timestamp":1780185600000},"content-version":"vor","delay-in-days":30,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"ERA-NET on Translational Cancer Research"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,5,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Gene set analysis evaluates the collective impact of groups of genes on an outcome of interest, such as disease occurrence. By incorporating biological knowledge through predefined gene sets, this approach enhances the interpretability of results and improves statistical power compared with gene-wise analyses. In the context of time-to-event data, existing methods are limited and fail to account for potentially strong correlations within gene sets. Given the strong performance of the Generalized Berk-Jones (GBJ) statistic, which effectively incorporates correlation within the test statistic, we adapted this method to the time-to-event framework using a Cox model. We then compared its performance with established methods, including the Cauchy, Harmonic Mean, Wald test, global test, and global boost test. We further benchmarked these methods in two different real-world datasets: gliomas and breast cancer. Our proposed method, sGBJ, shows an overcontrol of Type I error, leading to reduced statistical power compared with other methods in numerical studies particularly when the number of genes is greater than or equal to the number of observations. The Wald test and global boost test generally exhibited the highest power, except in very high-correlation settings for the global boost test, while the Wald test could not adjust for confounders in current implementations.<\/jats:p>","DOI":"10.1093\/bib\/bbag262","type":"journal-article","created":{"date-parts":[[2026,5,5]],"date-time":"2026-05-05T11:33:13Z","timestamp":1777980793000},"source":"Crossref","is-referenced-by-count":0,"title":["Gene set analysis for time-to-event outcome: comparison of a new approach based on the generalized Berk\u2013Jones statistic with existing methods in presence of intra gene-set correlation"],"prefix":"10.1093","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8455-4665","authenticated-orcid":false,"given":"Thomas","family":"Fert\u00e9","sequence":"first","affiliation":[{"name":"SISTM team, Inserm Bordeaux Population Health Research Center , UMR 1219, Univ. Bordeaux, 146 rue L\u00e9o Saignat, F-33000 Bordeaux, Gironde ,","place":["France"]},{"name":"SISTM team, Centre INRIA de l'Universit\u00e9 de Bordeaux , 200 Av. de la Vieille Tour, F-33400 Talence, Gironde ,","place":["France"]},{"name":"VRI, Vaccine Research Institute, H\u00f4pital Henri Mondor , 8 Rue du G\u00e9n\u00e9ral Sarrail, 94000 Cr\u00e9teil, F-94000 Cr\u00e9teil, Val-de-Marne ,","place":["France"]},{"name":"P\u00f4le de Sant\u00e9 Publique , CHU de Bordeaux, Pl. Am\u00e9lie Raba L\u00e9on, F-33000 Bordeaux ,","place":["France"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Laura","family":"Villain","sequence":"additional","affiliation":[{"name":"SISTM team, Inserm Bordeaux Population Health Research Center , UMR 1219, Univ. Bordeaux, 146 rue L\u00e9o Saignat, F-33000 Bordeaux, Gironde ,","place":["France"]},{"name":"SISTM team, Centre INRIA de l'Universit\u00e9 de Bordeaux , 200 Av. de la Vieille Tour, F-33400 Talence, Gironde ,","place":["France"]},{"name":"VRI, Vaccine Research Institute, H\u00f4pital Henri Mondor , 8 Rue du G\u00e9n\u00e9ral Sarrail, 94000 Cr\u00e9teil, F-94000 Cr\u00e9teil, Val-de-Marne ,","place":["France"]},{"name":"ESQlabs GmbH , Am Sportplatz 7, D-26683 Saterland, Niedersachsen ,","place":["Germany"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Rodolphe","family":"Thi\u00e9baut","sequence":"additional","affiliation":[{"name":"SISTM team, Inserm Bordeaux Population Health Research Center , UMR 1219, Univ. Bordeaux, 146 rue L\u00e9o Saignat, F-33000 Bordeaux, Gironde ,","place":["France"]},{"name":"SISTM team, Centre INRIA de l'Universit\u00e9 de Bordeaux , 200 Av. de la Vieille Tour, F-33400 Talence, Gironde ,","place":["France"]},{"name":"VRI, Vaccine Research Institute, H\u00f4pital Henri Mondor , 8 Rue du G\u00e9n\u00e9ral Sarrail, 94000 Cr\u00e9teil, F-94000 Cr\u00e9teil, Val-de-Marne ,","place":["France"]},{"name":"P\u00f4le de Sant\u00e9 Publique , CHU de Bordeaux, Pl. Am\u00e9lie Raba L\u00e9on, F-33000 Bordeaux ,","place":["France"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0646-452X","authenticated-orcid":false,"given":"Boris P","family":"Hejblum","sequence":"additional","affiliation":[{"name":"SISTM team, Inserm Bordeaux Population Health Research Center , UMR 1219, Univ. Bordeaux, 146 rue L\u00e9o Saignat, F-33000 Bordeaux, Gironde ,","place":["France"]},{"name":"SISTM team, Centre INRIA de l'Universit\u00e9 de Bordeaux , 200 Av. de la Vieille Tour, F-33400 Talence, Gironde ,","place":["France"]},{"name":"VRI, Vaccine Research Institute, H\u00f4pital Henri Mondor , 8 Rue du G\u00e9n\u00e9ral Sarrail, 94000 Cr\u00e9teil, F-94000 Cr\u00e9teil, Val-de-Marne ,","place":["France"]}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2026,5,31]]},"reference":[{"key":"2026053109465871100_ref1","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nrg2484","article-title":"RNA-Seq: a revolutionary tool for transcriptomics","volume":"10","author":"Wang","year":"2009","journal-title":"Nat Rev Genet"},{"key":"2026053109465871100_ref2","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1093\/bioinformatics\/btp616","article-title":"edgeR: bioconductor package for differential expression analysis of digital gene expression data","volume":"26","author":"Robinson","year":"2010","journal-title":"Bioinformatics"},{"key":"2026053109465871100_ref3","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1186\/s13059-014-0550-8","article-title":"Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2","volume":"15","author":"Love","year":"2014","journal-title":"Genome Biol"},{"key":"2026053109465871100_ref4","doi-asserted-by":"crossref","first-page":"R29","DOI":"10.1186\/gb-2014-15-2-r29","article-title":"Voom: precision weights unlock linear model analysis tools for RNA-seq read counts","volume":"15","author":"Law","year":"2014","journal-title":"Genome Biol"},{"key":"2026053109465871100_ref5","doi-asserted-by":"crossref","first-page":"lqaa093","DOI":"10.1093\/nargab\/lqaa093","article-title":"dearseq: a variance component score test for RNA-seq differential analysis that effectively controls the false discovery rate","volume":"2","author":"Gauthier","year":"2020","journal-title":"NAR Genom Bioinform"},{"key":"2026053109465871100_ref6","doi-asserted-by":"crossref","first-page":"1739","DOI":"10.1093\/bioinformatics\/btr260","article-title":"Molecular signatures database (MSigDB) 3.0","volume":"27","author":"Liberzon","year":"2011","journal-title":"Bioinformatics"},{"key":"2026053109465871100_ref7","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1093\/nar\/27.1.29","article-title":"KEGG: Kyoto encyclopedia of genes and genomes","volume":"27","author":"Ogata","year":"1999","journal-title":"Nucleic Acids Res"},{"key":"2026053109465871100_ref8","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat Genet"},{"key":"2026053109465871100_ref9","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1093\/bib\/bbt002","article-title":"Gene set analysis methods: statistical models and methodological differences","volume":"15","author":"Maciejewski","year":"2014","journal-title":"Brief Bioinform"},{"key":"2026053109465871100_ref10","doi-asserted-by":"crossref","first-page":"15545","DOI":"10.1073\/pnas.0506580102","article-title":"Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles","volume":"102","author":"Subramanian","year":"2005","journal-title":"Proc Natl Acad Sci"},{"key":"2026053109465871100_ref11","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1214\/07-AOAS101","article-title":"On testing the significance of sets of genes","volume":"1","author":"Efron","year":"2007","journal-title":"Ann Appl Stat"},{"key":"2026053109465871100_ref12","doi-asserted-by":"crossref","first-page":"e1004310","DOI":"10.1371\/journal.pcbi.1004310","article-title":"Time-course gene set analysis for longitudinal gene expression data","volume":"11","author":"Hejblum","year":"2015","journal-title":"PLoS Comput Biol"},{"key":"2026053109465871100_ref13","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1080\/01621459.2016.1192039","article-title":"The generalized higher criticism for testing SNP-set effects in genetic association studies","volume":"112","author":"Barnett","year":"2017","journal-title":"J Am Stat Assoc"},{"key":"2026053109465871100_ref14","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1093\/biostatistics\/kxx005","article-title":"Variance component score test for time-course gene set analysis of longitudinal RNA-seq data","volume":"18","author":"Agniel","year":"2017","journal-title":"Biostatistics"},{"key":"2026053109465871100_ref15","doi-asserted-by":"crossref","first-page":"4568","DOI":"10.1093\/bioinformatics\/btz277","article-title":"Identification of differentially expressed gene sets using the generalized Berk\u2013Jones statistic","volume":"35","author":"Gaynor","year":"2019","journal-title":"Bioinformatics"},{"key":"2026053109465871100_ref16","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1214\/08-AOAS169","article-title":"Random survival forests","volume":"2","author":"Ishwaran","year":"2008","journal-title":"Ann Appl Stat"},{"key":"2026053109465871100_ref17","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1002\/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3","article-title":"The lasso method for variable selection in the Cox model","volume":"16","author":"Tibshirani","year":"1997","journal-title":"Stat Med"},{"key":"2026053109465871100_ref18","doi-asserted-by":"crossref","first-page":"975","DOI":"10.1111\/j.1541-0420.2010.01544.x","article-title":"Kernel machine approach to testing the significance of multiple genetic markers for risk prediction","volume":"67","author":"Cai","year":"2011","journal-title":"Biometrics"},{"key":"2026053109465871100_ref19","doi-asserted-by":"crossref","first-page":"1099","DOI":"10.1177\/0962280216653427","article-title":"Kernel machine score test for pathway analysis in the presence of semi-competing risks","volume":"27","author":"Neykov","year":"2018","journal-title":"Stat Methods Med Res"},{"key":"2026053109465871100_ref20","doi-asserted-by":"crossref","first-page":"1950","DOI":"10.1093\/bioinformatics\/bti267","article-title":"Testing association of a pathway with survival using gene expression data","volume":"21","author":"Goeman","year":"2005","journal-title":"Bioinformatics"},{"key":"2026053109465871100_ref21","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1089\/cmb.2008.0002","article-title":"Pathway analysis of microarray data via regression","volume":"15","author":"Adewale","year":"2008","journal-title":"J Comput Biol"},{"key":"2026053109465871100_ref22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-11-78","article-title":"Testing the additional predictive value of high-dimensional molecular data","volume":"11","author":"Boulesteix","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2026053109465871100_ref23","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1186\/1471-2105-12-377","article-title":"A comparative study on gene-set analysis methods for assessing differential expression associated with the survival phenotype","volume":"12","author":"Lee","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2026053109465871100_ref24","volume-title":"Journal of the American Statistical Association","author":"Sun"},{"key":"2026053109465871100_ref25","doi-asserted-by":"crossref","first-page":"e1007530","DOI":"10.1371\/journal.pgen.1007530","article-title":"Powerful gene set analysis in GWAS with the generalized Berk-Jones statistic","volume":"15","author":"Sun","year":"2019","journal-title":"PLoS Genet"},{"key":"2026053109465871100_ref26","doi-asserted-by":"crossref","first-page":"775","DOI":"10.1016\/S1573-4412(84)02005-5","article-title":"Chapter 13: Wald, likelihood ratio, and Lagrange multiplier tests in econometrics","volume-title":"Handbook of Econometrics","author":"Engle","year":"1984"},{"key":"2026053109465871100_ref27","doi-asserted-by":"crossref","first-page":"771","DOI":"10.1593\/neo.11806","article-title":"G-DOC: a systems medicine platform for personalized oncology","volume":"13","author":"Madhavan","year":"2011","journal-title":"Neoplasia"},{"key":"2026053109465871100_ref28","doi-asserted-by":"crossref","first-page":"1999","DOI":"10.1056\/NEJMoa021967","article-title":"A gene-expression signature as a predictor of survival in breast cancer","volume":"347","author":"Van De Vijver","year":"2002","journal-title":"N Engl J Med"},{"key":"2026053109465871100_ref29","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1093\/imanum\/22.3.329","article-title":"Computing the nearest correlation matrix\u2014a problem from finance","volume":"22","author":"Higham","journal-title":"IMA J Numer Anal"},{"key":"2026053109465871100_ref30","volume-title":"ICSKAT: Interval-Censored Sequence Kernel Association Test","author":"Sun","year":"2025"},{"key":"2026053109465871100_ref31","doi-asserted-by":"crossref","first-page":"1573","DOI":"10.1111\/biom.13636","article-title":"Inference for set-based effects in genetic association studies with interval-censored outcomes","volume":"79","author":"Sun","year":"2023","journal-title":"Biometrics"},{"key":"2026053109465871100_ref32","volume-title":"Proceedings of the National Academy of Sciences USA","author":"Wilson"},{"key":"2026053109465871100_ref33","doi-asserted-by":"crossref","first-page":"896","DOI":"10.1093\/neuonc\/nou087","article-title":"The epidemiology of glioma in adults: a \u2018state of the science\u2019 review","volume":"16","author":"Ostrom","year":"2014","journal-title":"Neuro-oncology"},{"key":"2026053109465871100_ref34","doi-asserted-by":"crossref","first-page":"284","DOI":"10.1007\/s13311-017-0519-x","article-title":"Glioma subclassifications and their clinical significance","volume":"14","author":"Chen","year":"2017","journal-title":"Neurotherapeutics"},{"key":"2026053109465871100_ref35","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1097\/01.nrl.0000250928.26044.47","article-title":"Glioma therapy in adults","volume":"12","author":"Norden","year":"2006","journal-title":"Neurologist"},{"key":"2026053109465871100_ref36","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1002\/cncr.22741","article-title":"Molecularly targeted therapy for malignant glioma","volume":"110","author":"Sathornsumetee","year":"2007","journal-title":"Cancer"},{"key":"2026053109465871100_ref37","volume-title":"BMC Bioinformatics","author":"Bhuvaneshwar"},{"key":"2026053109465871100_ref38","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1158\/1541-7786.MCR-08-0435","article-title":"Rembrandt: helping personalized medicine become a reality through integrative translational research","volume":"7","author":"Madhavan","year":"2009","journal-title":"Mol Cancer Res"},{"key":"2026053109465871100_ref39","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J R Stat Soc Ser B Methodol"},{"key":"2026053109465871100_ref40","doi-asserted-by":"crossref","DOI":"10.2202\/1544-6115.1585","article-title":"Permutation p-values should never be zero: calculating exact p-values when permutations are randomly drawn","volume":"9","author":"Phipson","year":"2010","journal-title":"Stat Appl Genet Mol Biol"},{"key":"2026053109465871100_ref41","doi-asserted-by":"crossref","first-page":"697","DOI":"10.2217\/fon.12.61","article-title":"The global breast cancer burden","volume":"8","author":"Benson","year":"2012","journal-title":"Future Oncol"},{"key":"2026053109465871100_ref42","first-page":"311","article-title":"Breast cancer metastasis","volume":"9","author":"Scully","year":"2012","journal-title":"Cancer Genomics Proteomics"},{"key":"2026053109465871100_ref43","volume-title":"breastCancerNKI: Genexpression Dataset Published by van\u2019t Veer Et al. [2002] and Van de Vijver Et al. [2002] (NKI).","author":"Schroeder","year":"2025"},{"key":"2026053109465871100_ref44","doi-asserted-by":"crossref","first-page":"e1004791","DOI":"10.1371\/journal.pcbi.1004791","article-title":"Context specific and differential gene co-expression networks via Bayesian Biclustering","volume":"12","author":"Gao","year":"2016","journal-title":"PLoS Comput Biol"},{"key":"2026053109465871100_ref45","volume-title":"Impute: Impute: Imputation for Microarray Data","author":"Hastie","year":"2025"},{"key":"2026053109465871100_ref46","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1093\/bioinformatics\/19.2.185","article-title":"A comparison of normalization methods for high density oligonucleotide array data based on variance and bias","volume":"19","author":"Bolstad","year":"2003","journal-title":"Bioinformatics"},{"key":"2026053109465871100_ref47","doi-asserted-by":"crossref","first-page":"624","DOI":"10.1136\/bmj.321.7261.624","article-title":"Breast cancer\u2014epidemiology, risk factors, and genetics","volume":"321","author":"McPherson","year":"2000","journal-title":"BMJ"},{"key":"2026053109465871100_ref48","doi-asserted-by":"crossref","first-page":"D480","DOI":"10.1093\/nar\/gkm882","article-title":"KEGG for linking genomes to life and the environment","volume":"36","author":"Kanehisa","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2026053109465871100_ref49","volume-title":"Msigdbr: Msigdb Gene Sets for Multiple Organisms in a Tidy Data Format","author":"Dolgalev","year":"2025"},{"key":"2026053109465871100_ref50","doi-asserted-by":"crossref","first-page":"23384","DOI":"10.1073\/pnas.1910684116","article-title":"Reply to Goeman et\u00a0al.: trade-offs in model averaging using multilevel tests","volume":"116","author":"Wilson","year":"2019","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2026053109465871100_ref51","volume-title":"Effective Positive Cauchy Combination Test","author":"Ouyang","year":"2024"},{"key":"2026053109465871100_ref52","volume-title":"Truncated Cauchy Combination Test: A Robust and Powerful P-Value Combination Method with Arbitrary Correlations","author":"Chen","year":"2025"},{"key":"2026053109465871100_ref53","doi-asserted-by":"crossref","first-page":"23382","DOI":"10.1073\/pnas.1909339116","article-title":"The harmonic mean p-value: strong versus weak control, and the assumption of independence","volume":"116","author":"Goeman","year":"2019","journal-title":"Proc Natl Acad Sci"},{"key":"2026053109465871100_ref54","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1016\/S0047-259X(03)00096-4","article-title":"A well-conditioned estimator for large-dimensional covariance matrices","volume":"88","author":"Olivier","year":"2004","journal-title":"J Multivar Anal"},{"key":"2026053109465871100_ref55","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1016\/j.ncl.2018.04.002","article-title":"Evolving insights into the molecular neuropathology of diffuse gliomas in adults","volume":"36","author":"Barthel","year":"2018","journal-title":"Neurol Clin"},{"key":"2026053109465871100_ref56","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1186\/s10020-022-00454-z","article-title":"Molecular landscape of IDH-mutant astrocytoma and oligodendroglioma grade 2 indicate tumor purity as an underlying genomic factor","volume":"28","author":"Zhao","year":"2022","journal-title":"Mol Med"},{"key":"2026053109465871100_ref57","doi-asserted-by":"crossref","first-page":"2235","DOI":"10.1111\/j.1349-7006.2009.01308.x","article-title":"Genetic alterations and signaling pathways in the evolution of gliomas","volume":"100","author":"Ohgaki","year":"2009","journal-title":"Cancer Sci"},{"key":"2026053109465871100_ref58","volume-title":"Thomasferte\/sGBJ_computation: v1.1.0","author":"Fert\u00e9"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/27\/3\/bbag262\/68438298\/bbag262.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/27\/3\/bbag262\/68438298\/bbag262.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,31]],"date-time":"2026-05-31T13:47:17Z","timestamp":1780235237000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbag262\/8698818"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,5]]},"references-count":58,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,5,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbag262","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,5]]},"published":{"date-parts":[[2026,5]]},"article-number":"bbag262"}}