{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,1]],"date-time":"2025-10-01T16:28:37Z","timestamp":1759336117156,"version":"build-2065373602"},"reference-count":55,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2025,10,1]],"date-time":"2025-10-01T00:00:00Z","timestamp":1759276800000},"content-version":"vor","delay-in-days":31,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100016353","name":"SDSC","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100016353","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Swiss Government Excellence Scholarship","award":["2021.0468"],"award-info":[{"award-number":["2021.0468"]}]},{"DOI":"10.13039\/501100001711","name":"Swiss National Science Foundation","doi-asserted-by":"publisher","award":["CRSII5 193832"],"award-info":[{"award-number":["CRSII5 193832"]}],"id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]},{"name":"European Union\u2019s Horizon 2020","award":["826121"],"award-info":[{"award-number":["826121"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,8,31]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Multi-omics data, which include genomic, transcriptomic, epigenetic, and proteomic data, are gaining increasing importance for determining the clinical outcomes of cancer patients. Several recent studies have evaluated various multimodal integration strategies for cancer survival prediction, highlighting the need for standardizing model performance results. Addressing this issue, we introduce SurvBoard, a benchmark framework that standardizes key experimental design choices. SurvBoard enables comparisons between single-cancer and pan-cancer data models and assesses the benefits of using patient data with missing modalities. We also address common pitfalls in preprocessing and validating multi-omics cancer survival models. We apply SurvBoard to several exemplary use cases, further confirming that statistical models tend to outperform deep learning methods, especially for metrics measuring survival function calibration. Moreover, most models exhibit better performance when trained in a pan-cancer context and can benefit from leveraging samples for which data of some omics modalities are missing. We provide a web service for model evaluation and to make our benchmark results easily accessible and viewable: https:\/\/www.survboard.science\/. All code is available on GitHub: https:\/\/github.com\/BoevaLab\/survboard\/. All benchmark outputs are available on Zenodo: 10.5281\/zenodo.11066226. A video tutorial on how to use the Survboard leaderboard is available on YouTube at https:\/\/youtu.be\/HJrdpJP8Vvk.<\/jats:p>","DOI":"10.1093\/bib\/bbaf521","type":"journal-article","created":{"date-parts":[[2025,9,15]],"date-time":"2025-09-15T11:32:52Z","timestamp":1757935972000},"source":"Crossref","is-referenced-by-count":0,"title":["SurvBoard: standardized benchmarking for multi-omics cancer survival models"],"prefix":"10.1093","volume":"26","author":[{"given":"David","family":"Wissel","sequence":"first","affiliation":[{"name":"Department of Computer Science , ETH Zurich, Zurich ,","place":["Switzerland"]},{"name":"Department of Molecular Life Sciences, University of Zurich, Zurich ,","place":["Switzerland"]},{"name":"Swiss Institute of Bioinformatics, Lausanne ,","place":["Switzerland"]}]},{"given":"Nikita","family":"Janakarajan","sequence":"additional","affiliation":[{"name":"Department of Computer Science , ETH Zurich, Zurich ,","place":["Switzerland"]},{"name":"IBM Research Europe, Zurich ,","place":["Switzerland"]}]},{"given":"Aayush","family":"Grover","sequence":"additional","affiliation":[{"name":"Department of Computer Science , ETH Zurich, Zurich ,","place":["Switzerland"]},{"name":"Swiss Institute of Bioinformatics, Lausanne ,","place":["Switzerland"]}]},{"given":"Enrico","family":"Toniato","sequence":"additional","affiliation":[{"name":"IBM Research Europe, Zurich ,","place":["Switzerland"]}]},{"given":"Maria Rodr\u00edguez","family":"Mart\u00ednez","sequence":"additional","affiliation":[{"name":"IBM Research Europe, Zurich ,","place":["Switzerland"]},{"name":"Yale School of Medicine , New Haven, CT 06510 ,","place":["USA"]}]},{"given":"Valentina","family":"Boeva","sequence":"additional","affiliation":[{"name":"Department of Computer Science , ETH Zurich, Zurich ,","place":["Switzerland"]},{"name":"Swiss Institute of Bioinformatics, Lausanne ,","place":["Switzerland"]},{"name":"Universit\u00e9 de Paris UMR-S1016 Institut Cochin , Inserm U1016 Paris ,","place":["France"]}]}],"member":"286","published-online":{"date-parts":[[2025,10,1]]},"reference":[{"key":"2025100108201461700_ref1","doi-asserted-by":"publisher","DOI":"10.1007\/b97377","volume-title":"Survival Analysis: Techniques for Censored and Truncated Data","author":"Klein","year":"2003"},{"key":"2025100108201461700_ref2","doi-asserted-by":"publisher","first-page":"1084","DOI":"10.1093\/jnci\/djy022","article-title":"Genomic amplifications and distal 6q loss: novel markers for poor survival in high-risk neuroblastoma patients","volume":"110","author":"Depuydt","year":"2018","journal-title":"J Natl Cancer Inst"},{"article-title":"Time-to-event prediction with neural networks and cox regression","year":"2019","author":"Kvamme","key":"2025100108201461700_ref3"},{"key":"2025100108201461700_ref4","first-page":"15111","article-title":"Deep extended hazard models for survival analysis","volume-title":"Advances in Neural Information Processing Systems","author":"Zhong","year":"2021"},{"key":"2025100108201461700_ref5","first-page":"1","article-title":"Survival analysis via ordinary differential equations","author":"Tang","year":"2022","journal-title":"J Am Stat Assoc"},{"key":"2025100108201461700_ref6","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11842","article-title":"Deephit: A deep learning approach to survival analysis with competing risks","volume-title":"Proceedings of the AAAI conference on artificial intelligence","author":"Lee","year":"2018"},{"key":"2025100108201461700_ref7","doi-asserted-by":"publisher","first-page":"181","DOI":"10.1093\/bfgp\/elad031","article-title":"Omics-based deep learning approaches for lung cancer decision-making and therapeutics development","volume":"23","author":"Tran","year":"2024","journal-title":"Brief Funct Genomics"},{"key":"2025100108201461700_ref8","doi-asserted-by":"publisher","first-page":"A68","DOI":"10.5114\/wo.2014.47136","article-title":"The cancer genome atlas (TCGA): an immeasurable source of knowledge","volume":"19","author":"Tomczak","year":"2015","journal-title":"Contemp Oncol"},{"key":"2025100108201461700_ref9","doi-asserted-by":"publisher","first-page":"993","DOI":"10.1038\/nature08987","article-title":"International network of cancer genome projects","volume":"464","author":"International Cancer Genome Consortium","year":"2010","journal-title":"Nature"},{"key":"2025100108201461700_ref10","doi-asserted-by":"publisher","first-page":"371","DOI":"10.1038\/nature25795","article-title":"Pan-cancer genome and transcriptome analyses of 1,699 paediatric leukaemias and solid tumours","volume":"555","author":"Ma","year":"2018","journal-title":"Nature"},{"key":"2025100108201461700_ref11","doi-asserted-by":"publisher","first-page":"166","DOI":"10.3389\/fgene.2019.00166","article-title":"Salmon: survival analysis learning with multi-omics neural networks on breast cancer","volume":"10","author":"Huang","year":"2019","journal-title":"Front Genet"},{"key":"2025100108201461700_ref12","doi-asserted-by":"publisher","first-page":"240","DOI":"10.3390\/genes10030240","article-title":"Group Lasso regularized deep learning for cancer prognosis from multi-omics and clinical features","volume":"10","author":"Xie","year":"2019","journal-title":"Genes"},{"key":"2025100108201461700_ref13","doi-asserted-by":"publisher","first-page":"3259","DOI":"10.1093\/bioinformatics\/btac286","article-title":"Tightly integrated multiomics-based deep tensor survival model for time-to-event prediction","volume":"38","author":"Zhang","year":"2022","journal-title":"Bioinformatics"},{"key":"2025100108201461700_ref14","doi-asserted-by":"publisher","first-page":"15761","DOI":"10.1038\/s41598-023-42365-x","article-title":"Autoencoder-based multimodal prediction of non-small cell lung cancer survival","volume":"13","author":"Ellen","year":"2023","journal-title":"Sci Rep"},{"key":"2025100108201461700_ref15","doi-asserted-by":"publisher","DOI":"10.1093\/bioadv\/vbad006","article-title":"Pancancer survival prediction using a deep learning architecture with multimodal representation and integration","volume":"3","author":"Fan","year":"2023","journal-title":"Bioinf Adv"},{"key":"2025100108201461700_ref16","doi-asserted-by":"publisher","first-page":"e1010921","DOI":"10.1371\/journal.pcbi.1010921","article-title":"Customics: a versatile deep-learning based strategy for multi-omics integration","volume":"19","author":"Benkirane","year":"2023","journal-title":"PLoS Comput Biol"},{"key":"2025100108201461700_ref17","doi-asserted-by":"publisher","first-page":"851","DOI":"10.1093\/bib\/bbw068","article-title":"Deep learning in bioinformatics","volume":"18","author":"Min","year":"2017","journal-title":"Brief Bioinform"},{"key":"2025100108201461700_ref18","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1007\/s10462-023-10681-3","article-title":"Deep learning for survival analysis: a review","volume":"57","author":"Wiegrebe","year":"2024","journal-title":"Artif Intell Rev"},{"key":"2025100108201461700_ref19","doi-asserted-by":"publisher","first-page":"394","DOI":"10.3389\/fbioe.2020.00394","article-title":"TOOme: a novel computational framework to infer cancer tissue-of-origin by integrating both gene mutation and expression","volume":"8","author":"He","year":"2020","journal-title":"Front Bioeng Biotechnol"},{"key":"2025100108201461700_ref20","doi-asserted-by":"publisher","first-page":"4013","DOI":"10.1038\/s41467-022-31666-w","article-title":"Machine learning-based tissue of origin classification for cancer of unknown primary diagnostics using genome-wide mutation features","volume":"13","author":"Nguyen","year":"2022","journal-title":"Nat Commun"},{"key":"2025100108201461700_ref21","doi-asserted-by":"publisher","first-page":"bbae028","DOI":"10.1093\/bib\/bbae028","article-title":"New techniques to identify the tissue of origin for cancer of unknown primary in the era of precision medicine: progress and challenges","volume":"25","author":"Ma","year":"2024","journal-title":"Brief Bioinform"},{"key":"2025100108201461700_ref22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-019-2942-y","article-title":"Block forests: random forests for blocks of clinical and omics covariate data","volume":"20","author":"Hornung","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2025100108201461700_ref23","doi-asserted-by":"publisher","first-page":"bbaa167","DOI":"10.1093\/bib\/bbaa167","article-title":"Large-scale benchmark study of survival prediction methods using multi-omics data","volume":"22","author":"Herrmann","year":"2021","journal-title":"Brief Bioinform"},{"key":"2025100108201461700_ref24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-021-92799-4","article-title":"Long-term cancer survival prediction using multimodal deep learning","volume":"11","author":"Vale-Silva","year":"2021","journal-title":"Sci Rep"},{"key":"2025100108201461700_ref25","doi-asserted-by":"publisher","first-page":"100461","DOI":"10.1016\/j.crmeth.2023.100461","article-title":"Systematic comparison of multi-omics survival models reveals a widespread lack of noise resistance","volume":"3","author":"Wissel","year":"2023","journal-title":"Cell Reports Methods"},{"key":"2025100108201461700_ref26","doi-asserted-by":"publisher","first-page":"i446","DOI":"10.1093\/bioinformatics\/btz342","article-title":"Deep learning with multimodal representation for pancancer prognosis prediction","volume":"35","author":"Cheerla","year":"2019","journal-title":"Bioinformatics"},{"key":"2025100108201461700_ref27","doi-asserted-by":"publisher","first-page":"291","DOI":"10.1093\/bib\/bbu003","article-title":"Combining multidimensional genomic measurements for predicting cancer prognosis: observations from tcga","volume":"16","author":"Zhao","year":"2015","journal-title":"Brief Bioinform"},{"key":"2025100108201461700_ref28","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2105-10-413","article-title":"Survival prediction from clinico-genomic models-a comparative study","volume":"10","author":"B\u00f8velstad","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2025100108201461700_ref29","first-page":"e1626","article-title":"Prediction approaches for partly missing multi-omics covariate data: a literature review and an empirical comparison study","author":"Hornung","year":"2023","journal-title":"Wiley Interdiscip Rev: Comput Stat"},{"key":"2025100108201461700_ref30","doi-asserted-by":"publisher","first-page":"e1441","DOI":"10.1002\/widm.1441","article-title":"Over-optimism in benchmark studies and the multiplicity of design and analysis options when interpreting their results","volume":"12","author":"Nie\u00dfl","year":"2022","journal-title":"Wiley Interdiscip Rev: Data Min Knowl Discovery"},{"key":"2025100108201461700_ref31","doi-asserted-by":"publisher","first-page":"100804","DOI":"10.1016\/j.patter.2023.100804","article-title":"Leakage and the reproducibility crisis in machine-learning-based science","volume":"4","author":"Kapoor","year":"2023","journal-title":"Patterns"},{"key":"2025100108201461700_ref32","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-018-2344-6","article-title":"Priority-Lasso: a simple hierarchical approach to the prediction of clinical outcome using multi-omics data","volume":"19","author":"Klau","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2025100108201461700_ref33","doi-asserted-by":"publisher","first-page":"258","DOI":"10.1080\/03610918.2010.535624","article-title":"Efficient estimation for a semiparametric extended hazards model","volume":"40","author":"Tseng","year":"2011","journal-title":"Commun Stat-Simul Comput"},{"key":"2025100108201461700_ref34","doi-asserted-by":"publisher","first-page":"e1006076","DOI":"10.1371\/journal.pcbi.1006076","article-title":"Cox-nnet: an artificial neural network method for prognosis prediction of high-throughput omics data","volume":"14","author":"Ching","year":"2018","journal-title":"PLoS Comput Biol"},{"key":"2025100108201461700_ref35","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12874-018-0482-1","article-title":"DeepSurv: personalized treatment recommender system using a cox proportional hazards deep neural network","volume":"18","author":"Katzman","year":"2018","journal-title":"BMC Med Res Methodol"},{"key":"2025100108201461700_ref36","doi-asserted-by":"publisher","first-page":"11707","DOI":"10.1038\/s41598-017-11817-6","article-title":"Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models","volume":"7","author":"Yousefi","year":"2017","journal-title":"Sci Rep"},{"key":"2025100108201461700_ref37","doi-asserted-by":"publisher","first-page":"401","DOI":"10.1158\/2159-8290.CD-12-0095","article-title":"The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data","volume":"2","author":"Cerami","year":"2012","journal-title":"Cancer Discov"},{"key":"2025100108201461700_ref38","doi-asserted-by":"publisher","first-page":"1113","DOI":"10.1038\/ng.2764","article-title":"The cancer genome atlas pan-cancer analysis project","volume":"45","author":"Weinstein","year":"2013","journal-title":"Nat Genet"},{"key":"2025100108201461700_ref39","doi-asserted-by":"publisher","first-page":"400","DOI":"10.1016\/j.cell.2018.02.052","article-title":"An integrated tcga pan-cancer clinical data resource to drive high-quality survival outcome analytics","volume":"173","author":"Liu","year":"2018","journal-title":"Cell"},{"key":"2025100108201461700_ref40","doi-asserted-by":"publisher","DOI":"10.1201\/9780429492259","volume-title":"Flexible Imputation of Missing Data","author":"Van Buuren","year":"2018"},{"key":"2025100108201461700_ref41","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12874-015-0088-9","article-title":"A measure of the impact of CV incompleteness on prediction error estimation with application to pca and normalization","volume":"15","author":"Hornung","year":"2015","journal-title":"BMC Med Res Methodol"},{"key":"2025100108201461700_ref42","doi-asserted-by":"publisher","first-page":"3927","DOI":"10.1002\/sim.2427","article-title":"A time-dependent discrimination index for survival data","volume":"24","author":"Antolini","year":"2005","journal-title":"Stat Med"},{"key":"2025100108201461700_ref43","doi-asserted-by":"publisher","first-page":"2529","DOI":"10.1002\/(sici)1097-0258(19990915\/30)18:17\/18&lt;2529::aid-sim274&gt;3.0.co;2-5","article-title":"Assessment and comparison of prognostic classification schemes for survival data","volume":"18","author":"Graf","year":"1999","journal-title":"Stat Med"},{"key":"2025100108201461700_ref44","first-page":"1","article-title":"Effective ways to build and evaluate individual survival distributions","volume":"21","author":"Haider","year":"2020","journal-title":"J Mach Learn Res"},{"key":"2025100108201461700_ref45","first-page":"1","article-title":"The brier score under administrative censoring: Problems and a solution","volume":"24","author":"Kvamme","year":"2023","journal-title":"J Mach Learn Res"},{"key":"2025100108201461700_ref46","doi-asserted-by":"publisher","first-page":"457","DOI":"10.1080\/01621459.1958.10501452","article-title":"Nonparametric estimation from incomplete observations","volume":"53","author":"Kaplan","year":"1958","journal-title":"J Am Stat Assoc"},{"key":"2025100108201461700_ref47","doi-asserted-by":"publisher","first-page":"1775","DOI":"10.1093\/bioinformatics\/btp322","article-title":"Gradient lasso for cox proportional hazards model","volume":"25","author":"Sohn","year":"2009","journal-title":"Bioinformatics"},{"key":"2025100108201461700_ref48","doi-asserted-by":"publisher","first-page":"15534","DOI":"10.1038\/s41598-020-72664-6","article-title":"How to do quantile normalization correctly for gene expression data analyses","volume":"10","author":"Zhao","year":"2020","journal-title":"Sci Rep"},{"key":"2025100108201461700_ref49","first-page":"41303","article-title":"Conformalized survival distributions: a generic post-process to increase calibration","volume-title":"Proceedings of the 41st International Conference on Machine Learning","author":"Qi","year":"2024"},{"key":"2025100108201461700_ref50","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13014-014-0260-0","article-title":"Quality assurance in radiotherapy: analysis of the causes of not starting or early radiotherapy withdrawal","volume":"9","author":"Arenas","year":"2014","journal-title":"Radiat Oncol"},{"key":"2025100108201461700_ref51","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/j.neunet.2021.12.015","article-title":"Survnam: The machine learning survival model explanation","volume":"147","author":"Utkin","year":"2022","journal-title":"Neural Netw"},{"article-title":"Interpretable machine learning for survival analysis","year":"2024","author":"Langbein","key":"2025100108201461700_ref52"},{"key":"2025100108201461700_ref53","doi-asserted-by":"publisher","first-page":"bbae185","DOI":"10.1093\/bib\/bbae185","article-title":"Deepkegg: A multi-omics data integration framework with biological insights for cancer recurrence prediction and biomarker discovery","volume":"25","author":"Lan","year":"2024","journal-title":"Brief Bioinform"},{"key":"2025100108201461700_ref54","doi-asserted-by":"publisher","first-page":"1292","DOI":"10.1080\/10618600.2022.2067548","article-title":"Survival regression with accelerated failure time model in xgboost","volume":"31","author":"Barnwal","year":"2022","journal-title":"J Comput Graph Stat"},{"key":"2025100108201461700_ref55","article-title":"Pervasive label errors in test sets destabilize machine learning benchmarks","volume-title":"Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks","author":"Northcutt","year":"2021"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/5\/bbaf521\/64460698\/bbaf521.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/5\/bbaf521\/64460698\/bbaf521.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,1]],"date-time":"2025-10-01T12:20:20Z","timestamp":1759321220000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbaf521\/8269886"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,31]]},"references-count":55,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,8,31]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbaf521","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"type":"print","value":"1467-5463"},{"type":"electronic","value":"1477-4054"}],"subject":[],"published-other":{"date-parts":[[2025,9]]},"published":{"date-parts":[[2025,8,31]]},"article-number":"bbaf521"}}