{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,12]],"date-time":"2026-04-12T14:49:29Z","timestamp":1776005369228,"version":"3.50.1"},"reference-count":88,"publisher":"Springer Science and Business Media LLC","issue":"7","license":[{"start":{"date-parts":[[2023,7,17]],"date-time":"2023-07-17T00:00:00Z","timestamp":1689552000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,7,17]],"date-time":"2023-07-17T00:00:00Z","timestamp":1689552000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Nat Mach Intell"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Medical artificial intelligence (AI) has tremendous potential to advance healthcare by supporting and contributing to the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving both healthcare provider and patient experience. Unlocking this potential requires systematic, quantitative evaluation of the performance of medical AI models on large-scale, heterogeneous data capturing diverse patient populations. Here, to meet this need, we introduce MedPerf, an open platform for benchmarking AI models in the medical domain. MedPerf focuses on enabling federated evaluation of AI models, by securely distributing them to different facilities, such as healthcare organizations. This process of bringing the model to the data empowers each facility to assess and verify the performance of AI models in an efficient and human-supervised process, while prioritizing privacy. We describe the current challenges healthcare and AI communities face, the need for an open platform, the design philosophy of MedPerf, its current implementation status and real-world deployment, our roadmap and, importantly, the use of MedPerf with multiple international institutions within cloud-based technology and on-premises scenarios. Finally, we welcome new contributions by researchers and organizations to further strengthen MedPerf as an open benchmarking platform.<\/jats:p>","DOI":"10.1038\/s42256-023-00652-2","type":"journal-article","created":{"date-parts":[[2023,7,17]],"date-time":"2023-07-17T16:02:14Z","timestamp":1689609734000},"page":"799-810","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":129,"title":["Federated benchmarking of medical artificial intelligence with MedPerf"],"prefix":"10.1038","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1930-3410","authenticated-orcid":false,"given":"Alexandros","family":"Karargyris","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5561-6932","authenticated-orcid":false,"given":"Renato","family":"Umeton","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6571-0850","authenticated-orcid":false,"given":"Micah J.","family":"Sheller","sequence":"additional","affiliation":[]},{"given":"Alejandro","family":"Aristizabal","sequence":"additional","affiliation":[]},{"given":"Johnu","family":"George","sequence":"additional","affiliation":[]},{"given":"Anna","family":"Wuest","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2243-8487","authenticated-orcid":false,"given":"Sarthak","family":"Pati","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5830-8890","authenticated-orcid":false,"given":"Hasan","family":"Kassem","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8933-5995","authenticated-orcid":false,"given":"Maximilian","family":"Zenk","sequence":"additional","affiliation":[]},{"given":"Ujjwal","family":"Baid","sequence":"additional","affiliation":[]},{"given":"Prakash","family":"Narayana Moorthy","sequence":"additional","affiliation":[]},{"given":"Alexander","family":"Chowdhury","sequence":"additional","affiliation":[]},{"given":"Junyi","family":"Guo","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5440-8357","authenticated-orcid":false,"given":"Sahil","family":"Nalawade","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1767-1826","authenticated-orcid":false,"given":"Jacob","family":"Rosenthal","sequence":"additional","affiliation":[]},{"given":"David","family":"Kanter","sequence":"additional","affiliation":[]},{"given":"Maria","family":"Xenochristou","sequence":"additional","affiliation":[]},{"given":"Daniel J.","family":"Beutel","sequence":"additional","affiliation":[]},{"given":"Verena","family":"Chung","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5614-8977","authenticated-orcid":false,"given":"Timothy","family":"Bergquist","sequence":"additional","affiliation":[]},{"given":"James","family":"Eddy","sequence":"additional","affiliation":[]},{"given":"Abubakar","family":"Abid","sequence":"additional","affiliation":[]},{"given":"Lewis","family":"Tunstall","sequence":"additional","affiliation":[]},{"given":"Omar","family":"Sanseviero","sequence":"additional","affiliation":[]},{"given":"Dimitrios","family":"Dimitriadis","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1795-2038","authenticated-orcid":false,"given":"Yiming","family":"Qian","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1449-3072","authenticated-orcid":false,"given":"Xinxing","family":"Xu","sequence":"additional","affiliation":[]},{"given":"Yong","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Rick Siow Mong","family":"Goh","sequence":"additional","affiliation":[]},{"given":"Srini","family":"Bala","sequence":"additional","affiliation":[]},{"given":"Victor","family":"Bittorf","sequence":"additional","affiliation":[]},{"given":"Sreekar Reddy","family":"Puchala","sequence":"additional","affiliation":[]},{"given":"Biagio","family":"Ricciuti","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1056-8006","authenticated-orcid":false,"given":"Soujanya","family":"Samineni","sequence":"additional","affiliation":[]},{"given":"Eshna","family":"Sengupta","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3667-6796","authenticated-orcid":false,"given":"Akshay","family":"Chaudhari","sequence":"additional","affiliation":[]},{"given":"Cody","family":"Coleman","sequence":"additional","affiliation":[]},{"given":"Bala","family":"Desinghu","sequence":"additional","affiliation":[]},{"given":"Gregory","family":"Diamos","sequence":"additional","affiliation":[]},{"given":"Debo","family":"Dutta","sequence":"additional","affiliation":[]},{"given":"Diane","family":"Feddema","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7719-1624","authenticated-orcid":false,"given":"Grigori","family":"Fursin","sequence":"additional","affiliation":[]},{"given":"Xinyuan","family":"Huang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4624-5690","authenticated-orcid":false,"given":"Satyananda","family":"Kashyap","sequence":"additional","affiliation":[]},{"given":"Nicholas","family":"Lane","sequence":"additional","affiliation":[]},{"given":"Indranil","family":"Mallick","sequence":"additional","affiliation":[]},{"name":"FeTS Consortium","sequence":"additional","affiliation":[]},{"name":"BraTS-2020 Consortium","sequence":"additional","affiliation":[]},{"name":"AI4SafeChole Consortium","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7288-3023","authenticated-orcid":false,"given":"Pietro","family":"Mascagni","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9447-401X","authenticated-orcid":false,"given":"Virendra","family":"Mehta","sequence":"additional","affiliation":[]},{"given":"Cassiano Ferro","family":"Moraes","sequence":"additional","affiliation":[]},{"given":"Vivek","family":"Natarajan","sequence":"additional","affiliation":[]},{"given":"Nikola","family":"Nikolov","sequence":"additional","affiliation":[]},{"given":"Nicolas","family":"Padoy","sequence":"additional","affiliation":[]},{"given":"Gennady","family":"Pekhimenko","sequence":"additional","affiliation":[]},{"given":"Vijay Janapa","family":"Reddi","sequence":"additional","affiliation":[]},{"given":"G. Anthony","family":"Reina","sequence":"additional","affiliation":[]},{"given":"Pablo","family":"Ribalta","sequence":"additional","affiliation":[]},{"given":"Abhishek","family":"Singh","sequence":"additional","affiliation":[]},{"given":"Jayaraman J.","family":"Thiagarajan","sequence":"additional","affiliation":[]},{"given":"Jacob","family":"Albrecht","sequence":"additional","affiliation":[]},{"given":"Thomas","family":"Wolf","sequence":"additional","affiliation":[]},{"given":"Geralyn","family":"Miller","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9702-5524","authenticated-orcid":false,"given":"Huazhu","family":"Fu","sequence":"additional","affiliation":[]},{"given":"Prashant","family":"Shah","sequence":"additional","affiliation":[]},{"given":"Daguang","family":"Xu","sequence":"additional","affiliation":[]},{"given":"Poonam","family":"Yadav","sequence":"additional","affiliation":[]},{"given":"David","family":"Talby","sequence":"additional","affiliation":[]},{"given":"Mark M.","family":"Awad","sequence":"additional","affiliation":[]},{"given":"Jeremy P.","family":"Howard","sequence":"additional","affiliation":[]},{"given":"Michael","family":"Rosenthal","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7336-8071","authenticated-orcid":false,"given":"Luigi","family":"Marchionni","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9674-8379","authenticated-orcid":false,"given":"Massimo","family":"Loda","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8677-6237","authenticated-orcid":false,"given":"Jason M.","family":"Johnson","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8734-6482","authenticated-orcid":false,"given":"Spyridon","family":"Bakas","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5984-238X","authenticated-orcid":false,"given":"Peter","family":"Mattson","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,7,17]]},"reference":[{"key":"652_CR1","doi-asserted-by":"publisher","first-page":"e2233946","DOI":"10.1001\/jamanetworkopen.2022.33946","volume":"5","author":"D Plana","year":"2022","unstructured":"Plana, D. et al. Randomized clinical trials of machine learning interventions in health care: a systematic review. JAMA Netw. Open 5, e2233946 (2022).","journal-title":"JAMA Netw. Open"},{"key":"652_CR2","doi-asserted-by":"crossref","unstructured":"Chowdhury, A., Kassem, H., Padoy, N., Umeton, R. & Karargyris, A. A review of medical federated learning: applications in oncology and cancer research. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. BrainLes 2021. Lecture Notes in Computer Science, vol 12962 (eds. Crimi, A. & Bakas, S.) 3\u201324 (Springer, 2022).","DOI":"10.1007\/978-3-031-08999-2_1"},{"key":"652_CR3","doi-asserted-by":"publisher","first-page":"7346","DOI":"10.1038\/s41467-022-33407-5","volume":"13","author":"S Pati","year":"2022","unstructured":"Pati, S. et al. Federated learning enables big data for rare cancer boundary detection. Nat. Commun. 13, 7346 (2022).","journal-title":"Nat. Commun."},{"key":"652_CR4","unstructured":"Digital Health Center of Excellence (US Food and Drug Administration, 2023); https:\/\/www.fda.gov\/medical-devices\/digital-health-center-excellence"},{"key":"652_CR5","unstructured":"Regulatory Science Strategy (European Medicines Agency, 2023); https:\/\/www.ema.europa.eu\/en\/about-us\/how-we-work\/regulatory-science-strategy"},{"key":"652_CR6","unstructured":"Verma, A., Rao, K., Eluri, V. & Sharm, Y. Regulating AI in Public Health: Systems Challenges and Perspectives (ORF, 2020)."},{"key":"652_CR7","doi-asserted-by":"publisher","first-page":"582","DOI":"10.1038\/s41591-021-01312-x","volume":"27","author":"E Wu","year":"2021","unstructured":"Wu, E. et al. How medical AI devices are evaluated: limitations and recommendations from an analysis of FDA approvals. Nat. Med. 27, 582\u2013584 (2021).","journal-title":"Nat. Med."},{"key":"652_CR8","doi-asserted-by":"publisher","first-page":"e337","DOI":"10.1016\/S2589-7500(21)00076-5","volume":"3","author":"KN Vokinger","year":"2021","unstructured":"Vokinger, K. N., Feuerriegel, S. & Kesselheim, A. S. Continual learning in medical devices: FDA\u2019s action plan and beyond. Lancet Digit. Health 3, e337\u2013e338 (2021).","journal-title":"Lancet Digit. Health"},{"key":"652_CR9","doi-asserted-by":"publisher","first-page":"916","DOI":"10.1016\/j.ccell.2021.04.002","volume":"39","author":"BH Kann","year":"2021","unstructured":"Kann, B. H., Hosny, A. & Aerts, H. J. W. L. Artificial intelligence for clinical oncology. Cancer Cell 39, 916\u2013927 (2021).","journal-title":"Cancer Cell"},{"key":"652_CR10","unstructured":"Sharing Sensitive Health Data in a Federated Data Consortium Model: An Eight-Step Guide (World Economic Forum, 2020); https:\/\/www.weforum.org\/reports\/sharing-sensitive-health-data-in-a-federated-data-consortium-model-an-eight-step-guide"},{"key":"652_CR11","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-019-0155-4","volume":"2","author":"T Panch","year":"2019","unstructured":"Panch, T., Mattie, H. & Celi, L. A. The \u201cinconvenient truth\u201d about AI in healthcare. npj Digit. Med. 2, 77 (2019).","journal-title":"npj Digit. Med."},{"key":"652_CR12","doi-asserted-by":"publisher","first-page":"1212","DOI":"10.1001\/jama.2020.12067","volume":"324","author":"A Kaushal","year":"2020","unstructured":"Kaushal, A., Altman, R. & Langlotz, C. Geographic distribution of US cohorts used to train deep learning algorithms. J. Am. Med. Assoc. 324, 1212\u20131213 (2020).","journal-title":"J. Am. Med. Assoc."},{"key":"652_CR13","doi-asserted-by":"publisher","first-page":"e1002683","DOI":"10.1371\/journal.pmed.1002683","volume":"15","author":"JR Zech","year":"2018","unstructured":"Zech, J. R. et al. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 15, e1002683 (2018).","journal-title":"PLoS Med."},{"key":"652_CR14","doi-asserted-by":"publisher","first-page":"447","DOI":"10.1126\/science.aax2342","volume":"366","author":"Z Obermeyer","year":"2019","unstructured":"Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447\u2013453 (2019).","journal-title":"Science"},{"key":"652_CR15","doi-asserted-by":"publisher","first-page":"1135","DOI":"10.1001\/jamadermatol.2019.1735","volume":"155","author":"JK Winkler","year":"2019","unstructured":"Winkler, J. K. et al. Association between surgical skin markings in dermoscopic images and diagnostic performance of a deep learning convolutional neural network for melanoma recognition. JAMA Dermatol. 155, 1135\u20131141 (2019).","journal-title":"JAMA Dermatol."},{"key":"652_CR16","doi-asserted-by":"publisher","first-page":"1486","DOI":"10.1056\/NEJMlim035027","volume":"348","author":"GJ Annas","year":"2003","unstructured":"Annas, G. J. HIPAA regulations\u2014a new era of medical-record privacy? N. Engl. J. Med. 348, 1486\u20131490 (2003).","journal-title":"N. Engl. J. Med."},{"key":"652_CR17","doi-asserted-by":"crossref","unstructured":"Voigt, P. & von dem Bussche, A. The EU General Data Protection Regulation (GDPR) (Springer, 2017).","DOI":"10.1007\/978-3-319-57959-7"},{"key":"652_CR18","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-020-69250-1","volume":"10","author":"MJ Sheller","year":"2020","unstructured":"Sheller, M. J. et al. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Sci. Rep. 10, 12598 (2020).","journal-title":"Sci. Rep."},{"key":"652_CR19","first-page":"92","volume":"11383","author":"MJ Sheller","year":"2019","unstructured":"Sheller, M. J., Reina, G. A., Edwards, B., Martin, J. & Bakas, S. Multi-institutional deep learning modeling without sharing patient data: a feasibility study on brain tumor segmentation. Brainlesion 11383, 92\u2013104 (2019).","journal-title":"Brainlesion"},{"key":"652_CR20","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-020-00323-1","volume":"3","author":"N Rieke","year":"2020","unstructured":"Rieke, N. et al. The future of digital health with federated learning. npj Digit. Med. 3, 119 (2020).","journal-title":"npj Digit. Med."},{"key":"652_CR21","doi-asserted-by":"publisher","first-page":"675","DOI":"10.1148\/radiol.2020192536","volume":"295","author":"DB Larson","year":"2020","unstructured":"Larson, D. B., Magnus, D. C., Lungren, M. P., Shah, N. H. & Langlotz, C. P. Ethics of using and sharing clinical imaging data for artificial intelligence: a proposed framework. Radiology 295, 675\u2013682 (2020).","journal-title":"Radiology"},{"key":"652_CR22","doi-asserted-by":"crossref","unstructured":"Czempiel, T. et al. TeCNO: surgical phase recognition with multi-stage temporal convolutional networks. In Medical Image Computing and Computer Assisted Intervention\u2014MICCAI 2020. Lecture Notes in Computer Science, vol 12263 (eds. Martel, A. L. et al.) 343\u2013352 (Springer, 2020).","DOI":"10.1007\/978-3-030-59716-0_33"},{"key":"652_CR23","unstructured":"Oldenhof, M. et al. Industry-scale orchestrated federated learning for drug discovery. Preprint at https:\/\/arxiv.org\/abs\/2210.08871 (2022)."},{"key":"652_CR24","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1038\/s41591-022-02155-w","volume":"29","author":"J Ogier du Terrail","year":"2023","unstructured":"Ogier du Terrail, J. et al. Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer. Nat. Med. 29, 135\u2013146 (2023).","journal-title":"Nat. Med."},{"key":"652_CR25","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-020-77476-2","volume":"10","author":"G Geleijnse","year":"2020","unstructured":"Geleijnse, G. et al. Prognostic factors analysis for oral cavity cancer survival in the Netherlands and Taiwan using a privacy-preserving federated infrastructure. Sci. Rep. 10, 20526 (2020).","journal-title":"Sci. Rep."},{"key":"652_CR26","unstructured":"MedPerf: Clinically Impactful Machine Learning (MedPerf, 2023); https:\/\/www.medperf.org\/"},{"key":"652_CR27","doi-asserted-by":"crossref","unstructured":"Hitaj, B., Ateniese, G. & Perez-Cruz, F. Deep models under the GAN: information leakage from collaborative deep learning. In Proc. 2017 ACM SIGSAC Conference on Computer and Communications Security (eds Thuraisingham, B. et al.) 603\u2013618 (ACM, 2017).","DOI":"10.1145\/3133956.3134012"},{"key":"652_CR28","doi-asserted-by":"publisher","first-page":"473","DOI":"10.1038\/s42256-021-00337-8","volume":"3","author":"G Kaissis","year":"2021","unstructured":"Kaissis, G. et al. End-to-end privacy preserving deep learning on multi-institutional medical imaging. Nat. Mach. Intell. 3, 473\u2013484 (2021).","journal-title":"Nat. Mach. Intell."},{"key":"652_CR29","unstructured":"Mattson, P. et al. MLPerf training benchmark. Preprint at https:\/\/arxiv.org\/abs\/1910.01500 (2019)."},{"key":"652_CR30","unstructured":"MLPerf Inference Delivers Power Efficiency and Performance Gain (MLCommons, 2023); https:\/\/mlcommons.org\/en\/news\/mlperf-inference-1q2023\/"},{"key":"652_CR31","doi-asserted-by":"publisher","first-page":"214001","DOI":"10.1088\/1361-6560\/ac97d9","volume":"67","author":"P Foley","year":"2022","unstructured":"Foley, P. et al. OpenFL: the open federated learning library. Phys. Med. Biol. 67, 214001 (2022).","journal-title":"Phys. Med. Biol."},{"key":"652_CR32","unstructured":"microsoft\/msrflute (GitHub, 2023); https:\/\/github.com\/microsoft\/msrflute"},{"key":"652_CR33","unstructured":"Bakas, S. et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BraATS challenge. Preprint at https:\/\/arxiv.org\/abs\/1811.02629 (2018)."},{"key":"652_CR34","unstructured":"Pati, S. et al. The Federated Tumor Segmentation (FeTS) challenge. Preprint at https:\/\/arxiv.org\/abs\/2105.05874 (2021)."},{"key":"652_CR35","doi-asserted-by":"publisher","first-page":"vi135","DOI":"10.1093\/neuonc\/noab196.532","volume":"23","author":"U Baid","year":"2021","unstructured":"Baid, U. et al. NIMG-32: the Federated Tumor Segmentation (FeTS) Initiative: the first real-world large-scale data-private collaboration focusing on neuro-oncology. Neuro Oncol. 23, vi135\u2013vi136 (2021).","journal-title":"Neuro Oncol."},{"key":"652_CR36","doi-asserted-by":"publisher","first-page":"1113","DOI":"10.1038\/s41591-023-02332-5","volume":"29","author":"D Placido","year":"2023","unstructured":"Placido, D. et al. A deep learning algorithm to predict risk of pancreatic cancer from disease trajectories. Nat. Med. 29, 1113\u20131122 (2023).","journal-title":"Nat. Med."},{"key":"652_CR37","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1038\/s41591-021-01506-3","volume":"27","author":"I Dayan","year":"2021","unstructured":"Dayan, I. et al. Federated learning for predicting clinical outcomes in patients with COVID-19. Nat. Med. 27, 1735\u20131743 (2021).","journal-title":"Nat. Med."},{"key":"652_CR38","unstructured":"Federated Tumor Segmentation Challenge (Synapse, 2022); https:\/\/miccai2022.fets.ai\/"},{"key":"652_CR39","unstructured":"MedPerf Technical Documentation (MedPerf, 2023); https:\/\/docs.medperf.org\/"},{"key":"652_CR40","unstructured":"MedPerf Issue Tracker (GitHub, 2023); https:\/\/github.com\/mlcommons\/medperf\/issues"},{"key":"652_CR41","unstructured":"Synapse (Sage Bionetworks, 2023); https:\/\/www.synapse.org\/"},{"key":"652_CR42","unstructured":"Dream Challenges (Sage Bionetworks, 2023); https:\/\/dreamchallenges.org\/."},{"key":"652_CR43","doi-asserted-by":"publisher","DOI":"10.1186\/s13059-019-1794-0","volume":"20","author":"K Ellrott","year":"2019","unstructured":"Ellrott, K. et al. Reproducible biomedical benchmarking in the cloud: lessons from crowd-sourced data challenges. Genome Biol. 20, 195 (2019).","journal-title":"Genome Biol."},{"key":"652_CR44","unstructured":"The Digital Mammography DREAM Challenge (Synapse, 2018); https:\/\/www.synapse.org\/#!Synapse:syn4224222\/wiki\/401743"},{"key":"652_CR45","unstructured":"Hugging Face Hub Documentation (Hugging Face, 2023); https:\/\/huggingface.co\/docs\/hub\/index"},{"key":"652_CR46","unstructured":"PubMed Summarization Task: Leaderboards (Hugging Face, 2023); https:\/\/huggingface.co\/spaces\/autoevaluate\/leaderboards?dataset=Blaise-g%2FSumPubmed&only_verified=0&task=-any-&config=Blaise-g--SumPubmed&split=test&metric=loss"},{"key":"652_CR47","unstructured":"Lhoest, Q. et al. Datasets: a community library for natural language processing. In Proc. 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (eds Adel, H. & Shi, S.) 175\u2013184 (Association for Computational Linguistics, 2021)."},{"key":"652_CR48","unstructured":"Wolf, T. et al. Transformers: state-of-the-art natural language processing. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (eds Liu, Q. & Schlangen, D.) 38\u201345 (Association for Computational Linguistics, 2020)."},{"key":"652_CR49","doi-asserted-by":"crossref","unstructured":"von Werra, L. et al. Evaluate & evaluation on the hub: better best practices for data and model measurements. Preprint at https:\/\/arxiv.org\/abs\/2210.01970 (2022).","DOI":"10.18653\/v1\/2022.emnlp-demos.13"},{"key":"652_CR50","unstructured":"MONAI (MONAI, 2023); http:\/\/monai.io"},{"key":"652_CR51","unstructured":"Lobe (Lobe, 2021); https:\/\/www.lobe.ai\/"},{"key":"652_CR52","unstructured":"KNIME (KNIME, 2023); https:\/\/www.knime.com\/"},{"key":"652_CR53","unstructured":"fast.ai\u2014Making Neural Nets Uncool Again (fast.ai, 2023); http:\/\/fast.ai"},{"key":"652_CR54","unstructured":"GPT-4 (OpenAI, 2023); https:\/\/openai.com\/research\/gpt-4"},{"key":"652_CR55","unstructured":"Inference Endpoints (Hugging Face, 2023); https:\/\/huggingface.co\/inference-endpoints"},{"key":"652_CR56","unstructured":"MedPerf examples; http:\/\/medperf.org\/examples"},{"key":"652_CR57","doi-asserted-by":"publisher","first-page":"202","DOI":"10.1158\/1541-7786.MCR-21-0665","volume":"20","author":"J Rosenthal","year":"2022","unstructured":"Rosenthal, J. et al. Building tools for machine learning and artificial intelligence in cancer research: best practices and a case study with the PathML toolkit for computational pathology. Mol. Cancer Res. 20, 202\u2013206 (2022).","journal-title":"Mol. Cancer Res."},{"key":"652_CR58","unstructured":"Slideflow Documentation (Slideflow, 2022); http:\/\/slideflow.dev"},{"key":"652_CR59","doi-asserted-by":"crossref","unstructured":"Kocaman, V. & Talby, D. Spark NLP: natural language understanding at scale. Software Impacts 8, 100058 (2021).","DOI":"10.1016\/j.simpa.2021.100058"},{"key":"652_CR60","doi-asserted-by":"crossref","unstructured":"Kocaman, V. & Talby, D. Accurate clinical and biomedical Named entity recognition at scale. Software Impacts 13, 100373 (2022).","DOI":"10.1016\/j.simpa.2022.100373"},{"key":"652_CR61","unstructured":"Ul Haq, H., Kocaman, V. & Talby, D. Deeper clinical document understanding using relation extraction. In Proc. Workshop on Scientific Document Understanding (eds Veyseh, A. P. B. et al.) Vol. 3164 (CEUR-WS, 2022)."},{"key":"652_CR62","doi-asserted-by":"crossref","unstructured":"Ul Haq, H., Kocaman, V. & Talby, D. in Multimodal AI in Healthcare: A Paradigm Shift in Health Intelligence (eds Shaban-Nejad, A. et al.) 361\u2013375 (Springer, 2022).","DOI":"10.1007\/978-3-031-14771-5_26"},{"key":"652_CR63","unstructured":"SIG for Challenges (MICCAI, 2023); http:\/\/www.miccai.org\/special-interest-groups\/challenges\/"},{"key":"652_CR64","unstructured":"Reinke, A. et al. Common limitations of image processing metrics: a picture story. Preprint at https:\/\/arxiv.org\/abs\/2104.05642 (2021)."},{"key":"652_CR65","doi-asserted-by":"crossref","unstructured":"Reinke, A. et al. How to exploit weaknesses in biomedical challenge design and organization. In Medical Image Computing and Computer Assisted Intervention\u2014MICCAI 2018. Lecture Notes in Computer Science, vol 11073 (eds. Frangi, A. F. et al.) 388\u2013395 (Springer, 2018).","DOI":"10.1007\/978-3-030-00937-3_45"},{"key":"652_CR66","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-018-07619-7","volume":"9","author":"L Maier-Hein","year":"2018","unstructured":"Maier-Hein, L. et al. Why rankings of biomedical image analysis competitions should be interpreted with care. Nat. Commun. 9, 5217 (2018).","journal-title":"Nat. Commun."},{"key":"652_CR67","unstructured":"du Terrail, J. O. et al. FLamby: datasets and benchmarks for cross-silo federated learning in realistic healthcare settings. In Proc. Thirty-Sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (eds Koyejo, S. et al.) 5315\u20135334 (Curran Associates, Inc., 2022)."},{"key":"652_CR68","unstructured":"SPEC\u2019s Benchmarks and Tools (SPEC, 2022); https:\/\/www.spec.org\/benchmarks.html"},{"key":"652_CR69","unstructured":"MLFlow (MLFlow, 2023); https:\/\/mlflow.org"},{"key":"652_CR70","unstructured":"Kubeflow: The Machine Learning Toolkit for Kubernetes (Kubeflow, 2023); https:\/\/www.kubeflow.org\/"},{"key":"652_CR71","unstructured":"Substra Documentation (Substra, 2023); https:\/\/docs.substra.org\/"},{"key":"652_CR72","unstructured":"Fed-BioMedFederated Learning in Healthcare (Fed-Biomed, 2022); https:\/\/fedbiomed.gitlabpages.inria.fr\/"},{"key":"652_CR73","doi-asserted-by":"publisher","first-page":"1027","DOI":"10.1200\/CCI.20.00045","volume":"4","author":"J Scherer","year":"2020","unstructured":"Scherer, J. et al. Joint imaging platform for federated clinical data analytics. JCO Clin. Cancer Inform. 4, 1027\u20131038 (2020).","journal-title":"JCO Clin. Cancer Inform."},{"key":"652_CR74","doi-asserted-by":"crossref","unstructured":"Pati, S. et al. GaNDLF: the generally nuanced deep learning framework for scalable end-to-end clinical workflows. Comms. Eng. 2, 23 (2023).","DOI":"10.1038\/s44172-023-00066-3"},{"key":"652_CR75","unstructured":"mlcommons\/GaNDLF (GitHub, 2023); https:\/\/github.com\/mlcommons\/GaNDLF"},{"key":"652_CR76","doi-asserted-by":"publisher","first-page":"427","DOI":"10.1016\/S0024-6301(97)90262-4","volume":"30","author":"SAW Drew","year":"1997","unstructured":"Drew, S. A. W. From knowledge to action: the impact of benchmarking on organizational performance. Long Range Plann. 30, 427\u2013441 (1997).","journal-title":"Long Range Plann."},{"key":"652_CR77","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1109\/MM.2020.2974843","volume":"40","author":"P Mattson","year":"2020","unstructured":"Mattson, P. et al. Mlperf: an industry standard benchmark suite for machine learning performance. IEEE Micro 40, 8\u201316 (2020).","journal-title":"IEEE Micro"},{"key":"652_CR78","doi-asserted-by":"publisher","first-page":"lsab023","DOI":"10.1093\/jlb\/lsab023","volume":"8","author":"K Liddell","year":"2021","unstructured":"Liddell, K., Simon, D. A. & Lucassen, A. Patient data ownership: who owns your health? J. Law Biosci. 8, lsab023 (2021).","journal-title":"J. Law Biosci."},{"key":"652_CR79","unstructured":"Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People (US White House, 2023); https:\/\/www.whitehouse.gov\/ostp\/ai-bill-of-rights\/"},{"key":"652_CR80","first-page":"574","volume":"216","author":"G Hripcsak","year":"2015","unstructured":"Hripcsak, G. et al. Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers. Stud. Health Technol. Inform. 216, 574\u2013578 (2015).","journal-title":"Stud. Health Technol. Inform."},{"key":"652_CR81","unstructured":"Standardized Data: The OMOP Common Data Model (OHDSI, 2023); https:\/\/www.ohdsi.org\/data-standardization\/the-common-data-model\/"},{"key":"652_CR82","doi-asserted-by":"publisher","first-page":"1773","DOI":"10.1038\/s41591-022-01981-2","volume":"28","author":"JN Acosta","year":"2022","unstructured":"Acosta, J. N., Falcone, G. J., Rajpurkar, P. & Topol, E. J. Multimodal biomedical AI. Nat. Med. 28, 1773\u20131784 (2022).","journal-title":"Nat. Med."},{"key":"652_CR83","unstructured":"medperf\/server\/sql\/ (GitHub, 2023); https:\/\/github.com\/mlcommons\/MedPerf\/tree\/main\/server\/sql"},{"key":"652_CR84","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1007\/s10278-018-0142-3","volume":"32","author":"C Sirota-Cohen","year":"2019","unstructured":"Sirota-Cohen, C., Rosipko, B., Forsberg, D. & Sunshine, J. L. Implementation and benefits of a vendor-neutral archive and enterprise-imaging management system in an integrated delivery network. J. Digit. Imaging 32, 211\u2013220 (2019).","journal-title":"J. Digit. Imaging"},{"key":"652_CR85","doi-asserted-by":"publisher","first-page":"40","DOI":"10.4103\/jpi.jpi_69_18","volume":"9","author":"L Pantanowitz","year":"2018","unstructured":"Pantanowitz, L. et al. Twenty years of digital pathology: an overview of the road travelled, what is on the horizon, and the emergence of vendor-neutral archives. J. Pathol. Inform. 9, 40 (2018).","journal-title":"J. Pathol. Inform."},{"key":"652_CR86","unstructured":"Cox, R. W. et al. A (sort of) new image data format standard: NIfTI-1 National Institutes of Health https:\/\/nifti.nimh.nih.gov\/nifti-1\/documentation\/hbm_nifti_2004.pdf (2004)."},{"key":"652_CR87","unstructured":"Janeway, K. A. The PRISSMM Data Model. NCCR Cancer Center Supplemental Data Summit (2021); https:\/\/events.cancer.gov\/sites\/default\/files\/assets\/dccps\/dccps-nccrsummit\/08_Katie-Janeway_2021_02_08_PRISSMM.pdf"},{"key":"652_CR88","doi-asserted-by":"publisher","first-page":"103188","DOI":"10.1016\/j.jbi.2019.103188","volume":"94","author":"R Saripalle","year":"2019","unstructured":"Saripalle, R., Runyan, C. & Russell, M. Using HL7 FHIR to achieve interoperability in patient health record. J. Biomed. Inform. 94, 103188 (2019).","journal-title":"J. Biomed. Inform."}],"container-title":["Nature Machine Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s42256-023-00652-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s42256-023-00652-2","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s42256-023-00652-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,17]],"date-time":"2023-12-17T06:31:35Z","timestamp":1702794695000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s42256-023-00652-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,17]]},"references-count":88,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2023,7]]}},"alternative-id":["652"],"URL":"https:\/\/doi.org\/10.1038\/s42256-023-00652-2","relation":{},"ISSN":["2522-5839"],"issn-type":[{"value":"2522-5839","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,17]]},"assertion":[{"value":"30 October 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 April 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 July 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"These authors declare the following competing interests: B.R. is on the Regeneron advisory board. M.M.A. receives consulting fees from Genentech, Bristol-Myers Squibb, Merck, AstraZeneca, Maverick, Blueprint Medicine, Mirati, Amgen, Novartis, EMD Serono and Gritstone and research funding (to the Dana\u2013Farber Cancer Institute) from AstraZeneca, Lilly, Genentech, Bristol-Myers Squibb and Amgen. N.P. is a scientific advisor to Caresyntax. V.N. is employed by Google and owns stock as part of a standard compensation package. The other authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}]}}