{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T09:20:15Z","timestamp":1777368015347,"version":"3.51.4"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,9,7]],"date-time":"2021-09-07T00:00:00Z","timestamp":1630972800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,9,7]],"date-time":"2021-09-07T00:00:00Z","timestamp":1630972800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Privacy protection is paramount in conducting health research. However, studies often rely on data stored in a centralized repository, where analysis is done with full access to the sensitive underlying content. Recent advances in federated learning enable building complex machine-learned models that are trained in a distributed fashion. These techniques facilitate the calculation of research study endpoints such that private data never leaves a given device or healthcare system. We show\u2014on a diverse set of single and multi-site health studies\u2014that federated models can achieve similar accuracy, precision, and generalizability, and lead to the same interpretation as standard centralized statistical models while achieving considerably stronger privacy protections and without significantly raising computational costs. This work is the first to apply modern and general federated learning methods that explicitly incorporate differential privacy to clinical and epidemiological research\u2014across a spectrum of units of federation, model architectures, complexity of learning tasks and diseases. As a result, it enables health research participants to remain in control of their data and still contribute to advancing science\u2014aspects that used to be at odds with each other.<\/jats:p>","DOI":"10.1038\/s41746-021-00489-2","type":"journal-article","created":{"date-parts":[[2021,9,7]],"date-time":"2021-09-07T06:03:12Z","timestamp":1630994592000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":167,"title":["Privacy-first health research with federated learning"],"prefix":"10.1038","volume":"4","author":[{"given":"Adam","family":"Sadilek","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2704-4030","authenticated-orcid":false,"given":"Luyang","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Dung","family":"Nguyen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8680-7061","authenticated-orcid":false,"given":"Methun","family":"Kamruzzaman","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2477-6060","authenticated-orcid":false,"given":"Stylianos","family":"Serghiou","sequence":"additional","affiliation":[]},{"given":"Benjamin","family":"Rader","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3155-4319","authenticated-orcid":false,"given":"Alex","family":"Ingerman","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6526-6913","authenticated-orcid":false,"given":"Stefan","family":"Mellem","sequence":"additional","affiliation":[]},{"given":"Peter","family":"Kairouz","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9170-8714","authenticated-orcid":false,"given":"Elaine O.","family":"Nsoesie","sequence":"additional","affiliation":[]},{"given":"Jamie","family":"MacFarlane","sequence":"additional","affiliation":[]},{"given":"Anil","family":"Vullikanti","sequence":"additional","affiliation":[]},{"given":"Madhav","family":"Marathe","sequence":"additional","affiliation":[]},{"given":"Paul","family":"Eastham","sequence":"additional","affiliation":[]},{"given":"John S.","family":"Brownstein","sequence":"additional","affiliation":[]},{"given":"Blaise Aguera y.","family":"Arcas","sequence":"additional","affiliation":[]},{"given":"Michael D.","family":"Howell","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3313-430X","authenticated-orcid":false,"given":"John","family":"Hernandez","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,9,7]]},"reference":[{"key":"489_CR1","first-page":"3837","volume":"108","author":"W Zhu","year":"2020","unstructured":"Zhu, W., Kairouz, P., Sun, H., McMahan, B. & Li, W. Federated heavy hitters with differential privacy. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics. PMLR 108, 3837\u20133847 (2020).","journal-title":"PMLR"},{"key":"489_CR2","unstructured":"Hanzely, Filip, et al. Lower Bounds and Optimal Algorithms for Personalized Federated Learning. Advances in Neural Information Processing Systems 33 (2020)."},{"key":"489_CR3","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-020-69250-1","volume":"10","author":"MJ Sheller","year":"2020","unstructured":"Sheller, M. J. et al. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Sci. Rep. 10, 12598 (2020).","journal-title":"Sci. Rep."},{"key":"489_CR4","unstructured":"Vaid, Akhil, et al. Federated Learning of Electronic Health Records Improves Mortality Prediction in Patients. Ethnicity 52.77.6: 0-001."},{"key":"489_CR5","first-page":"313","volume":"2019","author":"O Choudhury","year":"2020","unstructured":"Choudhury, O. et al. Predicting adverse drug reactions on distributed health data using federated learning. AMIA Annu. Symp. Proc. 2019, 313\u2013322 (2020). eCollection 2019.","journal-title":"AMIA Annu. Symp. Proc."},{"key":"489_CR6","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1016\/j.ijmedinf.2018.01.007","volume":"112","author":"TS Brisimi","year":"2018","unstructured":"Brisimi, T. S. et al. Federated learning of predictive models from federated Electronic Health Records. Int. J. Med. Inform. 112, 59\u201367 (2018).","journal-title":"Int. J. Med. Inform."},{"key":"489_CR7","first-page":"574","volume":"216","author":"G Hripcsak","year":"2015","unstructured":"Hripcsak, G. et al. Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers. Stud. Health Technol. Inform. 216, 574\u2013578 (2015).","journal-title":"Stud. Health Technol. Inform."},{"key":"489_CR8","unstructured":"ClinicalTrials.gov: National Library of Medicine (US). NCT04663776, wide scale monitoring for acute respiratory infection using a mobile-based study platform. (2020). https:\/\/clinicaltrials.gov\/ct2\/show\/NCT04663776."},{"key":"489_CR9","doi-asserted-by":"publisher","unstructured":"Sweeney L., Abu A. & Winn J. Identifying participants in the personal genome project by name. SSRN Electron J (2013). published online May. https:\/\/doi.org\/10.2139\/ssrn.2257732.","DOI":"10.2139\/ssrn.2257732"},{"key":"489_CR10","unstructured":"Personal Data for the Public Good: New Opportunities to Enrich Understanding of Individual and Population Health (2014). Health Data Exploration Project. http:\/\/hdexplore.calit2.net\/wp-content\/uploads\/2015\/08\/hdx_final_report_small.pdf (Accessed Dec 2020)."},{"key":"489_CR11","doi-asserted-by":"publisher","DOI":"10.1186\/s12911-020-1023-5","volume":"20","author":"D Chicco","year":"2020","unstructured":"Chicco, D. & Jurman, G. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Med. Inf. Decis. Mak. 20, 16 (2020).","journal-title":"BMC Med. Inf. Decis. Mak."},{"key":"489_CR12","doi-asserted-by":"publisher","DOI":"10.1038\/sdata.2016.35","volume":"3","author":"AEW Johnson","year":"2016","unstructured":"Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016).","journal-title":"Sci. Data"},{"key":"489_CR13","doi-asserted-by":"publisher","first-page":"784","DOI":"10.1038\/s43018-020-0104-9","volume":"1","author":"M Rugge","year":"2020","unstructured":"Rugge, M., Zorzi, M. & Guzzinati, S. SARS-CoV-2 infection in the Italian Veneto region: adverse outcomes in patients with cancer. Nat. Cancer 1, 784\u2013788 (2020).","journal-title":"Nat. Cancer"},{"key":"489_CR14","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-020-00323-1","volume":"3","author":"N Rieke","year":"2020","unstructured":"Rieke, N. et al. The future of digital health with federated learning. npj Digital Med. 3, 119 (2020).","journal-title":"npj Digital Med."},{"key":"489_CR15","doi-asserted-by":"publisher","first-page":"e20","DOI":"10.2196\/medinform.7744","volume":"6","author":"J Lee","year":"2018","unstructured":"Lee, J. et al. Privacy-preserving patient similarity learning in a federated environment: development and analysis. JMIR Med. Inform. 6, e20 (2018).","journal-title":"JMIR Med. Inform."},{"key":"489_CR16","doi-asserted-by":"publisher","first-page":"103291","DOI":"10.1016\/j.jbi.2019.103291","volume":"99","author":"L Huang","year":"2019","unstructured":"Huang, L. et al. Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. J. Biomed. Inform. 99, 103291 (2019).","journal-title":"J. Biomed. Inform."},{"key":"489_CR17","first-page":"50","volume":"37","author":"T Li","year":"2020","unstructured":"Li, T., Sahu, A. K., Talwalkar, A. & Smith, V. Federated learning: challenges, methods, and future directions. IEEE Signal Process Mag. 37, 50\u201360 (2020).","journal-title":"IEEE Signal Process Mag."},{"key":"489_CR18","doi-asserted-by":"crossref","unstructured":"Hsu, T.M.H., Qi, H. & Brown, M. Federated visual classification with real-world data distribution. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part X 16 (pp. 76-92). Springer International Publishing (2020).","DOI":"10.1007\/978-3-030-58607-2_5"},{"key":"489_CR19","doi-asserted-by":"publisher","first-page":"431","DOI":"10.1109\/TCBB.2016.2515610","volume":"13","author":"Y Gong","year":"2016","unstructured":"Gong, Y., Fang, Y. & Guo, Y. Private data analytics on biomedical sensing data via distributed computation. IEEE\/ACM Trans. Comput Biol. Bioinform. 13, 431\u2013444 (2016).","journal-title":"IEEE\/ACM Trans. Comput Biol. Bioinform."},{"key":"489_CR20","unstructured":"Geyer R. C., Klein T., Nabi M. Differentially private federated learning: a client level perspective. arXiv 2017; published online Dec. http:\/\/arxiv.org\/abs\/1712.07557 (Accessed 23 Nov 2020)."},{"key":"489_CR21","unstructured":"Ramage D. & Mazzocchi S. Federated analytics: collaborative data science without data collection. Google AI Blog. (2020). https:\/\/ai.googleblog.com\/2020\/05\/federated-analytics-collaborative-data.html (Accessed Nov 2020)."},{"key":"489_CR22","unstructured":"Bonawitz K. et al. TensorFlow federated: machine learning on decentralized data. (2020). https:\/\/www.tensorflow.org\/federated (accessed Nov 2020)."},{"key":"489_CR23","unstructured":"Zhu L., Liu Z. & Han S. Deep leakage from gradients. arXiv. (2019). published online June. https:\/\/arxiv.org\/abs\/1906.08935v2 (accessed Nov 2020)."},{"key":"489_CR24","doi-asserted-by":"crossref","unstructured":"Thakkar O., Ramaswamy S., Mathews R. & Beaufays F. Understanding unintended memorization in federated learning. arXiv (2020). published online June. http:\/\/arxiv.org\/abs\/2006.07490 (Accessed 23 Nov 2020).","DOI":"10.18653\/v1\/2021.privatenlp-1.1"},{"key":"489_CR25","unstructured":"Carlini N., Liu C., Erlingsson \u00da., Kos J. & Song D. The secret Sharer: evaluating and testing unintended memorization in neural networks. In: Proceedings of the 28th USENIX Security Symposium. 267\u2013284 (2019)."},{"key":"489_CR26","doi-asserted-by":"crossref","unstructured":"Bonawitz K. et al. Practical secure aggregation for privacy preserving machine learning. In Proceedings of ACM Conference on Computer and Communications Security (ACM CCS). (2017).","DOI":"10.1145\/3133956.3133982"},{"key":"489_CR27","unstructured":"Smith J. W., Everhart J. E., Dickson W. C., Knowler W. C. & Johannes R. S. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In: Proceedings of the Annual Symposium on Computer Applications in Medical Care. American Medical Informatics Association. 261\u2013265 (1988)."},{"key":"489_CR28","doi-asserted-by":"publisher","first-page":"19941","DOI":"10.2807\/ese.16.32.19941-en","volume":"16","author":"L Fiebig","year":"2011","unstructured":"Fiebig, L. et al. Avian influenza A(H5N1) in humans: new insights from a line list of World Health Organization confirmed cases, September 2006 to August 2010. Eurosurveillance 16, 19941 (2011).","journal-title":"Eurosurveillance"},{"key":"489_CR29","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1186\/s13756-017-0177-0","volume":"6","author":"PNA Harris","year":"2017","unstructured":"Harris, P. N. A. et al. Risk factors for relapse or persistence of bacteraemia caused by Enterobacter spp.: a case\u2013control study. Antimicrob. Resist. Infect. Control 6, 14 (2017).","journal-title":"Antimicrob. Resist. Infect. Control"},{"key":"489_CR30","doi-asserted-by":"publisher","first-page":"e0006950","DOI":"10.1371\/journal.pntd.0006950","volume":"12","author":"CE Oldenburg","year":"2018","unstructured":"Oldenburg, C. E. et al. Safety of azithromycin in infants under six months of age in Niger: a community randomized trial. PLoS Negl. Trop. Dis. 12, e0006950 (2018).","journal-title":"PLoS Negl. Trop. Dis."},{"key":"489_CR31","doi-asserted-by":"publisher","first-page":"e0209650","DOI":"10.1371\/journal.pone.0209650","volume":"14","author":"S-A Ohene","year":"2019","unstructured":"Ohene, S.-A. et al. Extra-pulmonary tuberculosis: a retrospective study of patients in Accra, Ghana. PLoS ONE 14, e0209650 (2019).","journal-title":"PLoS ONE"}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00489-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00489-2","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00489-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,3]],"date-time":"2022-12-03T13:58:35Z","timestamp":1670075915000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00489-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,7]]},"references-count":31,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2021,12]]}},"alternative-id":["489"],"URL":"https:\/\/doi.org\/10.1038\/s41746-021-00489-2","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.12.22.20245407","asserted-by":"object"}]},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,7]]},"assertion":[{"value":"8 March 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 July 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 September 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"A.S., L.L., A.I., S.M., P.K., J.M., P.E., M.H., B.A., S.S. and J.H. are employees of Google and own Alphabet stock. The remaining authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"132"}}