{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T15:22:09Z","timestamp":1777735329704,"version":"3.51.4"},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T00:00:00Z","timestamp":1729036800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T00:00:00Z","timestamp":1729036800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000272","name":"DH | National Institute for Health Research","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000272","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Robust data privacy regulations hinder the exchange of healthcare data among institutions, crucial for global insights and developing generalised clinical models. Federated learning (FL) is ideal for training global models using datasets from different institutions without compromising privacy. However, disparities in electronic healthcare records (EHRs) lead to inconsistencies in ML-ready data views, making FL challenging without extensive preprocessing and information loss. These differences arise from variations in services, care standards, and record-keeping practices. This paper addresses data view heterogeneity by introducing a knowledge abstraction and filtering-based FL framework that allows FL over heterogeneous data views without manual alignment or information loss. The knowledge abstraction and filtering mechanism maps raw input representations to a unified, semantically rich shared space for effective global model training. Experiments on three healthcare datasets demonstrate the framework\u2019s effectiveness in overcoming data view heterogeneity and facilitating information sharing in a federated setup.<\/jats:p>","DOI":"10.1038\/s41746-024-01272-9","type":"journal-article","created":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T10:02:21Z","timestamp":1729072941000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Knowledge abstraction and filtering based federated learning over heterogeneous data views in healthcare"],"prefix":"10.1038","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7006-1947","authenticated-orcid":false,"given":"Anshul","family":"Thakur","sequence":"first","affiliation":[]},{"given":"Soheila","family":"Molaei","sequence":"additional","affiliation":[]},{"given":"Pafue Christy","family":"Nganjimi","sequence":"additional","affiliation":[]},{"given":"Fenglin","family":"Liu","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2391-5361","authenticated-orcid":false,"given":"Andrew","family":"Soltan","sequence":"additional","affiliation":[]},{"given":"Patrick","family":"Schwab","sequence":"additional","affiliation":[]},{"given":"Kim","family":"Branson","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9848-8555","authenticated-orcid":false,"given":"David A.","family":"Clifton","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,10,16]]},"reference":[{"key":"1272_CR1","doi-asserted-by":"publisher","first-page":"360","DOI":"10.1093\/jamiaopen\/ooaa044","volume":"3","author":"JM Butler","year":"2020","unstructured":"Butler, J. M. et al. Patient-centered care and the electronic health record: exploring functionality and gaps. Jamia Open 3, 360\u2013368 (2020).","journal-title":"Jamia Open"},{"key":"1272_CR2","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1038\/s41591-021-01580-7","volume":"28","author":"H Ko","year":"2022","unstructured":"Ko, H. Pseudonymization of healthcare data in South Korea. Nat. Med. 28, 15\u201316 (2022).","journal-title":"Nat. Med."},{"key":"1272_CR3","unstructured":"Data Protection Act 2018. United Kingdom Legislation. c. 12 (2018). http:\/\/www.legislation.gov.uk\/ukpga\/2018\/12\/contents."},{"key":"1272_CR4","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-022-33407-5","volume":"13","author":"S Pati","year":"2022","unstructured":"Pati, S. et al. Federated learning enables big data for rare cancer boundary detection. Nat. Commun. 13, 7346 (2022).","journal-title":"Nat. Commun."},{"key":"1272_CR5","doi-asserted-by":"publisher","first-page":"1761","DOI":"10.1109\/JBHI.2021.3134835","volume":"26","author":"A Thakur","year":"2021","unstructured":"Thakur, A., Sharma, P. & Clifton, D. A. Dynamic neural graphs based federated reptile for semi-supervised multi-tasking in healthcare applications. IEEE J. Biomed. Health Inf. 26, 1761\u20131772 (2021).","journal-title":"IEEE J. Biomed. Health Inf."},{"key":"1272_CR6","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-020-00323-1","volume":"3","author":"N Rieke","year":"2020","unstructured":"Rieke, N. et al. The future of digital health with federated learning. NPJ Digital Med. 3, 119 (2020).","journal-title":"NPJ Digital Med."},{"key":"1272_CR7","unstructured":"McMahan, B., Moore, E., Ramage, D., Hampson, S. & Arcas, B. A. Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273\u20131282 (2017)."},{"key":"1272_CR8","doi-asserted-by":"crossref","unstructured":"Shi, S. et al. A distributed synchronous sgd algorithm with global top-k sparsification for low bandwidth networks. In: Proceedings of International Conference on Distributed Computing Systems (ICDCS), pp.2238\u20132247 (2019).","DOI":"10.1109\/ICDCS.2019.00220"},{"key":"1272_CR9","first-page":"49","volume":"9","author":"DCG Orach","year":"2009","unstructured":"Orach, D. C. G. Health equity: challenges in low income countries. Af. Health Sci. 9, 49\u201351 (2009).","journal-title":"Af. Health Sci."},{"key":"1272_CR10","doi-asserted-by":"publisher","first-page":"161","DOI":"10.1196\/annals.1425.011","volume":"1136","author":"DH Peters","year":"2008","unstructured":"Peters, D. H. et al. Poverty and access to health care in developing countries. Ann. NY Acad. Sci. 1136, 161\u2013171 (2008).","journal-title":"Ann. NY Acad. Sci."},{"key":"1272_CR11","doi-asserted-by":"publisher","first-page":"9587","DOI":"10.1109\/TNNLS.2022.3160699","volume":"34","author":"AZ Tan","year":"2023","unstructured":"Tan, A. Z., Yu, H., Cui, L. & Yang, Q. Towards personalized federated learning. IEEE Trans. Neural Netw. Learn. Syst. 34, 9587\u20139603 (2023).","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"1272_CR12","doi-asserted-by":"crossref","unstructured":"Soltan, A. A. et al. Scalable federated learning for emergency care using low cost microcomputing: Real-world, privacy preserving development and evaluation of a covid-19 screening test in uk hospitals. medRxiv, 2023\u201305 (2023).","DOI":"10.1101\/2023.05.05.23289554"},{"key":"1272_CR13","doi-asserted-by":"publisher","first-page":"827","DOI":"10.1002\/cpt.1577","volume":"107","author":"S Schneeweiss","year":"2020","unstructured":"Schneeweiss, S., Brown, J. S., Bate, A., Trifir\u00f2, G. & Bartels, D. B. Choosing among common data models for real-world data analyses fit for making decisions about the effectiveness of medical products. Clin. Pharmacol. Therap. 107, 827\u2013833 (2020).","journal-title":"Clin. Pharmacol. Therap."},{"key":"1272_CR14","doi-asserted-by":"publisher","first-page":"30970","DOI":"10.2196\/30970","volume":"9","author":"N Paris","year":"2021","unstructured":"Paris, N., Lamer, A. & Parrot, A. Transformation and evaluation of the mimic database in the omop common data model: development and usability study. JMIR Medical Informatics 9, 30970 (2021).","journal-title":"JMIR Medical Informatics"},{"key":"1272_CR15","doi-asserted-by":"publisher","first-page":"104002","DOI":"10.1016\/j.jbi.2022.104002","volume":"127","author":"Y Yu","year":"2022","unstructured":"Yu, Y. et al. Developing an ETL tool for converting the PCORnet CDM into the OMOP CDM to facilitate the COVID-19 data integration. Journal of Biomedical Informatics 127, 104002 (2022).","journal-title":"Journal of Biomedical Informatics"},{"key":"1272_CR16","doi-asserted-by":"crossref","unstructured":"Hallinan, C. M. et al. Seamless EMR data access: Integrated governance, digital health and the 0OMOP-CDM. BMJ Health & Care Informatics 31 (2024).","DOI":"10.1136\/bmjhci-2023-100953"},{"key":"1272_CR17","first-page":"1","volume":"56","author":"M Ye","year":"2023","unstructured":"Ye, M., Fang, X., Du, B., Yuen, P. C. & Tao, D. Heterogeneous federated learning: State-of-the-art and research challenges. ACM Comput. Surv. 56, 1\u201344 (2023).","journal-title":"ACM Comput. Surv."},{"key":"1272_CR18","doi-asserted-by":"crossref","unstructured":"Nie, J., Xiao, D., Yang, L. & Wu, W. Fedcme: Client matching and classifier exchanging to handle data heterogeneity in federated learning. arXiv preprint arXiv:2307.08574 (2023).","DOI":"10.1109\/MSN60784.2023.00083"},{"key":"1272_CR19","first-page":"2351","volume":"33","author":"T Lin","year":"2020","unstructured":"Lin, T., Kong, L., Stich, S. U. & Jaggi, M. Ensemble distillation for robust model fusion in federated learning. Adv. Neural Inf. Processing Syst. 33, 2351\u20132363 (2020).","journal-title":"Adv. Neural Inf. Processing Syst."},{"key":"1272_CR20","unstructured":"Liang, P. P., Liu, T., Ziyin, L., Salakhutdinov, R. & Morency, L.-P. Think locally, act globally: Federated learning with local and global representations. arXiv preprint arXiv:2001.01523 (2020)."},{"key":"1272_CR21","doi-asserted-by":"crossref","unstructured":"Fu, Y. et al. Partial feature selection and alignment for multi-source domain adaptation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 16654\u201316663 (2021).","DOI":"10.1109\/CVPR46437.2021.01638"},{"key":"1272_CR22","doi-asserted-by":"crossref","unstructured":"Li, S. et al. Simultaneous semantic alignment network for heterogeneous domain adaptation. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 3866\u20133874 (2020).","DOI":"10.1145\/3394171.3413995"},{"key":"1272_CR23","unstructured":"Shamsian, A., Navon, A., Fetaya, E. & Chechik, G. Personalized federated learning using hypernetworks. In: International Conference on Machine Learning, pp. 9489\u20139502 (2021)."},{"key":"1272_CR24","unstructured":"Molaei, S. et al. Federated learning for heterogeneous electronic health records utilising augmented temporal graph attention networks. In: International Conference on Artificial Intelligence and Statistics, pp. 1342\u20131350 (2024)."},{"key":"1272_CR25","unstructured":"Maaten, L. & Hinton, G. Visualizing data using t-sne. J. Mach. Learning Res. 9 (2008)."},{"key":"1272_CR26","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1016\/S2589-7500(20)30274-0","volume":"3","author":"AA Soltan","year":"2021","unstructured":"Soltan, A. A. et al. Rapid triage for covid-19 using routine clinical data for patients attending hospital: development and prospective validation of an artificial intelligence screening test. The Lancet Digital Health 3, 78\u201387 (2021).","journal-title":"The Lancet Digital Health"},{"key":"1272_CR27","doi-asserted-by":"publisher","first-page":"266","DOI":"10.1016\/S2589-7500(21)00272-7","volume":"4","author":"AA Soltan","year":"2022","unstructured":"Soltan, A. A. et al. Real-world evaluation of rapid and laboratory-free covid-19 triage for emergency care: external validation and pilot deployment of artificial intelligence driven screening. The Lancet Digital Health 4, 266\u2013278 (2022).","journal-title":"The Lancet Digital Health"},{"key":"1272_CR28","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/sdata.2018.178","volume":"5","author":"TJ Pollard","year":"2018","unstructured":"Pollard, T. J. et al. The eicu collaborative research database, a freely available multi-center database for critical care research. Sci. Data 5, 1\u201313 (2018).","journal-title":"Sci. Data"},{"key":"1272_CR29","doi-asserted-by":"publisher","first-page":"1921","DOI":"10.1093\/jamia\/ocaa139","volume":"27","author":"S Tang","year":"2020","unstructured":"Tang, S. et al. Democratizing ehr analyses with fiddle: a flexible data-driven preprocessing pipeline for structured clinical data. J. Am. Med. Inf. Assoc. 27, 1921\u20131934 (2020).","journal-title":"J. Am. Med. Inf. Assoc."},{"key":"1272_CR30","doi-asserted-by":"crossref","unstructured":"Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. Data 3 (2016).","DOI":"10.1038\/sdata.2016.35"},{"key":"1272_CR31","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41597-019-0103-9","volume":"6","author":"H Harutyunyan","year":"2019","unstructured":"Harutyunyan, H., Khachatrian, H., Kale, D. C., Steeg, G. V. & Galstyan, A. Multitask learning and benchmarking with clinical time series data. Sci. Data 6, 1\u201318 (2019).","journal-title":"Sci. Data"},{"key":"1272_CR32","doi-asserted-by":"publisher","first-page":"3454","DOI":"10.1109\/TIFS.2020.2988575","volume":"15","author":"K Wei","year":"2020","unstructured":"Wei, K. et al. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Trans. Inf. Forens. Sec. 15, 3454\u20133469 (2020).","journal-title":"IEEE Trans. Inf. Forens. Sec."},{"key":"1272_CR33","unstructured":"Jagielski, M., Ullman, J. & Oprea, A. Auditing differentially private machine learning: How private is private SGD? In: Advances in Neural Information Processing Systems, vol. 33, pp. 22205\u201322216. Curran Associates, Inc., Red Hook, NY, USA (2020)."},{"key":"1272_CR34","doi-asserted-by":"crossref","unstructured":"Yu, S. & Cui, L. Secure multi-party computation in federated learning. In: Security and Privacy in Federated Learning, pp. 89\u201398. Springer, Singapore (2022)","DOI":"10.1007\/978-981-19-8692-5_6"},{"key":"1272_CR35","doi-asserted-by":"publisher","first-page":"1089","DOI":"10.1109\/TNNLS.2021.3104901","volume":"34","author":"S Molaei","year":"2023","unstructured":"Molaei, S., Bousejin, N. G., Zare, H., Jalili, M. & Pan, S. Learning graph representations with maximal cliques. IEEE Transactions on Neural Networks and Learning Systems 34, 1089\u20131096 (2023).","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"1272_CR36","unstructured":"Zheng, X. et al. GNNEvaluator: Evaluating GNN performance on unseen graphs without labels. In: Thirty-seventh Conference on Neural Information Processing Systems (2023). https:\/\/openreview.net\/forum?id=ihlT8yvQ2I."}],"updated-by":[{"DOI":"10.1038\/s41746-024-01342-y","type":"correction","label":"Correction","source":"publisher","updated":{"date-parts":[[2024,11,21]],"date-time":"2024-11-21T00:00:00Z","timestamp":1732147200000}}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-024-01272-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-024-01272-9","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-024-01272-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,21]],"date-time":"2024-11-21T07:10:29Z","timestamp":1732173029000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-024-01272-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,16]]},"references-count":36,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["1272"],"URL":"https:\/\/doi.org\/10.1038\/s41746-024-01272-9","relation":{"correction":[{"id-type":"doi","id":"10.1038\/s41746-024-01342-y","asserted-by":"object"}]},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,16]]},"assertion":[{"value":"9 February 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 September 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 October 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 November 2024","order":4,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Correction","order":5,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"A Correction to this paper has been published:","order":6,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"https:\/\/doi.org\/10.1038\/s41746-024-01342-y","URL":"https:\/\/doi.org\/10.1038\/s41746-024-01342-y","order":7,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"283"}}