{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,30]],"date-time":"2026-03-30T11:08:25Z","timestamp":1774868905077,"version":"3.50.1"},"reference-count":22,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2025,9,2]],"date-time":"2025-09-02T00:00:00Z","timestamp":1756771200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,9,2]],"date-time":"2025-09-02T00:00:00Z","timestamp":1756771200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001652","name":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001652","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J CARS"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Purpose<\/jats:title>\n                    <jats:p>Federated Learning helps training deep learning networks with diverse data from different locations, particularly in restricted clinical settings. However, label distributions overlapping only partially across clients, due to different demographics, may significantly harm the global training, and thus local model performance. Investigating such effects before rolling out large-scale Federated Learning setups requires proper sampling of the expected label distributions.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>We present a sampling algorithm to build data subsets according to desired mean and standard deviations from an initial global distribution. To this end, we incorporate the chi-squared and Gini impurity measures to numerically optimize label distributions for multiple groups in an efficient fashion.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Using a real-world application scenario, we sample train and test groups according to region-specific distributions for 3D camera-based weight and height estimation in a clinical context, comparing a hard data split serving as a baseline with our proposed sampling technique. We train a baseline model on all data for comparison and use Federated Averaging to combine the training of our data subsets, demonstrating a realistic deterioration of 25.3\u00a0% on weight and 28.7\u00a0% on height estimations by the global model.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusions<\/jats:title>\n                    <jats:p>Realistically client-biased label distribution can notably harm the training in a federated context. Our sampling algorithm for simulating realistic data distributions opens up an efficient way for prior analysis of this effect. The technique is agnostic to the chosen network architecture and target scenario and can be adapted to any feature or label problem with non-IID subpopulations.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1007\/s11548-025-03504-z","type":"journal-article","created":{"date-parts":[[2025,9,2]],"date-time":"2025-09-02T11:40:22Z","timestamp":1756813222000},"page":"495-506","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A robust sampling technique for realistic distribution simulation in federated learning"],"prefix":"10.1007","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-3786-7516","authenticated-orcid":false,"given":"Robin","family":"Hoepp","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9589-3774","authenticated-orcid":false,"given":"Leonhard","family":"Rist","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1958-1549","authenticated-orcid":false,"given":"Alexander","family":"Katzmann","sequence":"additional","affiliation":[]},{"given":"Raghavan","family":"Ashok","sequence":"additional","affiliation":[]},{"given":"Andreas","family":"Wimmer","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1060-3759","authenticated-orcid":false,"given":"Michael","family":"S\u00fchling","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9550-5284","authenticated-orcid":false,"given":"Andreas","family":"Maier","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,9,2]]},"reference":[{"key":"3504_CR1","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2024.102422","volume":"109","author":"H Kheddar","year":"2024","unstructured":"Kheddar H, Hemis M, Himeur Y (2024) Automatic speech recognition using advanced deep learning approaches: a survey. Inf Fus 109:102422. https:\/\/doi.org\/10.1016\/j.inffus.2024.102422","journal-title":"Inf Fus"},{"issue":"16","key":"3504_CR2","doi-asserted-by":"publisher","first-page":"4172","DOI":"10.3390\/cancers15164172","volume":"15","author":"AB Abdusalomov","year":"2023","unstructured":"Abdusalomov AB, Mukhiddinov M, Whangbo TK (2023) Brain tumor detection based on deep learning approaches and magnetic resonance imaging. Cancers 15(16):4172. https:\/\/doi.org\/10.3390\/cancers15164172","journal-title":"Cancers"},{"issue":"2","key":"3504_CR3","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1038\/s41592-020-01008-z","volume":"18","author":"F Isensee","year":"2021","unstructured":"Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH (2021) nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 18(2):203\u2013211. https:\/\/doi.org\/10.1038\/s41592-020-01008-z","journal-title":"Nat Methods"},{"key":"3504_CR4","doi-asserted-by":"publisher","unstructured":"Liu X, Wang Z (2024) Deep learning in medical image classification from mri-based brain tumor images. In: 2024 IEEE 6th international conference on power, intelligent computing and systems (ICPICS). IEEE, Shenyang, pp 840\u2013844. https:\/\/doi.org\/10.1109\/ICPICS62053.2024.10796108","DOI":"10.1109\/ICPICS62053.2024.10796108"},{"key":"3504_CR5","doi-asserted-by":"publisher","first-page":"244","DOI":"10.1016\/j.future.2022.05.003","volume":"135","author":"X Ma","year":"2022","unstructured":"Ma X, Zhu J, Lin Z, Chen S, Qin Y (2022) A state-of-the-art survey on solving non-iid data in federated learning. Futur Gener Comput Syst 135:244\u2013258. https:\/\/doi.org\/10.1016\/j.future.2022.05.003","journal-title":"Futur Gener Comput Syst"},{"key":"3504_CR6","doi-asserted-by":"publisher","first-page":"371","DOI":"10.1016\/j.neucom.2021.07.098","volume":"465","author":"H Zhu","year":"2021","unstructured":"Zhu H, Xu J, Liu S, Jin Y (2021) Federated learning on non-iid data: a survey. Neurocomputing 465:371\u2013390. https:\/\/doi.org\/10.1016\/j.neucom.2021.07.098","journal-title":"Neurocomputing"},{"key":"3504_CR7","doi-asserted-by":"publisher","unstructured":"Mora A, Fantini D, Bellavista P (2022) Federated learning algorithms with heterogeneous data distributions: an empirical evaluation. In: 2022 IEEE\/ACM 7th symposium on edge computing (SEC), pp 336\u2013341. https:\/\/doi.org\/10.1109\/SEC54971.2022.00049","DOI":"10.1109\/SEC54971.2022.00049"},{"key":"3504_CR8","doi-asserted-by":"publisher","unstructured":"Tounsi A, Salem O, Mehaoua A (2024) Robust federated learning against data poisoning: a split learning-based approach evaluated on various aggregation techniques. In: 2024 IEEE international conference on E-health networking, application and services (HealthCom), pp 1\u20136. https:\/\/doi.org\/10.1109\/HealthCom60970.2024.10880725","DOI":"10.1109\/HealthCom60970.2024.10880725"},{"key":"3504_CR9","doi-asserted-by":"publisher","unstructured":"Wang Y, Tong Y, Zhou Z, Zhang R, Pan SJ, Fan L, Yang Q (2023) Distribution-regularized federated learning on non-iid data. In: 2023 IEEE 39th international conference on data engineering (ICDE), pp 2113\u20132125. https:\/\/doi.org\/10.1109\/ICDE55515.2023.00164","DOI":"10.1109\/ICDE55515.2023.00164"},{"key":"3504_CR10","doi-asserted-by":"publisher","unstructured":"Yeganeh Y, Farshad A, Navab N, Albarqouni S (2020) Inverse distance aggregation for federated learning with non-iid data. In: Domain adaptation and representation transfer, and distributed and collaborative learning, pp 150\u2013159. https:\/\/doi.org\/10.1007\/978-3-030-60548-3_15","DOI":"10.1007\/978-3-030-60548-3_15"},{"key":"3504_CR11","doi-asserted-by":"publisher","unstructured":"Duan J-h, Li W, Zou D, Li R, Lu S (2023) Federated learning with data-agnostic distribution fusion. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 8074\u20138083. https:\/\/doi.org\/10.1109\/CVPR52729.2023.00780","DOI":"10.1109\/CVPR52729.2023.00780"},{"issue":"13","key":"3504_CR12","doi-asserted-by":"publisher","first-page":"3521","DOI":"10.1073\/pnas.1611835114","volume":"114","author":"J Kirkpatrick","year":"2017","unstructured":"Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A, Hassabis D, Clopath C, Kumaran D, Hadsell R (2017) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521\u20133526. https:\/\/doi.org\/10.1073\/pnas.1611835114","journal-title":"Proc Natl Acad Sci"},{"key":"3504_CR13","unstructured":"WorldData (2024) Average height for men and women worldwide. Accessed 26 Aug 2024. https:\/\/www.worlddata.info\/average-bodyheight.php"},{"key":"3504_CR14","unstructured":"NCD-RisC (2024) NCD risk factor collaboration. Accessed 24 Aug 2024. https:\/\/www.ncdrisc.org\/index.html"},{"key":"3504_CR15","doi-asserted-by":"publisher","first-page":"383","DOI":"10.1007\/978-3-031-72117-5_36","volume":"15010","author":"K Alhamoud","year":"2024","unstructured":"Alhamoud K, Ghunaim Y, Alfarra M, Hartvigsen T, Torr P, Ghanem B, Bibi A, Ghassemi M (2024) FedMedICL: towards holistic evaluation of distribution shifts in federated medical imaging. Med Image Comput Comput Assist Interv 15010:383\u2013393. https:\/\/doi.org\/10.1007\/978-3-031-72117-5_36","journal-title":"Med Image Comput Comput Assist Interv"},{"key":"3504_CR16","doi-asserted-by":"publisher","first-page":"337","DOI":"10.1007\/978-3-031-43987-2_33","volume":"14225","author":"B Tamersoy","year":"2023","unstructured":"Tamersoy B, Pirvan FA, Pai S, Kapoor A (2023) Accurate and robust patient height and weight estimation in clinical imaging using a depth camera. Med Image Comput Comput Assist Interv 14225:337\u2013346. https:\/\/doi.org\/10.1007\/978-3-031-43987-2_33","journal-title":"Med Image Comput Comput Assist Interv"},{"issue":"7","key":"3504_CR17","doi-asserted-by":"publisher","first-page":"2715","DOI":"10.1016\/j.acra.2024.01.029","volume":"31","author":"I Shahzadi","year":"2024","unstructured":"Shahzadi I, Tamersoy B, Frohwein LJ, Subramanian S, Moenninghoff C, Niehoff JH, Kroeger JR, Surov A, Borggrefe J (2024) Automated patient registration in magnetic resonance imaging using deep learning-based height and weight estimation with 3D camera: a feasibility study. Acad Radiol 31(7):2715\u20132724. https:\/\/doi.org\/10.1016\/j.acra.2024.01.029","journal-title":"Acad Radiol"},{"key":"3504_CR18","doi-asserted-by":"publisher","unstructured":"Manava P, Galster M, Ammon J, Singer J, Lell MM, Rieger V (2023) Optimized camera-based patient positioning in CT: impact on radiation exposure. Invest Radiol 58(2):126\u2013130. https:\/\/doi.org\/10.1097\/RLI.0000000000000904","DOI":"10.1097\/RLI.0000000000000904"},{"key":"3504_CR19","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. https:\/\/doi.org\/10.48550\/arXiv.2010.11929"},{"key":"3504_CR20","unstructured":"McMahan HB, Moore E, Ramage D, Hampson S (2017) Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th international conference on artificial intelligence and statistics, vol 54, pp 1273\u20131282"},{"key":"3504_CR21","first-page":"769","volume":"87","author":"C Gini","year":"1997","unstructured":"Gini C (1997) Concentration and dependency ratios. Rivista di politica economica 87:769\u2013792","journal-title":"Rivista di politica economica"},{"key":"3504_CR22","unstructured":"Bhattacharyya A (1946) On a measure of divergence between two multinomial populations. Sankhya Indian J Stat 4(7):401\u2013406"}],"container-title":["International Journal of Computer Assisted Radiology and Surgery"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11548-025-03504-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11548-025-03504-z","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11548-025-03504-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,30]],"date-time":"2026-03-30T10:21:20Z","timestamp":1774866080000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11548-025-03504-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,2]]},"references-count":22,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2026,3]]}},"alternative-id":["3504"],"URL":"https:\/\/doi.org\/10.1007\/s11548-025-03504-z","relation":{},"ISSN":["1861-6429"],"issn-type":[{"value":"1861-6429","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,2]]},"assertion":[{"value":"13 January 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 August 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 September 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have influenced the work in this paper. All data were collected in a retrospective study which received Institutional Review Board approval. The need for informed consent was waived.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}