{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,20]],"date-time":"2026-04-20T23:48:01Z","timestamp":1776728881872,"version":"3.51.2"},"reference-count":48,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2024,6,1]],"date-time":"2024-06-01T00:00:00Z","timestamp":1717200000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,6,10]],"date-time":"2024-06-10T00:00:00Z","timestamp":1717977600000},"content-version":"vor","delay-in-days":9,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100006359","name":"Blekinge Institute of Technology","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006359","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Prog Artif Intell"],"published-print":{"date-parts":[[2024,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Federated learning (FL) enables edge nodes to collaboratively contribute to constructing a global model without sharing their data. This is accomplished by devices computing local, private model updates that are then aggregated by a server. However, computational resource constraints and network communication can become a severe bottleneck for larger model sizes typical for deep learning (DL) applications. Edge nodes tend to have limited hardware resources (RAM, CPU), and the network bandwidth and reliability at the edge is a concern for scaling federated fleet applications. In this paper, we propose and evaluate a FL strategy inspired by transfer learning in order to reduce resource utilization on devices, as well as the load on the server and network in each global training round. For each local model update, we randomly select layers to train, freezing the remaining part of the model. In doing so, we can reduce both server load and communication costs per round by excluding all untrained layer weights from being transferred to the server. The goal of this study is to empirically explore the potential trade-off between resource utilization on devices and global model convergence under the proposed strategy. We implement the approach using the FL framework FEDn. A number of experiments were carried out over different datasets (CIFAR-10, CASA, and IMDB), performing different tasks using different DL model architectures. Our results show that training the model partially can accelerate the training process, efficiently utilizes resources on-device, and reduce the data transmission by around 75% and 53% when we train 25%, and 50% of the model layers, respectively, without harming the resulting global model accuracy. Furthermore, our results demonstrate a negative correlation between the number of participating clients in the training process and the number of layers that need to be trained on each client\u2019s side. As the number of clients increases, there is a decrease in the required number of layers. This observation highlights the potential of the approach, particularly in cross-device use cases.<\/jats:p>","DOI":"10.1007\/s13748-024-00322-3","type":"journal-article","created":{"date-parts":[[2024,6,10]],"date-time":"2024-06-10T18:01:40Z","timestamp":1718042500000},"page":"101-117","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["Toward efficient resource utilization at edge nodes in federated learning"],"prefix":"10.1007","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6309-2892","authenticated-orcid":false,"given":"Sadi","family":"Alawadi","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Addi","family":"Ait-Mlouk","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Salman","family":"Toor","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andreas","family":"Hellander","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,6,10]]},"reference":[{"key":"322_CR1","unstructured":"McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273\u20131282 (2017)"},{"key":"322_CR2","unstructured":"Kone\u010dn\u1ef3, J., McMahan, H.B., Ramage, D., Richt\u00e1rik, P.: Federated optimization: Distributed machine learning for on-device intelligence (2016). arXiv:1610.02527"},{"key":"322_CR3","doi-asserted-by":"crossref","unstructured":"Alawadi, S., Kebande, V.R., Dong, Y., Bugeja, J., Persson, J.A., Olsson, C.M.: A federated interactive learning iot-based health monitoring platform. In: European Conference on Advances in Databases and Information Systems, pp. 235\u2013246. Springer (2021)","DOI":"10.1007\/978-3-030-85082-1_21"},{"issue":"6","key":"322_CR4","doi-asserted-by":"publisher","first-page":"3130","DOI":"10.3390\/app12063130","volume":"12","author":"A Ait-Mlouk","year":"2022","unstructured":"Ait-Mlouk, A., Alawadi, S.A., Toor, S., Hellander, A.: Fedqas: privacy-aware machine reading comprehension with federated learning. Appl. Sci. 12(6), 3130 (2022)","journal-title":"Appl. Sci."},{"key":"322_CR5","doi-asserted-by":"crossref","unstructured":"Chen, C., Xu, H., Wang, W., Li, B., Li, B., Chen, L., Zhang, G.: Communication-efficient federated learning with adaptive parameter freezing. In: 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), pp. 1\u201311. IEEE (2021)","DOI":"10.1109\/ICDCS51616.2021.00010"},{"key":"322_CR6","unstructured":"Tandon, R., Lei, Q., Dimakis, A.G., Karampatziakis, N.: Gradient coding: Avoiding stragglers in distributed learning. In: International Conference on Machine Learning, pp. 3368\u20133376. PMLR (2017)"},{"issue":"r1","key":"322_CR7","first-page":"2","volume":"2","author":"Z Xu","year":"2019","unstructured":"Xu, Z., Yang, Z., Xiong, J., Yang, J., Chen, X.: Elfish: resource-aware federated learning on heterogeneous edge devices. Ratio 2(r1), 2 (2019)","journal-title":"Ratio"},{"issue":"3","key":"322_CR8","doi-asserted-by":"publisher","first-page":"50","DOI":"10.1109\/MSP.2020.2975749","volume":"37","author":"T Li","year":"2020","unstructured":"Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Signal Process. Mag. 37(3), 50\u201360 (2020)","journal-title":"IEEE Signal Process. Mag."},{"key":"322_CR9","doi-asserted-by":"publisher","first-page":"272","DOI":"10.1016\/j.compag.2018.03.032","volume":"161","author":"EC Too","year":"2019","unstructured":"Too, E.C., Yujian, L., Njuki, S., Yingchun, L.: A comparative study of fine-tuning deep learning models for plant disease identification. Comput. Electron. Agric. 161, 272\u2013279 (2019)","journal-title":"Comput. Electron. Agric."},{"key":"322_CR10","unstructured":"Liu, Y., Agarwal, S., Venkataraman, S.: Autofreeze: automatically freezing model blocks to accelerate fine-tuning (2021). arXiv:2102.01386"},{"key":"322_CR11","unstructured":"Chen, C.-C., Yang, C.-L., Cheng, H.-Y.: Efficient and robust parallel dnn training through model parallelism on multi-gpu platform (2018). arXiv:1809.02839"},{"issue":"13","key":"322_CR12","doi-asserted-by":"publisher","first-page":"1772","DOI":"10.14778\/2733004.2733082","volume":"7","author":"Y Zou","year":"2014","unstructured":"Zou, Y., Jin, X., Li, Y., Guo, Z., Wang, E., Xiao, B.: Mariana: tencent deep learning platform and its applications. Proc. VLDB Endowment 7(13), 1772\u20131777 (2014)","journal-title":"Proc. VLDB Endowment"},{"key":"322_CR13","unstructured":"Vishnu, A., Siegel, C., Daily, J.: Distributed tensorflow with mpi (2016). arXiv:1603.02339"},{"key":"322_CR14","unstructured":"Hewett, R.J., Grady\u00a0II, T.J.: A linear algebraic approach to model parallelism in deep learning (2020). arXiv:2006.03108"},{"key":"322_CR15","unstructured":"Jia, Z., Zaharia, M., Aiken, A.: Beyond data and model parallelism for deep neural networks (2018). arXiv:1807.05358"},{"key":"322_CR16","unstructured":"Shoeybi, M., Patwary, M., Puri, R., LeGresley, P., Casper, J., Catanzaro, B.: Megatron-lm: Training multi-billion parameter language models using model parallelism (2019). arXiv:1909.08053"},{"key":"322_CR17","doi-asserted-by":"crossref","unstructured":"Xiao, X., Mudiyanselage, T.B., Ji, C., Hu, J., Pan, Y.: Fast deep learning training through intelligently freezing layers. In: 2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), pp. 1225\u20131232. IEEE (2019)","DOI":"10.1109\/iThings\/GreenCom\/CPSCom\/SmartData.2019.00205"},{"key":"322_CR18","unstructured":"Kone\u010dn\u1ef3, J., McMahan, H.B., Yu, F.X., Richt\u00e1rik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency (2016). arXiv:1610.05492"},{"key":"322_CR19","doi-asserted-by":"crossref","unstructured":"Alawadi, S., Alkharabsheh, K., Alkhabbas, F., Kebande, V.R., Awaysheh, F.M., Palomba, F., Awad, M.: Fedcsd: A federated learning based approach for code-smell detection. IEEE Access (2024)","DOI":"10.1109\/ACCESS.2024.3380167"},{"key":"322_CR20","unstructured":"Sahu, A.K., Li, T., Sanjabi, M., Zaheer, M., Talwalkar, A., Smith, V.: On the convergence of federated optimization in heterogeneous networks 3, 3 (2018). arXiv:1812.06127"},{"key":"322_CR21","doi-asserted-by":"crossref","unstructured":"Ekmefjord, M., Ait-Mlouk, A., Alawadi, S., \u00c5kesson, M., Singh, P., Spjuth, O., Toor, S., Hellander, A.: Scalable federated machine learning with fedn. In: 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), pp. 555\u2013564. IEEE (2022)","DOI":"10.1109\/CCGrid54584.2022.00065"},{"issue":"2","key":"322_CR22","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3298981","volume":"10","author":"Q Yang","year":"2019","unstructured":"Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 1\u201319 (2019)","journal-title":"ACM Trans. Intell. Syst. Technol. (TIST)"},{"key":"322_CR23","doi-asserted-by":"crossref","unstructured":"Guo, Y., Shi, H., Kumar, A., Grauman, K., Rosing, T., Feris, R.: Spottune: transfer learning through adaptive fine-tuning. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 4805\u20134814 (2019)","DOI":"10.1109\/CVPR.2019.00494"},{"key":"322_CR24","doi-asserted-by":"publisher","first-page":"196197","DOI":"10.1109\/ACCESS.2020.3034343","volume":"8","author":"G Vrban\u010di\u010d","year":"2020","unstructured":"Vrban\u010di\u010d, G., Podgorelec, V.: Transfer learning with adaptive fine-tuning. IEEE Access 8, 196197\u2013196211 (2020)","journal-title":"IEEE Access"},{"issue":"8","key":"322_CR25","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1145\/79173.79181","volume":"33","author":"LG Valiant","year":"1990","unstructured":"Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103\u2013111 (1990)","journal-title":"Commun. ACM"},{"key":"322_CR26","unstructured":"Shallue, C.J., Lee, J., Antognini, J., Sohl-Dickstein, J., Frostig, R., Dahl, G.E.: Measuring the effects of data parallelism on neural network training (2018). arXiv:1811.03600"},{"key":"322_CR27","unstructured":"Park, J.H., Yun, G., Chang, M.Y., Nguyen, N.T., Lee, S., Choi, J., Noh, S.H., Choi, Y.-r.: $$\\{$$HetPipe$$\\}$$: Enabling large $$\\{$$DNN$$\\}$$ training on (whimpy) heterogeneous $$\\{$$GPU$$\\}$$ clusters through integration of pipelined model parallelism and data parallelism. In: 2020 USENIX Annual Technical Conference (USENIX ATC 20), pp. 307\u2013321 (2020)"},{"key":"322_CR28","unstructured":"Shazeer, N., Cheng, Y., Parmar, N., Tran, D., Vaswani, A., Koanantakool, P., Hawkins, P., Lee, H., Hong, M., Young, C., et al.: Mesh-tensorflow: deep learning for supercomputers. Adv. Neural Inform. Process. Syst. 31 (2018)"},{"issue":"1","key":"322_CR29","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1177\/1094342005051521","volume":"19","author":"R Thakur","year":"2005","unstructured":"Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of collective communication operations in mpich. Int. J. High Perform. Comput. Appl. 19(1), 49\u201366 (2005)","journal-title":"Int. J. High Perform. Comput. Appl."},{"key":"322_CR30","unstructured":"Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Ranzato, M., Senior, A., Tucker, P., Yang, K., et al.: Large scale distributed deep networks. Adv. Neural Inform. Process. Syst. 25 (2012)"},{"key":"322_CR31","unstructured":"Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks (2014). arXiv:1404.5997"},{"issue":"10","key":"322_CR32","doi-asserted-by":"publisher","first-page":"1345","DOI":"10.1109\/TKDE.2009.191","volume":"22","author":"SJ Pan","year":"2009","unstructured":"Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345\u20131359 (2009)","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"322_CR33","unstructured":"Sergeev, A., Del\u00a0Balso, M.: Horovod: fast and easy distributed deep learning in tensorflow (2018). arXiv:1802.05799"},{"key":"322_CR34","unstructured":"Gaunt, A.L., Johnson, M.A., Riechert, M., Tarlow, D., Tomioka, R., Vytiniotis, D., Webster, S.: Ampnet: Asynchronous model-parallel training for dynamic neural networks (2017). arXiv:1705.09786"},{"key":"322_CR35","unstructured":"Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inform. Process. Syst. 25 (2012)"},{"key":"322_CR36","unstructured":"TensorFlow Benchmarks. https:\/\/www.tensorflow.org\/performance\/benchmarks"},{"key":"322_CR37","unstructured":"A New Lightweight, Modular, and Scalable Deep Learning Framework. https:\/\/caffe2.ai\/"},{"key":"322_CR38","unstructured":"Tensors and Dynamic Neural Networks in Python with Strong GPU Acceleration. https:\/\/pytorch.org"},{"key":"322_CR39","unstructured":"Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Freezeout: accelerate training by progressively freezing layers (2017). arXiv:1706.04983"},{"key":"322_CR40","doi-asserted-by":"crossref","unstructured":"Xu-hui, C., Haq, E.U., Chengyu, Z.: Notice of violation of IEEE publication principles: efficient technique to accelerate neural network training by freezing hidden layers. In: 2019 IEEE\/ACIS 18th International Conference on Computer and Information Science (ICIS), pp. 542\u2013546. IEEE (2019)","DOI":"10.1109\/ICIS46139.2019.8940213"},{"key":"322_CR41","doi-asserted-by":"crossref","unstructured":"Xiao, X., Mudiyanselage, T.B., Ji, C., Hu, J., Pan, Y.: Fast deep learning training through intelligently freezing layers. In: 2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), pp. 1225\u20131232. IEEE (2019)","DOI":"10.1109\/iThings\/GreenCom\/CPSCom\/SmartData.2019.00205"},{"key":"322_CR42","unstructured":"Lee, J., Tang, R., Lin, J.: What would elsa do? Freezing layers during transformer fine-tuning (2019). arXiv:1911.03090"},{"key":"322_CR43","unstructured":"Liu, Y., Agarwal, S., Venkataraman, S.: Autofreeze: Automatically freezing model blocks to accelerate fine-tuning (2021). arXiv:2102.01386"},{"key":"322_CR44","doi-asserted-by":"crossref","unstructured":"Wang, Y., Sun, D., Chen, K., Lai, F., Chowdhury, M.: Efficient dnn training with knowledge-guided layer freezing (2022). arXiv:2201.06227","DOI":"10.1145\/3552326.3587451"},{"key":"322_CR45","unstructured":"Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)"},{"key":"322_CR46","unstructured":"Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142\u2013150. Association for Computational Linguistics, Portland, Oregon, USA (2011). http:\/\/www.aclweb.org\/anthology\/P11-1015"},{"key":"322_CR47","doi-asserted-by":"crossref","unstructured":"Alkhabbas, F., Alawadi, S., Spalazzese, R., Davidsson, P.: Activity recognition and user preference learning for automated configuration of iot environments. In: Proceedings of the 10th International Conference on the Internet of Things, pp. 1\u20138 (2020)","DOI":"10.1145\/3410992.3411003"},{"key":"322_CR48","doi-asserted-by":"publisher","unstructured":"Toor, S., Lindberg, M., Falman, I., Vallin, A., Mohill, O., Freyhult, P., Nilsson, L., Agback, M., Viklund, L., Zazzik, H., Spjuth, O., Capuccini, M., M\u00f6ller, J., Murtagh, D., Hellander, A.: Snic science cloud (ssc): A national-scale cloud infrastructure for swedish academia. In: 2017 IEEE 13th International Conference on e-Science (e-Science), pp. 219\u2013227 (2017). https:\/\/doi.org\/10.1109\/eScience.2017.35","DOI":"10.1109\/eScience.2017.35"}],"container-title":["Progress in Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13748-024-00322-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s13748-024-00322-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13748-024-00322-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,8]],"date-time":"2024-07-08T09:36:20Z","timestamp":1720431380000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s13748-024-00322-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6]]},"references-count":48,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,6]]}},"alternative-id":["322"],"URL":"https:\/\/doi.org\/10.1007\/s13748-024-00322-3","relation":{},"ISSN":["2192-6352","2192-6360"],"issn-type":[{"value":"2192-6352","type":"print"},{"value":"2192-6360","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6]]},"assertion":[{"value":"20 December 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 May 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 June 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this manuscript.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}},{"value":"Not applicable.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}]}}