{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,21]],"date-time":"2025-11-21T12:53:00Z","timestamp":1763729580304,"version":"3.45.0"},"reference-count":53,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2025,11,21]],"date-time":"2025-11-21T00:00:00Z","timestamp":1763683200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62172123"],"award-info":[{"award-number":["62172123"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100017366","name":"Key Research and Development Program of Heilongjiang","doi-asserted-by":"publisher","award":["2022ZX01A36"],"award-info":[{"award-number":["2022ZX01A36"]}],"id":[{"id":"10.13039\/100017366","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>Federated learning in heterogeneous data scenarios faces two key challenges. First, the conflict between global models and local personalization complicates knowledge transfer and leads to feature misalignment, hindering effective personalization for clients. Second, the lack of dynamic adaptation in standard federated learning makes it difficult to handle highly heterogeneous and changing client data, reducing the global model\u2019s generalization ability. To address these issues, this paper proposes pFedKA, a personalized federated learning framework integrating knowledge distillation and a dual-attention mechanism. On the client-side, a cross-attention module dynamically aligns global and local feature spaces using adaptive temperature coefficients to mitigate feature misalignment. On the server-side, a Gated Recurrent Unit-based attention network adaptively adjusts aggregation weights using cross-round historical states, providing more robust aggregation than static averaging in heterogeneous settings. Experimental results on CIFAR-10, CIFAR-100, and Shakespeare datasets demonstrate that pFedKA converges faster and with greater stability in heterogeneous scenarios. Furthermore, it significantly improves personalization accuracy compared to state-of-the-art personalized federated learning methods. Additionally, we demonstrate privacy guarantees by integrating pFedKA with DP-SGD, showing comparable privacy protection to FedAvg while maintaining high personalization accuracy.<\/jats:p>","DOI":"10.3390\/computers14120504","type":"journal-article","created":{"date-parts":[[2025,11,21]],"date-time":"2025-11-21T12:41:58Z","timestamp":1763728918000},"page":"504","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["pFedKA: Personalized Federated Learning via Knowledge Distillation with Dual Attention Mechanism"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-2330-4357","authenticated-orcid":false,"given":"Yuanhao","family":"Jin","sequence":"first","affiliation":[{"name":"School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kaiqi","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chao","family":"Ma","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-2899-0508","authenticated-orcid":false,"given":"Xinxin","family":"Cheng","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Luogang","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hongguo","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,21]]},"reference":[{"key":"ref_1","unstructured":"Chen, H.Y., and Chao, W.L. (2020). Fedbe: Making bayesian model ensemble applicable to federated learning. arXiv."},{"key":"ref_2","unstructured":"Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S., Stich, S., and Suresh, A.T. (2020, January 12\u201318). Scaffold: Stochastic controlled averaging for federated learning. Proceedings of the 37th International Conference on Machine Learning (ICML 2020), Vienna, Austria."},{"key":"ref_3","unstructured":"Reddi, S., Charles, Z., Zaheer, M., Garrett, Z., Rush, K., Kone\u010dn\u00fd, J., Kumar, S., and McMahan, B. (2020). Adaptive federated optimization. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Jiang, Z., Xu, J., Zhang, S., Shen, T., Li, J., Kuang, K., Cai, H., and Wu, F. (2024). FedCFA: Alleviating Simpson\u2019s Paradox in Model Aggregation with Counterfactual Federated Learning. arXiv.","DOI":"10.1609\/aaai.v39i17.33942"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Chen, Z., Li, J., and Shen, C. (2024, January 14\u201319). Personalized Federated Learning with Attention-Based Client Selection. Proceedings of the 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea.","DOI":"10.1109\/ICASSP48485.2024.10447362"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"13426","DOI":"10.1109\/TNNLS.2023.3269062","article-title":"FedTP: Federated Learning by Transformer Personalization","volume":"35","author":"Li","year":"2023","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_7","unstructured":"Marfoq, O., Neglia, G., Vidal, R., and Kameni, L. (2022, January 17\u201323). Personalized federated learning through local memorization. Proceedings of the 39th International Conference on Machine Learning, PMLR 2022, Baltimore, MD, USA."},{"key":"ref_8","unstructured":"Panchal, K., Choudhary, S., Parikh, N., Zhang, L., and Guan, H. (2023, January 10\u201316). Flow: Per-instance personalized federated learning. Proceedings of the Advances in Neural Information Processing Systems 36 (NeurIPS 2023), New Orleans, LA, USA."},{"key":"ref_9","unstructured":"Yang, Z., Zhang, Y., Zheng, Y., Tian, X., Peng, H., Liu, T., and Han, B. (2023, January 10\u201316). FedFed: Feature distillation against dataheterogeneity in federated learning. Proceedings of the Advances in Neural Information Processing Systems 36 (NeurIPS 2023), New Orleans, LA, USA."},{"key":"ref_10","unstructured":"Chen, Z., Yang, H.H., Quek, T., and Chong, K.F.E. (2023, January 10\u201316). Spectral co-distillation for personalized federated learning. Proceedings of the Advances in Neural Information Processing Systems 36 (NeurIPS 2023), New Orleans, LA, USA."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1109\/TPAMI.2024.3470072","article-title":"Medical Federated Model With Mixture of Personalized and Shared Components","volume":"47","author":"Zhao","year":"2024","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Guo, T., Guo, S., and Wang, J. (May, January 30). Pfedprompt: Learning personalized prompt for vision-language models in federated learning. Proceedings of the ACM Web Conference 2023, Austin, TX, USA.","DOI":"10.1145\/3543507.3583518"},{"key":"ref_13","unstructured":"Wang, J., Yang, X., Cui, S., Che, L., Lyu, L., Xu, D., and Ma, F. (2023, January 10\u201316). Towards personalized federated learning via heterogeneous model reassembly. Proceedings of the Advances in Neural Information Processing Systems 36 (NeurIPS 2023), New Orleans, LA, USA."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"9368","DOI":"10.1109\/TMC.2024.3361876","article-title":"FedCache: A Knowledge Cache-Driven Federated Learning Architecture for Personalized Edge Intelligence","volume":"23","author":"Wu","year":"2024","journal-title":"IEEE Trans. Mob. Comput."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"14787","DOI":"10.1109\/TMC.2024.3446271","article-title":"FedASA: A Personalized Federated Learning With Adaptive Model Aggregation for Heterogeneous Mobile Edge Computing","volume":"23","author":"Deng","year":"2024","journal-title":"IEEE Trans. Mob. Comput."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, W., Shi, Y., and Zhao, J. (2021, January 13\u201315). A robustly optimized BERT pre-training approach with post-training. Proceedings of the 20th China National Conference on Chinese Computational Linguistics, Hohhot, China.","DOI":"10.1007\/978-3-030-84186-7_31"},{"key":"ref_17","unstructured":"Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., and Jegou, H. (2021, January 18\u201324). Training data-efficient image transformers & distillation through attention. Proceedings of the 38th International Conference on Machine Learning PMLR, Vienna, Austria."},{"key":"ref_18","unstructured":"Hinton, G. (2015). Distilling the Knowledge in a Neural Network. arXiv."},{"key":"ref_19","first-page":"1562","article-title":"A Data-Free Personalized Federated Learning Algorithm Based on KD","volume":"24","author":"Chen","year":"2024","journal-title":"Netinf. Secur."},{"key":"ref_20","unstructured":"Li, D., and Wang, J. (2019). Fedmd: Heterogenous federated learning via model distillation. arXiv."},{"key":"ref_21","first-page":"14068","article-title":"Group knowledge transfer: Federated learning of large cnns at the edge","volume":"33","author":"He","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_22","first-page":"2351","article-title":"Ensemble distillation for robust model fusion in federated learning","volume":"33","author":"Lin","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Ji, S., Pan, S., Long, G., Li, X., Jiang, J., and Huang, Z. (2019, January 14\u201319). Learning private neural language modeling with attentive aggregation. Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary.","DOI":"10.1109\/IJCNN.2019.8852464"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Huang, Y., Chu, L., Zhou, Z., Wang, L., Liu, J., Pei, J., and Zhan, Y. (2021, January 2\u20139). Personalized cross-silo federated learning on non-iid data. Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada.","DOI":"10.1609\/aaai.v35i9.16960"},{"key":"ref_25","unstructured":"Shamsian, A., Navon, A., Fetaya, E., and Chechik, G. (2021, January 18\u201324). Personalized federated learning using hypernetworks. Proceedings of the 38th International Conference on Machine Learning PMLR, Vienna, Austria."},{"key":"ref_26","unstructured":"Deng, Y., Kamani, M.M., and Mahdavi, M. (2020). Adaptive personalized federated learning. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Abadi, M., Chu, A., Goodfellow, I., McMahan, H.B., Mironov, I., Talwar, K., and Zhang, L. (2016, January 24\u201328). Deep learning with differential privacy. Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria.","DOI":"10.1145\/2976749.2978318"},{"key":"ref_28","unstructured":"Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., McMahan, H.B., Patel, S., Ramage, D., Segal, A., and Seth, K. (November, January 30). Practical secure aggregation for privacy-preserving machine learning. Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA."},{"key":"ref_29","unstructured":"McMahan, B., Moore, E., Ramage, D., Hampson, S., and y Arcas, B.A. (2017, January 20\u201322). Communication-efficient learningof deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Ft. Lauderdale, FL, USA."},{"key":"ref_30","unstructured":"Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., and Smith, V. (2020, January 2\u20134). Federated optimization in heterogeneous networks. Proceedings of the Machine Learning and Systems 2 (MLSys 2020), Austin, TX, USA."},{"key":"ref_31","unstructured":"Fallah, A., Mokhtari, A., and Ozdaglar, A. (2020). Personalized federated learning: A meta-learning approach. arXiv."},{"key":"ref_32","unstructured":"Collins, L., Hassani, H., Mokhtari, A., and Shakkottai, S. (2021, January 18\u201324). Exploiting shared representations for personalized federated learning. Proceedings of the 38th International Conference on Machine Learning PMLR, Vienna, Austria."},{"key":"ref_33","unstructured":"Wang, H., Yurochkin, M., Sun, Y., Papailiopoulos, D., and Khazaeni, Y. (2020). Federated learning with matched averaging. arXiv."},{"key":"ref_34","unstructured":"Li, T., Hu, S., Beirami, A., and Smith, V. (2021, January 18\u201324). Ditto: Fair and robust federated learning through personalization. Proceedings of the 38th International Conference on Machine Learning PMLR, Vienna, Austria."},{"key":"ref_35","unstructured":"Wang, H., Xu, H., Li, Y., Xu, Y., Li, R., and Zhang, T. (2024, January 7\u201311). Fedcda: Federated learning with cross-rounds divergence-aware aggregation. Proceedings of the Twelfth International Conference on Learning Representations, Vienna, Austria."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Li, Q., He, B., and Song, D. (2021, January 19\u201326). Practical One-Shot Federated Learning for Cross-Silo Setting. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, Montreal, QC, Canada.","DOI":"10.24963\/ijcai.2021\/205"},{"key":"ref_37","unstructured":"Acar, D.A.E., Zhao, Y., Navarro, R.M., Mattina, M., Whatmough, P.N., and Saligrama, V. (2021, January 4\u20138). Federated learning based on dynamic regularization. Proceedings of the 9th International Conference on Learning Representations, Vienna, Austria."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Shen, Y., Zhou, Y., and Yu, L. (2022, January 21\u201324). Cd2-pfed: Cyclic distillation-guided channel decoupling for model personalization in federated learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00980"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Xie, L., Lin, M., Luan, T., Li, C., Fang, Y., Shen, Q., and Wu, Z. (2024). MH-pFLID: Model Heterogeneous personalized Federated Learning via Injection and Distillation for Medical Data Analysis. arXiv.","DOI":"10.1007\/978-3-031-72117-5_50"},{"key":"ref_40","unstructured":"Chen, H.Y., and Chao, W.L. (2021). On bridging generic and personalized federated learning for image classification. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Jiang, Y., Zhao, X., Li, H., and Xue, Y. (2024). A Personalized Federated Learning Method Based on Knowledge Distillation and Differential Privacy. Electronics, 13.","DOI":"10.3390\/electronics13173538"},{"key":"ref_42","first-page":"5","article-title":"KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via KD","volume":"4","author":"Feng","year":"2021","journal-title":"ICML"},{"key":"ref_43","unstructured":"Zhang, M., Sapra, K., Fidler, S., Yeung, S., and Alvarez, J.M. (2020). Personalized federated learning with first order model optimization. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Kalafatelis, A.S., Pitsiakou, A., Nomikos, N., Tsoulakos, N., Syriopoulos, T., and Trakadas, P. (2025). FLUID: Dynamic Model-Agnostic Federated Learning with Pruning and Knowledge Distillation for Maritime Predictive Maintenance. J. Mar. Sci. Eng., 13.","DOI":"10.3390\/jmse13081569"},{"key":"ref_45","first-page":"21394","article-title":"Personalized federated learning with moreau envelopes","volume":"33","author":"Dinh","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_46","unstructured":"Liang, P.P., Liu, T., Ziyin, L., Allen, N.B., Auerbach, R.P., Brent, D., Salakhutdinov, R., and Morency, L.-P. (2020). Think locally, act globally: Federated learning with local and global representations. arXiv."},{"key":"ref_47","unstructured":"Wang, Z., Yan, F., Wang, T., Wang, C., Shu, Y., Cheng, P., and Chen, J. (March, January 25). Fed-DFA: Federated distillation for heterogeneous model fusion through the adversarial lens. Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA."},{"key":"ref_48","unstructured":"Krizhevsky, A., and Hinton, G. (2009). Learning Multiple Layers of Features from Tiny Images, University of Toronto."},{"key":"ref_49","unstructured":"(2024, November 12). Shakespeare. TensorFlow Federated Datasets. Available online: https:\/\/www.tensorflow.org\/federated\/api_docs\/python\/tff\/simulation\/datasets\/shakespeare."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Graves, A. (2012). Supervised Sequence Labelling with Recurrent Neural Networks, Springer Nature.","DOI":"10.1007\/978-3-642-24797-2"},{"key":"ref_52","first-page":"10572","article-title":"Fedavg with fine tuning: Local updates lead to representation learning","volume":"35","author":"Collins","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1561\/0400000042","article-title":"The Algorithmic Foundations of Differential Privacy","volume":"9","author":"Dwork","year":"2013","journal-title":"Found. Trends\u00ae Theor. Comput. Sci."}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/14\/12\/504\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,21]],"date-time":"2025-11-21T12:48:00Z","timestamp":1763729280000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/14\/12\/504"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,21]]},"references-count":53,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["computers14120504"],"URL":"https:\/\/doi.org\/10.3390\/computers14120504","relation":{},"ISSN":["2073-431X"],"issn-type":[{"value":"2073-431X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,21]]}}}