{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T15:35:31Z","timestamp":1772897731529,"version":"3.50.1"},"reference-count":44,"publisher":"MDPI AG","issue":"23","license":[{"start":{"date-parts":[[2025,11,29]],"date-time":"2025-11-29T00:00:00Z","timestamp":1764374400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62476111"],"award-info":[{"award-number":["62476111"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100011789","name":"Department of Science and Technology of Jilin Province","doi-asserted-by":"publisher","award":["20230201086GX"],"award-info":[{"award-number":["20230201086GX"]}],"id":[{"id":"10.13039\/501100011789","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Science and Technology Bureau of Changchun","award":["24GNYZ11"],"award-info":[{"award-number":["24GNYZ11"]}]},{"name":"Industry University Research Innovation Fund of the Ministry of Education project","award":["2022XF017"],"award-info":[{"award-number":["2022XF017"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Federated learning (FL) for skeleton-based action recognition remains underexplored, particularly under strong client heterogeneity where regular FedAvg tends to cause client drift and unstable convergence. We introduce Clustered Federated Spatio-Temporal Graph Attention Networks (CF-STGAT), a clustered FL framework that leverages attention-derived spatio-temporal statistics from local STGAT models to dynamically group clients and perform attention-weighted inter-cluster fusion that gently align cluster models. Concretely, the server periodically extracts multi-head parameter-based attention descriptors, normalizes and projects them via PCA, and applies K-means to form clusters; a global reference is then computed by attention\u2013similarity weighting and used to regularize each cluster model with a lightweight fusion step. On NTU RGB+D 60\/120(NTU 60\/120), CF-STGAT consistently outperforms strong FL baselines with the STGAT backbone, yielding absolute top-1 gains of +0.84\/+4.09 (NTU 60, X-Sub\/X-Setup) and +7.98\/+4.18 (NTU 120, X-Sub\/X-Setup) over FedAvg, alongside smoother per-client trajectories and lower terminal test loss. Ablations indicate that attention-guided clustering and inter-cluster fusion are complementary: clustering reduces within-group variance whereas fusion limits cross-cluster divergence. The approach keeps local training unchanged and adds only server-side statistics and clustering.<\/jats:p>","DOI":"10.3390\/s25237277","type":"journal-article","created":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T17:35:17Z","timestamp":1764956117000},"page":"7277","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Clustered Federated Spatio-Temporal Graph Attention Networks for Skeleton-Based Action Recognition"],"prefix":"10.3390","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2523-4404","authenticated-orcid":false,"given":"Tao","family":"Yu","sequence":"first","affiliation":[{"name":"Centro Algoritmi, University do Minho, 4800-058 Guimar\u00e3es, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4580-7484","authenticated-orcid":false,"given":"Sandro","family":"Pinto","sequence":"additional","affiliation":[{"name":"Centro Algoritmi, University do Minho, 4800-058 Guimar\u00e3es, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4071-9015","authenticated-orcid":false,"given":"Tiago","family":"Gomes","sequence":"additional","affiliation":[{"name":"Centro Algoritmi, University do Minho, 4800-058 Guimar\u00e3es, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8316-6927","authenticated-orcid":false,"given":"Adriano","family":"Tavares","sequence":"additional","affiliation":[{"name":"Centro Algoritmi, University do Minho, 4800-058 Guimar\u00e3es, Portugal"}]},{"given":"Hao","family":"Xu","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"111166","DOI":"10.1016\/j.asoc.2023.111166","article-title":"A light-weight skeleton human action recognition model with knowledge distillation for edge intelligent surveillance applications","volume":"151","author":"Dai","year":"2024","journal-title":"Appl. Soft Comput."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"104523","DOI":"10.1016\/j.robot.2023.104523","article-title":"A general skeleton-based action and gesture recognition framework for human\u2013robot collaboration","volume":"170","author":"Terreran","year":"2023","journal-title":"Robot. Auton. Syst."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Yan, S., Xiong, Y., and Lin, D. (2018, January 2\u20137). Spatial temporal graph convolutional networks for skeleton-based action recognition. Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA.","DOI":"10.1609\/aaai.v32i1.12328"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Xiao, L., Yang, X., Peng, T., Li, H., and Guo, R. (2024). Skeleton-Based Activity Recognition for Process-Based Quality Control of Concealed Work via Spatial\u2013Temporal Graph Convolutional Networks. Sensors, 24.","DOI":"10.3390\/s24041220"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Shi, L., Zhang, Y., Cheng, J., and Lu, H. (2019, January 15\u201320). Two-stream adaptive graph convolutional networks for skeleton-based action recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01230"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Chen, Y., Zhang, Z., Yuan, C., Li, B., Deng, Y., and Hu, W. (2021, January 11\u201317). Channel-wise topology refinement graph convolution for skeleton-based action recognition. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.01311"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Liu, Z., Zhang, H., Chen, Z., Wang, Z., and Ouyang, W. (2020, January 13\u201319). Disentangling and unifying graph convolutions for skeleton-based action recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00022"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"120683","DOI":"10.1016\/j.eswa.2023.120683","article-title":"Skeleton-based action recognition with local dynamic spatial\u2013temporal aggregation","volume":"232","author":"Hu","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Li, C., Niu, D., Jiang, B., Zuo, X., and Yang, J. (2021, January 19\u201323). Meta-har: Federated representation learning for human activity recognition. Proceedings of the Web Conference 2021, Ljubljana, Slovenia.","DOI":"10.1145\/3442381.3450006"},{"key":"ref_10","unstructured":"Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., and Smith, V. (2020, January 2\u20134). Federated optimization in heterogeneous networks. Proceedings of the Machine Learning and Systems, Austin, TX, USA."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Li, Q., He, B., and Song, D. (2021, January 20\u201325). Model-contrastive federated learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01057"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Presotto, R., Civitarese, G., and Bettini, C. (2022, January 21\u201325). Fedclar: Federated clustering for personalized sensor-based human activity recognition. Proceedings of the 2022 IEEE International Conference on Pervasive Computing and Communications (PerCom), Pisa, Italy.","DOI":"10.1109\/PerCom53586.2022.9762352"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Kim, T.S., and Reiter, A. (2017, January 21\u201326). Interpretable 3d human action analysis with temporal convolutional networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.207"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Lea, C., Flynn, M.D., Vidal, R., Reiter, A., and Hager, G.D. (2017, January 21\u201326). Temporal convolutional networks for action segmentation and detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.113"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Nguyen, H.C., Nguyen, T.H., Scherer, R., and Le, V.H. (2023). Deep learning for human activity recognition on 3D human skeleton: Survey and comparative study. Sensors, 23.","DOI":"10.3390\/s23115121"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Duan, H., Zhao, Y., Chen, K., Lin, D., and Dai, B. (2022, January 18\u201324). Revisiting skeleton-based action recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00298"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Chi, H.g., Ha, M.H., Chi, S., Lee, S.W., Huang, Q., and Ramani, K. (2022, January 18\u201324). Infogcn: Representation learning for human skeleton-based action recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01955"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Lee, J., Lee, M., Lee, D., and Lee, S. (2023, January 1\u20136). Hierarchically decomposed graph convolutional networks for skeleton-based action recognition. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.00958"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Feng, M., and Meunier, J. (2022). Skeleton graph-neural-network-based human action recognition: A survey. Sensors, 22.","DOI":"10.3390\/s22062091"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zhou, Y., Yan, X., Cheng, Z.Q., Yan, Y., Dai, Q., and Hua, X.S. (2024, January 16\u201322). Blockgcn: Redefine topology awareness for skeleton-based action recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.00200"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Gao, Z., Wang, P., Lv, P., Jiang, X., Liu, Q., Wang, P., Xu, M., and Li, W. (2022, January 4\u20138). Focal and global spatial-temporal transformer for skeleton-based action recognition. Proceedings of the Asian Conference on Computer Vision, Macau, China.","DOI":"10.1007\/978-3-031-26316-3_10"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Ahn, D., Kim, S., Hong, H., and Ko, B.C. (2023, January 2\u20137). Star-transformer: A spatio-temporal cross attention transformer for human action recognition. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","DOI":"10.1109\/WACV56688.2023.00333"},{"key":"ref_23","unstructured":"Do, J., and Kim, M. (October, January 29). Skateformer: Skeletal-temporal transformer for human action recognition. Proceedings of the European Conference on Computer Vision, Milan, Italy."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Qin, X., Cai, R., Yu, J., He, C., and Zhang, X. (2022). An efficient self-attention network for skeleton-based action recognition. Sci. Rep., 12.","DOI":"10.1038\/s41598-022-08157-5"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1016\/j.neucom.2023.03.001","article-title":"Transformer for skeleton-based action recognition: A review of recent advances","volume":"537","author":"Xin","year":"2023","journal-title":"Neurocomputing"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Ren, B., Liu, M., Ding, R., and Liu, H. (2024). A survey on 3d skeleton-based action recognition using learning method. Cyborg Bionic Syst., 5.","DOI":"10.34133\/cbsystems.0100"},{"key":"ref_27","unstructured":"McMahan, B., Moore, E., Ramage, D., Hampson, S., and y Arcas, B.A. (2017, January 20\u201322). Communication-efficient learning of deep networks from decentralized data. Proceedings of the Artificial Intelligence and Statistics, PMLR, Fort Lauderdale, FL, USA."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"3347","DOI":"10.1109\/TKDE.2021.3124599","article-title":"A survey on federated learning systems: Vision, hype and reality for data privacy and protection","volume":"35","author":"Li","year":"2021","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"128019","DOI":"10.1016\/j.neucom.2024.128019","article-title":"Recent advances on federated learning: A systematic survey","volume":"597","author":"Liu","year":"2024","journal-title":"Neurocomputing"},{"key":"ref_30","first-page":"1","article-title":"The federation strikes back: A survey of federated learning privacy attacks, defenses, applications, and policy landscape","volume":"57","author":"Zhao","year":"2025","journal-title":"ACM Comput. Surv."},{"key":"ref_31","unstructured":"Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S., Stich, S., and Suresh, A.T. (2020, January 13\u201318). Scaffold: Stochastic controlled averaging for federated learning. Proceedings of the International Conference on Machine Learning, PMLR, Virtual."},{"key":"ref_32","unstructured":"Wang, J., Liu, Q., Liang, H., Joshi, G., and Poor, H.V. (2020, January 6\u201312). Tackling the objective inconsistency problem in heterogeneous federated optimization. Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver BC Canada."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"11856","DOI":"10.1109\/TII.2023.3252599","article-title":"A greedy agglomerative framework for clustered federated learning","volume":"19","author":"Mehta","year":"2023","journal-title":"IEEE Trans. Ind. Informatics"},{"key":"ref_34","unstructured":"Ma, J., Zhou, T., Long, G., Jiang, J., and Zhang, C. (2023, January 10\u201316). Structured federated learning through clustered additive modeling. Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"119976","DOI":"10.1016\/j.ins.2023.119976","article-title":"FedGL: Federated graph learning framework with global self-supervision","volume":"657","author":"Chen","year":"2024","journal-title":"Inf. Sci."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"41","DOI":"10.14778\/3617838.3617842","article-title":"FedGTA: Topology-Aware Averaging for Federated Graph Learning","volume":"17","author":"Li","year":"2023","journal-title":"VLDB Endow."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Huang, W., Wan, G., Ye, M., and Du, B. (2023, January 19\u201325). Federated graph semantic and structural learning. Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, Macao, China.","DOI":"10.24963\/ijcai.2023\/426"},{"key":"ref_38","unstructured":"Liu, Y., Lou, Y., Liu, Y., Cao, Y., and Wang, H. (2024, January 3\u20139). Label leakage in vertical federated learning: A survey. Proceedings of the IJCAI, Jeju, Republic of Korea."},{"key":"ref_39","first-page":"1047","article-title":"Split learning for distributed collaborative training of deep learning models in health informatics","volume":"2023","author":"Li","year":"2024","journal-title":"AMIA Annu. Symp. Proc."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3720539","article-title":"Vertical federated learning for effectiveness, security, applicability: A survey","volume":"57","author":"Ye","year":"2025","journal-title":"ACM Comput. Surv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Shahroudy, A., Liu, J., Ng, T.T., and Wang, G. (2016, January 27\u201330). NTU RGB+D: A large scale dataset for 3D human activity analysis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.115"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"2684","DOI":"10.1109\/TPAMI.2019.2916873","article-title":"NTU RGB+D 120: A large-scale benchmark for 3D human activity understanding","volume":"42","author":"Liu","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_43","unstructured":"Arivazhagan, M.G., Aggarwal, V., Singh, A.K., and Choudhary, S. (2019). Federated learning with personalization layers. arXiv Prepr."},{"key":"ref_44","unstructured":"Collins, L., Hassani, H., Mokhtari, A., and Shakkottai, S. (2021, January 18\u201324). Exploiting shared representations for personalized federated learning. Proceedings of the International Conference on Machine Learning, PMLR, Virtual."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/25\/23\/7277\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,6]],"date-time":"2025-12-06T05:12:30Z","timestamp":1764997950000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/25\/23\/7277"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,29]]},"references-count":44,"journal-issue":{"issue":"23","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["s25237277"],"URL":"https:\/\/doi.org\/10.3390\/s25237277","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,29]]}}}