{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T22:34:37Z","timestamp":1776378877081,"version":"3.51.2"},"reference-count":63,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,7,20]],"date-time":"2023-07-20T00:00:00Z","timestamp":1689811200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,7,20]],"date-time":"2023-07-20T00:00:00Z","timestamp":1689811200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Postdoctoral Foundation of Zhejiang Normal University","award":["ZC304021941"],"award-info":[{"award-number":["ZC304021941"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>View-invariant action recognition has been widely researched in various applications, such as visual surveillance and human\u2013robot interaction. However, view-invariant human action recognition is challenging due to the action occlusions and information loss caused by view changes. Modeling spatiotemporal dynamics of body joints and minimizing representation discrepancy between different views could be a valuable solution for view-invariant human action recognition. Therefore, we propose a <jats:bold>D<\/jats:bold>ual-<jats:bold>A<\/jats:bold>ttention <jats:bold>Net<\/jats:bold>work (DANet) aims to learn robust video representation for view-invariant action recognition. The DANet is composed of relation-aware spatiotemporal self-attention and spatiotemporal cross-attention modules. The relation-aware spatiotemporal self-attention module learns representative and discriminative action features. This module captures local and global long-range dependencies, as well as pairwise relations among human body parts and joints in the spatial and temporal domains. The cross-attention module learns view-invariant attention maps and generates discriminative features for semantic representations of actions in different views. We exhaustively evaluate our proposed approach on the NTU-60, NTU-120, and UESTC large-scale challenging datasets with multi-type evaluation metrics including Cross-Subject, Cross-View, Cross-Set, and Arbitrary-view. The experimental results demonstrate that our proposed approach significantly outperforms state-of-the-art approaches in view-invariant action recognition.<\/jats:p>","DOI":"10.1007\/s40747-023-01171-8","type":"journal-article","created":{"date-parts":[[2023,7,20]],"date-time":"2023-07-20T02:01:59Z","timestamp":1689818519000},"page":"305-321","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Dual-attention Network for View-invariant Action Recognition"],"prefix":"10.1007","volume":"10","author":[{"given":"Gedamu Alemu","family":"Kumie","sequence":"first","affiliation":[]},{"given":"Maregu Assefa","family":"Habtie","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9413-6459","authenticated-orcid":false,"given":"Tewodros Alemu","family":"Ayall","sequence":"additional","affiliation":[]},{"given":"Changjun","family":"Zhou","sequence":"additional","affiliation":[]},{"given":"Huawen","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Abegaz Mohammed","family":"Seid","sequence":"additional","affiliation":[]},{"given":"Aiman","family":"Erbad","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,7,20]]},"reference":[{"key":"1171_CR1","doi-asserted-by":"publisher","first-page":"2742","DOI":"10.1109\/TIP.2019.2952088","volume":"29","author":"Y Ji","year":"2020","unstructured":"Ji Y, Zhan Y, Yang Y, Xu X, Shen F, Shen HT (2020) A context knowledge map guided coarse-to-fine action recognition. Trans Image Process 29:2742\u20132752. https:\/\/doi.org\/10.1109\/TIP.2019.2952088","journal-title":"Trans Image Process"},{"key":"1171_CR2","doi-asserted-by":"publisher","DOI":"10.1007\/s40747-022-00914-3","author":"T Jun","year":"2022","unstructured":"Jun T, Baodi L, Wenhui G, Yanjiang W (2022) Two-stream temporal enhanced fisher vector encoding for skeleton-based action recognition. Complex Intell Syst. https:\/\/doi.org\/10.1007\/s40747-022-00914-3","journal-title":"Complex Intell Syst"},{"key":"1171_CR3","doi-asserted-by":"publisher","unstructured":"Wang J, Nie X, Xia Y, Wu Y, Zhu S (2014) Cross-view action modeling, learning and recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2649\u20132656. https:\/\/doi.org\/10.1109\/CVPR.2014.339","DOI":"10.1109\/CVPR.2014.339"},{"key":"1171_CR4","doi-asserted-by":"publisher","unstructured":"Ji Y, Yang Y, Shen F, Shen HT, Zheng W (2018) A large-scale varying-view rgb-d action dataset for arbitrary-view human action recognition. In: ACM international conference on multimedia, pp 1510\u20131518. https:\/\/doi.org\/10.1145\/3240508.3240675","DOI":"10.1145\/3240508.3240675"},{"key":"1171_CR5","doi-asserted-by":"publisher","unstructured":"Liu J, Shah M, Kuipers B, Savarese S (2011) Cross-view action recognition via view knowledge transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3209\u20133216. https:\/\/doi.org\/10.1109\/CVPR.2011.5995729","DOI":"10.1109\/CVPR.2011.5995729"},{"issue":"5","key":"1171_CR6","doi-asserted-by":"publisher","first-page":"914","DOI":"10.1109\/TPAMI.2013.198","volume":"36","author":"J Wang","year":"2013","unstructured":"Wang J, Liu Z, Wu Y, Yuan J (2013) Learning actionlet ensemble for 3d human action recognition. IEEE Trans Pattern Anal Mach Intell 36(5):914\u2013927. https:\/\/doi.org\/10.1109\/TPAMI.2013.198","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1171_CR7","doi-asserted-by":"publisher","first-page":"108043","DOI":"10.1016\/j.patcog.2021.108043","volume":"118","author":"K Gedamu","year":"2021","unstructured":"Gedamu K, Ji Y, Yang Y, Gao L, Shen HT (2021) Arbitrary-view human action recognition via novel-view action generation. Pattern Recognit 118:108043. https:\/\/doi.org\/10.1016\/j.patcog.2021.108043","journal-title":"Pattern Recognit"},{"key":"1171_CR8","doi-asserted-by":"publisher","unstructured":"Weinland D, Boyer E, Ronfard R (2007) Action recognition from arbitrary views using 3d exemplars. In: Proceedings of the IEEE international conference on computer vision, pp 1\u20137. https:\/\/doi.org\/10.1109\/ICCV.2007.4408849","DOI":"10.1109\/ICCV.2007.4408849"},{"key":"1171_CR9","doi-asserted-by":"publisher","unstructured":"Jing Q, Kun X, Xilun D (2022) Approach to hand posture recognition based on hand shape features for human-robot interaction. In: Complex and intelligent systems, pp 2825\u20132842. https:\/\/doi.org\/10.1007\/s40747-022-00914-3","DOI":"10.1007\/s40747-022-00914-3"},{"key":"1171_CR10","doi-asserted-by":"publisher","unstructured":"Junejo IN, Dexter E, Laptev I, P\u00darez P (2008) Cross-view action recognition from temporal self-similarities. In: European conference on computer vision, pp 293\u2013306. https:\/\/doi.org\/10.1007\/978-3-540-88688-4_22","DOI":"10.1007\/978-3-540-88688-4_22"},{"issue":"12","key":"1171_CR11","doi-asserted-by":"publisher","first-page":"2430","DOI":"10.1109\/TPAMI.2016.2533389","volume":"38","author":"H Rahmani","year":"2016","unstructured":"Rahmani H, Mahmood A, Huynh D, Mian A (2016) Histogram of oriented principal components for cross-view action recognition. IEEE Trans Pattern Anal Mach Intell 38(12):2430\u20132443. https:\/\/doi.org\/10.1109\/TPAMI.2016.2533389","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1171_CR12","doi-asserted-by":"publisher","unstructured":"Ji Y, Yang Y, Xie N, Shen HT, Harada T (2019) Attention transfer (ant) network for view-invariant action recognition. In: ACM international conference on multimedia, pp 574\u2013582. https:\/\/doi.org\/10.1145\/3343031.3350959","DOI":"10.1145\/3343031.3350959"},{"issue":"10","key":"1171_CR13","doi-asserted-by":"publisher","first-page":"4709","DOI":"10.1109\/TIP.2018.2836323","volume":"27","author":"J Zhang","year":"2018","unstructured":"Zhang J, Shum HP, Han J, Shao L (2018) Action recognition from arbitrary views using transferable dictionary learning. Trans Image Process 27(10):4709\u20134723. https:\/\/doi.org\/10.1109\/TIP.2018.2836323","journal-title":"Trans Image Process"},{"issue":"8","key":"1171_CR14","doi-asserted-by":"publisher","first-page":"1963","DOI":"10.1109\/TPAMI.2019.2896631","volume":"41","author":"P Zhang","year":"2019","unstructured":"Zhang P, Lan C, Xing J, Zeng W, Xue J, Zheng N (2019) View adaptive neural networks for high performance skeleton-based human action recognition. IEEE Trans Pattern Anal Mach Intell 41(8):1963\u20131978. https:\/\/doi.org\/10.1109\/TPAMI.2019.2896631","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1171_CR15","doi-asserted-by":"publisher","first-page":"346","DOI":"10.1016\/j.patcog.2017.02.030","volume":"68","author":"M Liu","year":"2017","unstructured":"Liu M, Liu H, Chen C (2017) Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognit 68:346\u2013362. https:\/\/doi.org\/10.1016\/j.patcog.2017.02.030","journal-title":"Pattern Recognit"},{"key":"1171_CR16","doi-asserted-by":"publisher","unstructured":"Song S, Lan C, Xing J, Zeng W, Liu J (2017) An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In: Proceedings of the AAAI conference on artificial intelligence, pp 4263\u20134270. https:\/\/doi.org\/10.1609\/aaai.v31i1.11212","DOI":"10.1609\/aaai.v31i1.11212"},{"key":"1171_CR17","doi-asserted-by":"publisher","unstructured":"Yan S, Xiong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence, pp 7444\u20137452. https:\/\/doi.org\/10.1609\/aaai.v32i1.12328","DOI":"10.1609\/aaai.v32i1.12328"},{"key":"1171_CR18","doi-asserted-by":"publisher","unstructured":"Li M, Chen S, Chen X, Zhang Y, Wang Y, Tian Q (2019) Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3590\u20133598. https:\/\/doi.org\/10.1109\/CVPR.2019.00371","DOI":"10.1109\/CVPR.2019.00371"},{"key":"1171_CR19","doi-asserted-by":"publisher","unstructured":"Shi L, Zhang Y, Cheng J, Lu H (2019) Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern Recognition, pp 12018\u201312027. https:\/\/doi.org\/10.1109\/CVPR.2019.01230","DOI":"10.1109\/CVPR.2019.01230"},{"key":"1171_CR20","doi-asserted-by":"publisher","unstructured":"Cheng K, Zhang Y, He X, Chen W, Cheng J, Lu H (2020) Skeleton-based action recognition with shift graph convolutional network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 180\u2013189. https:\/\/doi.org\/10.1109\/CVPR42600.2020.00026","DOI":"10.1109\/CVPR42600.2020.00026"},{"key":"1171_CR21","doi-asserted-by":"publisher","unstructured":"Liu Z, Zhang H, Chen Z, Wang Z, Ouyang W (2020) Disentangling and unifying graph convolutions for skeleton-based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 140\u2013149. https:\/\/doi.org\/10.1109\/CVPR42600.2020.00022","DOI":"10.1109\/CVPR42600.2020.00022"},{"key":"1171_CR22","doi-asserted-by":"publisher","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser \u0141, Polosukhin I (2017) Attention is all you need. In: Proceedings of advances in neural information processing systems. https:\/\/doi.org\/10.48550\/arXiv.1706.03762","DOI":"10.48550\/arXiv.1706.03762"},{"issue":"2","key":"1171_CR23","doi-asserted-by":"publisher","first-page":"363","DOI":"10.1109\/TMM.2018.2859620","volume":"21","author":"Z Fan","year":"2019","unstructured":"Fan Z, Zhao X, Lin T, Su H (2019) Attention-based multiview re-observation fusion network for skeletal action recognition. IEEE Trans Multimedia 21(2):363\u2013374. https:\/\/doi.org\/10.1109\/TMM.2018.2859620","journal-title":"IEEE Trans Multimedia"},{"key":"1171_CR24","doi-asserted-by":"publisher","unstructured":"Bello I, Zoph B, Vaswani A, Shlens J, Le QV (2019) Attention augmented convolutional networks. In: Proceedings of the IEEE international conference on computer vision. https:\/\/doi.org\/10.1109\/ICCV.2019.00338","DOI":"10.1109\/ICCV.2019.00338"},{"key":"1171_CR25","doi-asserted-by":"publisher","unstructured":"Arnab A, Dehghani M, Heigold G, Sun C, Lu\u010di\u0107 M, Schmid C (2021) Vivit: a video vision transformer. In: Proceedings of the IEEE international conference on computer vision, pp 6816\u20136826. https:\/\/doi.org\/10.1109\/ICCV48922.2021.00676","DOI":"10.1109\/ICCV48922.2021.00676"},{"key":"1171_CR26","doi-asserted-by":"publisher","unstructured":"Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. https:\/\/doi.org\/10.1109\/CVPR.2018.00813","DOI":"10.1109\/CVPR.2018.00813"},{"key":"1171_CR27","doi-asserted-by":"publisher","first-page":"109455","DOI":"10.1016\/j.patcog.2023.109455","volume":"139","author":"K Gedamu","year":"2023","unstructured":"Gedamu K, Ji Y, Gao L, Yang Y, Shen HT (2023) Relation-mining self-attention network for skeleton-based human action recognition. Pattern Recognit 139:109455. https:\/\/doi.org\/10.1016\/j.patcog.2023.109455","journal-title":"Pattern Recognit"},{"key":"1171_CR28","doi-asserted-by":"publisher","unstructured":"Zhang Z, Wang C, Xiao B, Zhou W, Liu S, Shi C (2013) Cross-view action recognition via a continuous virtual path. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2690\u20132697. https:\/\/doi.org\/10.1109\/CVPR.2013.347","DOI":"10.1109\/CVPR.2013.347"},{"key":"1171_CR29","doi-asserted-by":"publisher","unstructured":"Gedamu K, Yilma G, Assefa M, Ayalew M (2022) Spatio-temporal dual-attention network for view-invariant human action recognition. In: Proceedings of international conference on digital image processing, pp 213\u2013222. https:\/\/doi.org\/10.1117\/12.2643446","DOI":"10.1117\/12.2643446"},{"key":"1171_CR30","doi-asserted-by":"publisher","unstructured":"Wang L, Ding Z, Tao Z, Liu Y, Fu Y (2019) Generative multi-view human action recognition. In: Proceedings of the IEEE international conference on computer vision. https:\/\/doi.org\/10.1109\/ICCV.2019.00631","DOI":"10.1109\/ICCV.2019.00631"},{"key":"1171_CR31","doi-asserted-by":"publisher","unstructured":"Hou R, Chang H, Ma B, Shan S, Chen X (2020) Cross attention network for few-shot classification. In: Proceedings of NeurIPS. https:\/\/doi.org\/10.48550\/arXiv.1910.07677","DOI":"10.48550\/arXiv.1910.07677"},{"key":"1171_CR32","doi-asserted-by":"publisher","unstructured":"Gao L, Ji Y, Yang Y, Shen H (2022) Global-local cross-view fisher discrimination for view-invariant action recognition. In: Proceedings of ACM international conference on multimedia, pp 5255\u20135264. https:\/\/doi.org\/10.1145\/3503161.3548280","DOI":"10.1145\/3503161.3548280"},{"key":"1171_CR33","doi-asserted-by":"publisher","first-page":"4503","DOI":"10.1109\/TMM.2021.3119177","volume":"4493","author":"L Gao","year":"2021","unstructured":"Gao L, Ji Y, Kumie GA, Xu X, Zhu X, Shen HT (2021) View-invariant human action recognition via view transformation network. IEEE Trans Multimed 4493:4503. https:\/\/doi.org\/10.1109\/TMM.2021.3119177","journal-title":"IEEE Trans Multimed"},{"key":"1171_CR34","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2023.3267178","author":"M Assefa","year":"2023","unstructured":"Assefa M, Jiang W, Gedamu K, Yilma G, Adhikari D, Ayalew M, Mohammed A, Erbad A (2023) Actor-aware self-supervised learning for semi-supervised video representation learning. IEEE Trans Circuits Syst for Video Technol. https:\/\/doi.org\/10.1109\/TCSVT.2023.3267178","journal-title":"IEEE Trans Circuits Syst for Video Technol"},{"key":"1171_CR35","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2022.3193559","author":"M Assefa","year":"2022","unstructured":"Assefa M, Jiang W, Gedamu K, Yilma G, Kumeda B, Ayalew M (2022) Self-supervised scene-debiasing for video representation learning via background patching. IEEE Trans Multimed. https:\/\/doi.org\/10.1109\/TMM.2022.3193559","journal-title":"IEEE Trans Multimed"},{"key":"1171_CR36","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-58347-1_10","author":"Y Ganin","year":"2016","unstructured":"Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. JMLR. https:\/\/doi.org\/10.1007\/978-3-319-58347-1_10","journal-title":"JMLR"},{"key":"1171_CR37","doi-asserted-by":"publisher","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT. https:\/\/doi.org\/10.18653\/v1\/N19-1423","DOI":"10.18653\/v1\/N19-1423"},{"key":"1171_CR38","doi-asserted-by":"publisher","unstructured":"Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: Proceedings of European conference on computer vision. https:\/\/doi.org\/10.1007\/978-3-030-58452-8_13","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"1171_CR39","doi-asserted-by":"publisher","first-page":"103219","DOI":"10.1016\/j.cviu.2021.103219","volume":"208\u2013209","author":"C Plizzari","year":"2021","unstructured":"Plizzari C, Cannici M, Matteucci M (2021) Skeleton-based action recognition via spatial and temporal transformer networks. Comput Vis Image Understand 208\u2013209:103219. https:\/\/doi.org\/10.1016\/j.cviu.2021.103219","journal-title":"Comput Vis Image Understand"},{"key":"1171_CR40","doi-asserted-by":"publisher","unstructured":"Shaw P, Uszkoreit J, Vaswani A (2018) Self-attention with relative position representations. In: Proceedings of the NAACL. https:\/\/doi.org\/10.18653\/v1\/N18-2074","DOI":"10.18653\/v1\/N18-2074"},{"key":"1171_CR41","doi-asserted-by":"publisher","unstructured":"Ramachandran P, Parmar N, Vaswani A, Bello I, Levskaya A, Shlens J (2019) Stand-alone self-attention in vision models. In: Proceedings of the NeurIPS. https:\/\/doi.org\/10.48550\/arXiv.1906.05909","DOI":"10.48550\/arXiv.1906.05909"},{"key":"1171_CR42","doi-asserted-by":"publisher","unstructured":"Cao Y, Xu J, Lin S, Wei F, Hu H (2019) Gcnet: non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of IEEE\/CVF international conference on computer vision workshop. https:\/\/doi.org\/10.1109\/ICCVW.2019.00246","DOI":"10.1109\/ICCVW.2019.00246"},{"key":"1171_CR43","doi-asserted-by":"publisher","unstructured":"Yin M, Yao Z, Cao Y, Li X, Zhang Z, Lin S, Hu H (2020) Disentangled non-local neural networks. In: Proceedings of European conference on computer vision. https:\/\/doi.org\/10.1007\/978-3-030-58555-6_12","DOI":"10.1007\/978-3-030-58555-6_12"},{"key":"1171_CR44","doi-asserted-by":"publisher","unstructured":"Wei X, Zhang T, Li Y, Zhang Y, Wu F (2020) Multi-modality cross attention network for image and sentence matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition. https:\/\/doi.org\/10.1109\/CVPR42600.2020.01095","DOI":"10.1109\/CVPR42600.2020.01095"},{"key":"1171_CR45","doi-asserted-by":"publisher","unstructured":"Lee K-H, Chen X, Hua G, Hu H, He X (2018) Stacked cross attention for image-text matching. In: Proceedings of European conference on computer vision, pp 212\u2013228. https:\/\/doi.org\/10.1007\/978-3-030-01225-0_13","DOI":"10.1007\/978-3-030-01225-0_13"},{"issue":"4","key":"1171_CR46","doi-asserted-by":"publisher","first-page":"1586","DOI":"10.1109\/TIP.2017.2785279","volume":"27","author":"J Liu","year":"2018","unstructured":"Liu J, Wang G, Duan L, Abdiyeva K, Kot AC (2018) Skeleton-based human action recognition with global context-aware attention LSTM networks. Trans Image Process 27(4):1586\u20131599. https:\/\/doi.org\/10.1109\/TIP.2017.2785279","journal-title":"Trans Image Process"},{"key":"1171_CR47","doi-asserted-by":"publisher","unstructured":"Shahroudy A, Liu J, Ng T-T, Wang G (2016) Ntu rgb+ d: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1010\u20131019. https:\/\/doi.org\/10.1109\/CVPR.2016.115","DOI":"10.1109\/CVPR.2016.115"},{"issue":"10","key":"1171_CR48","doi-asserted-by":"publisher","first-page":"2684","DOI":"10.1109\/TPAMI.2019.2916873","volume":"42","author":"J Liu","year":"2019","unstructured":"Liu J, Shahroudy A, Perez ML, Wang G, Duan L, Chichung A (2019) Ntu rgb+ d 120: a large-scale benchmark for 3d human activity understanding. IEEE Trans Pattern Anal Mach Intell 42(10):2684\u20132701. https:\/\/doi.org\/10.1109\/TPAMI.2019.2916873","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1171_CR49","doi-asserted-by":"publisher","unstructured":"Yang D, Wang Y, Dantcheva A, Garattoni L, Francesca G, Bremond F (2021) Unik: a unified framework for real-world skeleton-based action recognition. In: BMVC, pp 1\u201313. https:\/\/doi.org\/10.48550\/arXiv.2107.08580","DOI":"10.48550\/arXiv.2107.08580"},{"key":"1171_CR50","doi-asserted-by":"publisher","unstructured":"Zhang P, Lan C, Zeng W, Xing J, Xue J, Zheng N (2020) Semantics-guided neural networks for efficient skeleton-based human action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1109\u20131118. https:\/\/doi.org\/10.1109\/CVPR42600.2020.00119","DOI":"10.1109\/CVPR42600.2020.00119"},{"key":"1171_CR51","doi-asserted-by":"publisher","unstructured":"Chen Y, Zhang Z, Yuan C, Li B, Deng Y, Hu W (2021) Channel-wise topology refinement graph convolution for skeleton-based action recognition. In: ICCV, pp 13359\u201313368. https:\/\/doi.org\/10.1109\/ICCV48922.2021.01311","DOI":"10.1109\/ICCV48922.2021.01311"},{"key":"1171_CR52","doi-asserted-by":"publisher","unstructured":"Shi L, Zhang Y, Cheng J, Lu H (2019) Skeleton-based action recognition with directed graph neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. https:\/\/doi.org\/10.1109\/CVPR.2019.00371","DOI":"10.1109\/CVPR.2019.00371"},{"key":"1171_CR53","doi-asserted-by":"publisher","unstructured":"Peng W, Hong X, Chen H, Zhao G (2020) Learning graph convolutional network for skeleton-based human action recognition by neural searching. In: Proceedings of the AAAI conference on artificial intelligence. https:\/\/doi.org\/10.1609\/AAAI.V34I03.5652","DOI":"10.1609\/AAAI.V34I03.5652"},{"key":"1171_CR54","doi-asserted-by":"publisher","unstructured":"Chen Z, Li S, Yang B, Li Q, Liu H (2021) Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition. In: AAAI, pp 1113\u20131122. https:\/\/doi.org\/10.48550\/arXiv.2206.13028","DOI":"10.48550\/arXiv.2206.13028"},{"key":"1171_CR55","doi-asserted-by":"publisher","unstructured":"Ye F, Pu S, Zhong Q, Li C, Xie D, Tang H (2020) Dynamic GCN: context-enriched topology learning for skeleton-based action recognition. In: ACM MM, pp 55\u201363. https:\/\/doi.org\/10.1145\/3394171.3413941","DOI":"10.1145\/3394171.3413941"},{"key":"1171_CR56","doi-asserted-by":"publisher","unstructured":"Shi L, Zhang Y, Cheng J, Lu H (2021) Adasgn: adapting joint number and model size for efficient skeleton-based action recognition. In: ICCV, pp 13413\u201313422. https:\/\/doi.org\/10.1109\/ICCV48922.2021.01316","DOI":"10.1109\/ICCV48922.2021.01316"},{"key":"1171_CR57","doi-asserted-by":"publisher","first-page":"1474","DOI":"10.1109\/TPAMI.2022.3157033","volume":"45","author":"Y Song","year":"2021","unstructured":"Song Y, Zhang Z, Shan C, Wang L (2021) Constructing stronger and faster baselines for skeleton-based action recognition. IEEE Trans Pattern Anal Mach Intell 45:1474\u20131488. https:\/\/doi.org\/10.1109\/TPAMI.2022.3157033","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1171_CR58","doi-asserted-by":"publisher","unstructured":"Shi L, Zhang Y, Cheng J, Lu H (2020) Decoupled spatial-temporal attention network for skeleton-based action recognition. In: Proceedings of the ACCV. https:\/\/doi.org\/10.1007\/978-3-030-69541-5_3","DOI":"10.1007\/978-3-030-69541-5_3"},{"key":"1171_CR59","doi-asserted-by":"publisher","unstructured":"Zhang Y, Wu B, Li W, Duan L, Gan C (2021) Stst: spatial-temporal specialized transformer for skeleton-based action recognition. In: ACM MM, pp 3229\u20133237. https:\/\/doi.org\/10.1145\/3474085.3475473","DOI":"10.1145\/3474085.3475473"},{"key":"1171_CR60","doi-asserted-by":"publisher","unstructured":"Kang M-S, Kang D, Kim H (2023) Efficient skeleton-based action recognition via joint-mapping strategies. In: WACV, pp 3403\u20133412. https:\/\/doi.org\/10.1109\/WACV56688.2023.00340","DOI":"10.1109\/WACV56688.2023.00340"},{"key":"1171_CR61","doi-asserted-by":"publisher","unstructured":"Hang R, Li M (2022) Spatial-temporal adaptive graph convolutional network for skeleton-based action recognition. In: ACCV, pp 1265\u20131281. https:\/\/doi.org\/10.1007\/978-3-031-26316-3_11","DOI":"10.1007\/978-3-031-26316-3_11"},{"key":"1171_CR62","doi-asserted-by":"publisher","first-page":"109231","DOI":"10.48550\/arXiv.2203.16767","volume":"136","author":"L Wu","year":"2023","unstructured":"Wu L, Zhang C, Zou Y (2023) Spatiotemporal focus for skeleton-based action recognition. Pattern Recognit 136:109231. https:\/\/doi.org\/10.48550\/arXiv.2203.16767","journal-title":"Pattern Recognit"},{"key":"1171_CR63","doi-asserted-by":"publisher","unstructured":"Kim TS, Reiter A (2017) Interpretable 3d human action analysis with temporal convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern Recognition Workshop, pp 1623\u20131631. https:\/\/doi.org\/10.1109\/CVPRW.2017.207","DOI":"10.1109\/CVPRW.2017.207"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01171-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-023-01171-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01171-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,10]],"date-time":"2024-02-10T22:15:41Z","timestamp":1707603341000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-023-01171-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,20]]},"references-count":63,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,2]]}},"alternative-id":["1171"],"URL":"https:\/\/doi.org\/10.1007\/s40747-023-01171-8","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,20]]},"assertion":[{"value":"28 December 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 June 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 July 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}