{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T22:07:44Z","timestamp":1770329264545,"version":"3.49.0"},"reference-count":22,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2024,12,16]],"date-time":"2024-12-16T00:00:00Z","timestamp":1734307200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"MITACS Accelerate","award":["IT29551"],"award-info":[{"award-number":["IT29551"]}]},{"name":"MITACS Accelerate","award":["RGPIN-2020-04307"],"award-info":[{"award-number":["RGPIN-2020-04307"]}]},{"name":"Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery","award":["IT29551"],"award-info":[{"award-number":["IT29551"]}]},{"name":"Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery","award":["RGPIN-2020-04307"],"award-info":[{"award-number":["RGPIN-2020-04307"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>With deep learning approaches, the fundamental assumption of data availability can be severely compromised when a model trained on a source domain is transposed to a target application domain where data are unlabeled, making supervised fine-tuning mostly impossible. To overcome this limitation, the present work introduces an unsupervised temporal-domain adaptation framework for human action recognition from skeleton-based data that combines Contrastive Prototype Learning (CPL) and Temporal Adaptation Modeling (TAM), with the aim of transferring the knowledge learned from a source domain to an unlabeled target domain. The CPL strategy, inspired by recent success in contrastive learning applied to skeleton data, learns a compact temporal representation from the source domain, from which the TAM strategy leverages the capacity for self-training to adapt the representation to a target application domain using pseudo-labels. The research demonstrates that simultaneously solving CPL and TAM effectively enables the training of a generalizable human action recognition model that is adaptive to both domains and overcomes the requirement of a large volume of labeled skeleton data in the target domain. Experiments are conducted on multiple large-scale human action recognition datasets such as NTU RGB+D, PKU MMD, and Northwestern\u2013UCLA to comprehensively evaluate the effectiveness of the proposed method.<\/jats:p>","DOI":"10.3390\/a17120581","type":"journal-article","created":{"date-parts":[[2024,12,16]],"date-time":"2024-12-16T12:09:28Z","timestamp":1734350968000},"page":"581","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Unsupervised Temporal Adaptation in Skeleton-Based Human Action Recognition"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8464-7973","authenticated-orcid":false,"given":"Haitao","family":"Tian","sequence":"first","affiliation":[{"name":"School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON K1N 6N5, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3103-9752","authenticated-orcid":false,"given":"Pierre","family":"Payeur","sequence":"additional","affiliation":[{"name":"School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON K1N 6N5, Canada"}]}],"member":"1968","published-online":{"date-parts":[[2024,12,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1366","DOI":"10.1007\/s11263-022-01594-9","article-title":"Human action recognition and prediction survey","volume":"130","author":"Kong","year":"2022","journal-title":"Int. J. Comput. Vis."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Lange, B., Chang, C.-Y., Suma, E., Newman, B., Rizzo, A.S., and Bolas, M. (September, January 30). Development and evaluation of low-cost game-based balance rehabilitation tool using the Microsoft Kinect sensor. Proceedings of the 2011 annual international conference of the IEEE engineering in medicine and biology society, Boston, MA, USA.","DOI":"10.1109\/IEMBS.2011.6090521"},{"key":"ref_3","unstructured":"Niu, W., Long, J.D., Han, D., and Wang, Y.-F. (2004, January 27\u201330). Human activity detection and recognition or video surveillance. Proceedings of the 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No. 04TH8763), Taipei, Taiwan."},{"key":"ref_4","first-page":"11","article-title":"Deep learning-based human pose estimation: A survey","volume":"56","author":"Zheng","year":"2023","journal-title":"ACM Comput. Surv."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Yan, S., Xiong, Y., and Lin, D. (2018, January 2\u20137). Spatial temporal graph convolutional networks for skeleton-based action recognition. Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA.","DOI":"10.1609\/aaai.v32i1.12328"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Chen, Y., Zhang, Z., Yuan, C., Li, B., Deng, Y., and Hu, W. (2021, January 11\u201317). Channel-wise topology refinement graph convolution for skeleton-based action recognition. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.01311"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Chi, H.G., Ha, M.H., Chi, S., Lee, S.W., Huang, Q., and Ramani, K. (2022, January 18\u201324). InfoGCN: Representation learning for human skeleton-based action recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01955"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Shahroudy, A., Jun, L., Tian-Tsong, N., and Gang, W. (2016, January 27\u201330). NTU RGB+D: A large scale dataset for 3d human activity analysis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.115"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Liu, C., Hu, Y., Li, Y., Song, S., and Liu, J. (2017, January 23). PKU-MMD: A large scale benchmark for skeleton-based human action understanding. Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities, Mountain View, CA, USA.","DOI":"10.1145\/3132734.3132739"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1007\/s10994-009-5152-4","article-title":"A theory of learning from different domains","volume":"79","author":"Blitzer","year":"2010","journal-title":"Mach. Learn."},{"key":"ref_11","unstructured":"Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., and Krishnan, D. (2020, January 6\u201312). Supervised contrastive learning. Proceedings of the Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Virtual."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Li, M., Chen, S., Chen, X., Zhang, Y., Wang, Y., and Tian, Q. (2019, January 15\u201320). Actional-structural graph convolutional networks for skeleton-based action recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00371"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Choi, J., Sharma, G., Chandraker, M., and Huang, J.-B. (2020, January 1\u20135). Unsupervised and semi-supervised domain adaptation for action recognition from drones. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Snowmass Village, CO, USA.","DOI":"10.1109\/WACV45572.2020.9093511"},{"key":"ref_14","unstructured":"Guo, T., Liu, H., Chen, Z., Liu, M., Wang, T., and Ding, R. (March, January 22). Contrastive learning from extremely augmented skeleton sequences for self-supervised action recognition. Proceedings of the AAAI Conference on Artificial Intelligence, Virtual."},{"key":"ref_15","unstructured":"Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. (2020, January 13\u201318). A simple framework for contrastive learning of visual representations. Proceedings of the International Conference on Machine Learning, Virtual."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"He, K., Fan, H., Wu, Y., Xie, S., and Girshick, R. (2020, January 13\u201319). Momentum contrast for unsupervised visual representation learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00975"},{"key":"ref_17","unstructured":"Chen, X., Fan, H., Girshick, R., and He, K. (2020). Improved baselines with momentum contrastive learning. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Wang, J., Nie, X., Xia, Y., Wu, Y., and Zhu, S.-C. (2014, January 23\u201328). Cross-view action modeling, learning and recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.339"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Zheng, N., Wen, J., Liu, R., Long, L., Dai, J., and Gong, Z. (2018, January 2\u20137). Unsupervised representation learning with long-term dynamics for skeleton based action recognition. Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA.","DOI":"10.1609\/aaai.v32i1.11853"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Lin, L., Song, S., Yang, W., and Liu, J. (2020, January 12\u201316). Ms2l: Multi-task self-supervised learning for skeleton based action recognition. Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA.","DOI":"10.1145\/3394171.3413548"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Thoker, F.M., Doughty, H., and Snoek, C.G. (2021, January 20\u201324). Skeleton-contrastive 3D action representation learning. Proceedings of the 29th ACM International Conference on Multimedia, Virtual China.","DOI":"10.1145\/3474085.3475307"},{"key":"ref_22","first-page":"2579","article-title":"Visualizing data using t-SNE. J","volume":"9","author":"Hinton","year":"2008","journal-title":"Mach. Learn. Res."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/17\/12\/581\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:53:24Z","timestamp":1760115204000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/17\/12\/581"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,16]]},"references-count":22,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["a17120581"],"URL":"https:\/\/doi.org\/10.3390\/a17120581","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,16]]}}}