{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T15:25:03Z","timestamp":1774365903163,"version":"3.50.1"},"reference-count":51,"publisher":"Springer Science and Business Media LLC","issue":"10","license":[{"start":{"date-parts":[[2025,6,23]],"date-time":"2025-06-23T00:00:00Z","timestamp":1750636800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,6,23]],"date-time":"2025-06-23T00:00:00Z","timestamp":1750636800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100023555","name":"National Centre of Competence in Research Evolving Language","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100023555","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001711","name":"Schweizerischer Nationalfonds zur F\u00f6rderung der Wissenschaftlichen Forschung","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Vis"],"published-print":{"date-parts":[[2025,10]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>This paper addresses the significant challenge of recognizing behaviors in non-human primates, specifically focusing on chimpanzees. Automated behavior recognition is crucial for both conservation efforts and the advancement of behavioral research. However, it is often hindered by the labor-intensive process of manual video annotation. Despite the availability of large-scale animal behavior datasets, effectively applying machine learning models across varied environmental settings remains a critical challenge due to the variability in data collection contexts and the specificity of annotations. In this paper, we introduce <jats:italic>ChimpBehave<\/jats:italic>, a novel dataset comprising over 2\u00a0h and 20\u00a0min of video (approximately 215,000 frames) of zoo-housed chimpanzees, annotated with bounding boxes and fine-grained locomotive behavior labels. Uniquely, <jats:italic>ChimpBehave<\/jats:italic> aligns its behavior classes with those in PanAf, an existing dataset collected in distinct visual environments, enabling the study of cross-dataset generalization - where models are trained on one dataset and tested on another with differing data distributions. We benchmark <jats:italic>ChimpBehave<\/jats:italic> using state-of-the-art video-based and skeleton-based action recognition models, establishing performance baselines for both within-dataset and cross-dataset evaluations. Our results highlight the strengths and limitations of different model architectures, providing insights into the application of automated behavior recognition across diverse visual settings. The dataset, models, and code can be accessed at: <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"https:\/\/github.com\/MitchFuchs\/ChimpBehave\" ext-link-type=\"uri\">https:\/\/github.com\/MitchFuchs\/ChimpBehave<\/jats:ext-link>\n          <\/jats:p>","DOI":"10.1007\/s11263-025-02484-6","type":"journal-article","created":{"date-parts":[[2025,6,23]],"date-time":"2025-06-23T02:29:03Z","timestamp":1750645743000},"page":"6668-6688","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["From Forest to Zoo: Great Ape Behavior Recognition with ChimpBehave"],"prefix":"10.1007","volume":"133","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8937-1427","authenticated-orcid":false,"given":"Michael","family":"Fuchs","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2461-2442","authenticated-orcid":false,"given":"Emilie","family":"Genty","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6989-8654","authenticated-orcid":false,"given":"Adrian","family":"Bangerter","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8378-088X","authenticated-orcid":false,"given":"Klaus","family":"Zuberb\u00fchler","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9537-9898","authenticated-orcid":false,"given":"Jean-Marc","family":"Odobez","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4103-5467","authenticated-orcid":false,"given":"Paul","family":"Cotofrei","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,6,23]]},"reference":[{"issue":"46","key":"2484_CR1","doi-asserted-by":"publisher","first-page":"4883","DOI":"10.1126\/sciadv.abi4883","volume":"7","author":"M Bain","year":"2021","unstructured":"Bain, M., Nagrani, A., Schofield, D., Berdugo, S., Bessa, J., Owen, J., Hockings, K. J., Matsuzawa, T., Hayashi, M., Biro, D., Carvalho, S., & Zisserman, A. (2021). Automated audiovisual behavior recognition in wild primates. Science Advances, 7(46), 4883.","journal-title":"Science Advances"},{"issue":"1","key":"2484_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41467-020-18441-5","volume":"11","author":"PC Bala","year":"2020","unstructured":"Bala, P. C., Eisenreich, B. R., Yoo, S. B. M., Hayden, B. Y., Park, H. S., & Zimmermann, J. (2020). Automated markerless pose estimation in freely moving macaques with OpenMonkeyStudio. Nature communications, 11(1), 1\u201312.","journal-title":"Nature communications"},{"key":"2484_CR3","first-page":"4","volume":"2","author":"G Bertasius","year":"2021","unstructured":"Bertasius, G., Wang, H., & Torresani, L. (2021). Is space-time attention all you need for video understanding? ICML, 2, 4.","journal-title":"ICML"},{"key":"2484_CR4","doi-asserted-by":"crossref","unstructured":"Brookes, O., Mirmehdi, M., K\u00fchl, H., & Burghardt, T. (2023). Triple-stream deep metric learning of great ape behavioural actions. arXiv preprint arXiv:2301.02642","DOI":"10.5220\/0011798400003417"},{"key":"2484_CR5","unstructured":"Brookes, O., Mirmehdi, M., Kuhl, H., & Burghardt, T. (2024). ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition. arXiv preprint arXiv:2404.08937"},{"key":"2484_CR6","doi-asserted-by":"crossref","unstructured":"Brookes, O., Mirmehdi, M., Stephens, C., Angedakin, S., Corogenes, K., Dowd, D., Dieguez, P., Hicks, T.C., Jones, S., Lee, K., Leinert, V., Lapuente, J., McCarthy, M. S., Meier, A., Murai, M., Normand, E, Vergnes, V., Wessling, E.G., Wittig, R.M., LangergraberNuria, K., ... Tilo, B. (2024). PanAf20K: A large video dataset for wild ape detection and behaviour recognition. International Journal of Computer Vision, 1\u201317","DOI":"10.1007\/s11263-024-02003-z"},{"key":"2484_CR7","doi-asserted-by":"crossref","unstructured":"Chen, J., Hu, M., Coker, D.J., Berumen, M.L., Costelloe, B., Beery, S., Rohrbach, A., & Elhoseiny, M. (2023). MammalNet: A Large-Scale Video Benchmark for Mammal Recognition and Behavior Understanding. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR), (pp. 13052\u201313061).","DOI":"10.1109\/CVPR52729.2023.01254"},{"key":"2484_CR8","doi-asserted-by":"publisher","first-page":"4405","DOI":"10.1007\/s11042-015-3177-1","volume":"76","author":"C Chen","year":"2017","unstructured":"Chen, C., Jafari, R., & Kehtarnavaz, N. (2017). A survey of depth and inertial sensor fusion for human action recognition. Multimedia Tools and Applications, 76, 4405\u20134425.","journal-title":"Multimedia Tools and Applications"},{"key":"2484_CR9","doi-asserted-by":"crossref","unstructured":"Desai, N., Bala, P., Richardson, R., Raper, J., Zimmermann, J., & Hayden, B. (2022). OpenApePose: a database of annotated ape photographs for pose estimation. arXiv preprint arXiv:2212.00741","DOI":"10.7554\/eLife.86873.1"},{"key":"2484_CR10","doi-asserted-by":"crossref","unstructured":"Duan, H., Wang, J., Chen, K., & Lin, D. (2022). Pyskl: Towards good practices for skeleton action recognition. In Proceedings of the 30th ACM international conference on multimedia, (pp. 7351\u20137354).","DOI":"10.1145\/3503161.3548546"},{"key":"2484_CR11","doi-asserted-by":"crossref","unstructured":"Duan, H., Zhao, Y., Chen, K., Lin, D., & Dai, B. (2022). Revisiting skeleton-based action recognition. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 2969\u20132978).","DOI":"10.1109\/CVPR52688.2022.00298"},{"key":"2484_CR12","unstructured":"Evolutionary\u00a0Anthropology, M.P.I.: Pan African programme: The Cultured Chimpanzee. http:\/\/panafrican.eva.mpg.de\/index.php"},{"key":"2484_CR13","doi-asserted-by":"crossref","unstructured":"Feichtenhofer, C. (2020). X3d: Expanding architectures for efficient video recognition. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 203\u2013213).","DOI":"10.1109\/CVPR42600.2020.00028"},{"key":"2484_CR14","doi-asserted-by":"crossref","unstructured":"Feng, L., Zhao, Y., Zhao, W., & Tang, J. (2022). A comparative review of graph convolutional networks for human skeleton-based action recognition. Artificial Intelligence Review, 1\u201331.","DOI":"10.1007\/s10462-021-10107-y"},{"key":"2484_CR15","doi-asserted-by":"publisher","unstructured":"Fuchs, M., Genty, E., Zuberb\u00fchler, K., & Cotofrei, P. (2024). ASBAR: an Animal Skeleton-Based Action Recognition framework. Recognizing great ape behaviors in the wild using pose estimation with domain adaptation. eLife, 13, Article 97962. https:\/\/doi.org\/10.7554\/elife.97962.1","DOI":"10.7554\/elife.97962.1"},{"key":"2484_CR16","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1016\/j.cviu.2017.01.011","volume":"158","author":"F Han","year":"2017","unstructured":"Han, F., Reily, B., Hoff, W., & Zhang, H. (2017). Space-time representation of people based on 3d skeletal data: A review. Computer Vision and Image Understanding, 158, 85\u2013105.","journal-title":"Computer Vision and Image Understanding"},{"key":"2484_CR17","doi-asserted-by":"crossref","unstructured":"Jin, S., Xu, L., Xu, J., Wang, C., Liu, W., Qian, C., Ouyang, W., & Luo, P. (2020). Whole-body human pose estimation in the wild. In Computer vision\u2013ECCV 2020: 16th European conference, glasgow, UK, August 23\u201328, 2020, proceedings, part IX 16, (pp. 196\u2013214). Springer.","DOI":"10.1007\/978-3-030-58545-7_12"},{"key":"2484_CR18","unstructured":"Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., & Zisserman, A. (2017). The kinetics human action video dataset. https:\/\/arxiv.org\/abs\/1705.06950"},{"key":"2484_CR19","unstructured":"Kipf, T.N., & Welling, M. (2017). Semi-Supervised Classification with Graph Convolutional Networks. In Proceedings of the 5th international conference on learning representations. ICLR \u201917. https:\/\/openreview.net\/forum?id=SJU4ayYgl"},{"key":"2484_CR20","doi-asserted-by":"crossref","unstructured":"Labuguen, R., Matsumoto, J., Negrete, S. B., Nishimaru, H., Nishijo, H., Takada, M., Go, Y., Inoue, K.-I., & Shibata, T. (2021). MacaquePose: A novel \u201cin the wild\u201d macaque monkey pose dataset for markerless motion capture. Frontiers in behavioral neuroscience, 14, Article 581154.","DOI":"10.3389\/fnbeh.2020.581154"},{"key":"2484_CR21","doi-asserted-by":"publisher","first-page":"496","DOI":"10.1038\/s41592-022-01443-0","volume":"19","author":"J Lauer","year":"2022","unstructured":"Lauer, J., Zhou, M., Ye, S., Menegas, W., Schneider, S., Nath, T., Rahman, M. M., Santo, V. D., Soberanes, D., Feng, G., Murthy, V. N., Lauder, G., Dulac, C., Mathis, M., & Mathis, A. (2022). Multi-animal pose estimation, identification and tracking with DeepLabCut. Nature Methods, 19, 496\u2013504.","journal-title":"Nature Methods"},{"key":"2484_CR22","doi-asserted-by":"crossref","unstructured":"Li, Y., Wu, C.-Y., Fan, H., Mangalam, K., Xiong, B., Malik, J., & Feichtenhofer, C.(2022). MViTV2: Improved multiscale vision transformers for classification and detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 4804\u20134814).","DOI":"10.1109\/CVPR52688.2022.00476"},{"key":"2484_CR23","doi-asserted-by":"crossref","unstructured":"Liu, D., Hou, J., Huang, S., Liu, J., He, Y., Zheng, B., Ning, J., & Zhang, J. (2023). LoTE-Animal: A Long Time-span Dataset for Endangered Animal Behavior Understanding. In Proceedings of the IEEE\/CVF international conference on computer vision (ICCV), (pp. 20064\u201320075).","DOI":"10.1109\/ICCV51070.2023.01836"},{"key":"2484_CR24","doi-asserted-by":"crossref","unstructured":"Liu, Z., Hu, H., Lin, Y., Yao, Z., Xie, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., & Dong, L. (2022). Swin transformer v2: Scaling up capacity and resolution. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 12009\u201312019).","DOI":"10.1109\/CVPR52688.2022.01170"},{"issue":"5","key":"2484_CR25","doi-asserted-by":"publisher","first-page":"967","DOI":"10.24272\/j.issn.2095-8137.2022.449","volume":"44","author":"C Li","year":"2023","unstructured":"Li, C., Xiao, Z., Li, Y., Chen, Z., Ji, X., Liu, Y., Feng, S., Zhang, Z., Zhang, K., & Feng, J. (2023). Deep learning-based activity recognition and fine motor identification using 2D skeletons of cynomolgus monkeys. Zoological Research, 44(5), 967.","journal-title":"Zoological Research"},{"key":"2484_CR26","unstructured":"Ma, X., Kaufhold, S., Su, J., Zhu, W., Terwilliger, J., Meza, A., Zhu, Y., Rossano, F., Wang, Y.: ChimpACT: A Longitudinal Dataset for Understanding Chimpanzee Behaviors. In: Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., & Levine, S. (2023). (eds.) Advances in neural information processing systems, vol. 36, (pp. 27501\u201327531). Curran Associates, Inc."},{"issue":"4","key":"2484_CR27","doi-asserted-by":"publisher","first-page":"331","DOI":"10.1038\/s42256-022-00477-5","volume":"4","author":"M Marks","year":"2022","unstructured":"Marks, M., Jin, Q., Sturman, O., Ziegler, L., Kollmorgen, S., Behrens, W., Mante, V., Bohacek, J., & Yanik, M. F. (2022). Deep-learning-based identification, tracking, pose estimation and behaviour classification of interacting primates and mice in complex environments. Nature machine intelligence, 4(4), 331\u2013340.","journal-title":"Nature machine intelligence"},{"key":"2484_CR28","doi-asserted-by":"publisher","unstructured":"Martini, L.M., Bogn\u00e1r, A., Vogels, R., & Giese, M.A. (2024). MacAction: Realistic 3D macaque body animation based on multi-camera markerless motion capture. bioRxiv. https:\/\/doi.org\/10.1101\/2024.01.29.577734","DOI":"10.1101\/2024.01.29.577734"},{"issue":"9","key":"2484_CR29","doi-asserted-by":"publisher","first-page":"1281","DOI":"10.1038\/s41593-018-0209-y","volume":"21","author":"A Mathis","year":"2018","unstructured":"Mathis, A., Mamidanna, P., Cury, K. M., Abe, T., Murthy, V. N., Mathis, M. W., & Bethge, M. (2018). DeepLabCut: Markerless pose estimation of user-defined body parts with deep learning. Nature neuroscience, 21(9), 1281\u20131289.","journal-title":"Nature neuroscience"},{"key":"2484_CR30","unstructured":"MMAction2 Contributors: OpenMMLab\u2019s Next Generation Video Understanding Toolbox and Benchmark. https:\/\/github.com\/open-mmlab\/mmaction2"},{"key":"2484_CR31","doi-asserted-by":"crossref","unstructured":"Ng, X.L., Ong, K.E., Zheng, Q., Ni, Y., Yeo, S.Y., & Liu, J. (2022). Animal kingdom: A large and diverse dataset for animal behavior understanding. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 19023\u201319034).","DOI":"10.1109\/CVPR52688.2022.01844"},{"issue":"2","key":"2484_CR32","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1537\/ase.107.141","volume":"107","author":"T Nishida","year":"1999","unstructured":"Nishida, T., Kano, T., Goodall, J., McGrew, W. C., & Nakamura, M. (1999). Ethogram and ethnography of Mahale chimpanzees. Anthropological Science, 107(2), 141\u2013188.","journal-title":"Anthropological Science"},{"key":"2484_CR33","doi-asserted-by":"crossref","unstructured":"Pang, J., Qiu, L., Li, X., Chen, H., Li, Q., Darrell, T., & Yu, F. (2021). Quasi-dense similarity learning for multiple object tracking. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 164\u2013173).","DOI":"10.1109\/CVPR46437.2021.00023"},{"key":"2484_CR34","unstructured":"Ravi, N., Gabeur, V., Hu, Y.-T., Hu, R., Ryali, C., Ma, T., Khedr, H., R\u00e4dle, R., Rolland, C., Gustafson, L. et\u00a0al. (2024). Sam 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714 (2024)"},{"key":"2484_CR35","unstructured":"Sakib, F., & Burghardt, T. Visual Recognition of Great Ape Behaviours in the Wild. (2021). International conference on pattern recognition (ICPR) workshop on visual observation and analysis of vertebrate and insect behavior, VAIB; conference date: 10-01-2021 Through 15-01-2021"},{"key":"2484_CR36","doi-asserted-by":"crossref","unstructured":"Sanakoyeu, A., Khalidov, V., McCarthy, M.S., Vedaldi, A., & Neverova, N. (2020). Transferring dense pose to proximal animal classes. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 5233\u20135242).","DOI":"10.1109\/CVPR42600.2020.00528"},{"key":"2484_CR37","doi-asserted-by":"publisher","unstructured":"Shahroudy, A., Liu, J., Ng, T.-T., & Wang, G. (2016). NTU RGB$$+$$d: A large scale dataset for 3D human activity analysis. In 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE. https:\/\/doi.org\/10.1109\/cvpr.2016.115.","DOI":"10.1109\/cvpr.2016.115"},{"key":"2484_CR38","doi-asserted-by":"crossref","unstructured":"Shao, D., Zhao, Y., Dai, B., & Lin, D. (2020). FineGym: A hierarchical video dataset for fine-grained action understanding. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 2616\u20132625).","DOI":"10.1109\/CVPR42600.2020.00269"},{"key":"2484_CR39","doi-asserted-by":"crossref","unstructured":"Shi, L., Zhang, Y., Cheng, J., & Lu, H. (2019). Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR)","DOI":"10.1109\/CVPR42600.2020.01434"},{"key":"2484_CR40","unstructured":"Simonyan, K., & Zisserman, A. (2014). Two-Stream Convolutional Networks for Action Recognition in Videos. In Proceedings of the 27th international conference on neural information processing systems - Volume 1. NIPS\u201914, (pp. 568\u2013576). MIT Press, Cambridge, MA, USA"},{"key":"2484_CR41","doi-asserted-by":"crossref","unstructured":"Sun, K., Xiao, B., Liu, D., & Wang, J. (2019). Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 5693\u20135703).","DOI":"10.1109\/CVPR.2019.00584"},{"key":"2484_CR42","doi-asserted-by":"crossref","unstructured":"Sun, J.J., Zhou, H., Zhao, L., Yuan, L., Seybold, B., Hendon, D., Schroff, F., Ross, D.A., Adam, H., Hu, B., & Liu, T. (2024). Video Foundation Models for Animal Behavior Analysis. bioRxiv, 2024\u201307.","DOI":"10.1101\/2024.07.30.605655"},{"key":"2484_CR43","doi-asserted-by":"crossref","unstructured":"Sun, Z., Zhu, X., Lei, Z., & Ma, X. (2022). Caged monkey dataset: A new benchmark for caged monkey pose estimation. In S. Yu, Z. Zhang, P. C. Yuen, J. Han, T. Tan, Y. Guo, J. Lai, & J. Zhang (Eds.), Pattern recognition and computer vision (pp. 694\u2013706). Springer.","DOI":"10.1007\/978-3-031-18916-6_55"},{"key":"2484_CR44","doi-asserted-by":"crossref","unstructured":"Wang, L., Huang, B., Zhao, Z., Tong, Z., He, Y., Wang, Y., Wang, Y., & Qiao, Y. (2023). VideoMAE v2: Scaling video masked autoencoders with dual masking. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 14549\u201314560).","DOI":"10.1109\/CVPR52729.2023.01398"},{"key":"2484_CR45","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1016\/j.neucom.2023.03.001","volume":"537","author":"W Xin","year":"2023","unstructured":"Xin, W., Liu, R., Liu, Y., Chen, Y., Yu, W., & Miao, Q. (2023). Transformer for skeleton-based action recognition: A review of recent advances. Neurocomputing, 537, 164\u2013186. https:\/\/doi.org\/10.1016\/j.neucom.2023.03.001","journal-title":"Neurocomputing"},{"key":"2484_CR46","unstructured":"Yang, Y., Deng, Y., Xu, Y., & Zhang, J. (2023). APTv2: Benchmarking animal pose estimation and tracking with a large-scale dataset and beyond"},{"issue":"1","key":"2484_CR47","doi-asserted-by":"publisher","first-page":"243","DOI":"10.1007\/s11263-022-01698-2","volume":"131","author":"Y Yao","year":"2023","unstructured":"Yao, Y., Bala, P., Mohan, A., Bliss-Moreau, E., Coleman, K., Freeman, S. M., Machado, C. J., Raper, J., Zimmermann, J., & Hayden, B. Y. (2023). OpenMonkeyChallenge: Dataset and benchmark challenges for pose estimation of non-human primates. International Journal of Computer Vision, 131(1), 243\u2013258.","journal-title":"International Journal of Computer Vision"},{"key":"2484_CR48","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1016\/j.neucom.2022.09.071","volume":"512","author":"R Yue","year":"2022","unstructured":"Yue, R., Tian, Z., & Du, S. (2022). Action recognition based on RGB and skeleton data sets: A survey. Neurocomputing, 512, 287\u2013306. https:\/\/doi.org\/10.1016\/j.neucom.2022.09.071","journal-title":"Neurocomputing"},{"key":"2484_CR49","doi-asserted-by":"crossref","unstructured":"Zhai, X., Kolesnikov, A., Houlsby, N., & Beyer, L. (2022). Scaling vision transformers. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 12104\u201312113).","DOI":"10.1109\/CVPR52688.2022.01179"},{"key":"2484_CR50","doi-asserted-by":"crossref","unstructured":"Zhang, J., Huang, J., Jin, S., & Lu, S.(2024). Vision-language models for vision tasks: A survey. IEEE transactions on pattern analysis and machine intelligence","DOI":"10.1109\/TPAMI.2024.3369699"},{"key":"2484_CR51","doi-asserted-by":"crossref","unstructured":"Zhang, F., Zhu, X., Dai, H., Ye, M., & Zhu, C. (2020). Distribution-aware coordinate representation for human pose estimation. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 7093\u20137102).","DOI":"10.1109\/CVPR42600.2020.00712"}],"container-title":["International Journal of Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-025-02484-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11263-025-02484-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-025-02484-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T08:49:42Z","timestamp":1760086182000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11263-025-02484-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,23]]},"references-count":51,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2025,10]]}},"alternative-id":["2484"],"URL":"https:\/\/doi.org\/10.1007\/s11263-025-02484-6","relation":{},"ISSN":["0920-5691","1573-1405"],"issn-type":[{"value":"0920-5691","type":"print"},{"value":"1573-1405","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,23]]},"assertion":[{"value":"21 September 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 May 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 June 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"We received ethical agreement for this study from the Commission d\u2019Ethique de la Recherche of the University of Neuch\u00e2tel (agreement number: 01-FS-2017) and the Kantonales Veterin\u00e4ramt BS at Basel Zoo.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical Approval"}}]}}