{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T20:20:18Z","timestamp":1773260418345,"version":"3.50.1"},"reference-count":54,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2022,11,23]],"date-time":"2022-11-23T00:00:00Z","timestamp":1669161600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,11,23]],"date-time":"2022-11-23T00:00:00Z","timestamp":1669161600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000006","name":"Office of Naval Research","doi-asserted-by":"publisher","award":["N00014-19-1-2571"],"award-info":[{"award-number":["N00014-19-1-2571"]}],"id":[{"id":"10.13039\/100000006","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Vis"],"published-print":{"date-parts":[[2023,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Accurate tracking of the 3D pose of animals from video recordings is critical for many behavioral studies, yet there is a dearth of publicly available datasets that the computer vision community could use for model development. We here introduce the Rodent3D dataset that records animals exploring their environment and\/or interacting with each other with multiple cameras and modalities (RGB, depth, thermal infrared). Rodent3D consists of 200\u00a0min of multimodal video recordings from up to three thermal and three RGB-D synchronized cameras (approximately 4 million frames). For the task of optimizing estimates of pose sequences provided by existing pose estimation methods, we provide a baseline model called <jats:italic>OptiPose<\/jats:italic>. While deep-learned attention mechanisms have been used for pose estimation in the past, with <jats:italic>OptiPose<\/jats:italic>, we propose a different way by representing 3D poses as tokens for which deep-learned context models pay attention to both spatial and temporal keypoint patterns. Our experiments show how <jats:italic>OptiPose<\/jats:italic> is highly robust to noise and occlusion and can be used to optimize pose sequences provided by state-of-the-art models for animal pose estimation.<\/jats:p>","DOI":"10.1007\/s11263-022-01714-5","type":"journal-article","created":{"date-parts":[[2022,11,24]],"date-time":"2022-11-24T17:33:52Z","timestamp":1669311232000},"page":"514-530","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Animal Pose Tracking: 3D Multimodal Dataset and Token-based Pose Optimization"],"prefix":"10.1007","volume":"131","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3370-4595","authenticated-orcid":false,"given":"Mahir","family":"Patel","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2437-0343","authenticated-orcid":false,"given":"Yiwen","family":"Gu","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2747-0275","authenticated-orcid":false,"given":"Lucas C.","family":"Carstensen","sequence":"additional","affiliation":[]},{"given":"Michael E.","family":"Hasselmo","sequence":"additional","affiliation":[]},{"given":"Margrit","family":"Betke","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,11,23]]},"reference":[{"issue":"8","key":"1714_CR1","doi-asserted-by":"publisher","first-page":"eaaz2322","DOI":"10.1126\/sciadv.aaz2322","volume":"6","author":"AS Alexander","year":"2020","unstructured":"Alexander, A. S., Carstensen, L. C., Hinman, J. R., Raudies, F., Chapman, G. W., & Hasselmo, M. E. (2020). Egocentric boundary vector tuning of the retrosplenial cortex. Science Advances, 6(8), eaaz2322.","journal-title":"Science Advances"},{"key":"1714_CR2","doi-asserted-by":"crossref","unstructured":"Biggs, B., Boyne, O., Charles, J., Fitzgibbon, A., & Cipolla, R. (2020). Who left the dogs out: 3D animal reconstruction with expectation maximization in the loop. In 16th European conference on computer vision, Glasgow UK August 23 to 28, 2020, Proceedings Part XI","DOI":"10.1007\/978-3-030-58621-8_12"},{"key":"1714_CR3","doi-asserted-by":"crossref","unstructured":"Breslav, M., Hedrick, T. L., Sclaroff, S., & Betke, M. (2016). Discovering useful parts for pose estimation in sparesly annotated datasets. In Proceedings of the IEEE winter conference on applications of computer vision (WACV), Lake Placid, NY","DOI":"10.1109\/WACV.2016.7477670"},{"key":"1714_CR4","doi-asserted-by":"crossref","unstructured":"Carstensen, L. C., Alexander, A. S., Chapman, G. W., Lee, A. J., & Hasselmo, M. E. (2021). Neural responses in retrosplenial cortex associated with environmental alterations. iScience p. 103377","DOI":"10.2139\/ssrn.3859659"},{"key":"1714_CR5","doi-asserted-by":"crossref","unstructured":"Cheng, Y., Yan, B., Wang, B., & Tan, R. T. (2020). 3D human pose estimation using spatio-temporal networks with explicit occlusion training. In The thirty-fourth AAAI conference on artificial intelligence (AAAI-20), (pp. 10631\u201310638)","DOI":"10.1609\/aaai.v34i07.6689"},{"key":"1714_CR6","doi-asserted-by":"publisher","DOI":"10.7554\/eLife.62500","volume":"9","author":"H Dannenberg","year":"2020","unstructured":"Dannenberg, H., Lazaro, H., Nambiar, P., Hoyland, A., & Hasselmo, M. E. (2020). Effects of visual inputs on neural dynamics for coding of location and running speed in medial entorhinal cortex. Elife, 9, e62500.","journal-title":"Elife"},{"issue":"5","key":"1714_CR7","doi-asserted-by":"publisher","first-page":"564","DOI":"10.1038\/s41592-021-01106-6","volume":"18","author":"TW Dunn","year":"2021","unstructured":"Dunn, T. W., Marshall, J. D., Severson, K. S., Aldarondo, D. E., Hildebrand, D. G., Chettih, S. N., Wang, W. L., Gellis, A. J., Carlson, D. E., Aronov, D., et al. (2021). Geometric deep learning enables 3D kinematic profiling across species and environments. Nature methods, 18(5), 564\u2013573.","journal-title":"Nature methods"},{"key":"1714_CR8","doi-asserted-by":"crossref","unstructured":"Gong, K., Zhang, J., & Feng, J. (2021). PoseAug: A differentiable pose augmentation framework for 3D human pose estimation. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR), (pp. 8575\u20138584)","DOI":"10.1109\/CVPR46437.2021.00847"},{"key":"1714_CR9","doi-asserted-by":"crossref","unstructured":"Gosztolai, A., G\u00fcnel, S., R\u00edos, V. L., Abrate, M. P., Morales, D., Rhodin, H., Fua, P., & Ramdya, P.: LiftPose3D, a deep learning-based approach for transforming 2D to 3D pose in laboratory animals. bioRxiv (2021), https:\/\/www.biorxiv.org\/content\/early\/2021\/04\/12\/2020.09.18.292680","DOI":"10.1101\/2020.09.18.292680"},{"key":"1714_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.7554\/eLife.47994","volume":"8","author":"JM Graving","year":"2019","unstructured":"Graving, J. M., Chae, D., Naik, H., Li, L., Koger, B., Costelloe, B. R., & Couzin, I. D. (2019). DeepPoseKit, a software toolkit for fast and robust animal pose estimation using deep learning. eLife, 8, 1\u201342. https:\/\/doi.org\/10.7554\/eLife.47994","journal-title":"eLife"},{"key":"1714_CR11","doi-asserted-by":"publisher","DOI":"10.7554\/eLife.48571","volume":"8","author":"S G\u00fcnel","year":"2019","unstructured":"G\u00fcnel, S., Rhodin, H., Morales, D., Campagnolo, J., Ramdya, P., & Fua, P. (2019). DeepFly3D, a deep learning-based approach for 3D limb and appendage tracking in tethered, adult drosophila. Elife, 8, e48571.","journal-title":"Elife"},{"issue":"7752","key":"1714_CR12","doi-asserted-by":"publisher","first-page":"400","DOI":"10.1038\/s41586-019-1077-7","volume":"568","author":"\u00d8A H\u00f8ydal","year":"2019","unstructured":"H\u00f8ydal, \u00d8. A., Skyt\u00f8en, E. R., Andersson, S. O., Moser, M. B., & Moser, E. I. (2019). Object-vector coding in the medial entorhinal cortex. Nature, 568(7752), 400\u2013404.","journal-title":"Nature"},{"key":"1714_CR13","unstructured":"Hu, B., Seybold, B., Yang, S., Ross, D. A., Sud, A., Ruby, G., & Liu, Y. (2021). Optical Mouse: 3D mouse pose from single-view video. https:\/\/arxiv.org\/abs\/2106.09251"},{"issue":"7","key":"1714_CR14","doi-asserted-by":"publisher","first-page":"1325","DOI":"10.1109\/TPAMI.2013.248","volume":"36","author":"C Ionescu","year":"2014","unstructured":"Ionescu, C., Papava, D., Olaru, V., & Sminchisescu, C. (2014). Human3.6M: Large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7), 1325\u20131339.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"1714_CR15","doi-asserted-by":"crossref","unstructured":"Iskakov, K., Burkov, E., Lempitsky, V., Malkov, Y. (2019). Learnable triangulation of human pose. In Proceedings of the IEEE\/CVF international conference on computer vision, (pp. 7718\u20137727)","DOI":"10.1109\/ICCV.2019.00781"},{"key":"1714_CR16","doi-asserted-by":"crossref","unstructured":"Joska, D., Clark, L., Muramatsu, N., Jericevich, R., Nicolls, F., Mathis, A., Mathis, M. W., & Patel, A. (2021). AcinoSet: A 3D pose estimation dataset and baseline models for cheetahs in the wild. arXiv: 2103.13282","DOI":"10.1109\/ICRA48506.2021.9561338"},{"key":"1714_CR17","doi-asserted-by":"crossref","unstructured":"Karashchuk, P., Rupp, K. L., Dickinson, E. S., Azim, E., Brunton, B. W., & Tuthill, J. C. (2021). Anipose: A toolkit for robust markerless 3D pose estimation. Cell Reports 36(13)","DOI":"10.1016\/j.celrep.2021.109730"},{"key":"1714_CR18","doi-asserted-by":"crossref","unstructured":"Kearney, S., Li, W., Parsons, M., Kim, K., & Cosker, D. (2020). RGBD-Dog: Predicting canine pose from RGBD sensors. In 2020 IEEE\/CVF conference on computer vision and pattern recognition (CVPR), (pp. 8333\u20138342), https:\/\/doi.ieeecomputersociety.org\/10.1109\/CVPR42600.2020.00836","DOI":"10.1109\/CVPR42600.2020.00836"},{"key":"1714_CR19","unstructured":"Kingma, D., & Ba, J. (2015). Adam: A method for stochastic optimization. In International conference on learning representations (ICLR)"},{"key":"1714_CR20","doi-asserted-by":"crossref","unstructured":"Lauer, J., Zhou, M., Ye, S., Menegas, W., Nath, T., Rahman, M. M., Di\u00a0Santo, V., Soberanes, D., Feng, G., Murthy, V. N., Lauder, G., Dulac, C., Mathis, M. W., & Mathis, A. (2021). Multi-animal pose estimation and tracking with DeepLabCut. bioRxiv , https:\/\/www.biorxiv.org\/content\/early\/2021\/04\/30\/2021.04.30.442096","DOI":"10.1101\/2021.04.30.442096"},{"key":"1714_CR21","doi-asserted-by":"crossref","unstructured":"Li, C., & Lee, G. H. (2021). From synthetic to real: Unsupervised domain adaptation for animal pose estimation. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR), (pp. 1482\u20131491)","DOI":"10.1109\/CVPR46437.2021.00153"},{"key":"1714_CR22","doi-asserted-by":"crossref","unstructured":"Li, S., Gunel, S., Ostrek, M., Ramdya, P., Fua, P., & Rhodin, H. (2020). Deformation-aware unpaired image translation for pose estimation on laboratory animals. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR), (pp. 13158\u201313168)","DOI":"10.1109\/CVPR42600.2020.01317"},{"key":"1714_CR23","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2022.3141231","author":"W Li","year":"2022","unstructured":"Li, W., Liu, H., Ding, R., Liu, M., Wang, P., & Yang, W. (2022). Exploiting temporal contexts with strided transformer for 3d human pose estimation. IEEE Transactions on Multimedia. https:\/\/doi.org\/10.1109\/TMM.2022.3141231","journal-title":"IEEE Transactions on Multimedia"},{"key":"1714_CR24","doi-asserted-by":"crossref","unstructured":"Lin, K., Wang, L., & Liu, Z. (2021). End-to-end human pose and mesh reconstruction with transformers. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, (pp. 1954\u20131963)","DOI":"10.1109\/CVPR46437.2021.00199"},{"key":"1714_CR25","doi-asserted-by":"crossref","unstructured":"Liu, X., Yu, S. y., Flierman, N., Loyola, S., Kamermans, M., Hoogland, T. M., & De\u00a0Zeeuw, C. I. (2020). OptiFlex: Video-based animal pose estimation using deep learning enhanced by optical flow. BioRxiv","DOI":"10.1101\/2020.04.04.025494"},{"key":"1714_CR26","unstructured":"Marshall, J. D., Aldarondo, D., Wang, W. P., \u00d6lveczky, B., & Dunn, T. (2021). Rat 7m, https:\/\/doi.org\/10.6084\/m9.figshare.c.5295370.v3"},{"key":"1714_CR27","doi-asserted-by":"crossref","unstructured":"Marshall, J. D., Klibaite, U., Gellis, A. J., Aldarondo, D. E., Olveczky, B. P., & Dunn, T. W. (2021). The pair-r24m dataset for multi-animal 3d pose estimation. bioRxiv","DOI":"10.1101\/2021.11.23.469743"},{"key":"1714_CR28","doi-asserted-by":"crossref","unstructured":"Martinez, J., Hossain, R., Romero, J., & Little, J. J. (2017). A simple yet effective baseline for 3d human pose estimation. In Proceedings of the IEEE international conference on computer vision, (pp. 2640\u20132649)","DOI":"10.1109\/ICCV.2017.288"},{"key":"1714_CR29","doi-asserted-by":"crossref","unstructured":"Mathis, A., Mamidanna, P., Cury, K. M., Abe, T., Murthy, V. N., Mathis, M. W., & Bethge, M. (2018). DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nature Neuroscience 21, 1281\u20131289, http:\/\/www.nature.com\/articles\/s41593-018-0209-y","DOI":"10.1038\/s41593-018-0209-y"},{"issue":"1","key":"1714_CR30","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1016\/j.neuron.2020.09.017","volume":"108","author":"A Mathis","year":"2020","unstructured":"Mathis, A., Schneider, S., Lauer, J., & Mathis, M. W. (2020). A primer on motion capture with deep learning: Principles, pitfalls, and perspectives. Neuron, 108(1), 44\u201365.","journal-title":"Neuron"},{"key":"1714_CR31","doi-asserted-by":"publisher","unstructured":"Mehta, D., Rhodin, H., Casas, D., Fua, P., Sotnychenko, O., Xu, W., Theobalt, C.: Monocular 3d human pose estimation in the wild using improved cnn supervision. In 3D Vision (3DV), 2017 fifth international conference on. IEEE (2017). https:\/\/doi.org\/10.1109\/3dv.2017.00064, http:\/\/gvv.mpi-inf.mpg.de\/3dhp_dataset","DOI":"10.1109\/3dv.2017.00064"},{"key":"1714_CR32","doi-asserted-by":"crossref","unstructured":"Monsees, A., Voit, K. M., Wallace, D. J., Sawinski, J., Leks, E., Scheffler, K., Macke, J. H., & Kerr, J. N. (2021). Anatomically-based skeleton kinetics and pose estimation in freely-moving rodents. bioRxiv","DOI":"10.1101\/2021.11.03.466906"},{"key":"1714_CR33","doi-asserted-by":"crossref","unstructured":"Moreno-Noguer, F. (2017). 3D human pose estimation from a single image via distance matrix regression. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), (pp. 2823\u20132832)","DOI":"10.1109\/CVPR.2017.170"},{"key":"1714_CR34","doi-asserted-by":"crossref","unstructured":"Mu, J., Qiu, W., Hager, G., & Yuille, A.L. (2020). Learning from synthetic animals. In 2020 IEEE\/CVF conference on computer vision and pattern recognition (CVPR), (pp. 12383\u201312392)","DOI":"10.1109\/CVPR42600.2020.01240"},{"key":"1714_CR35","doi-asserted-by":"publisher","unstructured":"Nath, T., Mathis, A., Chen, A. C., Patel, A., Bethge, M., & Mathis, M. W. (2018). Using DeepLabCut for 3D markerless pose estimation across species and behaviors. bioRxiv . https:\/\/doi.org\/10.1101\/476531, https:\/\/www.biorxiv.org\/content\/early\/2018\/11\/24\/476531","DOI":"10.1101\/476531"},{"issue":"7","key":"1714_CR36","doi-asserted-by":"publisher","first-page":"853","DOI":"10.1002\/hipo.20115","volume":"15","author":"J O\u2019Keefe","year":"2005","unstructured":"O\u2019Keefe, J., & Burgess, N. (2005). Dual phase and rate coding in hippocampal place cells: Theoretical significance and relationship to entorhinal grid cells. Hippocampus, 15(7), 853\u2013866.","journal-title":"Hippocampus"},{"key":"1714_CR37","doi-asserted-by":"crossref","unstructured":"Pavlakos, G., Zhou, X., Derpanis, K. G., & Daniilidis, K. (2017). Harvesting multiple views for marker-less 3d human pose annotations. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 6988\u20136997)","DOI":"10.1109\/CVPR.2017.138"},{"issue":"1","key":"1714_CR38","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1038\/s41592-018-0234-5","volume":"16","author":"TD Pereira","year":"2019","unstructured":"Pereira, T. D., Aldarondo, D. E., Willmore, L., Kislin, M., Wang, S. S. H., Murthy, M., & Shaevitz, J. W. (2019). Fast animal pose estimation using deep neural networks. Nature Methods, 16(1), 117\u2013125.","journal-title":"Nature Methods"},{"key":"1714_CR39","doi-asserted-by":"publisher","unstructured":"Ramdya, P.P. (2019). aDN-GAL4 Control. https:\/\/doi.org\/10.7910\/DVN\/PKKXOE","DOI":"10.7910\/DVN\/PKKXOE"},{"key":"1714_CR40","doi-asserted-by":"publisher","first-page":"355","DOI":"10.1016\/j.brainres.2014.10.053","volume":"1621","author":"F Raudies","year":"2015","unstructured":"Raudies, F., Brandon, M. P., Chapman, G. W., & Hasselmo, M. E. (2015). Head direction is coded more strongly than movement direction in a population of entorhinal neurons. Brain Research, 1621, 355\u2013367.","journal-title":"Brain Research"},{"key":"1714_CR41","doi-asserted-by":"crossref","unstructured":"Rempe, D., Birdal, T., Hertzmann, A., Yang, J., Sridhar, S., & Guibas, L. J. (2021). Humor: 3d human motion model for robust pose estimation. In International conference on computer vision (ICCV)","DOI":"10.1109\/ICCV48922.2021.01129"},{"key":"1714_CR42","doi-asserted-by":"publisher","unstructured":"Sherstinsky, A. (2020). Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network. Physica D: Nonlinear Phenomena, 404, 132306. https:\/\/doi.org\/10.1016\/j.physd.2019.132306, www.sciencedirect.com\/science\/article\/pii\/S0167278919305974","DOI":"10.1016\/j.physd.2019.132306"},{"key":"1714_CR43","doi-asserted-by":"publisher","unstructured":"Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on image data augmentation for deep learning. Journal of Big Data 6(60), https:\/\/doi.org\/10.1186\/s40537-019-0197-0","DOI":"10.1186\/s40537-019-0197-0"},{"key":"1714_CR44","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2022.3188716","author":"H Shuai","year":"2022","unstructured":"Shuai, H., Wu, L., & Liu, Q. (2022). Adaptive multi-view and temporal fusing transformer for 3d human pose estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence. https:\/\/doi.org\/10.1109\/TPAMI.2022.3188716","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"1714_CR45","doi-asserted-by":"crossref","unstructured":"Theriault, D. H., Fuller, N. W., Jackson, B. E., Bluhm, E., Evangelista, D., Wu, Z., Betke, M., & Hedrick, T. L. (2014). A protocol and calibration method for accurate multi-camera field videography. The Journal of Experimental Biology 217, 1843\u20131848, open access online, http:\/\/jeb.biologists.org\/content\/early\/2014\/02\/20\/jeb.100529.abstract.html?papetoc","DOI":"10.1242\/jeb.100529"},{"key":"1714_CR46","doi-asserted-by":"crossref","unstructured":"Tome, D., Russell, C., & Agapito, L. (2017). Lifting from the deep: Convolutional 3d pose estimation from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 2500\u20132509)","DOI":"10.1109\/CVPR.2017.603"},{"key":"1714_CR47","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, \u0141., & Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems. pp. 5998\u20136008"},{"key":"1714_CR48","unstructured":"Wah, C., Branson, S., Welinder, P., Perona, P., & Belongie, S. (2011). The Caltech-UCSD birds-200-2011 dataset. Technical Report CNS-TR-2011-001, California Institute of Technology"},{"key":"1714_CR49","doi-asserted-by":"crossref","unstructured":"Wu, Z., Kunz, T. H., & Betke, M. (2011). Efficient track linking methods for track graphs using network-flow and set-cover techniques. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), (pp. 1185\u20131192). Colorado Springs , http:\/\/www.cs.bu.edu\/fac\/betke\/papers\/WuKunzBetke-CVPR2011.pdf","DOI":"10.1109\/CVPR.2011.5995515"},{"key":"1714_CR50","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1016\/j.cviu.2015.10.006","volume":"143","author":"Z Wu","year":"2016","unstructured":"Wu, Z., & Betke, M. (2016). Global optimization for coupled detection and data association in multiple object tracking. Computer Vision and Image Understanding, 143, 25\u201337.","journal-title":"Computer Vision and Image Understanding"},{"key":"1714_CR51","doi-asserted-by":"crossref","unstructured":"Yuan, Y., Wei, S. E., Simon, T., Kitani, K., & Saragih, J. (2021). SimPoE: Simulated character control for 3D human pose estimation. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR). (pp. 7159\u20137169)","DOI":"10.1109\/CVPR46437.2021.00708"},{"key":"1714_CR52","unstructured":"Zhang, L., Dunn, T., Marshall, J., Olveczky, B., & Linderman, S. (2021). Animal pose estimation from video data with a hierarchical von Mises-Fisher-Gaussian model. In: Banerjee, A., Fukumizu, K. (eds.) Proceedings of The 24th international conference on artificial intelligence and statistics proceedings of machine learning research, (vol.\u00a0130, pp. 2800\u20132808.) PMLR , https:\/\/proceedings.mlr.press\/v130\/zhang21h.html"},{"key":"1714_CR53","doi-asserted-by":"crossref","unstructured":"Zheng, C., Zhu, S., Mendieta, M., Yang, T., Chen, C., & Ding, Z. (2021). 3D human pose estimation with spatial and temporal transformers. In Proceedings of the IEEE\/CVF international conference on computer vision (ICCV), (pp. 11656\u201311665)","DOI":"10.1109\/ICCV48922.2021.01145"},{"key":"1714_CR54","doi-asserted-by":"publisher","unstructured":"Zuffi, S., Kanazawa, A., Jacobs, D. W. & Black, M. J. (2017). 3D Menagerie: Modeling the 3D shape and pose of animals. In 2017 IEEE conference on computer vision and pattern recognition (CVPR), (pp. 5524\u20135532). IEEE Computer Society, Los Alamitos, CA, USA. https:\/\/doi.org\/10.1109\/CVPR.2017.586, https:\/\/doi.ieeecomputersociety.org\/10.1109\/CVPR.2017.586","DOI":"10.1109\/CVPR.2017.586"}],"container-title":["International Journal of Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-022-01714-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11263-022-01714-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-022-01714-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,12]],"date-time":"2023-01-12T04:30:38Z","timestamp":1673497838000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11263-022-01714-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,23]]},"references-count":54,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,2]]}},"alternative-id":["1714"],"URL":"https:\/\/doi.org\/10.1007\/s11263-022-01714-5","relation":{},"ISSN":["0920-5691","1573-1405"],"issn-type":[{"value":"0920-5691","type":"print"},{"value":"1573-1405","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,23]]},"assertion":[{"value":"29 April 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 November 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 November 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}