{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,28]],"date-time":"2026-02-28T20:03:22Z","timestamp":1772309002483,"version":"3.50.1"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,8,29]],"date-time":"2023-08-29T00:00:00Z","timestamp":1693267200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,8,29]],"date-time":"2023-08-29T00:00:00Z","timestamp":1693267200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Vis. Intell."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Human motion prediction is a challenging task due to the diversity and randomness of future poses. Due to the inherent topology of pose data, most recent work has used graph convolution networks (GCNs) to accomplish the task of human motion prediction. However, the GCN-based method has a major shortcoming in that the modeling of temporal information is insufficient. In this paper, we propose a simple approach that combines recurrent neural networks (RNNs) and attention mechanism for motion prediction, which considers both spatial relations between different joints and temporal correlation. The query, key and value of the attention mechanism allow us to select the information we need for the subsequent prediction. To solve the difficult problem of RNN training, we utilize uncertainty and strong short-term constraints to optimize the training process. We evaluate our method on several standard benchmark datasets for human motion prediction, i.e., the Human3.6M dataset and the CMU MoCap dataset. The experimental results show that our approach outperforms previous approaches.<\/jats:p>","DOI":"10.1007\/s44267-023-00020-z","type":"journal-article","created":{"date-parts":[[2023,8,29]],"date-time":"2023-08-29T07:02:07Z","timestamp":1693292527000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Predicting human poses via recurrent attention network"],"prefix":"10.1007","volume":"1","author":[{"given":"Jianwei","family":"Tang","sequence":"first","affiliation":[]},{"given":"Jieming","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Jian-Fang","family":"Hu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,8,29]]},"reference":[{"key":"20_CR1","first-page":"1314","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"A. M. Lehrmann","year":"2014","unstructured":"Lehrmann, A. M., Gehler, P. V., & Nowozin, S. (2014). Efficient nonlinear Markov models for human motion. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1314\u20131321). Los Alamitos: IEEE."},{"key":"20_CR2","first-page":"1441","volume-title":"Advances in Neural Information Processing Systems","author":"J. Wang","year":"2005","unstructured":"Wang, J., Hertzmann, A., & Fleet, D. J. (2005). Gaussian process dynamical models. In Advances in Neural Information Processing Systems (Vol. 18, pp. 1441\u20131448). Red Hook: Curran Associates."},{"key":"20_CR3","first-page":"1345","volume-title":"Advances in Neural Information Processing Systems","author":"G. W. Taylor","year":"2006","unstructured":"Taylor, G. W., Hinton, G. E., & Roweis, S. (2006). Modeling human motion using binary latent variables. In Y. Weiss, B. Sch\u00f6lkopf, & J. Platt (Eds.), Advances in Neural Information Processing Systems (Vol. 19, pp. 1345\u20131352). Red Hook: Curran Associates."},{"key":"20_CR4","first-page":"6158","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"J. Butepage","year":"2017","unstructured":"Butepage, J., Black, M. J., Kragic, D., & Kjellstrom, H. (2017). Deep representation learning for human motion prediction and classification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6158\u20136166). Los Alamitos: IEEE."},{"key":"20_CR5","doi-asserted-by":"publisher","first-page":"427","DOI":"10.1016\/j.ins.2020.08.123","volume":"545","author":"Q. Cui","year":"2021","unstructured":"Cui, Q., Sun, H., Kong, Y., Zhang, X., & Li, Y. (2021). Efficient human motion prediction using temporal convolutional generative adversarial network. Information Sciences, 545, 427\u2013447.","journal-title":"Information Sciences"},{"key":"20_CR6","first-page":"5226","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"C. Li","year":"2018","unstructured":"Li, C., Zhang, Z., Lee, W. S., & Lee, G. H. (2018). Convolutional sequence to sequence model for human dynamics. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5226\u20135234). Los Alamitos: IEEE."},{"issue":"6","key":"20_CR7","doi-asserted-by":"publisher","first-page":"3300","DOI":"10.1109\/TPAMI.2021.3050918","volume":"44","author":"X. Shu","year":"2021","unstructured":"Shu, X., Zhang, L., Qi, G.-J., Liu, W., & Tang, J. (2021). Spatiotemporal co-attention recurrent neural networks for human-skeleton motion prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(6), 3300\u20133315.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"20_CR8","first-page":"7144","volume-title":"Proceedings of the IEEE\/CVF international conference on computer vision","author":"E. Aksan","year":"2019","unstructured":"Aksan, E., Kaufmann, M., & Hilliges, O. (2019). Structured prediction helps 3d human motion modelling. In Proceedings of the IEEE\/CVF international conference on computer vision (pp. 7144\u20137153). Los Alamitos: IEEE."},{"key":"20_CR9","first-page":"4801","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"Q. Cui","year":"2021","unstructured":"Cui, Q., & Sun, H. (2021). Towards accurate 3d human motion prediction from incomplete observations. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4801\u20134810). Los Alamitos: IEEE."},{"key":"20_CR10","first-page":"6519","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"Q. Cui","year":"2020","unstructured":"Cui, Q., Sun, H., & Yang, F. (2020). Learning dynamic relationships for 3d human motion prediction. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6519\u20136527). Los Alamitos: IEEE."},{"key":"20_CR11","first-page":"11467","volume-title":"Proceedings of the IEEE\/CVF international conference on computer vision","author":"L. Dang","year":"2021","unstructured":"Dang, L., Nie, Y., Long, C., Zhang, Q., & Li, G. (2021). MSR-GCN: multi-scale residual graph convolution networks for human motion prediction. In Proceedings of the IEEE\/CVF international conference on computer vision (pp. 11467\u201311476). Los Alamitos: IEEE."},{"key":"20_CR12","volume-title":"Proceedings of the Asian conference on computer vision","author":"T. Lebailly","year":"2020","unstructured":"Lebailly, T., Kiciroglu, S., Salzmann, M., Fua, P., & Wang, W. (2020). Motion prediction using temporal inception module. In H. Ishikawa, C.-L. Liu, T. Pajdla, et al. (Eds.), Proceedings of the Asian conference on computer vision. Berlin: Springer."},{"key":"20_CR13","doi-asserted-by":"publisher","first-page":"2562","DOI":"10.1109\/TIP.2020.3038362","volume":"30","author":"B. Li","year":"2020","unstructured":"Li, B., Tian, J., Zhang, Z., Feng, H., & Li, X. (2020). Multitask non-autoregressive model for human motion prediction. IEEE Transactions on Image Processing, 30, 2562\u20132574.","journal-title":"IEEE Transactions on Image Processing"},{"issue":"6","key":"20_CR14","doi-asserted-by":"publisher","first-page":"3316","DOI":"10.1109\/TPAMI.2021.3053765","volume":"44","author":"M. Li","year":"2021","unstructured":"Li, M., Chen, S., Chen, X., Zhang, Y., Wang, Y., & Tian, Q. (2021). Symbiotic graph neural networks for 3d skeleton-based human action recognition and motion prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(6), 3316\u20133333.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"20_CR15","first-page":"214","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"M. Li","year":"2020","unstructured":"Li, M., Chen, S., Zhao, Y., Zhang, Y., Wang, Y., & Tian, Q. (2020). Dynamic multiscale graph neural networks for 3d skeleton based human motion prediction. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 214\u2013223). Los Alamitos: IEEE."},{"key":"20_CR16","unstructured":"Liu, J., & Yin, J. (2020). Multi-grained trajectory graph convolutional networks for habit-unrelated human motion prediction. arXiv preprint. arXiv:2012.12558."},{"key":"20_CR17","first-page":"474","volume-title":"European conference on computer vision","author":"W. Mao","year":"2020","unstructured":"Mao, W., Liu, M., & Salzmann, M. (2020). History repeats itself: human motion prediction via motion attention. In A. Vedaldi, H. Bischof, T. Brox, et al. (Eds.), European conference on computer vision (pp. 474\u2013489). Berlin: Springer."},{"key":"20_CR18","first-page":"9489","volume-title":"Proceedings of the IEEE\/CVF international conference on computer vision","author":"W. Mao","year":"2019","unstructured":"Mao, W., Liu, M., Salzmann, M., & Li, H. (2019). Learning trajectory dependencies for human motion prediction. In Proceedings of the IEEE\/CVF international conference on computer vision (pp. 9489\u20139497). Los Alamitos: IEEE."},{"key":"20_CR19","doi-asserted-by":"publisher","first-page":"1423","DOI":"10.1109\/WACV.2019.00156","volume-title":"2019 IEEE winter conference on applications of computer vision (WACV)","author":"H. Chiu","year":"2019","unstructured":"Chiu, H., Adeli, E., Wang, B., Huang, D.-A., & Carlos, J. (2019). Niebles. Action-agnostic human pose forecasting. In 2019 IEEE winter conference on applications of computer vision (WACV) (pp. 1423\u20131432). Los Alamitos: IEEE."},{"key":"20_CR20","first-page":"6990","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"E. Corona","year":"2020","unstructured":"Corona, E., Pumarola, A., Alenya, G., & Moreno-Noguer, F. (2020). Context-aware human motion prediction. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6990\u20136999). Los Alamitos: IEEE."},{"key":"20_CR21","first-page":"4346","volume-title":"Proceedings of the IEEE international conference on computer vision","author":"K. Fragkiadaki","year":"2015","unstructured":"Fragkiadaki, K., Levine, S., Felsen, P., & Malik, J. (2015). Recurrent network models for human dynamics. In Proceedings of the IEEE international conference on computer vision (pp. 4346\u20134354). Los Alamitos: IEEE."},{"key":"20_CR22","doi-asserted-by":"publisher","first-page":"458","DOI":"10.1109\/3DV.2017.00059","volume-title":"2017 international conference on 3D vision (3DV)","author":"P. Ghosh","year":"2017","unstructured":"Ghosh, P., Song, J., Aksan, E., & Hilliges, O. (2017). Learning human motion models for long-term predictions. In 2017 international conference on 3D vision (3DV) (pp. 458\u2013466). Los Alamitos: IEEE."},{"key":"20_CR23","first-page":"12116","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"A. Gopalakrishnan","year":"2019","unstructured":"Gopalakrishnan, A., Mali, A., Kifer, D., Giles, L., & Ororbia, A. G. (2019). A neural temporal model for human motion prediction. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 12116\u201312125). Los Alamitos: IEEE."},{"key":"20_CR24","first-page":"786","volume-title":"Proceedings of the European conference on computer vision (ECCV)","author":"L.-Y. Gui","year":"2018","unstructured":"Gui, L.-Y., Wang, Y.-X., Liang, X., & Moura, J. M. F. (2018). Adversarial geometry-aware human motion prediction. In Proceedings of the European conference on computer vision (ECCV) (pp. 786\u2013803). Berlin: Springer."},{"key":"20_CR25","first-page":"432","volume-title":"Proceedings of the European conference on computer vision (ECCV)","author":"L.-Y. Gui","year":"2018","unstructured":"Gui, L.-Y., Wang, Y.-X., Ramanan, D., & Moura, J. M. F. (2018). Few-shot human motion prediction via meta-learning. In Proceedings of the European conference on computer vision (ECCV) (pp. 432\u2013450). Berlin: Springer."},{"key":"20_CR26","first-page":"2580","volume-title":"Proceedings of the AAAI conference on artificial intelligence","author":"X. Guo","year":"2019","unstructured":"Guo, X., & Choi, J. (2019). Human motion prediction via learning local structure representations and temporal dependencies. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, pp. 2580\u20132587). Menlo Park: AAAI Press."},{"key":"20_CR27","first-page":"5308","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"A. Jain","year":"2016","unstructured":"Jain, A., Zamir, A. R., Savarese, S., & Structural-rnn, A. S. (2016). Deep learning on spatio-temporal graphs. In V. Ferrari, M. Hebert, C. Sminchisescu, et al. (Eds.), Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5308\u20135317). Los Alamitos: IEEE."},{"key":"20_CR28","first-page":"10004","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"Z. Liu","year":"2019","unstructured":"Liu, Z., Wu, S., Jin, S., Liu, Q., Lu, S., Zimmermann, R., & Cheng, L. (2019). Towards natural and accurate future motion prediction of humans and animals. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 10004\u201310012). Los Alamitos: IEEE."},{"key":"20_CR29","first-page":"2891","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"J. Martinez","year":"2017","unstructured":"Martinez, J., Black, M. J., & Romero, J. (2017). On human motion prediction using recurrent neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2891\u20132900). Los Alamitos: IEEE."},{"issue":"4","key":"20_CR30","doi-asserted-by":"publisher","first-page":"855","DOI":"10.1007\/s11263-019-01245-6","volume":"128","author":"D. Pavllo","year":"2020","unstructured":"Pavllo, D., Feichtenhofer, C., Auli, M., & Grangier, D. (2020). Modeling human motion with quaternion-based neural networks. International Journal of Computer Vision, 128(4), 855\u2013872.","journal-title":"International Journal of Computer Vision"},{"issue":"9","key":"20_CR31","doi-asserted-by":"publisher","first-page":"5529","DOI":"10.1007\/s11042-019-08269-7","volume":"79","author":"H.-F. Sang","year":"2020","unstructured":"Sang, H.-F., Chen, Z.-Z., & He, D.-K. (2020). Human motion prediction based on attention mechanism. Multimedia Tools and Applications, 79(9), 5529\u20135544.","journal-title":"Multimedia Tools and Applications"},{"key":"20_CR32","doi-asserted-by":"crossref","unstructured":"Tang, Y., Ma, L., Liu, W., & Zheng, W. (2018). Long-term human motion prediction by modeling motion context and enhancing motion dynamic. arXiv preprint. arXiv:1805.02513.","DOI":"10.24963\/ijcai.2018\/130"},{"key":"20_CR33","first-page":"11189","volume-title":"Proceedings of the IEEE\/CVF international conference on computer vision","author":"T. Sofianos","year":"2021","unstructured":"Sofianos, T., Sampieri, A., Franco, L., & Galasso, F. (2021). Space-time-separable graph convolutional network for pose forecasting. In Proceedings of the IEEE\/CVF international conference on computer vision (pp. 11189\u201311198). Los Alamitos: IEEE."},{"key":"20_CR34","first-page":"5998","volume-title":"Advances in Neural Information Processing Systems","author":"A. Vaswani","year":"2017","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, \u0141., & Polosukhin, I. (2017). Attention is all you need. In I. Guyon, U. Von Luxburg, S. Bengio, et al. (Eds.), Advances in Neural Information Processing Systems (Vol. 30, pp. 5998\u20136008). Red Hook: Curran Associates."},{"key":"20_CR35","doi-asserted-by":"publisher","first-page":"565","DOI":"10.1109\/3DV53792.2021.00066","volume-title":"2021 international conference on 3D vision (3DV)","author":"E. Aksan","year":"2021","unstructured":"Aksan, E., Kaufmann, M., Cao, P., & Hilliges, O. (2021). A spatio-temporal transformer for 3d human motion prediction. In 2021 international conference on 3D vision (3DV) (pp. 565\u2013574). Los Alamitos: IEEE."},{"key":"20_CR36","first-page":"226","volume-title":"European conference on computer vision","author":"Y. Cai","year":"2020","unstructured":"Cai, Y., Huang, L., Wang, Y., Cham, T.-J., Cai, J., Yuan, J., Liu, J., Yang, X., Zhu, Y., Shen, X., et al. (2020). Learning progressive joint propagation for human motion prediction. In A. Vedaldi, H. Bischof, T. Brox, et al. (Eds.), European conference on computer vision (pp. 226\u2013242). Berlin: Springer."},{"issue":"8","key":"20_CR37","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S. Hochreiter","year":"1997","unstructured":"Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735\u20131780.","journal-title":"Neural Computation"},{"key":"20_CR38","unstructured":"Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint. arXiv:1409.0473."},{"key":"20_CR39","first-page":"5574","volume-title":"Advances in Neural Information Processing Systems","author":"A. Kendall","year":"2017","unstructured":"Kendall, A., & Gal, Y. (2017). What uncertainties do we need in bayesian deep learning for computer vision?. In I. Guyon, U. Von Luxburg, S. Bengio, et al. (Eds.), Advances in Neural Information Processing Systems (Vol. 30, pp. 5574\u20135584). Red Hook: Curran Associates."},{"issue":"7","key":"20_CR40","doi-asserted-by":"publisher","first-page":"1325","DOI":"10.1109\/TPAMI.2013.248","volume":"36","author":"C. Ionescu","year":"2013","unstructured":"Ionescu, C., Papava, D., Olaru, V., & Sminchisescu, C. (2013). Human3. 6m: large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7), 1325\u20131339.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"}],"container-title":["Visual Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44267-023-00020-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44267-023-00020-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44267-023-00020-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,20]],"date-time":"2023-11-20T05:03:02Z","timestamp":1700456582000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44267-023-00020-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,29]]},"references-count":40,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["20"],"URL":"https:\/\/doi.org\/10.1007\/s44267-023-00020-z","relation":{},"ISSN":["2731-9008"],"issn-type":[{"value":"2731-9008","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,29]]},"assertion":[{"value":"4 January 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 April 2023","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 May 2023","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 August 2023","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"18"}}