{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:19:06Z","timestamp":1750220346482,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":75,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,8,24]],"date-time":"2021-08-24T00:00:00Z","timestamp":1629763200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,8,24]]},"DOI":"10.1145\/3460426.3463605","type":"proceedings-article","created":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T22:50:28Z","timestamp":1630536628000},"page":"144-154","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["HPOF:3D Human Pose Recovery from Monocular Video with Optical Flow"],"prefix":"10.1145","author":[{"given":"Bin","family":"Ji","sequence":"first","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chen","family":"Yang","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yao","family":"Shunyu","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ye","family":"Pan","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,9]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015","author":"Akhter Ijaz","year":"2015","unstructured":"Ijaz Akhter and Michael J. Black . 2015. Pose-conditioned joint angle limits for 3D human pose reconstruction . In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015 , Boston, MA, USA, June 7--12 , 2015 . IEEE Computer Society, 1446--1455. https:\/\/doi.org\/10.1109\/CVPR.2015.7298751 10.1109\/CVPR.2015.7298751 Ijaz Akhter and Michael J. Black. 2015. Pose-conditioned joint angle limits for 3D human pose reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7--12, 2015. IEEE Computer Society, 1446--1455. https:\/\/doi.org\/10.1109\/CVPR.2015.7298751"},{"key":"e_1_3_2_1_2_1","volume-title":"PoseTrack: A Benchmark for Human Pose Estimation and Tracking. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018","author":"Andriluka Mykhaylo","year":"2018","unstructured":"Mykhaylo Andriluka , Umar Iqbal , Eldar Insafutdinov , Leonid Pishchulin , Anton Milan , Juergen Gall , and Bernt Schiele . 2018 . PoseTrack: A Benchmark for Human Pose Estimation and Tracking. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018 , Salt Lake City, UT, USA, June 18--22 , 2018. 5167--5176. https:\/\/doi.org\/10.1109\/CVPR.2018.00542 10.1109\/CVPR.2018.00542 Mykhaylo Andriluka, Umar Iqbal, Eldar Insafutdinov, Leonid Pishchulin, Anton Milan, Juergen Gall, and Bernt Schiele. 2018. PoseTrack: A Benchmark for Human Pose Estimation and Tracking. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. 5167--5176. https:\/\/doi.org\/10.1109\/CVPR.2018.00542"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1186822.1073207"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00351"},{"key":"e_1_3_2_1_5_1","volume-title":"Proceedings, Part V (Lecture Notes in Computer Science","volume":"578","author":"Bogo Federica","unstructured":"Federica Bogo , Angjoo Kanazawa , Christoph Lassner , Peter V. Gehler , Javier Romero , and Michael J. Black . 2016. Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016 , Proceedings, Part V (Lecture Notes in Computer Science , Vol. 9909), , Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). 561-- 578 . https:\/\/doi.org\/10.1007\/978--3--319--46454--1_34 10.1007\/978--3--319--46454--1_34 Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter V. Gehler, Javier Romero, and Michael J. Black. 2016. Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part V (Lecture Notes in Computer Science, Vol. 9909), , Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). 561--578. https:\/\/doi.org\/10.1007\/978--3--319--46454--1_34"},{"key":"e_1_3_2_1_6_1","volume-title":"9th European Conference on Computer Vision","volume":"111","author":"Brox Thomas","year":"2006","unstructured":"Thomas Brox , Bodo Rosenhahn , Daniel Cremers , and Hans-Peter Seidel . 2006 . High Accuracy Optical Flow Serves 3-D Pose Tracking: Exploiting Contour and Flow Based Constraints. In Computer Vision - ECCV 2006 , 9th European Conference on Computer Vision , Graz, Austria, May 7--13 , 2006, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 3952), Ales Leonardis, Horst Bischof, and Axel Pinz (Eds.). 98-- 111 . https:\/\/doi.org\/10.1007\/11744047_8 10.1007\/11744047_8 Thomas Brox, Bodo Rosenhahn, Daniel Cremers, and Hans-Peter Seidel. 2006. High Accuracy Optical Flow Serves 3-D Pose Tracking: Exploiting Contour and Flow Based Constraints. In Computer Vision - ECCV 2006, 9th European Conference on Computer Vision, Graz, Austria, May 7--13, 2006, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 3952), Ales Leonardis, Horst Bischof, and Axel Pinz (Eds.). 98--111. https:\/\/doi.org\/10.1007\/11744047_8"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2019.2929257"},{"key":"e_1_3_2_1_8_1","volume-title":"2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019","author":"Cheng Yu","year":"2019","unstructured":"Yu Cheng , Bo Yang , Bo Wang , Yan Wending , and Robby T. Tan . 2019. Occlusion-Aware Networks for 3D Human Pose Estimation in Video . In 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019 , Seoul, Korea (South), October 27 - November 2, 2019 . 723--732. https:\/\/doi.org\/10.1109\/ICCV.2019.00081 10.1109\/ICCV.2019.00081 Yu Cheng, Bo Yang, Bo Wang, Yan Wending, and Robby T. Tan. 2019. Occlusion-Aware Networks for 3D Human Pose Estimation in Video. In 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019. 723--732. https:\/\/doi.org\/10.1109\/ICCV.2019.00081"},{"key":"e_1_3_2_1_9_1","volume-title":"Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio.","author":"Cho Kyunghyun","year":"2014","unstructured":"Kyunghyun Cho , Bart van Merrienboer , cC aglar G\u00fc lcc ehre , Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014 . Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25--29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL , , Alessandro Moschitti, Bo Pang, and Walter Daelemans (Eds.). 1724--1734. https:\/\/doi.org\/10.3115\/v1\/d14--1179 10.3115\/v1 Kyunghyun Cho, Bart van Merrienboer, cC aglar G\u00fc lcc ehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25--29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL , , Alessandro Moschitti, Bo Pang, and Walter Daelemans (Eds.). 1724--1734. https:\/\/doi.org\/10.3115\/v1\/d14--1179"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58607-2_2"},{"key":"e_1_3_2_1_11_1","volume-title":"Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019","author":"Doersch Carl","year":"2019","unstructured":"Carl Doersch and Andrew Zisserman . 2019 . Sim2real transfer learning for 3D human pose estimation: motion to the rescue . In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 , NeurIPS 2019, 8--14 December 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch\u00e9 -Buc, Emily B. Fox, and Roman Garnett (Eds.). 12929--12941. http:\/\/papers.nips.cc\/paper\/9454-sim2real-transfer-learning-for-3d-human-pose-estimation-motion-to-the-rescue Carl Doersch and Andrew Zisserman. 2019. Sim2real transfer learning for 3D human pose estimation: motion to the rescue. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8--14 December 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch\u00e9 -Buc, Emily B. Fox, and Roman Garnett (Eds.). 12929--12941. http:\/\/papers.nips.cc\/paper\/9454-sim2real-transfer-learning-for-3d-human-pose-estimation-motion-to-the-rescue"},{"key":"e_1_3_2_1_12_1","volume-title":"Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18)","author":"Fang Haoshu","year":"2018","unstructured":"Haoshu Fang , Yuanlu Xu , Wenguan Wang , Xiaobai Liu , and Song-Chun Zhu . 2018 . Learning Pose Grammar to Encode Human Body Configuration for 3D Pose Estimation . In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18) , the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2--7 , 2018 , , Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). 6821--6828. https:\/\/www.aaai.org\/ocs\/index.php\/AAAI\/AAAI18\/paper\/view\/16471 Haoshu Fang, Yuanlu Xu, Wenguan Wang, Xiaobai Liu, and Song-Chun Zhu. 2018. Learning Pose Grammar to Encode Human Body Configuration for 3D Pose Estimation. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2--7, 2018 , , Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). 6821--6828. https:\/\/www.aaai.org\/ocs\/index.php\/AAAI\/AAAI18\/paper\/view\/16471"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.268"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2009.5459300"},{"key":"e_1_3_2_1_15_1","volume-title":"Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016","author":"He Kaiming","year":"2016","unstructured":"Kaiming He , Xiangyu Zhang , Shaoqing Ren , and Jian Sun . 2016 . Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 , Las Vegas, NV, USA, June 27--30 , 2016 . 770--778. https:\/\/doi.org\/10.1109\/CVPR.2016.90 10.1109\/CVPR.2016.90 Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27--30, 2016 . 770--778. https:\/\/doi.org\/10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_1_17_1","volume-title":"Proceedings, Part XXVIII (Lecture Notes in Computer Science","volume":"786","author":"Hofinger Markus","year":"2020","unstructured":"Markus Hofinger , Samuel Rota Bul\u00f2 , Lorenzo Porzi , Arno Knapitsch , Thomas Pock , and Peter Kontschieder . 2020 . Improving Optical Flow on a Pyramid Level. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23--28, 2020 , Proceedings, Part XXVIII (Lecture Notes in Computer Science , Vol. 12373), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). 770-- 786 . https:\/\/doi.org\/10.1007\/978--3-030--58604--1_46 10.1007\/978--3-030--58604--1_46 Markus Hofinger, Samuel Rota Bul\u00f2, Lorenzo Porzi, Arno Knapitsch, Thomas Pock, and Peter Kontschieder. 2020. Improving Optical Flow on a Pyramid Level. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XXVIII (Lecture Notes in Computer Science, Vol. 12373), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). 770--786. https:\/\/doi.org\/10.1007\/978--3-030--58604--1_46"},{"key":"e_1_3_2_1_18_1","volume-title":"LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018","author":"Hui Tak-Wai","year":"2018","unstructured":"Tak-Wai Hui , Xiaoou Tang , and Chen Change Loy . 2018 . LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018 , Salt Lake City, UT, USA, June 18--22 , 2018. 8981--8989. https:\/\/doi.org\/10.1109\/CVPR.2018.00936 10.1109\/CVPR.2018.00936 Tak-Wai Hui, Xiaoou Tang, and Chen Change Loy. 2018. LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. 8981--8989. https:\/\/doi.org\/10.1109\/CVPR.2018.00936"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.179"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.248"},{"key":"e_1_3_2_1_21_1","volume-title":"Exemplar Fine-Tuning for 3D Human Pose Fitting Towards In-the-Wild 3D Human Pose Estimation. CoRR","author":"Joo Hanbyul","year":"2020","unstructured":"Hanbyul Joo , Natalia Neverova , and Andrea Vedaldi . 2020. Exemplar Fine-Tuning for 3D Human Pose Fitting Towards In-the-Wild 3D Human Pose Estimation. CoRR , Vol. abs\/ 2004 .03686 ( 2020 ). arxiv: 2004.03686 https:\/\/arxiv.org\/abs\/2004.03686 Hanbyul Joo, Natalia Neverova, and Andrea Vedaldi. 2020. Exemplar Fine-Tuning for 3D Human Pose Fitting Towards In-the-Wild 3D Human Pose Estimation. CoRR , Vol. abs\/2004.03686 (2020). arxiv: 2004.03686 https:\/\/arxiv.org\/abs\/2004.03686"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2782743"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00868"},{"key":"e_1_3_2_1_24_1","volume-title":"End-to-End Recovery of Human Shape and Pose. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018","author":"Kanazawa Angjoo","year":"2018","unstructured":"Angjoo Kanazawa , Michael J. Black , David W. Jacobs , and Jitendra Malik . 2018 . End-to-End Recovery of Human Shape and Pose. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018 , Salt Lake City, UT, USA, June 18--22 , 2018. 7122--7131. https:\/\/doi.org\/10.1109\/CVPR.2018.00744 10.1109\/CVPR.2018.00744 Angjoo Kanazawa, Michael J. Black, David W. Jacobs, and Jitendra Malik. 2018. End-to-End Recovery of Human Shape and Pose. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. 7122--7131. https:\/\/doi.org\/10.1109\/CVPR.2018.00744"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00576"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00576"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00411"},{"key":"e_1_3_2_1_28_1","volume-title":"VIBE: Video Inference for Human Body Pose and Shape Estimation. In 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020","author":"Kocabas Muhammed","year":"2020","unstructured":"Muhammed Kocabas , Nikos Athanasiou , and Michael J. Black . 2020 . VIBE: Video Inference for Human Body Pose and Shape Estimation. In 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 , Seattle, WA, USA, June 13--19 , 2020 . 5252--5262. https:\/\/doi.org\/10.1109\/CVPR42600.2020.00530 10.1109\/CVPR42600.2020.00530 Muhammed Kocabas, Nikos Athanasiou, and Michael J. Black. 2020. VIBE: Video Inference for Human Body Pose and Shape Estimation. In 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13--19, 2020. 5252--5262. https:\/\/doi.org\/10.1109\/CVPR42600.2020.00530"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00234"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00463"},{"key":"e_1_3_2_1_31_1","volume-title":"Convolutional Mesh Regression for Single-Image Human Shape Reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019","author":"Kolotouros Nikos","year":"2019","unstructured":"Nikos Kolotouros , Georgios Pavlakos , and Kostas Daniilidis . 2019 b . Convolutional Mesh Regression for Single-Image Human Shape Reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019 , Long Beach, CA, USA, June 16--20 , 2019. 4501--4510. https:\/\/doi.org\/10.1109\/CVPR.2019.00463 10.1109\/CVPR.2019.00463 Nikos Kolotouros, Georgios Pavlakos, and Kostas Daniilidis. 2019 b. Convolutional Mesh Regression for Single-Image Human Shape Reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16--20, 2019. 4501--4510. https:\/\/doi.org\/10.1109\/CVPR.2019.00463"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.500"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.500"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818013"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818013"},{"key":"e_1_3_2_1_36_1","volume-title":"AMASS: Archive of Motion Capture As Surface Shapes. In 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019","author":"Mahmood Naureen","year":"2019","unstructured":"Naureen Mahmood , Nima Ghorbani , Nikolaus F. Troje , Gerard Pons-Moll , and Michael J. Black . 2019 . AMASS: Archive of Motion Capture As Surface Shapes. In 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019 , Seoul, Korea (South), October 27 - November 2, 2019 . 5441--5450. https:\/\/doi.org\/10.1109\/ICCV.2019.00554 10.1109\/ICCV.2019.00554 Naureen Mahmood, Nima Ghorbani, Nikolaus F. Troje, Gerard Pons-Moll, and Michael J. Black. 2019. AMASS: Archive of Motion Capture As Surface Shapes. In 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019. 5441--5450. https:\/\/doi.org\/10.1109\/ICCV.2019.00554"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.288"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/3DV.2017.00064"},{"key":"e_1_3_2_1_39_1","volume-title":"I2L-MeshNet: Image-to-lixel prediction network for accurate 3D human pose and mesh estimation from a single RGB image. arXiv preprint arXiv:2008.03713","author":"Moon Gyeongsik","year":"2020","unstructured":"Gyeongsik Moon and Kyoung Mu Lee . 2020. I2L-MeshNet: Image-to-lixel prediction network for accurate 3D human pose and mesh estimation from a single RGB image. arXiv preprint arXiv:2008.03713 ( 2020 ). Gyeongsik Moon and Kyoung Mu Lee. 2020. I2L-MeshNet: Image-to-lixel prediction network for accurate 3D human pose and mesh estimation from a single RGB image. arXiv preprint arXiv:2008.03713 (2020)."},{"key":"e_1_3_2_1_40_1","volume-title":"Proceedings, Part VIII (Lecture Notes in Computer Science","volume":"499","author":"Newell Alejandro","year":"2016","unstructured":"Alejandro Newell , Kaiyu Yang , and Jia Deng . 2016 . Stacked Hourglass Networks for Human Pose Estimation. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016 , Proceedings, Part VIII (Lecture Notes in Computer Science , Vol. 9912), , Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). 483-- 499 . https:\/\/doi.org\/10.1007\/978--3--319--46484--8_29 10.1007\/978--3--319--46484--8_29 Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked Hourglass Networks for Human Pose Estimation. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part VIII (Lecture Notes in Computer Science, Vol. 9912), , Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). 483--499. https:\/\/doi.org\/10.1007\/978--3--319--46484--8_29"},{"key":"e_1_3_2_1_41_1","volume-title":"Proceedings, Part V (Lecture Notes in Computer Science","volume":"720","author":"Nie Xuecheng","year":"2018","unstructured":"Xuecheng Nie , Jiashi Feng , Junliang Xing , and Shuicheng Yan . 2018 . Pose Partition Networks for Multi-person Pose Estimation. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8--14, 2018 , Proceedings, Part V (Lecture Notes in Computer Science , Vol. 11209), Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). 705-- 720 . https:\/\/doi.org\/10.1007\/978--3-030-01228--1_42 10.1007\/978--3-030-01228--1_42 Xuecheng Nie, Jiashi Feng, Junliang Xing, and Shuicheng Yan. 2018. Pose Partition Networks for Multi-person Pose Estimation. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8--14, 2018, Proceedings, Part V (Lecture Notes in Computer Science, Vol. 11209), Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). 705--720. https:\/\/doi.org\/10.1007\/978--3-030-01228--1_42"},{"key":"e_1_3_2_1_42_1","volume-title":"Single-Stage Multi-Person Pose Machines. In 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019","author":"Nie Xuecheng","year":"2019","unstructured":"Xuecheng Nie , Jiashi Feng , Jianfeng Zhang , and Shuicheng Yan . 2019 . Single-Stage Multi-Person Pose Machines. In 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019 , Seoul, Korea (South), October 27 - November 2, 2019. 6950--6959. https:\/\/doi.org\/10.1109\/ICCV.2019.00705 10.1109\/ICCV.2019.00705 Xuecheng Nie, Jiashi Feng, Jianfeng Zhang, and Shuicheng Yan. 2019. Single-Stage Multi-Person Pose Machines. In 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019. 6950--6959. https:\/\/doi.org\/10.1109\/ICCV.2019.00705"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/3DV.2018.00062"},{"key":"e_1_3_2_1_44_1","volume-title":"Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation. In 2018 International Conference on 3D Vision, 3DV 2018","author":"Omran Mohamed","year":"2018","unstructured":"Mohamed Omran , Christoph Lassner , Gerard Pons-Moll , Peter V. Gehler , and Bernt Schiele . 2018 b. Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation. In 2018 International Conference on 3D Vision, 3DV 2018 , Verona, Italy, September 5--8 , 2018. 484--494. https:\/\/doi.org\/10.1109\/3DV.2018.00062 10.1109\/3DV.2018.00062 Mohamed Omran, Christoph Lassner, Gerard Pons-Moll, Peter V. Gehler, and Bernt Schiele. 2018b. Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation. In 2018 International Conference on 3D Vision, 3DV 2018, Verona, Italy, September 5--8, 2018. 484--494. https:\/\/doi.org\/10.1109\/3DV.2018.00062"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-49409-8_15"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01123"},{"key":"e_1_3_2_1_47_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019","author":"Pavlakos Georgios","year":"2019","unstructured":"Georgios Pavlakos , Vasileios Choutas , Nima Ghorbani , Timo Bolkart , Ahmed A. A. Osman , Dimitrios Tzionas , and Michael J. Black . 2019 b. Expressive Body Capture: 3D Hands, Face, and Body From a Single Image . In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019 , Long Beach, CA, USA, June 16--20 , 2019 . 10975--10985. https:\/\/doi.org\/10.1109\/CVPR.2019.01123 10.1109\/CVPR.2019.01123 Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, and Michael J. Black. 2019 b. Expressive Body Capture: 3D Hands, Face, and Body From a Single Image. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16--20, 2019. 10975--10985. https:\/\/doi.org\/10.1109\/CVPR.2019.01123"},{"key":"e_1_3_2_1_48_1","volume-title":"Human Mesh Recovery from Multiple Shots. CoRR","author":"Pavlakos Georgios","year":"2020","unstructured":"Georgios Pavlakos , Jitendra Malik , and Angjoo Kanazawa . 2020. Human Mesh Recovery from Multiple Shots. CoRR , Vol. abs\/ 2012 .09843 ( 2020 ). arxiv: 2012.09843 https:\/\/arxiv.org\/abs\/2012.09843 Georgios Pavlakos, Jitendra Malik, and Angjoo Kanazawa. 2020. Human Mesh Recovery from Multiple Shots. CoRR , Vol. abs\/2012.09843 (2020). arxiv: 2012.09843 https:\/\/arxiv.org\/abs\/2012.09843"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00763"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.139"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00055"},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00055"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00794"},{"key":"e_1_3_2_1_54_1","article-title":"MotioNet: 3D Human Motion Reconstruction from Monocular Video with Skeleton Consistency","volume":"40","author":"Shi Mingyi","year":"2020","unstructured":"Mingyi Shi , Kfir Aberman , Andreas Aristidou , Taku Komura , Dani Lischinski , Daniel Cohen-Or , and Baoquan Chen . 2020 . MotioNet: 3D Human Motion Reconstruction from Monocular Video with Skeleton Consistency . ACM Trans. Graph. , Vol. 40 , 1 (2020), 1:1--1:15. https:\/\/doi.org\/10.1145\/3407659 10.1145\/3407659 Mingyi Shi, Kfir Aberman, Andreas Aristidou, Taku Komura, Dani Lischinski, Daniel Cohen-Or, and Baoquan Chen. 2020. MotioNet: 3D Human Motion Reconstruction from Monocular Video with Skeleton Consistency. ACM Trans. Graph. , Vol. 40, 1 (2020), 1:1--1:15. https:\/\/doi.org\/10.1145\/3407659","journal-title":"ACM Trans. Graph."},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2011.5995316"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00931"},{"key":"e_1_3_2_1_57_1","volume-title":"Deep High-Resolution Representation Learning for Human Pose Estimation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019","author":"Sun Ke","year":"2019","unstructured":"Ke Sun , Bin Xiao , Dong Liu , and Jingdong Wang . 2019 a . Deep High-Resolution Representation Learning for Human Pose Estimation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019 , Long Beach, CA, USA, June 16--20 , 2019. 5693--5703. https:\/\/doi.org\/10.1109\/CVPR.2019.00584 10.1109\/CVPR.2019.00584 Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. 2019 a. Deep High-Resolution Representation Learning for Human Pose Estimation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16--20, 2019. 5693--5703. https:\/\/doi.org\/10.1109\/CVPR.2019.00584"},{"key":"e_1_3_2_1_58_1","volume-title":"Compositional Human Pose Regression. In IEEE International Conference on Computer Vision, ICCV 2017","author":"Sun Xiao","year":"2017","unstructured":"Xiao Sun , Jiaxiang Shang , Shuang Liang , and Yichen Wei . 2017 . Compositional Human Pose Regression. In IEEE International Conference on Computer Vision, ICCV 2017 , Venice, Italy, October 22--29 , 2017 . 2621--2630. https:\/\/doi.org\/10.1109\/ICCV.2017.284 10.1109\/ICCV.2017.284 Xiao Sun, Jiaxiang Shang, Shuang Liang, and Yichen Wei. 2017. Compositional Human Pose Regression. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017 . 2621--2630. https:\/\/doi.org\/10.1109\/ICCV.2017.284"},{"key":"e_1_3_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00545"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.5244\/C.31.15"},{"key":"e_1_3_2_1_61_1","volume-title":"Proceedings, Part II (Lecture Notes in Computer Science","volume":"419","author":"Teed Zachary","year":"2020","unstructured":"Zachary Teed and Jia Deng . 2020 . RAFT: Recurrent All-Pairs Field Transforms for Optical Flow. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23--28, 2020 , Proceedings, Part II (Lecture Notes in Computer Science , Vol. 12347), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). 402-- 419 . https:\/\/doi.org\/10.1007\/978--3-030--58536--5_24 10.1007\/978--3-030--58536--5_24 Zachary Teed and Jia Deng. 2020. RAFT: Recurrent All-Pairs Field Transforms for Optical Flow. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 12347), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). 402--419. https:\/\/doi.org\/10.1007\/978--3-030--58536--5_24"},{"key":"e_1_3_2_1_62_1","volume-title":"Self-supervised Learning of Motion Capture. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017","author":"Tung Hsiao-Yu","year":"2017","unstructured":"Hsiao-Yu Tung , Hsiao-Wei Tung , Ersin Yumer , and Katerina Fragkiadaki . 2017 b. Self-supervised Learning of Motion Capture. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017 , December 4 --9 , 2017, Long Beach, CA, USA , , Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5236--5246. https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/ab452534c5ce28c4fbb0e102d4a4fb2e-Abstract.html Hsiao-Yu Tung, Hsiao-Wei Tung, Ersin Yumer, and Katerina Fragkiadaki. 2017b. Self-supervised Learning of Motion Capture. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4--9, 2017, Long Beach, CA, USA , , Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5236--5246. https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/ab452534c5ce28c4fbb0e102d4a4fb2e-Abstract.html"},{"key":"e_1_3_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.467"},{"key":"e_1_3_2_1_64_1","volume-title":"Self-supervised learning of motion capture. arXiv preprint arXiv:1712.01337","author":"Fish Tung Hsiao-Yu","year":"2017","unstructured":"Hsiao-Yu Fish Tung , Hsiao-Wei Tung , Ersin Yumer , and Katerina Fragkiadaki . 2017c. Self-supervised learning of motion capture. arXiv preprint arXiv:1712.01337 ( 2017 ). Hsiao-Yu Fish Tung, Hsiao-Wei Tung, Ersin Yumer, and Katerina Fragkiadaki. 2017c. Self-supervised learning of motion capture. arXiv preprint arXiv:1712.01337 (2017)."},{"key":"e_1_3_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01234-2_2"},{"key":"e_1_3_2_1_66_1","volume-title":"Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N. Gomez , Lukasz Kaiser , and Illia Polosukhin . 2017 . Attention is All you Need . In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017 , December 4 --9 , 2017, Long Beach, CA, USA , , Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998--6008. https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4--9, 2017, Long Beach, CA, USA , , Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998--6008. https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html"},{"key":"e_1_3_2_1_67_1","volume-title":"Proceedings, Part X (Lecture Notes in Computer Science","volume":"631","author":"von Marcard Timo","year":"2018","unstructured":"Timo von Marcard , Roberto Henschel , Michael J. Black , Bodo Rosenhahn , and Gerard Pons-Moll . 2018 . Recovering Accurate 3D Human Pose in the Wild Using IMUs and a Moving Camera. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8--14, 2018 , Proceedings, Part X (Lecture Notes in Computer Science , Vol. 11214), , Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). 614-- 631 . https:\/\/doi.org\/10.1007\/978--3-030-01249--6_37 10.1007\/978--3-030-01249--6_37 Timo von Marcard, Roberto Henschel, Michael J. Black, Bodo Rosenhahn, and Gerard Pons-Moll. 2018. Recovering Accurate 3D Human Pose in the Wild Using IMUs and a Moving Camera. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8--14, 2018, Proceedings, Part X (Lecture Notes in Computer Science, Vol. 11214), , Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). 614--631. https:\/\/doi.org\/10.1007\/978--3-030-01249--6_37"},{"key":"e_1_3_2_1_68_1","volume-title":"Convolutional Pose Machines. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016","author":"Wei Shih-En","year":"2016","unstructured":"Shih-En Wei , Varun Ramakrishna , Takeo Kanade , and Yaser Sheikh . 2016 . Convolutional Pose Machines. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 , Las Vegas, NV, USA, June 27--30 , 2016 . 4724--4732. https:\/\/doi.org\/10.1109\/CVPR.2016.511 10.1109\/CVPR.2016.511 Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. 2016. Convolutional Pose Machines. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27--30, 2016 . 4724--4732. https:\/\/doi.org\/10.1109\/CVPR.2016.511"},{"key":"e_1_3_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01122"},{"key":"e_1_3_2_1_70_1","volume-title":"Volumetric Correspondence Networks for Optical Flow. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019","author":"Yang Gengshan","year":"2019","unstructured":"Gengshan Yang and Deva Ramanan . 2019 . Volumetric Correspondence Networks for Optical Flow. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 , NeurIPS 2019, December 8--14, 2019, Vancouver, BC, Canada , , Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch\u00e9 -Buc, Emily B. Fox, and Roman Garnett (Eds.). 793--803. https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/bbf94b34eb32268ada57a3be5062fe7d-Abstract.html Gengshan Yang and Deva Ramanan. 2019. Volumetric Correspondence Networks for Optical Flow. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8--14, 2019, Vancouver, BC, Canada , , Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch\u00e9 -Buc, Emily B. Fox, and Roman Garnett (Eds.). 793--803. https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/bbf94b34eb32268ada57a3be5062fe7d-Abstract.html"},{"key":"e_1_3_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00229"},{"key":"e_1_3_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.280"},{"key":"e_1_3_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00244"},{"key":"e_1_3_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.51"},{"key":"e_1_3_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.537"}],"event":{"name":"ICMR '21: International Conference on Multimedia Retrieval","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Taipei Taiwan","acronym":"ICMR '21"},"container-title":["Proceedings of the 2021 International Conference on Multimedia Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3460426.3463605","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3460426.3463605","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:17:03Z","timestamp":1750191423000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3460426.3463605"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,24]]},"references-count":75,"alternative-id":["10.1145\/3460426.3463605","10.1145\/3460426"],"URL":"https:\/\/doi.org\/10.1145\/3460426.3463605","relation":{},"subject":[],"published":{"date-parts":[[2021,8,24]]},"assertion":[{"value":"2021-09-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}