{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T03:22:55Z","timestamp":1774495375740,"version":"3.50.1"},"reference-count":42,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2019,11,30]],"date-time":"2019-11-30T00:00:00Z","timestamp":1575072000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Shenzhen Science and Technology Foundation","award":["JCYJ20170816093943197"],"award-info":[{"award-number":["JCYJ20170816093943197"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2019,11,30]]},"abstract":"<jats:p>\n            Although Deep Convolutional Neural Networks (DCNNs) facilitate the evolution of 3D human pose estimation, ambiguity remains the most challenging problem in such tasks. Inspired by the Human Perception Mechanism (HPM), we propose an image-to-pose coding method to fill the gap between image cues and 3D poses, thereby alleviating the ambiguity of 3D human pose estimation. First, in 3D pose space, we divide the whole 3D pose space into multiple subregions named\n            <jats:italic>pose codes<\/jats:italic>\n            , turning a disambiguation problem into a classification problem. The proposed coding mechanism covers multiple camera views and provides a complete description for 3D pose space. Second, it is noteworthy that the articulated structure of the human body lies on a sophisticated product manifold and the error accumulation in the chain structure will undoubtedly affect the coding performance. Therefore, in image space, we extract the image cues from independent local image patches rather than the whole image. The mapping relationship between image cues and 3D pose codes is established by a set of DCNNs. The image-to-pose coding method transforms the implicit image cues into explicit constraints. Finally, the image-to-pose coding method is integrated into a linear matching mechanism to construct a 3D pose estimation method that effectively alleviates the ambiguity. We conduct extensive experiments on widely used public benchmarks. The experimental results show that our method effectively alleviates the ambiguity in 3D pose recovery and is robust to the variations of view.\n          <\/jats:p>","DOI":"10.1145\/3368066","type":"journal-article","created":{"date-parts":[[2019,12,16]],"date-time":"2019-12-16T13:12:30Z","timestamp":1576501950000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["An Image Cues Coding Approach for 3D Human Pose Estimation"],"prefix":"10.1145","volume":"15","author":[{"given":"Meng","family":"Xing","sequence":"first","affiliation":[{"name":"Tianjin University, Tianjin, China"}]},{"given":"Zhiyong","family":"Feng","sequence":"additional","affiliation":[{"name":"Tianjin University, Tianjin, China"}]},{"given":"Yong","family":"Su","sequence":"additional","affiliation":[{"name":"Tianjin University, Tianjin, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0330-6908","authenticated-orcid":false,"given":"Jianhai","family":"Zhang","sequence":"additional","affiliation":[{"name":"Tianjin University, Tianjin, China"}]}],"member":"320","published-online":{"date-parts":[[2019,12,16]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1446--1455","author":"Akhter Ijaz","unstructured":"Ijaz Akhter and Michael J. Black . 2015. Pose-conditioned joint angle limits for 3D human pose reconstruction . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1446--1455 . Ijaz Akhter and Michael J. Black. 2015. Pose-conditioned joint angle limits for 3D human pose reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1446--1455."},{"key":"e_1_2_1_2_1","doi-asserted-by":"crossref","unstructured":"Sikandar Amin Mykhaylo Andriluka Marcus Rohrbach and Bernt Schiele. 2013. Multi-view pictorial structures for 3D human pose estimation. In Bmvc. Citeseer.  Sikandar Amin Mykhaylo Andriluka Marcus Rohrbach and Bernt Schiele. 2013. Multi-view pictorial structures for 3D human pose estimation. In Bmvc. Citeseer.","DOI":"10.5244\/C.27.45"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.471"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2010.5540156"},{"key":"e_1_2_1_5_1","volume-title":"Black","author":"Bogo Federica","year":"2016","unstructured":"Federica Bogo , Angjoo Kanazawa , Christoph Lassner , Peter Gehler , Javier Romero , and Michael J . Black . 2016 . Keep it SMPL : Automatic estimation of 3D human pose and shape from a single image. In European Conference on Computer Vision. Springer , 561--578. Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J. Black. 2016. Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In European Conference on Computer Vision. Springer, 561--578."},{"key":"e_1_2_1_6_1","volume-title":"Convex Optimization","author":"Boyd Stephen","unstructured":"Stephen Boyd and Lieven Vandenberghe . 2004. Convex Optimization . Cambridge University Press . Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press."},{"key":"e_1_2_1_7_1","volume-title":"Conference on Computer Vision and Pattern Recognition. IEEE, 7035--7043","author":"Chen Ching-Hang","year":"2017","unstructured":"Ching-Hang Chen and Deva Ramanan . 2017 . 3D human pose estimation&equals; 2D pose estimation+ matching . In Conference on Computer Vision and Pattern Recognition. IEEE, 7035--7043 . Ching-Hang Chen and Deva Ramanan. 2017. 3D human pose estimation&equals; 2D pose estimation+ matching. In Conference on Computer Vision and Pattern Recognition. IEEE, 7035--7043."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/APSIPA.2015.7415430"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10590-1_12"},{"key":"e_1_2_1_10_1","volume-title":"The National Conference on Artificial Intelligence. AAAI, 6821--6828","author":"Fang Haoshu","year":"2018","unstructured":"Haoshu Fang , Yuanlu Xu , Wenguan Wang , Xiaobai Liu , and Song-Chun Zhu . 2018 . Learning pose grammar to encode human body configuration for 3D pose estimation . In The National Conference on Artificial Intelligence. AAAI, 6821--6828 . Haoshu Fang, Yuanlu Xu, Wenguan Wang, Xiaobai Liu, and Song-Chun Zhu. 2018. Learning pose grammar to encode human body configuration for 3D pose estimation. In The National Conference on Artificial Intelligence. AAAI, 6821--6828."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_10"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-011-0451-1"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.248"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5244\/C.24.12"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5244\/C.28.80"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/0734-189X(85)90094-5"},{"key":"e_1_2_1_17_1","volume-title":"Chan","author":"Li Sijin","year":"2014","unstructured":"Sijin Li and Antoni B . Chan . 2014 . 3D human pose estimation from monocular images with deep convolutional neural network. In Asian Conference on Computer Vision. Springer , 332--347. Sijin Li and Antoni B. Chan. 2014. 3D human pose estimation from monocular images with deep convolutional neural network. In Asian Conference on Computer Vision. Springer, 332--347."},{"key":"e_1_2_1_18_1","volume-title":"Transactions on Pattern Analysis and Machine Intelligence","author":"Liu Jun","unstructured":"Jun Liu , Henghui Ding , Amir Shahroudy , Ling-Yu Duan , Xudong Jiang , Gang Wang , and Alex Kot Chichung . 2019. Feature boosting network for 3D pose estimation . In Transactions on Pattern Analysis and Machine Intelligence . IEEE , 1--11. Jun Liu, Henghui Ding, Amir Shahroudy, Ling-Yu Duan, Xudong Jiang, Gang Wang, and Alex Kot Chichung. 2019. Feature boosting network for 3D pose estimation. In Transactions on Pattern Analysis and Machine Intelligence. IEEE, 1--11."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818013"},{"key":"e_1_2_1_20_1","volume-title":"International Conference on Computer Vision. IEEE, 2640--2649","author":"Martinez Julieta","unstructured":"Julieta Martinez , Rayat Hossain , Javier Romero , and James J. Little . 2017. A simple yet effective baseline for 3D human pose estimation . In International Conference on Computer Vision. IEEE, 2640--2649 . Julieta Martinez, Rayat Hossain, Javier Romero, and James J. Little. 2017. A simple yet effective baseline for 3D human pose estimation. In International Conference on Computer Vision. IEEE, 2640--2649."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/3DV.2017.00064"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.170"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46484-8_29"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.373"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2004.1315139"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.139"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33765-9_41"},{"key":"e_1_2_1_28_1","volume-title":"Advances in Neural Information Processing Systems","author":"Rogez Gr\u00e9gory","unstructured":"Gr\u00e9gory Rogez and Cordelia Schmid . 2016. Mocap-guided data augmentation for 3D pose estimation in the wild . In Advances in Neural Information Processing Systems . MIT Press , 3108--3116. Gr\u00e9gory Rogez and Cordelia Schmid. 2016. Mocap-guided data augmentation for 3D pose estimation in the wild. In Advances in Neural Information Processing Systems. MIT Press, 3108--3116."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-009-0293-2"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-011-0493-4"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2012.6247988"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3180420"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1006\/cviu.2000.0878"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.113"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.603"},{"key":"e_1_2_1_36_1","volume-title":"IEEE 12th International Conference on Computer Vision. IEEE","author":"Xiaolin","unstructured":"Xiaolin K. Wei and Jinxiang Chai. 2009. Modeling 3D human poses from uncalibrated monocular images . In IEEE 12th International Conference on Computer Vision. IEEE , 1873--1880. Xiaolin K. Wei and Jinxiang Chai. 2009. Modeling 3D human poses from uncalibrated monocular images. In IEEE 12th International Conference on Computer Vision. IEEE, 1873--1880."},{"key":"e_1_2_1_37_1","volume-title":"Freeman","author":"Wu Jiajun","year":"2016","unstructured":"Jiajun Wu , Tianfan Xue , Joseph J. Lim , Yuandong Tian , Joshua B. Tenenbaum , Antonio Torralba , and William T . Freeman . 2016 . Single image 3D interpreter network. In European Conference on Computer Vision. Springer , 365--382. Jiajun Wu, Tianfan Xue, Joseph J. Lim, Yuandong Tian, Joshua B. Tenenbaum, Antonio Torralba, and William T. Freeman. 2016. Single image 3D interpreter network. In European Conference on Computer Vision. Springer, 365--382."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00551"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.535"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.51"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.537"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2816031"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3368066","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3368066","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:44:39Z","timestamp":1750203879000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3368066"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,11,30]]},"references-count":42,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2019,11,30]]}},"alternative-id":["10.1145\/3368066"],"URL":"https:\/\/doi.org\/10.1145\/3368066","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,11,30]]},"assertion":[{"value":"2018-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-12-16","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}