{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:17:59Z","timestamp":1760242679023,"version":"build-2065373602"},"reference-count":52,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2016,2,20]],"date-time":"2016-02-20T00:00:00Z","timestamp":1455926400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Pedestrian detection and human pose estimation are instructive for reconstructing a three-dimensional scenario and for robot navigation, particularly when large amounts of vision data are captured using various data-recording techniques. Using an unrestricted capture scheme, which produces occlusions or breezing, the information describing each part of a human body and the relationship between each part or even different pedestrians must be present in a still image. Using this framework, a multi-layered, spatial, virtual, human pose reconstruction framework is presented in this study to recover any deficient information in planar images. In this framework, a hierarchical parts-based deep model is used to detect body parts by using the available restricted information in a still image and is then combined with spatial Markov random fields to re-estimate the accurate joint positions in the deep network. Then, the planar estimation results are mapped onto a virtual three-dimensional space using multiple constraints to recover any deficient spatial information. The proposed approach can be viewed as a general pre-processing method to guide the generation of continuous, three-dimensional motion data. The experiment results of this study are used to describe the effectiveness and usability of the proposed approach.<\/jats:p>","DOI":"10.3390\/s16020263","type":"journal-article","created":{"date-parts":[[2016,2,22]],"date-time":"2016-02-22T10:24:17Z","timestamp":1456136657000},"page":"263","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["A Layered Approach for Robust Spatial Virtual Human Pose Reconstruction Using a Still Image"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8462-8636","authenticated-orcid":false,"given":"Chengyu","family":"Guo","sequence":"first","affiliation":[{"name":"State Key Lab of Virtual Reality Technology and Systems, Beihang university, Xueyuan Road No.37, Haidian District, Beijing 100000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Songsong","family":"Ruan","sequence":"additional","affiliation":[{"name":"State Key Lab of Virtual Reality Technology and Systems, Beihang university, Xueyuan Road No.37, Haidian District, Beijing 100000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaohui","family":"Liang","sequence":"additional","affiliation":[{"name":"State Key Lab of Virtual Reality Technology and Systems, Beihang university, Xueyuan Road No.37, Haidian District, Beijing 100000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qinping","family":"Zhao","sequence":"additional","affiliation":[{"name":"State Key Lab of Virtual Reality Technology and Systems, Beihang university, Xueyuan Road No.37, Haidian District, Beijing 100000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2016,2,20]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Ramanan, D., Forsyth, D.A., and Zisserman, A. (2007). Tracking People by Learning their Appearance. IEEE Pattern Anal. Mach. Intell.","DOI":"10.1109\/TPAMI.2007.250600"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Sminchisescu, C., Kanaujia, A., and Metaxas, D. (2007). Discriminative Density Propagation for Visual Tracking. IEEE Pattern Anal. Mach. Intell.","DOI":"10.1109\/TPAMI.2007.1111"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive Image Features from Scale-Invariant Keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_4","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Johnson, S., and Everingham, M. (2009, January 28\u201328). Combining discriminative appearance and segmentation cues for articulated human pose estimation. Proceedings of the 2nd IEEE international workshop on machine learning for vision-based motion analysis, Kyoto, Japan.","DOI":"10.1109\/ICCVW.2009.5457673"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Ramanan, D. (2011, January 20\u201325). Articulated pose estimation with flexible mixtures-of-parts. Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, USA.","DOI":"10.1109\/CVPR.2011.5995741"},{"key":"ref_7","unstructured":"Leonid, P., Micha, A., Peter, G., and Bernt, S. (2013, January 1\u20138). Strong appearance and expressive spatial models for human pose estimation. Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1627","DOI":"10.1109\/TPAMI.2009.167","article-title":"Object detection with discriminatively trained part-based models","volume":"32","author":"Felzenszwalb","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_9","unstructured":"Bourdev, L.D., and Malik, J. (October, January 27). Poselets: Body part detectors trained using 3D human pose annotations. Proceedings of the IEEE 12th International Conference on Computer Vision, Kyoto, Japan."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1145\/1015706.1015720","article-title":"GrabCut: Interactive Foreground Extraction Using Iterated Graph Cuts","volume":"23","author":"Rother","year":"2004","journal-title":"ACM Trans. Graph."},{"key":"ref_11","unstructured":"Agarwal, A., and Triggs, B. (July, January 27). 3D human pose from silhouettes by relevance vector regression. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA."},{"key":"ref_12","unstructured":"Abhinav, G., Trista, C., Francine, C., Don, K., and Davis, S. (2008, January 23\u201328). Context and observation driven latent variable model for human pose estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"4189","DOI":"10.3390\/s140304189","article-title":"A Survey on Model Based Approaches for 2D and 3D Visual Human Pose Recovery","volume":"14","author":"Escalera","year":"2014","journal-title":"Sensors"},{"key":"ref_14","unstructured":"Perez-Sala, X., Escalera, S., and Angulo, C. (2012, January 24\u201326). Survey on spatio-temporal view invariant human pose recovery. Proceedings of the 15th International Conference of the Catalan Association of Artificial Intelligence, Alicante, Spain."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1582","DOI":"10.1109\/TPAMI.2009.154","article-title":"Evaluating color descriptors for object and scene recognition","volume":"32","author":"Sande","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1007\/BF01420984","article-title":"Performance of optical flow techniques","volume":"12","author":"Barron","year":"1994","journal-title":"Int. J. Comput. Vis."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Simo-Serra, E., Quattoni, A., Torras, C., and Moreno-Noguer, F. (2013, January 25\u201327). A joint model for 2D and 3D pose estimation from a single image. Proceedings of the IEEE Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.466"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Dantone, M., Gall, J., Leistner, C., and van Gool, L. (2013, January 25\u201327). Human pose estimation using body parts dependent joint regressors. Proceedings of the IEEE Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.391"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Belagiannis, V., Amin, S., Andriluka, M., Schiele, B., Navab, N., and Ilic, S. (2014, January 23\u201328). 3D pictorial structures for multiple human pose estimation. Proceedings of the IEEE Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.216"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Toshev, A., and Szegedy, C. (2014, January 23\u201328). Deep pose: Human pose estimation via deep neural networks. Proceedings of the IEEE Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.214"},{"key":"ref_21","unstructured":"Chen, X., and Yuille, A. (2014, January 8\u201313). Articulated pose estimation by a graphical model with image dependent pairwise relations. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_22","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20138). Imagenet classification with deep convolutional neural networks. Proceedings of the Neural Information Processing Systems, Stateline, NV, USA."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1935","DOI":"10.1145\/2629500","article-title":"Real-time continuous pose recovery of human hands using convolutional networks","volume":"33","author":"Tompson","year":"2014","journal-title":"Acm Trans. Graph."},{"key":"ref_24","unstructured":"Dumitru, E., Yoshua, B., Aaron, C., and Pascal, V. (2009, January 14\u201318). Visualizing higher-layer features of a deep network. Proceedings of the ICML, Montreal, QC, Canada."},{"key":"ref_25","unstructured":"Jain, A., Tompson, J., Andriluka, M., Taylor, G.W., and Bregler, C. (2014, January 24\u201327). Learning human pose estimation features with convolutional networks. Proceedings of the Computer Vision and Pattern Recognition, Columbus OH, USA."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1610","DOI":"10.1109\/TNN.2010.2066286","article-title":"Human tracking using convolutional neural networks","volume":"21","author":"Fan","year":"2010","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Wang, K., Wang, X., Lin, L., Wang, M., and Zuo, W. (2015, January 23\u201326). 3D human activity recognition with reconfigurable convolutional neural networks. Proceedings of the ACM International Conference on Multimedia, Shanghai, China.","DOI":"10.1145\/2647868.2654912"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1109\/TPAMI.2012.59","article-title":"3D convolutional neural networks for human action recognition","volume":"35","author":"Shuiwang","year":"2013","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1007\/s11263-013-0634-z","article-title":"Rotation-invariant hog descriptors using Fourier analysis in polar and spherical coordinates","volume":"106","author":"Liu","year":"2014","journal-title":"Int. J. Comput. Vis."},{"key":"ref_30","unstructured":"Giusti, A., Ciresan, D.C., Masci, J., Gambardella, L.M., and Schmidhuber, J. Fast image scanning with deep max-pooling convolutional neural networks. Available online: http:\/\/arxiv.org\/abs\/1302.1700v1."},{"key":"ref_31","unstructured":"Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and Lecun, Y. (2013, January 25\u201327). Overfeat: Integrated recognition, localization and detection using convolutional networks. Proceedings of the Computer Vision and Pattern Recognition, Portland, OR, USA."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Ouyang, W., Zeng, X., and Wang, X. (2015). Partial occlusion handling in pedestrian detection with a deep model. IEEE Trans. Circuits Syst. Video Technol.","DOI":"10.1109\/TCSVT.2015.2501940"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Ouyang, W., and Wang, X. (2013, January 1\u20138). Joint deep learning for pedestrian detection. Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia.","DOI":"10.1109\/ICCV.2013.257"},{"key":"ref_34","unstructured":"Mikolajczyk, K., Leibe, B., and Schiele, B. (2006, January 17\u201322). Multiple object class detection with a generative model. Proceedings of the Computer Vision and Pattern Recognition, New York, NY, USA."},{"key":"ref_35","unstructured":"Sigal, L., and Black, M.J. Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. Available online: http:\/\/humaneva.is.tue.mpg.de\/."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Wang, C., Wang, Y., Lin, Z., Yuille, A.L., and Gao, W. (2014, January 23\u201328). Robust estimation of 3D human poses from a single image. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.303"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"514","DOI":"10.1145\/1015706.1015754","article-title":"Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces","volume":"23","author":"Safonova","year":"2004","journal-title":"ACM Trans. Graph."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1002\/cav.1534","article-title":"Flexible editing of human motion by three-way decomposition","volume":"25","author":"He","year":"2014","journal-title":"Comput. Anim. Virtual Worlds"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Guo, C., Ruan, S., and Liang, X. (2015, January 17\u201318). Synthesis and editing of human motion with generative human motion model. Proceedings of the International Conference on Virtual Reality and Visualization, Xiamen, China.","DOI":"10.1109\/ICVRV.2015.42"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"743","DOI":"10.1109\/TPAMI.2011.155","article-title":"Pedestrian detection: an evaluation of the state of the art","volume":"34","author":"Dollar","year":"2012","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Ramanan, D. (2006, January December). Learning to parse images of articulated bodies. Proceedings of the Twentieth Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada.","DOI":"10.7551\/mitpress\/7503.003.0146"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Sapp, B., and Taskar, B. (2013, January 25\u201327). MODEC: Multimodal decomposable models for human pose estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.471"},{"key":"ref_43","unstructured":"CMU Human Motion Capture Database. Available online: http:\/\/mocap.cs.cmu.edu\/search.html."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1023\/B:VISI.0000013087.49260.fb","article-title":"Robust Real-Time Face Detection","volume":"57","author":"Viola","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Walk, S., Majer, N., Schindler, K., and Schiele, B. (2010, January 13\u201318). New features and insights for pedestrian detection. Proceedings of the Computer Vision and Pattern Recognition, San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5540102"},{"key":"ref_46","unstructured":"Ouyang, W., and Wang, X. (2012, January 16\u201321). A Discriminative deep model for pedestrian detection with occlusion handling. Proceedings of the Computer Vision and Pattern Recognition, Providence, RI, USA."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Ouyang, W., Zeng, X., and Wang, X. (2013, January 25\u201327). Modeling mutual visibility relationship with a deep model in pedestrian detection. Proceedings of the Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.414"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"DollSar, P., Tu, Z., Perona, P., and Belongie, S. (2009, January 7\u201310). Integral channel features. Proceedings of the British Machine Vision Conference, London, UK.","DOI":"10.5244\/C.23.91"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"DollSar, P., Appel, R., Belongie, S., and Perona, P. (2014). Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell.","DOI":"10.1109\/TPAMI.2014.2300479"},{"key":"ref_50","unstructured":"Tompson, J., Jain, A., Lecun, Y., and Bregler, C. (2014, January 23\u201328). Joint training of a convolutional network and a graphical model for human pose estimation. Proceedings of the Computer Vision and Pattern Recognition, Columbus, OH, USA."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Daubney, B., and Xie, X. (2011, January 20\u201325). Tracking 3D human pose with large root node uncertainty. Proceedings of the Computer Vision and Pattern Recognition, Colorado Springs, CO, USA.","DOI":"10.1109\/CVPR.2011.5995502"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Simo-Serra, E., Ramisa, A., Aleny\u2019a, G., Torras, C., and Moreno-Noguer, F. (2012, January 16\u201321). Single image 3D human pose estimation from noisy observations. Proceedings of the Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6247988"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/16\/2\/263\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T19:19:26Z","timestamp":1760210366000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/16\/2\/263"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,2,20]]},"references-count":52,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2016,2]]}},"alternative-id":["s16020263"],"URL":"https:\/\/doi.org\/10.3390\/s16020263","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2016,2,20]]}}}