{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:56:55Z","timestamp":1760151415357,"version":"build-2065373602"},"reference-count":39,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2022,3,4]],"date-time":"2022-03-04T00:00:00Z","timestamp":1646352000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the Science Foundation of Shandong Province","award":["ZR2020MF005"],"award-info":[{"award-number":["ZR2020MF005"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>The occlusion problem is one of the fundamental problems of computer vision, especially in the case of non-rigid objects with variable shapes and complex backgrounds, such as humans. With the rise of computer vision in recent years, the problem of occlusion has also become increasingly visible in branches such as human pose estimation, where the object of study is a human being. In this paper, we propose a two-stage framework that solves the human de-occlusion problem. The first stage is the amodal completion stage, where a new network structure is designed based on the hourglass network, and a large amount of prior information is obtained from the training set to constrain the model to predict in the correct direction. The second phase is the content recovery phase, where visible guided attention (VGA) is added to the U-Net with a symmetric U-shaped network structure to derive relationships between visible and invisible regions and to capture information between contexts across scales. As a whole, the first stage is the encoding stage, and the second stage is the decoding stage, and the network structure of each stage also consists of encoding and decoding, which is symmetrical overall and locally. To evaluate the proposed approach, we provided a dataset, the human occlusion dataset, which has occluded objects from drilling scenes and synthetic images that are close to reality. Experiments show that the method has high performance in terms of quality and diversity compared to existing methods. It is able to remove occlusions in complex scenes and can be extended to human pose estimation.<\/jats:p>","DOI":"10.3390\/sym14030531","type":"journal-article","created":{"date-parts":[[2022,3,6]],"date-time":"2022-03-06T20:40:02Z","timestamp":1646599202000},"page":"531","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Removal and Recovery of the Human Invisible Region"],"prefix":"10.3390","volume":"14","author":[{"given":"Qian","family":"Zhang","sequence":"first","affiliation":[{"name":"School of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China"}]},{"given":"Qiyao","family":"Liang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China"}]},{"given":"Hong","family":"Liang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China"}]},{"given":"Ying","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,3,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_2","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_3","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_5","first-page":"91","article-title":"Faster r-cnn: Towards real-time object detection with region proposal networks","volume":"28","author":"Ren","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., and Zhang, L. (2020, January 14\u201320). Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00543"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Cao, Z., Simon, T., Wei, S.E., and Sheikh, Y. (2017, January 21\u201326). Realtime multi-person 2d pose estimation using part affinity fields. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.143"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Newell, A., Yang, K., and Deng, J. (2016, January 11\u201314). Stacked hourglass networks for human pose estimation. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46484-8_29"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Liu, Z., Chen, H., Feng, R., Wu, S., Ji, S., Yang, B., and Wang, X. (2021, January 20\u201325). Deep Dual Consecutive Network for Human Pose Estimation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00059"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Artacho, B., and Savakis, A. (2020, January 14\u201320). Unipose: Unified human pose estimation in single images and videos. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00706"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Kuznetsova, A., Maleva, T., and Soloviev, V. (2020). Using YOLOv3 algorithm with pre-and post-processing for apple detection in fruit-harvesting robot. Agronomy, 10.","DOI":"10.3390\/agronomy10071016"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"8577","DOI":"10.1109\/ACCESS.2022.3143524","article-title":"Artificial Neural Networks and Computer Vision\u2019s-Based Phytoindication Systems for Variable Rate Irrigation Improving","volume":"10","author":"Kamyshova","year":"2022","journal-title":"IEEE Access"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Korchagin, S.A., Gataullin, S.T., Osipov, A.V., Smirnov, M.V., Suvorov, S.V., Serdechnyi, D.V., and Bublikov, K.V. (2021). Development of an Optimal Algorithm for Detecting Damaged and Diseased Potato Tubers Moving along a Conveyor Belt Using Computer Vision Systems. Agronomy, 11.","DOI":"10.3390\/agronomy11101980"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Andriyanov, N., Khasanshin, I., Utkin, D., Gataullin, T., Ignar, S., Shumaev, V., and Soloviev, V. (2022). Intelligent System for Estimation of the Spatial Position of Apples Based on YOLOv3 and Real Sense Depth Camera D415. Symmetry, 14.","DOI":"10.3390\/sym14010148"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Chu, X., Zheng, A., Zhang, X., and Sun, J. (2020, January 14\u201320). Detection in crowded scenes: One proposal, multiple predictions. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01223"},{"key":"ref_16","unstructured":"Dai, H., Zhou, L., Zhang, F., Zhang, Z., Hu, H., Zhu, X., and Ye, M. (2020). Joint COCO and Mapillary Workshop at ICCV 2019 Keypoint Detection Challenge Track Technical Report: Distribution\u2014Aware Coordinate Representation for Human Pose Estimation. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1145\/1531326.1531330","article-title":"PatchMatch: A randomized correspondence algorithm for structural image editing","volume":"28","author":"Barnes","year":"2009","journal-title":"ACM Trans. Graph."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., and Efros, A.A. (2016, January 27\u201330). Context encoders: Feature learning by inpainting. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.278"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Ehsani, K., Roozbeh, M., and Ali, F. (2018, January 18\u201323). Segan: Segmenting and generating the invisible. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00643"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zhan, X., Pan, X., Dai, B., Liu, Z., Lin, D., and Loy, C.C. (2020, January 14\u201320). Self-supervised scene de-occlusion. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00384"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Yan, X., Wang, F., Liu, W., Yu, Y., He, S., and Pan, J. (2019, January 27\u201328). Visualizing the invisible: Occluded vehicle segmentation and recovery. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00771"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Li, K., and Malik, J. (2016, January 8\u201316). Amodal instance segmentation. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46475-6_42"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Xiao, Y., Xu, Y., Zhong, Z., Luo, W., Li, J., and Gao, S. (2020). Amodal Segmentation Based on Visible Region Segmentation and Shape Prior. arXiv.","DOI":"10.1609\/aaai.v35i4.16407"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., and Li, H. (2017, January 21\u201326). High-resolution image inpainting using multi-scale neural patch synthesis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.434"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3072959.3073659","article-title":"Globally and locally consistent image completion","volume":"36","author":"Iizuka","year":"2017","journal-title":"ACM Trans. Graph."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Liu, G., Reda, F.A., Shih, K.J., Wang, T.C., Tao, A., and Catanzaro, B. (2018, January 8\u201314). Image inpainting for irregular holes using partial convolutions. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01252-6_6"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Liu, H., Wan, Z., Huang, W., Song, Y., Han, X., and Liao, J. (2021, January 20\u201325). PD-GAN: Probabilistic Diverse GAN for Image Inpainting. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00925"},{"key":"ref_28","unstructured":"Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., and Ebrahimi, M. (2019). Edgeconnect: Generative image inpainting with adversarial edge learning. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Isola, P., Zhu, J.Y., Zhou, T., and Efros, A.A. (2017, January 21\u201326). Image-to-image translation with conditional adversarial networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.632"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Zeng, Y., Fu, J., Chao, H., and Guo, B. (2019, January 15\u201320). Learning pyramid-context encoder network for high-quality image inpainting. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00158"},{"key":"ref_31","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (voc) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., and Berg, T.L. (2012, January 16\u201321). Parsing clothing in fashion photographs. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248101"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Bolya, D., Zhou, C., Xiao, F., and Lee, Y.J. (2019, January 27\u201328). Yolact: Real-time instance segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00925"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_36","unstructured":"Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lerer, A., Lin, Z., Desmaison, A., and Antiga, L. (2017, January 9). Automatic differentiation in pytorch. Proceedings of the NIPS 2017 Autodiff Workshop, Long Beach, CA, USA."},{"key":"ref_37","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_38","unstructured":"Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., and Hochreiter, S. (2017). Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inf. Process. Syst., 30."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Fang, H.S., Lu, G., Fang, X., Xie, J., Tai, Y.W., and Lu, C. (2018). Weakly and semi supervised human body part parsing via pose-guided knowledge transfer. arXiv.","DOI":"10.1109\/CVPR.2018.00015"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/14\/3\/531\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:32:22Z","timestamp":1760135542000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/14\/3\/531"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,4]]},"references-count":39,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2022,3]]}},"alternative-id":["sym14030531"],"URL":"https:\/\/doi.org\/10.3390\/sym14030531","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2022,3,4]]}}}