{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,15]],"date-time":"2025-04-15T13:36:58Z","timestamp":1744724218987,"version":"3.37.3"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,8,31]],"date-time":"2020-08-31T00:00:00Z","timestamp":1598832000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,8,31]],"date-time":"2020-08-31T00:00:00Z","timestamp":1598832000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["IPSJ T Comput Vis Appl"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We address a 3D human pose estimation for equirectangular images taken by a wearable omnidirectional camera. The equirectangular image is distorted because the omnidirectional camera is attached closely in front of a person\u2019s neck. Furthermore, some parts of the body are disconnected on the image; for instance, when a hand goes out to an edge of the image, the hand comes in from another edge. The distortion and disconnection of images make 3D pose estimation challenging. To overcome this difficulty, we introduce the location-maps method proposed by Mehta et al.; however, the method was used to estimate 3D human poses only for regular images without distortion and disconnection. We focus on a characteristic of the location-maps that can extend 2D joint locations to 3D positions with respect to 2D-3D consistency without considering kinematic model restrictions and optical properties. In addition, we collect a new dataset that is composed of equirectangular images and synchronized 3D joint positions for training and evaluation. We validate the location-maps\u2019 capability to estimate 3D human poses for distorted and disconnected images. We propose a new location-maps-based model by replacing the backbone network with a state-of-the-art 2D human pose estimation model (HRNet). Our model is a simpler architecture than the reference model proposed by Mehta et al. Nevertheless, our model indicates better performance with respect to accuracy and computation complexity. Finally, we analyze the location-maps method from two perspectives: the map variance and the map scale. Therefore, some location-maps characteristics are revealed that (1) the map variance affects robustness to extend 2D joint locations to 3D positions for the 2D estimation error, and (2) the 3D position accuracy is related to the 2D locations relative accuracy to the map scale.<\/jats:p>","DOI":"10.1186\/s41074-020-00066-8","type":"journal-article","created":{"date-parts":[[2020,8,31]],"date-time":"2020-08-31T10:03:10Z","timestamp":1598868190000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["3D human pose estimation model using location-maps for distorted and disconnected images by a wearable omnidirectional camera"],"prefix":"10.1186","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9628-1414","authenticated-orcid":false,"given":"Teppei","family":"Miura","sequence":"first","affiliation":[]},{"given":"Shinji","family":"Sako","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,8,31]]},"reference":[{"key":"66_CR1","first-page":"4160","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"J Pu","year":"2019","unstructured":"Pu J, Zhou W, Li H (2019) Iterative alignment network for continuous sign language recognition In: IEEE Conference on Computer Vision and Pattern Recognition, 4160\u20134169.. IEEE, New York."},{"key":"66_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/TPAMI.2019.2917037","volume":"42","author":"O Koller","year":"2019","unstructured":"Koller O, Camgoz C, Ney H, Bowden R (2019) Weakly supervised learning with multi-stream CNN-LSTM-HMMs to discover sequential parallelism in sign language videos. IEEE Trans Pattern Anal Mach Intell 42:1\u20131.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"66_CR3","first-page":"1302","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"Cao Z","year":"2017","unstructured":"Cao Z, Simon T, Wei SE, Sheikh Y (2017) Realtime multi-person 2D pose estimation using part affinity fields In: IEEE Conference on Computer Vision and Pattern Recognition, 1302\u20131310.. IEEE, New York."},{"key":"66_CR4","doi-asserted-by":"crossref","unstructured":"Mehta D, Sridhar S, Sotnychenko O, Rhodin H Shafiei M, Seidel HP, et al. (2017) VNect: real-time 3D human pose estimation with a single RGB camera In: ACM Transactions on Graphics. vol. 36, 1\u201314.. ACM, New York, NY, USA.","DOI":"10.1145\/3072959.3073596"},{"issue":"5","key":"66_CR5","doi-asserted-by":"publisher","first-page":"2093","DOI":"10.1109\/TVCG.2019.2898650","volume":"25","author":"W Xu","year":"2019","unstructured":"Xu W, Chatterjee A, Zollh\u00f6fer M, Rhodin H, Fua P, Seidel HP, et al. (2019) Mo2Cap2: real-time mobile 3D motion capture with a cap-mounted fisheye camera. IEEE Trans Vis Comput Graph 25(5):2093\u20132101.","journal-title":"IEEE Trans Vis Comput Graph"},{"key":"66_CR6","first-page":"5686","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"K Sun","year":"2019","unstructured":"Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation In: IEEE Conference on Computer Vision and Pattern Recognition, 5686\u20135696.. IEEE, New York."},{"key":"66_CR7","first-page":"2345","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"G Pons-Moll","year":"2014","unstructured":"Pons-Moll G, Fleet DJ, Rosenhahn B (2014) Posebits for monocular human pose estimation In: IEEE Conference on Computer Vision and Pattern Recognition, 2345\u20132352.. IEEE, New York."},{"key":"66_CR8","doi-asserted-by":"crossref","unstructured":"Ionescu C, Carreira J, Sminchisescu C (2014) Iterated second-order label sensitive pooling for 3D human pose estimation In: IEEE Conference on Computer Vision and Pattern Recognition, 1661\u20131668.","DOI":"10.1109\/CVPR.2014.215"},{"key":"66_CR9","doi-asserted-by":"crossref","unstructured":"Tekin B, Rozantsev A, Lepetit V, Fua P (2016) Direct prediction of 3D body poses from motion compensated sequences In: IEEE Conference on Computer Vision and Pattern Recognition, 991\u20131000.. IEEE, New York.","DOI":"10.1109\/CVPR.2016.113"},{"key":"66_CR10","doi-asserted-by":"crossref","unstructured":"Tekin B, M\u00e1rquez-Neila P, Salzmann M, Fua P (2016) Fusing 2D uncertainty and 3D cues for monocular body pose estimation. ArXiv 1611.05708 abs\/1611.05708.","DOI":"10.1109\/ICCV.2017.425"},{"key":"66_CR11","first-page":"20","volume-title":"European Conference on Computer Vision","author":"Y Du","year":"2016","unstructured":"Du Y, Wong Y, Liu Y, Han F, Gui Y, Wang Z, et al. (2016) Marker-less 3D human motion capture with monocular image sequence and height-maps In: European Conference on Computer Vision, 20\u201336.. Springer, New York."},{"key":"66_CR12","first-page":"1263","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"G Pavlakos","year":"2017","unstructured":"Pavlakos G, Zhou X, Derpanis KG, Daniilidis K (2017) Coarse-to-fine volumetric prediction for single-image 3D human pose In: IEEE Conference on Computer Vision and Pattern Recognition, 1263\u20131272.. IEEE, New York."},{"key":"66_CR13","first-page":"332","volume-title":"Asian Conference on Computer Vision","author":"S Li","year":"2015","unstructured":"Li S, Chan AB (2015) 3D human pose estimation from monocular images with deep convolutional neural network In: Asian Conference on Computer Vision, 332\u2013347.. Springer, New York."},{"key":"66_CR14","first-page":"130.1","volume-title":"British Machine Vision Conference","author":"B Tekin","year":"2016","unstructured":"Tekin B, Katircioglu I, Salzmann M, Lepetit V, Fua P (2016) Structured prediction of 3D human pose with deep neural networks In: British Machine Vision Conference, 130.1\u2013130,11.. BMVA Press, Durham."},{"key":"66_CR15","first-page":"186","volume-title":"European Conference on Computer Vision","author":"X Zhou","year":"2016","unstructured":"Zhou X, Sun X, Zhang W, Liang S, Wei Y (2016) Deep kinematic pose regression In: European Conference on Computer Vision, 186\u2013201.. Springer, New York."},{"issue":"1","key":"66_CR16","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1109\/TPAMI.2006.21","volume":"28","author":"A Agarwal","year":"2006","unstructured":"Agarwal A, Triggs B (2006) Recovering 3D human pose from monocular images. IEEE Trans Pattern Anal Mach Intell 28(1):44\u201358.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"7","key":"66_CR17","doi-asserted-by":"publisher","first-page":"1052","DOI":"10.1109\/TPAMI.2006.149","volume":"28","author":"G Mori","year":"2006","unstructured":"Mori G, Malik J (2006) Recovering 3D human body configurations using shape contexts. IEEE Trans Pattern Anal Mach Intell 28(7):1052\u20131062.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"66_CR18","first-page":"509","volume-title":"European Conference on Computer Vision. vol. 9909","author":"H Rhodin","year":"2016","unstructured":"Rhodin H, Robertini N, Casas D, Richardt C, Seidel HP, Theobalt C (2016) General automatic human shape and motion capture using volumetric contour cues In: European Conference on Computer Vision. vol. 9909, 509\u2013526.. Springer, New York."},{"key":"66_CR19","first-page":"506","volume-title":"International Conference on 3D Vision","author":"D Mehta","year":"2017","unstructured":"Mehta D, Rhodin H, Casas D, Fua P, Sotnychenko O, Xu W, et al. (2017) Monocular 3D human pose estimation in the wild using improved CNN supervision In: International Conference on 3D Vision, 506\u2013516.. IEEE, New York."},{"key":"66_CR20","first-page":"2369","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"W Chunyu","year":"2014","unstructured":"Chunyu W, Yizhou W, Zhouchen L, Alan LY, Wen G (2014) Robust estimation of 3D human poses from a single image In: IEEE Conference on Computer Vision and Pattern Recognition, 2369\u20132376.. IEEE, New York."},{"key":"66_CR21","first-page":"3634","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"Edgar SS","year":"2013","unstructured":"Edgar SS, Ariadna Q, Carme T, Francesc MN (2013) A joint model for 2D and 3D pose estimation from a single image In: IEEE Conference on Computer Vision and Pattern Recognition, 3634\u20133641.. IEEE, New York."},{"key":"66_CR22","first-page":"2673","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"SS Edgar","year":"2012","unstructured":"Edgar SS, Arnau R, Guillem A, Carme T, Francesc MN (2012) Single image 3D human pose estimation from noisy observations In: IEEE Conference on Computer Vision and Pattern Recognition, 2673\u20132680.. IEEE, New York."},{"issue":"8","key":"66_CR23","doi-asserted-by":"publisher","first-page":"1648","DOI":"10.1109\/TPAMI.2016.2605097","volume":"39","author":"Z Xiaowei","year":"2017","unstructured":"Xiaowei Z, Menglong Z, Spyridon L, Kostas D (2017) Sparse representation for 3D shape estimation: a convex relaxation approach. IEEE Trans Pattern Anal Mach Intell 39(8):1648\u20131661.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"66_CR24","doi-asserted-by":"crossref","unstructured":"Mehta D, Sotnychenko O, Mueller F, Xu W, Sridhar S, Pons-Moll G, et al. (2018) Single-shot multi-person 3d pose estimation from monocular RGB In: IEEE, 120\u2013130, New York.","DOI":"10.1109\/3DV.2018.00024"},{"key":"66_CR25","first-page":"1799","volume-title":"Advances in Neural Information Processing Systems 27. Curran Associates, Inc","author":"J Tompson","year":"2014","unstructured":"Tompson J, Jain A, LeCun Y, Bregler C (2014) Joint training of a convolutional network and a graphical model for human pose estimation In: Advances in Neural Information Processing Systems 27. Curran Associates, Inc, 1799\u20131807.. MIT Press, Cambridge."},{"key":"66_CR26","first-page":"483","volume-title":"European Conference on Computer Vision","author":"A Newell","year":"2016","unstructured":"Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation In: European Conference on Computer Vision, 483\u2013499.. Springer, New York."},{"key":"66_CR27","first-page":"4724","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"SE Wei","year":"2016","unstructured":"Wei SE, Ramakrishna V, Kanade T, Sheikh Y (2016) Convolutional pose machines In: IEEE Conference on Computer Vision and Pattern Recognition, 4724\u20134732.. IEEE, New York."},{"key":"66_CR28","first-page":"472","volume-title":"European Conference on Computer Vision","author":"B Xiao","year":"2018","unstructured":"Xiao B, Wu H, Wei Y (2018) Simple baselines for human pose estimation and tracking In: European Conference on Computer Vision, 472\u2013487.. Springer, New York."},{"key":"66_CR29","doi-asserted-by":"publisher","unstructured":"Freeman TG (2002) Portraits of the Earth: A Mathematician Looks at Maps. American Mathematical Soc. https:\/\/doi.org\/10.1111\/j.1949-8535.2002.tb00041.x.","DOI":"10.1111\/j.1949-8535.2002.tb00041.x"},{"key":"66_CR30","first-page":"770","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition","author":"K He","year":"2016","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition In: IEEE Conference on Computer Vision and Pattern Recognition, 770\u2013778.. Springer, New York."},{"key":"66_CR31","unstructured":"Kingma D, Ba J (2014) Adam: a method for stochastic optimization. ArXiv 1412.6980 abs\/1412.6980."}],"container-title":["IPSJ Transactions on Computer Vision and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s41074-020-00066-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s41074-020-00066-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s41074-020-00066-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,8,30]],"date-time":"2021-08-30T23:20:32Z","timestamp":1630365632000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1186\/s41074-020-00066-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,31]]},"references-count":31,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["66"],"URL":"https:\/\/doi.org\/10.1186\/s41074-020-00066-8","relation":{},"ISSN":["1882-6695"],"issn-type":[{"type":"electronic","value":"1882-6695"}],"subject":[],"published":{"date-parts":[[2020,8,31]]},"assertion":[{"value":"3 February 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 August 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 August 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare that they have no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"4"}}