{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,1]],"date-time":"2026-02-01T03:41:29Z","timestamp":1769917289480,"version":"3.49.0"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2022,5,11]],"date-time":"2022-05-11T00:00:00Z","timestamp":1652227200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,5,11]],"date-time":"2022-05-11T00:00:00Z","timestamp":1652227200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"JSPS KAKENHI","award":["17H06114"],"award-info":[{"award-number":["17H06114"]}]},{"name":"Kayamori Foundation of Information Science Advancement"},{"name":"JST SPRING","award":["JPMJSP2112"],"award-info":[{"award-number":["JPMJSP2112"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Appl Intell"],"published-print":{"date-parts":[[2023,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Following the advances in convolutional neural networks and synthetic data generation, 3D egocentric body pose estimations from a mounted fisheye camera have been developed. Previous works estimated 3D joint positions from raw image pixels and intermediate supervision during the process. The mounted fisheye camera captures notably different images that are affected by the optical properties of the lens, angle of views, and setup positions. Therefore, 3D ego-pose estimation from a mounted fisheye camera must be trained for each set of camera optics and setup. We propose a 3D ego-pose estimation from a single mounted omnidirectional camera that captures the entire circumference by back-to-back dual fisheye cameras. The omnidirectional camera can capture the user\u2019s body in the 360<jats:sup>\u2218<\/jats:sup> field of view under a wide variety of motions. We also propose a simple feed-forward network model to estimate 3D joint positions from 2D joint locations. The lift-up model can be used in real time yet obtains accuracy comparable to those of previous works on our new dataset. Moreover, our model is trainable with the ground truth 3D joint positions and the unit vectors toward the 3D joint positions, which are easily generated from existing publicly available 3D mocap datasets. This advantage alleviates the data collection and training burden due to changes in the camera optics and setups, although it is limited to the effect after the 2D joint location estimation.<\/jats:p>","DOI":"10.1007\/s10489-022-03417-3","type":"journal-article","created":{"date-parts":[[2022,5,10]],"date-time":"2022-05-10T23:02:36Z","timestamp":1652223756000},"page":"2616-2628","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Simple yet effective 3D ego-pose lift-up based on vector and distance for a mounted omnidirectional camera"],"prefix":"10.1007","volume":"53","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9628-1414","authenticated-orcid":false,"given":"Teppei","family":"Miura","sequence":"first","affiliation":[]},{"given":"Shinji","family":"Sako","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,5,11]]},"reference":[{"key":"3417_CR1","doi-asserted-by":"publisher","unstructured":"Tome D, Toso M, Agapito L, Russell C (2018) Rethinking Pose in 3D: Multi-stage Refinement and Recovery for Markerless Motion Capture. International Conference on 3D Vision, pp 474\u2013483, https:\/\/doi.org\/10.1109\/3DV.2018.00061","DOI":"10.1109\/3DV.2018.00061"},{"key":"3417_CR2","doi-asserted-by":"publisher","unstructured":"Qiu H, Wang C, Wang J, Wang N, Zeng W (2019) Cross View Fusion for 3D Human Pose Estimation. IEEE International Conference on Computer Vision, pp 4341\u20134350. https:\/\/doi.org\/10.1109\/ICCV.2019.00444","DOI":"10.1109\/ICCV.2019.00444"},{"key":"3417_CR3","doi-asserted-by":"publisher","unstructured":"Iskakov K, Burkov E, Lempitsky V, Malkov Y (2019) Learnable Triangulation of Human Pose. IEEE International Conference on Computer Vision, pp 7717\u20137726. https:\/\/doi.org\/10.1109\/ICCV.2019.00781","DOI":"10.1109\/ICCV.2019.00781"},{"key":"3417_CR4","doi-asserted-by":"publisher","unstructured":"Tekin B, Katircioglu I, Salzmann M, Lepetit V, Fua P (2016) Structured Prediction of 3D Human Pose with Deep Neural Networks. British Machine Vision Conference, pp 130.1\u2013130.11. https:\/\/doi.org\/10.5244\/C.30.130","DOI":"10.5244\/C.30.130"},{"key":"3417_CR5","doi-asserted-by":"publisher","unstructured":"Tekin B, Rozantsev A, Lepetit V, Fua P (2016) Direct Prediction of 3D Body Poses from Motion Compensated Sequences. IEEE Conference on Computer Vision and Pattern Recognition, pp 991\u20131000. https:\/\/doi.org\/10.1007\/978-3-319-49409-8_17","DOI":"10.1007\/978-3-319-49409-8_17"},{"key":"3417_CR6","doi-asserted-by":"publisher","unstructured":"Pavlakos G, Zhou X, Derpanis KG, Daniilidis K (2017) Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose. IEEE Conference on Computer Vision and Pattern Recognition, pp 1263\u20131272. https:\/\/doi.org\/10.1109\/CVPR.2017.139","DOI":"10.1109\/CVPR.2017.139"},{"key":"3417_CR7","doi-asserted-by":"publisher","unstructured":"Zhou X, Sun X, Zhang W, Liang S, Wei Y (2016) Deep Kinematic Pose Regression. European Conference on Computer Vision, pp 186\u2013201. https:\/\/doi.org\/10.1007\/978-3-319-49409-8_17","DOI":"10.1007\/978-3-319-49409-8_17"},{"key":"3417_CR8","doi-asserted-by":"publisher","unstructured":"Mehta D, Rhodin H, Casas D, Fua P, Sotnychenko O, Xu W, Theobalt C (2017) Monocular 3D Human Pose Estimation in the Wild Using Improved CNN Supervision. International Conference on 3D Vision, pp 506\u2013516. https:\/\/doi.org\/10.1109\/3DV.2017.00064","DOI":"10.1109\/3DV.2017.00064"},{"key":"3417_CR9","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3072959.3073596","volume":"36","author":"D Mehta","year":"2017","unstructured":"Mehta D, Sridhar S, Sotnychenko O, Rhodin H, Shafiei M, Seidel H-P, Xu W, Casas D, Theobalt C (2017) VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera. ACM Trans Graph 36:1\u201314. https:\/\/doi.org\/10.1145\/3072959.3073596","journal-title":"ACM Trans Graph"},{"key":"3417_CR10","doi-asserted-by":"publisher","unstructured":"Mehta D, Sotnychenko O, Mueller F, Xu W, Sridhar S, Pons-Moll G, Theobalt C (2018) Single-Shot Multi-person 3D Pose Estimation from Monocular RGB. International Conference on 3D Vision, pp 120\u2013130. https:\/\/doi.org\/10.1109\/3DV.2018.00024","DOI":"10.1109\/3DV.2018.00024"},{"key":"3417_CR11","doi-asserted-by":"publisher","unstructured":"Martinez J, Hossain R, Romero J, Little JJ (2017) A Simple Yet Effective Baseline for 3d Human Pose Estimation. IEEE International Conference on Computer Vision, pp 2659\u20132668. https:\/\/doi.org\/10.1109\/ICCV.2017.288","DOI":"10.1109\/ICCV.2017.288"},{"issue":"8","key":"3417_CR12","doi-asserted-by":"publisher","first-page":"1648","DOI":"10.1109\/TPAMI.2016.2605097","volume":"39","author":"Z Xiaowei","year":"2017","unstructured":"Xiaowei Z, Menglong Z, Spyridon L, Kostas D (2017) Sparse Representation for 3D Shape Estimation: A Convex Relaxation Approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (8):1648\u20131661. https:\/\/doi.org\/10.1109\/TPAMI.2016.2605097","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"3417_CR13","doi-asserted-by":"publisher","unstructured":"Jiang H, Grauman K (2017) Seeing Invisible Poses: Estimating 3D Body Pose from Egocentric Video. IEEE Conference on Computer Vision and Pattern Recognition, pp 3501\u20133509. https:\/\/doi.org\/10.1109\/CVPR.2017.373","DOI":"10.1109\/CVPR.2017.373"},{"issue":"5","key":"3417_CR14","doi-asserted-by":"publisher","first-page":"2093","DOI":"10.1109\/TVCG.2019.2898650","volume":"25","author":"W Xu","year":"2019","unstructured":"Xu W, Chatterjee A, Zollh\u00f6fer M, Rhodin H, Fua P, Seidel H-P, Theobalt C (2019) Mo2Cap2: Real-time Mobile 3D Motion Capture with a Cap-mounted Fisheye Camera. IEEE Trans Vis Comput Graph 25(5):2093\u20132101. https:\/\/doi.org\/10.1109\/TVCG.2019.2898650","journal-title":"IEEE Trans Vis Comput Graph"},{"key":"3417_CR15","doi-asserted-by":"publisher","unstructured":"Tome D, Peluse P, Agapito L, Badino H (2019) xR-EgoPose: Egocentric 3D Human Pose From an HMD Camera. IEEE International Conference on Computer Vision, pp 7727\u20137737. https:\/\/doi.org\/10.1109\/ICCV.2019.00782","DOI":"10.1109\/ICCV.2019.00782"},{"key":"3417_CR16","doi-asserted-by":"publisher","unstructured":"Tome D, Alldieck T, Peluse P, Pons-Moll G, Agapito L, Badino H, la Torre FD (2020) SelfPose: 3D Egocentric Pose Estimation from a Headset Mounted Camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp 1\u20131. https:\/\/doi.org\/10.1109\/TPAMI.2020.3029700","DOI":"10.1109\/TPAMI.2020.3029700"},{"key":"3417_CR17","doi-asserted-by":"publisher","unstructured":"Scaramuzza D, Martinelli A, Siegwart R (2006) A Toolbox for Easily Calibrating Omnidirectional Cameras. IEEE\/RSJ International Conference on Intelligent Robots and Systems, pp 5695\u20135701. https:\/\/doi.org\/10.1109\/IROS.2006.282372","DOI":"10.1109\/IROS.2006.282372"},{"key":"3417_CR18","doi-asserted-by":"publisher","unstructured":"Wei S-E, Ramakrishna V, Kanade T, Sheikh Y (2016) Convolutional Pose Machines. IEEE Conference on Computer Vision and Pattern Recognition, pp 4724\u20134732. https:\/\/doi.org\/10.1109\/CVPR.2016.511","DOI":"10.1109\/CVPR.2016.511"},{"key":"3417_CR19","doi-asserted-by":"publisher","unstructured":"Newell A, Yang K, Deng J (2016) Stacked Hourglass Networks for Human Pose Estimation. European Conference on Computer Vision, pp 483\u2013499. https:\/\/doi.org\/10.1007\/978-3-319-46484-8_29","DOI":"10.1007\/978-3-319-46484-8_29"},{"key":"3417_CR20","doi-asserted-by":"publisher","unstructured":"Xiao B, Wu H, Wei Y (2018) Simple Baselines for Human Pose Estimation and Tracking. European Conference on Computer Vision, pp 472\u2013487. https:\/\/doi.org\/10.1007\/978-3-030-01231-1_29","DOI":"10.1007\/978-3-030-01231-1_29"},{"key":"3417_CR21","doi-asserted-by":"publisher","unstructured":"Sun K, Xiao B, Liu D, Wang J (2019) Deep High-Resolution Representation Learning for Human Pose Estimation. IEEE Conference on Computer Vision and Pattern Recognition, pp 5686\u20135696. https:\/\/doi.org\/10.1109\/CVPR.2019.00584","DOI":"10.1109\/CVPR.2019.00584"},{"key":"3417_CR22","doi-asserted-by":"publisher","unstructured":"Ma M, Fan H, Kitani KM (2016) Going Deeper into First-Person Activity Recognition. IEEE Conference on Computer Vision and Pattern Recognition, pp 1894\u20131903. https:\/\/doi.org\/10.1109\/CVPR.2016.209","DOI":"10.1109\/CVPR.2016.209"},{"key":"3417_CR23","doi-asserted-by":"publisher","unstructured":"Cao C, Zhang Y, Wu Y, Lu H, Cheng J (2017) Egocentric Gesture Recognition Using Recurrent 3D Convolutional Neural Networks with Spatiotemporal Transformer Modules. IEEE International Conference on Computer Vision, pp 3783\u20133791. https:\/\/doi.org\/10.1109\/ICCV.2017.406","DOI":"10.1109\/ICCV.2017.406"},{"key":"3417_CR24","doi-asserted-by":"publisher","unstructured":"Ahuja K, Harrison C, Goel M, Xiao R (2019) MeCap: Whole-Body Digitization for Low-Cost VR\/AR Headsets. ACM Symposium on User Interface Software and Technology, pp 453\u2013462. https:\/\/doi.org\/10.1145\/3332165.3347889","DOI":"10.1145\/3332165.3347889"},{"key":"3417_CR25","doi-asserted-by":"publisher","unstructured":"Cao Z, Simon T, Wei S-E, Sheikh Y (2017) Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. IEEE Conference on Computer Vision and Pattern Recognition, pp 1302\u20131310. https:\/\/doi.org\/10.1109\/CVPR.2017.143","DOI":"10.1109\/CVPR.2017.143"},{"key":"3417_CR26","doi-asserted-by":"publisher","unstructured":"Rhodin H, Richardt C, Casas D, Insafutdinov E, Shafiei M, Seidel H-P, Schiele B, Theobalt C (2016) EgoCap: Egocentric Marker-Less Motion Capture with Two Fisheye Cameras. ACM Trans Graph, 35(6). https:\/\/doi.org\/10.1145\/2980179.2980235","DOI":"10.1145\/2980179.2980235"},{"key":"3417_CR27","doi-asserted-by":"publisher","unstructured":"Miura T, Sako S (2020) 3D human pose estimation model using location-maps for distorted and disconnected images by a wearable omnidirectional camera. IPSJ Transactions on Computer Vision and Applications. https:\/\/doi.org\/10.1186\/s41074-020-00066-8","DOI":"10.1186\/s41074-020-00066-8"},{"key":"3417_CR28","doi-asserted-by":"publisher","unstructured":"Zhang Y, You S, Gevers T (2021) Automatic Calibration of the Fisheye Camera for Egocentric 3D Human Pose Estimation From a Single Image. IEEE Winter Conference on Applications of Computer Vision, pp 1771\u20131780. https:\/\/doi.org\/10.1109\/WACV48630.2021.00181","DOI":"10.1109\/WACV48630.2021.00181"},{"key":"3417_CR29","doi-asserted-by":"publisher","unstructured":"Varol G, Romero J, Martin X, Mahmood N, Black MJ, Laptev I, Schmid C (2017) Learning from Synthetic Humans. IEEE Conference on Computer Vision and Pattern Recognition, pp 4627\u20134635. https:\/\/doi.org\/10.1109\/CVPR.2017.492","DOI":"10.1109\/CVPR.2017.492"},{"key":"3417_CR30","doi-asserted-by":"publisher","unstructured":"Loper M, Mahmood N, Romero J, Pons-Moll G, Black MJ (2015) SMPL: a skinned multi-person linear model. ACM Trans Graph, 34(6). https:\/\/doi.org\/10.1145\/2816795.2818013","DOI":"10.1145\/2816795.2818013"},{"key":"3417_CR31","doi-asserted-by":"publisher","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision and Pattern Recognition, pp 770\u2013778. https:\/\/doi.org\/10.1109\/CVPR.2016.90","DOI":"10.1109\/CVPR.2016.90"},{"key":"3417_CR32","doi-asserted-by":"publisher","unstructured":"Andriluka M, Pishchulin L, Gehler P, Schiele B (2014) 2D Human Pose Estimation: New Benchmark and State of the Art Analysis. IEEE Conference on Computer Vision and Pattern Recognition, pp 3686\u20133693. https:\/\/doi.org\/10.1109\/CVPR.2014.471","DOI":"10.1109\/CVPR.2014.471"}],"container-title":["Applied Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-022-03417-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10489-022-03417-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-022-03417-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,11]],"date-time":"2023-01-11T11:35:04Z","timestamp":1673436904000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10489-022-03417-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,11]]},"references-count":32,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,2]]}},"alternative-id":["3417"],"URL":"https:\/\/doi.org\/10.1007\/s10489-022-03417-3","relation":{},"ISSN":["0924-669X","1573-7497"],"issn-type":[{"value":"0924-669X","type":"print"},{"value":"1573-7497","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,5,11]]},"assertion":[{"value":"18 February 2022","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 May 2022","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"<!--Emphasis Type='Bold' removed-->Conflict of Interests"}}]}}