{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T18:04:18Z","timestamp":1780596258047,"version":"3.54.1"},"reference-count":41,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2018,12,5]],"date-time":"2018-12-05T00:00:00Z","timestamp":1543968000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>This work addresses the problem of automatic head pose estimation and its application in 3D gaze estimation using low quality RGB-D sensors without any subject cooperation or manual intervention. The previous works on 3D head pose estimation using RGB-D sensors require either an offline step for supervised learning or 3D head model construction, which may require manual intervention or subject cooperation for complete head model reconstruction. In this paper, we propose a 3D pose estimator based on low quality depth data, which is not limited by any of the aforementioned steps. Instead, the proposed technique relies on modeling the subject\u2019s face in 3D rather than the complete head, which, in turn, relaxes all of the constraints in the previous works. The proposed method is robust, highly accurate and fully automatic. Moreover, it does not need any offline step. Unlike some of the previous works, the method only uses depth data for pose estimation. The experimental results on the Biwi head pose database confirm the efficiency of our algorithm in handling large pose variations and partial occlusion. We also evaluated the performance of our algorithm on IDIAP database for 3D head pose and eye gaze estimation.<\/jats:p>","DOI":"10.3390\/s18124280","type":"journal-article","created":{"date-parts":[[2018,12,5]],"date-time":"2018-12-05T12:22:00Z","timestamp":1544012520000},"page":"4280","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Highly Accurate and Fully Automatic 3D Head Pose Estimation and Eye Gaze Estimation Using RGB-D Sensors and 3D Morphable Models"],"prefix":"10.3390","volume":"18","author":[{"given":"Reza","family":"Shoja Ghiass","sequence":"first","affiliation":[{"name":"Computer Vision and Systems Laboratory, Laval University, 1665 Rue de l\u2019Universite, Universite Laval, Quebec City, QC G1V 0A6, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9314-194X","authenticated-orcid":false,"given":"Ognjen","family":"Arandjelov\u0107","sequence":"additional","affiliation":[{"name":"School of Computer Science, University of St Andrews, St Andrews, KY16 9SX Scotland, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Denis","family":"Laurendeau","sequence":"additional","affiliation":[{"name":"Computer Vision and Systems Laboratory, Laval University, 1665 Rue de l\u2019Universite, Universite Laval, Quebec City, QC G1V 0A6, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2018,12,5]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1109\/TPAMI.2008.106","article-title":"Head pose estimation in computer vision: A survey","volume":"31","author":"Trivedi","year":"2009","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Fanelli, G., Weise, T., Gall, J., and Van Gool, L. (2011). Real time head pose estimation from consumer depth cameras. Pattern Recognition, Springer.","DOI":"10.1007\/978-3-642-23123-0_11"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1007\/s11263-012-0549-0","article-title":"Random Forests for Real Time 3D Face Analysis","volume":"101","author":"Fanelli","year":"2013","journal-title":"Int. J. Comput. Vis."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Breitenstein, M.D., Kuettel, D., Weise, T., Van Gool, L., and Pfister, H. (2008, January 23\u201328). Real-time face pose estimation from single range images. Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA.","DOI":"10.1109\/CVPR.2008.4587807"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Fanelli, G., Gall, J., and Van Gool, L. (2011, January 20\u201325). Real time head pose estimation with random regression forests. Proceedings of the CVPR 2011, Colorado Springs, CO, USA.","DOI":"10.1109\/CVPR.2011.5995458"},{"key":"ref_6","unstructured":"Seemann, E., Nickel, K., and Stiefelhagen, R. (2004, January 19). Head pose estimation using stereo vision for human-robot interaction. Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, Seoul, Korea."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Funes Mora, K.A., and Odobez, J. (2012, January 16\u201321). Gaze estimation from multimodal Kinect data. Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI, USA.","DOI":"10.1109\/CVPRW.2012.6239182"},{"key":"ref_8","unstructured":"Rekik, A., Ben-Hamadou, A., and Mahdi, W. (2013, January 21\u201324). 3D Face Pose Tracking using Low Quality Depth Cameras. Proceedings of the International Conference on Computer Vision Theory and Applications\u2014VISAPP 2013, Barcelona, Spain."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Cai, Q., Gallup, D., Zhang, C., and Zhang, Z. (2010, January 5\u201311). 3D Deformable Face Tracking with a Commodity Depth Camera. Proceedings of the 11th European Conference on Computer Vision Conference on Computer Vision: Part III, ECCV\u201910, Heraklion, Crete, Greece.","DOI":"10.1007\/978-3-642-15558-1_17"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Baltrusaitis, T., Robinson, P., and Morency, L.P. (2012, January 16\u201321). 3D Constrained Local Model for rigid and non-rigid facial tracking. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6247980"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Yu, Y., Mora, K.A.F., and Odobez, J.M. (June, January 30). Robust and Accurate 3D Head Pose Estimation through 3DMM and Online Head Model Reconstruction. Proceedings of the 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017), Washington, DC, USA.","DOI":"10.1109\/FG.2017.90"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Papazov, C., Marks, T.K., and Jones, M. (2015, January 7\u201312). Real-time 3D head pose and facial landmark estimation from depth images using triangular surface patch features. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299104"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Tulyakov, S., Vieriu, R.L., Semeniuta, S., and Sebe, N. (2014, January 24\u201328). Robust Real-Time Extreme Head Pose Estimation. Proceedings of the 2014 22nd International Conference on Pattern Recognition, Stockholm, Sweden.","DOI":"10.1109\/ICPR.2014.393"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"478","DOI":"10.1109\/TPAMI.2009.30","article-title":"In the Eye of the Beholder: A Survey of Models for Eyes and Gaze","volume":"32","author":"Hansen","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"663","DOI":"10.1016\/j.imavis.2005.06.001","article-title":"Eye tracking: Pupil orientation geometrical modeling","volume":"24","author":"Villanueva","year":"2006","journal-title":"Image Vis. Comput."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1016\/j.cviu.2004.07.008","article-title":"Estimating the eye gaze from one eye","volume":"98","author":"Wang","year":"2005","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"855","DOI":"10.1142\/S0218001407005697","article-title":"Gaze Tracking System Model Based on Physical Parameters","volume":"21","author":"Villanueva","year":"2007","journal-title":"IJPRAI Int. J. Pattern Recognit. Artif. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1162","DOI":"10.1109\/21.247897","article-title":"Spatially dynamic calibration of an eye-tracking system","volume":"23","author":"White","year":"1993","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1124","DOI":"10.1109\/TBME.2005.863952","article-title":"General theory of remote gaze estimation using the pupil center and corneal reflections","volume":"53","author":"Guestrin","year":"2006","journal-title":"IEEE Trans. Biomed. Eng."},{"key":"ref_20","unstructured":"Morimoto, C.H., Amir, A., and Flickner, M. (2002, January 11\u201315). Detecting eye position and gaze from a single camera and 2 light sources. Proceedings of the Object Recognition Supported by User Interaction for Service Robots, Quebec City, QC, Canada."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Blignaut, P. (2013). Mapping the pupil-glint vector to gaze coordinates in a simple video-based eye tracker. J. Eye Mov. Res., 7.","DOI":"10.16910\/jemr.7.1.4"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1029","DOI":"10.1109\/IMTC.2002.1007096","article-title":"An adaptive calibration of an infrared light device used for gaze tracking","volume":"Volume 2","author":"Cherif","year":"2002","journal-title":"Proceedings of the 19th IEEE Instrumentation and Measurement Technology Conference, IMTC\/2002"},{"key":"ref_23","unstructured":"Brolly, X.L., and Mulligan, J.B. (July, January 27). Implicit calibration of a remote gaze tracker. Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop, CVPRW\u201904, Washington, DC, USA."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Cerrolaza, J.J., Villanueva, A., and Cabeza, R. (2008, January 26\u201328). Taxonomic study of polynomial regressions applied to the calibration of video-oculographic systems. Proceedings of the 2008 Symposium on Eye Tracking Research & Applications, Savannah, GA, USA.","DOI":"10.1145\/1344471.1344530"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1007\/s00138-004-0139-4","article-title":"Eye and gaze tracking for interactive graphic display","volume":"15","author":"Zhu","year":"2004","journal-title":"Mach. Vis. Appl."},{"key":"ref_26","first-page":"617","article-title":"Eye gaze calculation based on nonlinear polynomial and generalized regression neural network","volume":"Volume 3","author":"Chi","year":"2009","journal-title":"Proceedings of the Fifth International Conference on Natural Computation, ICNC\u201909"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"083103","DOI":"10.1117\/1.OE.54.8.083103","article-title":"Robust remote gaze estimation method based on multiple geometric transforms","volume":"54","author":"Ma","year":"2015","journal-title":"Opt. Eng."},{"key":"ref_28","unstructured":"Pomerleau, D., and Baluja, S. (1993, January 22\u201324). Non-intrusive gaze tracking using artificial neural networks. Proceedings of the AAAI Fall Symposium on Machine Learning in Computer Vision, Raleigh, NC, USA."},{"key":"ref_29","unstructured":"Tan, K.H., Kriegman, D.J., and Ahuja, N. (2002, January 4). Appearance-based eye gaze estimation. Proceedings of the Sixth IEEE Workshop on Applications of Computer Vision (WACV 2002), Orlando, FL, USA."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Lu, F., Sugano, Y., Okabe, T., and Sato, Y. (2011, January 6\u201313). Inferring human gaze from appearance via adaptive linear regression. Proceedings of the 2011 IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126237"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"230","DOI":"10.1109\/CVPR.2006.285","article-title":"Sparse and Semi-supervised Visual Mapping with the S^ 3GP","volume":"Volume 1","author":"Williams","year":"2006","journal-title":"Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"476","DOI":"10.1016\/j.cviu.2010.11.013","article-title":"A wearable gaze tracking system for children in unconstrained environments","volume":"115","author":"Noris","year":"2011","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Choi, D.H., Jang, I.H., Kim, M.H., and Kim, N.C. (2007, January 27\u201330). Color image enhancement based on single-scale retinex with a JND-based nonlinear filter. Proceedings of the IEEE International Symposium on Circuits and Systems, ISCAS 2007, New Orleans, LA, USA.","DOI":"10.1109\/ISCAS.2007.378664"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1007\/s11263-015-0863-4","article-title":"Gaze estimation in the 3d space using rgb-d sensors","volume":"118","author":"Odobez","year":"2016","journal-title":"Int. J. Comput. Vis."},{"key":"ref_35","unstructured":"Viola, P., and Jones, M. (2001, January 8\u201314). Rapid object detection using a boosted cascade of simple features. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, Kauai, HI, USA."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Bouaziz, S., and Pauly, M. (2013). Dynamic 2d\/3d Registration for the Kinect, ACM.","DOI":"10.1145\/2504435.2504456"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"LaValle, S.M. (2006). Planning Algorithms, Cambridge University Press. Available online: http:\/\/planning.cs.uiuc.edu\/node103.html.","DOI":"10.1017\/CBO9780511546877"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Bouaziz, S., Tagliasacchi, A., and Pauly, M. (2013, January 3\u20135). Sparse iterative closest point. Proceedings of the Eleventh Eurographics\/ACMSIGGRAPH Symposium on Geometry Processing, Genova, Italy.","DOI":"10.1111\/cgf.12178"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Funes Mora, K.A., Monay, F., and Odobez, J.M. (2014). EYEDIAP Database: Data Description and Gaze Tracking Evaluation Benchmarks, Idiap. Idiap-RR Idiap-RR-08-2014.","DOI":"10.1145\/2578153.2578190"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Pauly, M. (2013, January 6\u20138). Realtime Performance-Based Facial Avatars for Immersive Gameplay. Proceedings of the Motion on Games, Dublin, Ireland.","DOI":"10.1145\/2522628.2541252"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Martin, M., van de Camp, F., and Stiefelhagen, R. (2014, January 8\u201311). Real Time Head Model Creation and Head Pose Estimation on Consumer Depth Cameras. Proceedings of the 2014 2nd International Conference on 3D Vision, Tokyo, Japan.","DOI":"10.1109\/3DV.2014.54"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/18\/12\/4280\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:31:18Z","timestamp":1760196678000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/18\/12\/4280"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,12,5]]},"references-count":41,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2018,12]]}},"alternative-id":["s18124280"],"URL":"https:\/\/doi.org\/10.3390\/s18124280","relation":{"has-preprint":[{"id-type":"doi","id":"10.20944\/preprints201810.0309.v1","asserted-by":"object"}]},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,12,5]]}}}