{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T04:01:32Z","timestamp":1780459292696,"version":"3.54.1"},"reference-count":38,"publisher":"MDPI AG","issue":"23","license":[{"start":{"date-parts":[[2023,12,4]],"date-time":"2023-12-04T00:00:00Z","timestamp":1701648000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Gaze is a significant behavioral characteristic that can be used to reflect a person\u2019s attention. In recent years, there has been a growing interest in estimating gaze from facial videos. However, gaze estimation remains a challenging problem due to variations in appearance and head poses. To address this, a framework for 3D gaze estimation using appearance cues is developed in this study. The framework begins with an end-to-end approach to detect facial landmarks. Subsequently, we employ a normalization method and improve the normalization method using orthogonal matrices and conduct comparative experiments to prove that the improved normalization method has a higher accuracy and a lower computational time in gaze estimation. Finally, we introduce a dual-branch convolutional neural network, named FG-Net, which processes the normalized images and extracts eye and face features through two branches. The extracted multi-features are then integrated and input into a fully connected layer to estimate the 3D gaze vectors. To evaluate the performance of our approach, we conduct ten-fold cross-validation experiments on two public datasets, namely MPIIGaze and EyeDiap, achieving remarkable accuracies of 3.11\u00b0 and 2.75\u00b0, respectively. The results demonstrate the high effectiveness of our proposed framework, showcasing its state-of-the-art performance in 3D gaze estimation.<\/jats:p>","DOI":"10.3390\/s23239604","type":"journal-article","created":{"date-parts":[[2023,12,4]],"date-time":"2023-12-04T05:28:21Z","timestamp":1701667701000},"page":"9604","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video"],"prefix":"10.3390","volume":"23","author":[{"given":"Shang","family":"Tian","sequence":"first","affiliation":[{"name":"College of Electrical Engineering, Sichuan University, Chengdu 610065, China"},{"name":"Key Laboratory of Information and Automation Technology of Sichuan Province, Sichuan University, Chengdu 610065, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7909-465X","authenticated-orcid":false,"given":"Haiyan","family":"Tu","sequence":"additional","affiliation":[{"name":"College of Electrical Engineering, Sichuan University, Chengdu 610065, China"},{"name":"Key Laboratory of Information and Automation Technology of Sichuan Province, Sichuan University, Chengdu 610065, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7168-2737","authenticated-orcid":false,"given":"Ling","family":"He","sequence":"additional","affiliation":[{"name":"College of Biomedical Engineering, Sichuan University, Chengdu 610065, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5480-1741","authenticated-orcid":false,"given":"Yue Ivan","family":"Wu","sequence":"additional","affiliation":[{"name":"College of Computer Science, Sichuan University, Chengdu 610065, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4703-9530","authenticated-orcid":false,"given":"Xiujuan","family":"Zheng","sequence":"additional","affiliation":[{"name":"College of Electrical Engineering, Sichuan University, Chengdu 610065, China"},{"name":"Key Laboratory of Information and Automation Technology of Sichuan Province, Sichuan University, Chengdu 610065, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2023,12,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Rakovi\u0107, M., Duarte, N.F., Marques, J., Billard, A., and Santos-Victor, J. (2022). The Gaze Dialogue Model: Nonverbal Communication in HHI and HRI. IEEE Trans. Cybern.","DOI":"10.1109\/TCYB.2022.3222077"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3338844","article-title":"Improving user experience of eye tracking-based interaction: Introspecting and adapting interfaces","volume":"26","author":"Menges","year":"2019","journal-title":"ACM Trans. Comput. Hum. Interact."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"939","DOI":"10.1007\/s11257-022-09352-9","article-title":"What we see is what we do: A practical Peripheral Vision-Based HMM framework for gaze-enhanced recognition of actions in a medical procedural task","volume":"33","author":"Wang","year":"2023","journal-title":"User Model User-Adap."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"77649","DOI":"10.1109\/ACCESS.2021.3080687","article-title":"Different Eye Movement Behaviors Related to Artificial Visual Field Defects\u2014A Pilot Study of Video-Based Perimetry","volume":"9","author":"Mao","year":"2021","journal-title":"IEEE Access"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Yu, W., Zhao, F., Ren, Z., Jin, D., Yang, X., and Zhang, X. (2023). Mining attention distribution paradigm: Discover gaze patterns and their association rules behind the visual image. Comput. Methods Programs Biomed., 230.","DOI":"10.1016\/j.cmpb.2022.107330"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1071","DOI":"10.1109\/TNSRE.2022.3157768","article-title":"Predicting the Reader\u2019s English Level From Reading Fixation Patterns Using the Siamese Convolutional Neural Network","volume":"30","author":"Fan","year":"2022","journal-title":"IEEE Trans. Neural Syst. Rehabil. Eng."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"478","DOI":"10.1109\/TPAMI.2009.30","article-title":"In the Eye of the Beholder: A Survey of Models for Eyes and Gaze","volume":"32","author":"Hansen","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1124","DOI":"10.1109\/TBME.2005.863952","article-title":"General theory of remote gaze estimation using the pupil center and corneal reflections","volume":"53","author":"Guestrin","year":"2006","journal-title":"IEEE Trans. Biomed. Eng."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Nakazawa, A., and Nitschke, C. (2012, January 7\u201313). Point of gaze estimation through corneal surface reflection in an active illumination environment. Proceedings of the Proceedings Part II, of the 12th European Conference on Computer Vision\u2014ECCV 2012, Florence, Italy.","DOI":"10.1007\/978-3-642-33709-3_12"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Alberto Funes Mora, K., and Odobez, J.M. (2014, January 23\u201328). Geometric generative gaze estimation (g3e) for remote rgb-d cameras. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.229"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1772","DOI":"10.1109\/TMM.2016.2576284","article-title":"Estimating 3D gaze directions using unlabeled eye images via synthetic iris appearance fitting","volume":"18","author":"Lu","year":"2016","journal-title":"IEEE Trans. Multimed."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"802","DOI":"10.1109\/TIP.2011.2162740","article-title":"Combining head pose and eye location information for gaze estimation","volume":"21","author":"Valenti","year":"2011","journal-title":"IEEE Trans. Image Process."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Schneider, T., Schauerte, B., and Stiefelhagen, R. (2014, January 24\u201328). Manifold Alignment for Person Independent Appearance-Based Gaze Estimation. Proceedings of the 22nd International Conference on Pattern Recognition, Stockholm, Sweden.","DOI":"10.1109\/ICPR.2014.210"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Sugano, Y., Matsushita, Y., and Sato, Y. (2014, January 23\u201328). Learning-by-Synthesis for Appearance-Based 3D Gaze Estimation. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.235"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1016\/j.neucom.2015.09.116","article-title":"Deep learning for visual understanding: A review","volume":"187","author":"Guo","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zhang, X., Sugano, Y., Fritz, M., and Bulling, A. (2017, January 21\u201326). It\u2019s written all over your face: Full-face appearance-based gaze estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.284"},{"key":"ref_17","unstructured":"Palmero, C., Selva, J., Bagheri, M.A., and Escalera, S. (2018). Recurrent cnn for 3d gaze estimation using appearance and shape cues. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhang, X., Sugano, Y., Fritz, M., and Bulling, A. (2015, January 7\u201312). Appearance-based gaze estimation in the wild. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299081"},{"key":"ref_19","unstructured":"Mora, F., Alberto, K., Monay, F., and Odobez, J.M. (2014, January 26\u201328). EYEDIAP: A Database for the Development and Evaluation of Gaze Estimation Algorithms from RGB and RGB-D Cameras. Proceedings of the Symposium on Eye Tracking Research and Applications, Safety Harbor, FL, USA."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Park, S., Spurr, A., and Hilliges, O. (2018, January 8\u201314). Deep pictorial gaze estimation. Proceedings of the Computer Vision\u2014ECCV 2018, Munich, Germany.","DOI":"10.1007\/978-3-030-01261-8_44"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"3010","DOI":"10.1109\/TNNLS.2018.2865525","article-title":"Multiview multitask gaze estimation with deep convolutional neural networks","volume":"30","author":"Lian","year":"2018","journal-title":"IEEE Trans. Neural. Netw. Learn. Syst."},{"key":"ref_22","unstructured":"Liu, G., Yu, Y., Mora, K.A.F., and Odobez, J.M. (2018, January 3\u20136). A differential approach for gaze estimation with calibration. Proceedings of the 2018 British Machine Vision Conference, Newcastle, UK."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Huang, L., Li, Y., Wang, X., Wang, H., Bouridane, A., and Chaddad, A. (2022). Gaze Estimation Approach Using Deep Differential Residual Network. Sensors, 22.","DOI":"10.3390\/s22145462"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Yu, Y., and Odobez, J.M. (2020, January 13\u201319). Unsupervised Representation Learning for Gaze Estimation. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00734"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"103369","DOI":"10.1016\/j.jvcir.2021.103369","article-title":"Gaze estimation via bilinear pooling-based attention networks","volume":"81","author":"Ren","year":"2021","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"777","DOI":"10.1016\/j.eng.2020.08.027","article-title":"Gaze estimation via a differential eyes\u2019 appearances network with a reference grid","volume":"7","author":"Gu","year":"2021","journal-title":"Engineering"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Krafka, K., Khosla, A., Kellnhofer, P., Kannan, H., Bhandarkar, S., Matusik, W., and Torralba, A. (2016, January 27\u201330). Eye tracking for everyone. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.239"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1016\/j.neucom.2019.04.099","article-title":"Improved itracker combined with bidirectional long short-term memory for 3D gaze estimation using appearance cues","volume":"390","author":"Zhou","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Kellnhofer, P., Recasens, A., Stent, S., Matusik, W., and Torralba, A. (November, January 27). Gaze360: Physically Unconstrained Gaze Estimation in the Wild. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00701"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1174","DOI":"10.1109\/TPAMI.2022.3148386","article-title":"Towards high performance low complexity calibration in appearance based gaze estimation","volume":"45","author":"Chen","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Li, Y., Huang, L., Chen, J., Wang, X., and Tan, B. (2023). Appearance-Based Gaze Estimation Method Using Static Transformer Temporal Differential Network. Mathematics, 11.","DOI":"10.3390\/math11030686"},{"key":"ref_32","unstructured":"Bazarevsky, V., Kartynnik, Y., Vakunov, A., Raveendran, K., and Grundmann, M. (2019). Blazeface: Sub-millisecond neural face detection on mobile gpus. arXiv."},{"key":"ref_33","unstructured":"Grishchenko, I., Ablavatski, A., Kartynnik, Y., Raveendran, K., and Grundmann, M. (2020). Attention mesh: High-fidelity face mesh prediction in real-time. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1007\/s11263-008-0152-6","article-title":"EPnP: An Accurate O(n) Solution to the PnP Problem","volume":"81","author":"Lepetit","year":"2009","journal-title":"Int. J. Comput. Vis."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Chen, Z., and Shi, B.E. (2018, January 2\u20136). Appearance-based gaze estimation using dilated-convolutions. Proceedings of the Computer Vision\u2014ACCV 2018, Perth, Australia.","DOI":"10.1007\/978-3-030-20876-9_20"},{"key":"ref_36","unstructured":"Abdelrahman, A.A., Hempel, T., Khalifa, A., and Al-Hamadi, A. (2022). L2CS-Net: Fine-Grained Gaze Estimation in Unconstrained Environments. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Cheng, Y., Huang, S., Wang, F., Qian, C., and Lu, F. (2020, January 7\u201312). A coarse-to-fine adaptive network for appearance-based gaze estimation. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.","DOI":"10.1609\/aaai.v34i07.6636"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"5259","DOI":"10.1109\/TIP.2020.2982828","article-title":"Gaze estimation by exploring two-eye asymmetry","volume":"29","author":"Cheng","year":"2020","journal-title":"IEEE Trans. Image Process."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/23\/9604\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:37:27Z","timestamp":1760132247000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/23\/9604"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,4]]},"references-count":38,"journal-issue":{"issue":"23","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["s23239604"],"URL":"https:\/\/doi.org\/10.3390\/s23239604","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,4]]}}}