{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,26]],"date-time":"2025-11-26T05:05:28Z","timestamp":1764133528229,"version":"build-2065373602"},"reference-count":41,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2022,5,26]],"date-time":"2022-05-26T00:00:00Z","timestamp":1653523200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Ministry of Education","award":["NRF-2020R1F1A1069079"],"award-info":[{"award-number":["NRF-2020R1F1A1069079"]}]},{"name":"Korean government (MSIT)","award":["NRF-2020R1F1A1069079"],"award-info":[{"award-number":["NRF-2020R1F1A1069079"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Gaze is an excellent indicator and has utility in that it can express interest or intention and the condition of an object. Recent deep-learning methods are mainly appearance-based methods that estimate gaze based on a simple regression from entire face and eye images. However, sometimes, this method does not give satisfactory results for gaze estimations in low-resolution and noisy images obtained in unconstrained real-world settings (e.g., places with severe lighting changes). In this study, we propose a method that estimates gaze by detecting eye region landmarks through a single eye image; and this approach is shown to be competitive with recent appearance-based methods. Our approach acquires rich information by extracting more landmarks and including iris and eye edges, similar to the existing feature-based methods. To acquire strong features even at low resolutions, we used the HRNet backbone network to learn representations of images at various resolutions. Furthermore, we used the self-attention module CBAM to obtain a refined feature map with better spatial information, which enhanced the robustness to noisy inputs, thereby yielding a performance of a 3.18% landmark localization error, a 4% improvement over the existing error and A large number of landmarks were acquired and used as inputs for a lightweight neural network to estimate the gaze. We conducted a within-datasets evaluation on the MPIIGaze, which was obtained in a natural environment and achieved a state-of-the-art performance of 4.32 degrees, a 6% improvement over the existing performance.<\/jats:p>","DOI":"10.3390\/s22114026","type":"journal-article","created":{"date-parts":[[2022,5,31]],"date-time":"2022-05-31T02:30:06Z","timestamp":1653964206000},"page":"4026","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Improved Feature-Based Gaze Estimation Using Self-Attention Module and Synthetic Eye Images"],"prefix":"10.3390","volume":"22","author":[{"given":"Jaekwang","family":"Oh","sequence":"first","affiliation":[{"name":"Department of Electronic Engineering, Kwangwoon University, Seoul 01897, Korea"}]},{"given":"Youngkeun","family":"Lee","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering, Kwangwoon University, Seoul 01897, Korea"}]},{"given":"Jisang","family":"Yoo","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering, Kwangwoon University, Seoul 01897, Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6595-6415","authenticated-orcid":false,"given":"Soonchul","family":"Kwon","sequence":"additional","affiliation":[{"name":"Graduate School of Smart Convergence, Kwangwoon Univeristy, Seoul 01897, Korea"}]}],"member":"1968","published-online":{"date-parts":[[2022,5,26]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Wu, M., Louw, T., Lahijanian, M., Ruan, W., Huang, X., Merat, N., and Kwiatkowska, M. (2019, January 4\u20138). Gaze-based intention anticipation over driving manoeuvres in semi-autonomous vehicles. Proceedings of the 2019 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS) IEEE, Macao, China.","DOI":"10.1109\/IROS40897.2019.8967779"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Ahn, S., and Lee, G. (2019, January 20\u201323). Gaze-assisted typing for smart glasses. Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, New Orleans, LA, USA.","DOI":"10.1145\/3332165.3347883"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Kim, J., Lee, Y., Lee, S., Kim, S., and Kwon, S. (2022). Implementation of Kiosk-Type System Based on Gaze Tracking for Objective Visual Function Examination. Symmetry, 14.","DOI":"10.3390\/sym14030499"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Wood, E., Baltru\u0161aitis, T., Morency, L.P., Robinson, P., and Bulling, A. (2016, January 14\u201317). Learning an appearance-based gaze estimator from one million synthesised images. Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications, Charleston, SC, USA.","DOI":"10.1145\/2857491.2857492"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Fischer, T., Chang, H.J., and Demiris, Y. (2018, January 8\u201314). Rt-gene: Real-time eye gaze estimation in natural environments. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01249-6_21"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"5259","DOI":"10.1109\/TIP.2020.2982828","article-title":"Gaze estimation by exploring two-eye asymmetry","volume":"29","author":"Cheng","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_7","unstructured":"Biswas, P. (2021, January 20\u201325). Appearance-Based gaze estimation using attention and difference mechanism. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1109\/TPAMI.2017.2778103","article-title":"Mpiigaze: Real-world dataset and deep appearance-based gaze estimation","volume":"41","author":"Zhang","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Cheng, Y., Lu, F., and Zhang, X. (2018, January 8\u201314). Appearance-based gaze estimation via evaluation-guided asymmetric regression. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_7"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Sugano, Y., Matsushita, Y., and Sato, Y. (2014, January 23\u201328). Learning-by-synthesis for appearance-based 3d gaze estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.235"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Zhang, X., Sugano, Y., Fritz, M., and Bulling, A. (2015, January 7\u201312). Appearance-based gaze estimation in the wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299081"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Park, S., Spurr, A., and Hilliges, O. (2018, January 8\u201314). Deep pictorial gaze estimation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01261-8_44"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Park, S., Zhang, X., Bulling, A., and Hilliges, O. (2018, January 14\u201317). Learning to find eye region landmarks for remote gaze estimation in unconstrained settings. Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications, Warsaw, Poland.","DOI":"10.1145\/3204493.3204545"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Newell, A., Yang, K., and Deng, J. (2016). Stacked hourglass networks for human pose estimation. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46484-8_29"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Sun, K., Xiao, B., Liu, D., and Wang, J. (2019, January 15\u201320). Deep high-resolution representation learning for human pose estimation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00584"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"783","DOI":"10.1007\/s11263-019-01283-0","article-title":"A simple and light-weight attention module for convolutional neural networks","volume":"128","author":"Park","year":"2020","journal-title":"Int. J. Comput. Vis."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018, January 8\u201314). Cbam: Convolutional block attention module. Proceedings of the European Conference on computer vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"324","DOI":"10.1007\/s11263-011-0511-6","article-title":"What are you looking at?","volume":"98","author":"Valenti","year":"2012","journal-title":"Int. J. Comput. Vis."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Manolova, A., Panev, S., and Tonchev, K. (2014, January 23\u201324). Human gaze tracking with an active multi-camera system. Proceedings of the International Workshop on Biometric Authentication, Sofia, Bulgaria.","DOI":"10.1007\/978-3-319-13386-7_14"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1109\/TCSVT.2014.2329362","article-title":"Hybrid method for 3-D gaze tracking using glint and contour features","volume":"25","author":"Lai","year":"2014","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Wood, E., Baltrusaitis, T., Zhang, X., Sugano, Y., Robinson, P., and Bulling, A. (2015, January 7\u201313). Rendering of eyes for eye-shape registration and gaze estimation. Proceedings of the IEEE International Conference on Computer Vision, Washington, DC, USA.","DOI":"10.1109\/ICCV.2015.428"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1023\/B:STCO.0000035301.49549.88","article-title":"A tutorial on support vector regression","volume":"14","author":"Smola","year":"2004","journal-title":"Stat. Comput."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Bernard, V., Wannous, H., and Vandeborre, J.P. (2021, January 28\u201330). Eye-Gaze Estimation using a Deep Capsule-based Regression Network. Proceedings of the 2021 International Conference on Content-Based Multimedia Indexing (CBMI), Lille, France.","DOI":"10.1109\/CBMI50038.2021.9461895"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Toshev, A., and Szegedy, C. (2014, January 23\u201328). Deeppose: Human pose estimation via deep neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.214"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1016\/j.cag.2019.09.002","article-title":"Human pose regression by combining indirect part detection and contextual information","volume":"85","author":"Luvizon","year":"2019","journal-title":"Comput. Graph."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Wei, S.E., Ramakrishna, V., Kanade, T., and Sheikh, Y. (2016, January 27\u201330). Convolutional pose machines. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.511"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Yang, S., Quan, Z., Nie, M., and Yang, W. (2021, January 11\u201317). Transpose: Keypoint localization via transformer. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.01159"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Guo, L., Liu, J., Zhu, X., Yao, P., Lu, S., and Lu, H. (2020, January 13\u201319). Normalized and Geometry-Aware Self-Attention Network for Image Captioning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01034"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Perreault, H., Bilodeau, G.A., Saunier, N., and H\u00e9ritier, M. (2020, January 13\u201315). Spotnet: Self-attention multi-task network for object detection. Proceedings of the 2020 17th Conference on Computer and Robot Vision (CRV), Ottawa, ON, Canada.","DOI":"10.1109\/CRV50864.2020.00038"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"11488","DOI":"10.1109\/JSEN.2020.3018172","article-title":"Attention! A lightweight 2d hand pose estimation approach","volume":"21","author":"Santavas","year":"2020","journal-title":"IEEE Sensors J."},{"key":"ref_32","unstructured":"Viola, P., and Jones, M. (2001, January 8\u201314). Rapid object detection using a boosted cascade of simple features. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, Kauai, HI, USA."},{"key":"ref_33","unstructured":"Cech, J., and Soukupova, T. (2016). Real-Time eye blink detection using facial landmarks. Cent. Mach. Perception, Dep. Cybern. Fac. Electr. Eng. Czech Tech. Univ. Prague, 1\u20138."},{"key":"ref_34","unstructured":"Yu, S. (2022, April 14). Harr Feature Cart-Tree Based Cascade Eye Detector Homepage. Available online: http:\/\/yushiqi.cn\/research\/eyedetection."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Dubey, A.K., and Jain, V. (2019). Comparative study of convolution neural network\u2019s relu and leaky-relu activation functions. Applications of Computing, Automation and Wireless Systems in Electrical Engineering, Springer.","DOI":"10.1007\/978-981-13-6772-4_76"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Sun, X., Xiao, B., Wei, F., Liang, S., and Wei, Y. (2018, January 8\u201314). Integral human pose regression. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01231-1_33"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1145\/358669.358692","article-title":"Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography","volume":"24","author":"Fischler","year":"1981","journal-title":"Commun. ACM"},{"key":"ref_38","first-page":"3","article-title":"Anchorface: An anchor-based facial landmark detector across large poses","volume":"1","author":"Xu","year":"2021","journal-title":"AAAI"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Kumar, A., Marks, T.K., Mou, W., Wang, Y., Jones, M., Cherian, A., Koike-Akino, T., Liu, X., and Feng, C. (2020, January 13\u201319). LUVLi Face Alignment: Estimating Landmarks\u2019 Location, Uncertainty, and Visibility Likelihood. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00826"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Jiang, J., Ji, Y., Wang, X., Liu, Y., Wang, J., and Long, M. (2021, January 20\u201325). Regressive domain adaptation for unsupervised keypoint detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00671"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., and Webb, R. (2017, January 21\u201326). Learning From Simulated and Unsupervised Images Through Adversarial Training. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.241"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/11\/4026\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:18:56Z","timestamp":1760138336000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/11\/4026"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,26]]},"references-count":41,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2022,6]]}},"alternative-id":["s22114026"],"URL":"https:\/\/doi.org\/10.3390\/s22114026","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2022,5,26]]}}}