{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T22:39:40Z","timestamp":1776206380796,"version":"3.50.1"},"reference-count":30,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2025,3,27]],"date-time":"2025-03-27T00:00:00Z","timestamp":1743033600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"The National Natural Science Foundation of China","award":["62177012"],"award-info":[{"award-number":["62177012"]}]},{"name":"The National Natural Science Foundation of China","award":["2024GXNSFDA010048"],"award-info":[{"award-number":["2024GXNSFDA010048"]}]},{"name":"The National Natural Science Foundation of China","award":["GXKL06240107"],"award-info":[{"award-number":["GXKL06240107"]}]},{"name":"The National Natural Science Foundation of China","award":["YCBZ2024160"],"award-info":[{"award-number":["YCBZ2024160"]}]},{"name":"Guangxi Natural Science Foundation under Grant","award":["62177012"],"award-info":[{"award-number":["62177012"]}]},{"name":"Guangxi Natural Science Foundation under Grant","award":["2024GXNSFDA010048"],"award-info":[{"award-number":["2024GXNSFDA010048"]}]},{"name":"Guangxi Natural Science Foundation under Grant","award":["GXKL06240107"],"award-info":[{"award-number":["GXKL06240107"]}]},{"name":"Guangxi Natural Science Foundation under Grant","award":["YCBZ2024160"],"award-info":[{"award-number":["YCBZ2024160"]}]},{"name":"the Project of Guangxi Wireless Broadband Communication and Signal Processing Key Laboratory","award":["62177012"],"award-info":[{"award-number":["62177012"]}]},{"name":"the Project of Guangxi Wireless Broadband Communication and Signal Processing Key Laboratory","award":["2024GXNSFDA010048"],"award-info":[{"award-number":["2024GXNSFDA010048"]}]},{"name":"the Project of Guangxi Wireless Broadband Communication and Signal Processing Key Laboratory","award":["GXKL06240107"],"award-info":[{"award-number":["GXKL06240107"]}]},{"name":"the Project of Guangxi Wireless Broadband Communication and Signal Processing Key Laboratory","award":["YCBZ2024160"],"award-info":[{"award-number":["YCBZ2024160"]}]},{"name":"Innovation Project of Guangxi Graduate Education","award":["62177012"],"award-info":[{"award-number":["62177012"]}]},{"name":"Innovation Project of Guangxi Graduate Education","award":["2024GXNSFDA010048"],"award-info":[{"award-number":["2024GXNSFDA010048"]}]},{"name":"Innovation Project of Guangxi Graduate Education","award":["GXKL06240107"],"award-info":[{"award-number":["GXKL06240107"]}]},{"name":"Innovation Project of Guangxi Graduate Education","award":["YCBZ2024160"],"award-info":[{"award-number":["YCBZ2024160"]}]},{"name":"The National Natural Science Foundation of China","award":["62177012"],"award-info":[{"award-number":["62177012"]}]},{"name":"The National Natural Science Foundation of China","award":["2024GXNSFDA010048"],"award-info":[{"award-number":["2024GXNSFDA010048"]}]},{"name":"The National Natural Science Foundation of China","award":["GXKL06240107"],"award-info":[{"award-number":["GXKL06240107"]}]},{"name":"The National Natural Science Foundation of China","award":["YCBZ2024160"],"award-info":[{"award-number":["YCBZ2024160"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>The complexity of various factors influencing online learning makes it difficult to characterize learning concentration, while Accurately estimating students\u2019 gaze points during learning video sessions represents a critical scientific challenge in assessing and enhancing the attentiveness of online learners. However, current appearance-based gaze estimation models lack a focus on extracting essential features and fail to effectively model the spatio-temporal relationships among the head, face, and eye regions, which limits their ability to achieve lower angular errors. This paper proposes an appearance-based gaze estimation model (RSP-MCGaze). The model constructs a feature extraction backbone network for gaze estimation (ResNetSC) by integrating ResNet and SCConv; this integration enhances the model\u2019s ability to extract important features while reducing spatial and channel redundancy. Based on the ResNetSC backbone, the method for video gaze estimation was further optimized by jointly locating the head, eyes, and face. The experimental results demonstrate that our model achieves significantly higher performance compared to existing baseline models on public datasets, thereby fully confirming the superiority of our method in the gaze estimation task. The model achieves a detection error of 9.86 on the Gaze360 dataset and a detection error of 7.11 on the detectable face subset of Gaze360.<\/jats:p>","DOI":"10.3390\/jimaging11040099","type":"journal-article","created":{"date-parts":[[2025,3,28]],"date-time":"2025-03-28T10:54:48Z","timestamp":1743159288000},"page":"99","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2429-5782","authenticated-orcid":false,"given":"Zhaoyu","family":"Shou","sequence":"first","affiliation":[{"name":"School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China"},{"name":"Guangxi Wireless Broadband Communication and Signal Processing Key Laboratory, Guilin University of Electronic Technology, Guilin 541004, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yanjun","family":"Lin","sequence":"additional","affiliation":[{"name":"School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianwen","family":"Mo","sequence":"additional","affiliation":[{"name":"School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ziyong","family":"Wu","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,3,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"5259","DOI":"10.1109\/TIP.2020.2982828","article-title":"Gaze estimation by exploring two-eye asymmetry","volume":"29","author":"Cheng","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Nonaka, S., Nobuhara, S., and Nishino, K. (2022, January 18\u201324). Dynamic 3d gaze from afar: Deep gaze estimation from temporal eye-head-body coordination. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00223"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Bao, Y., Cheng, Y., Liu, Y., and Lu, F. (2021, January 10\u201315). Adaptive feature fusion network for gaze tracking in mobile tablets. Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy.","DOI":"10.1109\/ICPR48806.2021.9412205"},{"key":"ref_4","first-page":"10623","article-title":"A coarse-to-fine adaptive network for appearance-based gaze estimation","volume":"34","author":"Cheng","year":"2020","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"3322","DOI":"10.1109\/TIP.2022.3171416","article-title":"An individual-difference-aware model for cross-person gaze estimation","volume":"31","author":"Bao","year":"2022","journal-title":"IEEE Trans. Image Process."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1687","DOI":"10.1109\/LSP.2023.3332569","article-title":"End-to-end video gaze estimation via capturing head-face-eye spatial-temporal interaction context","volume":"30","author":"Guan","year":"2023","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"2592","DOI":"10.1109\/TCYB.2023.3312392","article-title":"Gaze estimation by attention-induced hierarchical variational auto-encoder","volume":"54","author":"Huang","year":"2023","journal-title":"IEEE Trans. Cybern."},{"key":"ref_8","first-page":"3027","article-title":"Learning a generalized gaze estimator from gaze-consistent feature","volume":"37","author":"Xu","year":"2023","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Hisadome, Y., Wu, T., Qin, J., and Sugano, Y. (2024, January 3\u20138). Rotation-Constrained Cross-View FeatureFusion for Multi-View Appearance-based Gaze Estimation. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","DOI":"10.1109\/WACV57701.2024.00588"},{"key":"ref_10","first-page":"6729","article-title":"CLIP-Gaze: Towards General Gaze Estimation via Visual-Linguistic Model","volume":"38","author":"Yin","year":"2024","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Oh, J.O., Chang, H.J., and Choi, S.I. (2022, January 18\u201324). Self-attention with convolution and deconvolution for efficient eye gaze estimation from a full face image. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPRW56347.2022.00547"},{"key":"ref_12","unstructured":"Biswas, P. (2021, January 20\u201325). Appearance-based gaze estimation using attention and difference mechanism. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Li, Y., Huang, L., Chen, J., and Tan, B. (2023). Appearance-based gaze estimation method using static transformer temporal differential network. Mathematics, 11.","DOI":"10.3390\/math11030686"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"122363","DOI":"10.1016\/j.eswa.2023.122363","article-title":"EG-Net: Appearance-based eye gaze estimation using an efficient gaze network with attention mechanism","volume":"238","author":"Wu","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Zhang, X., Park, S., Beeler, T., Bradley, D., Tang, S., and Hilliges, O. (2020, January 23\u201328). Eth-xgaze: A large scale dataset for gaze estimation under extreme head pose and gaze variation. Proceedings of the Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK. Proceedings, Part V16.","DOI":"10.1007\/978-3-030-58558-7_22"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zhang, M., Liu, Y., and Lu, F. (2022, January 18\u201324). Gazeonce: Realtime multi-person gaze estimation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00416"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Balim, H., Park, S., Wang, X., Zhang, X., and Hilliges, O. (2023, January 17\u201324). Efe: End-to-end frame-to-gaze estimation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPRW59228.2023.00269"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chen, J., He, T., Zhuo, W., Ma, L., Ha, S., and Chan, S.-H.G. (2022, January 18\u201324). Tvconv: Efficient translation variant convolution for layout-aware visual processing. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01222"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Chen, Y., Dai, X., Chen, D., Liu, M., Dong, X., and Liu, Z. (2022, January 18\u201324). Mobileformer: Bridging mobilenet and transformer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00520"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Sun, X., Hassani, A., Wang, Z., Huang, G., and Shi, H. (2022, January 18\u201324). Disparse: Disentangled sparsification for multitask model compression. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01206"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Xia, M., Zhong, Z., and Chen, D. (2022). Structured pruning learn compact and accurate models. arXiv.","DOI":"10.18653\/v1\/2022.acl-long.107"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Li, J., Wen, Y., and He, L. (2023, January 17\u201324). Scconv: Spatial and channel reconstruction convolution for feature redundancy. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00596"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Yang, S., Wang, X., Li, Y., Fang, Y., Fang, J., Liu, W., Zhao, X., and Shan, Y. (2022, January 18\u201324). Temporally efficient vision transformer for video instance segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00290"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Rezatofighi, H., Tsoi, N., Gwak, J.Y., Sadeghian, A., Reid, I., and Savarese, S. (2019, January 15\u201320). Generalized intersection over union: A metric and a loss for bounding box regression. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Cheng, Y., and Lu, F. (2022, January 21\u201325). Gaze estimation using transformer. Proceedings of the 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada.","DOI":"10.1109\/ICPR56361.2022.9956687"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Abdelrahman, A.A., Hempel, T., Khalifa, A., and Al-Hamadi, A. (2022). L2csnet: Fine-grained gaze estimation in unconstrained environments. arXiv.","DOI":"10.1109\/ICFSP59764.2023.10372944"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Yan, C., Pan, W., Xu, C., Dai, S., and Li, X. (2023). Gaze estimation via strip pooling and multi-criss-cross attention networks. Appl. Sci., 13.","DOI":"10.3390\/app13105901"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Nagpure, V., and Okuma, K. (2023, January 17\u201324). Searching efficient neural architecture with multi-resolution fusion transformer for appearance-based gaze estimation. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Vancouver, BC, Canada.","DOI":"10.1109\/WACV56688.2023.00095"},{"key":"ref_30","unstructured":"Kellnhofer, P., Recasens, A., Stent, S., Matusik, W., and Torralba, A. (November, January 27). Gaze360: Physically unconstrained gaze estimation in the wild. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/11\/4\/99\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:03:30Z","timestamp":1760029410000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/11\/4\/99"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,27]]},"references-count":30,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2025,4]]}},"alternative-id":["jimaging11040099"],"URL":"https:\/\/doi.org\/10.3390\/jimaging11040099","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,3,27]]}}}