{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,2]],"date-time":"2026-03-02T15:27:49Z","timestamp":1772465269078,"version":"3.50.1"},"reference-count":55,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2026,3,2]],"date-time":"2026-03-02T00:00:00Z","timestamp":1772409600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Current Facial expression recognition methods typically extract facial features indiscriminately, incorporating expression-irrelevant information that compromises recognition accuracy. To overcome this, we propose Multi-stage Feature Sparse Constraints (MFSC), a novel model that integrates a Multi-scale Attention-based Sparse Window Selection (MSAWS) mechanism with key region graph learning. Notably, MFSC operates without dependence on pre-extracted facial landmarks, enabling more flexible deployment. The MSAWS mechanism progressively filters redundant features through multi-stage sparse attention, adaptively selecting the most discriminative facial patches. The selected tokens are structured into a dynamic graph to model regional relationships via graph neural networks (GNNs). Critically, our framework further introduces a global-guided fusion module, which effectively integrates fine-grained local features from an IR50 backbone with the global topological features from the GNN through cross-attention. This integration enables complementary strengths, where local details are enhanced by global semantic context. Comprehensive experiments on RAF-DB, FER2013, and AffectNet-7 datasets demonstrate MFSC\u2019s superior performance, achieving state-of-the-art accuracy of 92.31%, 76.21%, and 67.35%, respectively. These results validate the effectiveness of our approach in focusing computational resources on expression-salient regions while maintaining a lightweight and efficient architecture.<\/jats:p>","DOI":"10.3390\/info17030246","type":"journal-article","created":{"date-parts":[[2026,3,2]],"date-time":"2026-03-02T14:06:56Z","timestamp":1772460416000},"page":"246","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Facial Expression Recognition Integrating Multi-Stage Feature Sparse Constraints and Key Region Graph Learning"],"prefix":"10.3390","volume":"17","author":[{"given":"Guanghui","family":"Xu","sequence":"first","affiliation":[{"name":"School of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan 430068, China"}]},{"given":"Yan","family":"Hong","sequence":"additional","affiliation":[{"name":"School of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan 430068, China"},{"name":"School of Information Management, Wuhan University, Wuhan 430072, China"}]},{"given":"Wanli","family":"Zhao","sequence":"additional","affiliation":[{"name":"School of Information Management, Wuhan University, Wuhan 430072, China"}]},{"given":"Zhongjie","family":"Mao","sequence":"additional","affiliation":[{"name":"School of Big Data and Artificial Intelligence, Chizhou University, Chizhou 247000, China"},{"name":"Wuhan Homelightyear Technology Co., Ltd., Wuhan 430061, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2902-7365","authenticated-orcid":false,"given":"Duantengchuan","family":"Li","sequence":"additional","affiliation":[{"name":"School of Information Management, Wuhan University, Wuhan 430072, China"}]},{"given":"Yue","family":"Li","sequence":"additional","affiliation":[{"name":"School of Information Management, Wuhan University, Wuhan 430072, China"},{"name":"Department of Intelligent Construction, School of Civil Engineering, Wuhan University, Wuhan 430072, China"}]}],"member":"1968","published-online":{"date-parts":[[2026,3,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"526","DOI":"10.1109\/TKDE.2024.3486747","article-title":"TGformer: A Graph Transformer Framework for Knowledge Graph Embedding","volume":"37","author":"Shi","year":"2025","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"103348","DOI":"10.1016\/j.ipm.2023.103348","article-title":"Knowledge graph representation learning with simplifying hierarchical feature propagation","volume":"60","author":"Li","year":"2023","journal-title":"Inf. Process. Manag."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"113220","DOI":"10.1016\/j.asoc.2025.113220","article-title":"Recommender system based on noise enhancement and multi-view graph contrastive learning","volume":"177","author":"Li","year":"2025","journal-title":"Appl. Soft Comput."},{"key":"ref_4","first-page":"12792","article-title":"Federated Recommendation with Explicitly Encoding Item Bias","volume":"39","author":"Wang","year":"2025","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"103680","DOI":"10.1016\/j.ipm.2024.103680","article-title":"Integrating user short-term intentions and long-term preferences in heterogeneous hypergraph networks for sequential recommendation","volume":"61","author":"Liu","year":"2024","journal-title":"Inf. Process. Manag."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"103631","DOI":"10.1016\/j.ipm.2023.103631","article-title":"Joint inter-word and inter-sentence multi-relation modeling for summary-based recommender system","volume":"61","author":"Li","year":"2024","journal-title":"Inf. Process. Manag."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"e70056","DOI":"10.1049\/ipr2.70056","article-title":"EANet: Integrate Edge Features and Attention Mechanisms Multi-Scale Networks for Vessel Segmentation in Retinal Images","volume":"19","author":"Zhang","year":"2025","journal-title":"IET Image Process."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"111536","DOI":"10.1016\/j.patcog.2025.111536","article-title":"ADGaze: Anisotropic Gaussian Label Distribution Learning for fine-grained gaze estimation","volume":"164","author":"Li","year":"2025","journal-title":"Pattern Recognit."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"101869","DOI":"10.1016\/j.jksuci.2023.101869","article-title":"DADL: Double Asymmetric Distribution Learning for head pose estimation in wisdom museum","volume":"36","author":"Zhao","year":"2024","journal-title":"J. King Saud Univ. Comput. Inf. Sci."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"113625","DOI":"10.1016\/j.asoc.2025.113625","article-title":"DAPlanner: Dual-agent framework with multi-modal large language model for autonomous driving motion planning","volume":"183","author":"Zhang","year":"2025","journal-title":"Appl. Soft Comput."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"131988","DOI":"10.1109\/ACCESS.2020.3010018","article-title":"Pyramid with super resolution for in-the-wild facial expression recognition","volume":"8","author":"Vo","year":"2020","journal-title":"IEEE Access"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Xu, H., Kong, J., Kong, X., Li, J., and Wang, J. (2022). MCF-Net: Fusion Network of Facial and Scene Features for Expression Recognition in the Wild. Appl. Sci., 12.","DOI":"10.3390\/app122010251"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"26756","DOI":"10.1109\/ACCESS.2022.3156598","article-title":"Ad-corre: Adaptive correlation-based loss for facial expression recognition in the wild","volume":"10","author":"Fard","year":"2022","journal-title":"IEEE Access"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Xue, F., Wang, Q., and Guo, G. (2021, January 11\u201317). Transfer: Learning relation-aware facial expression representations with transformers. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00358"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Zheng, C., Mendieta, M., and Chen, C. (2023, January 4\u20136). Poster: A pyramid cross-fusion transformer network for facial expression recognition. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCVW60793.2023.00339"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"110951","DOI":"10.1016\/j.patcog.2024.110951","article-title":"Poster++: A simpler and stronger facial expression recognition network","volume":"157","author":"Mao","year":"2025","journal-title":"Pattern Recognit."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"59774","DOI":"10.1109\/ACCESS.2023.3286547","article-title":"Facial expression recognition in the wild using face graph and attention","volume":"11","author":"Kim","year":"2023","journal-title":"IEEE Access"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"8693","DOI":"10.1007\/s11760-024-03501-w","article-title":"Attentional visual graph neural network based facial expression recognition method","volume":"18","author":"Dong","year":"2024","journal-title":"Signal Image Video Process."},{"key":"ref_19","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_20","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"5728","DOI":"10.1109\/TCSS.2024.3393247","article-title":"Automatic diagnosis of depression based on facial expression information and deep convolutional neural network","volume":"11","author":"Li","year":"2024","journal-title":"IEEE Trans. Comput. Soc. Syst."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"e10391","DOI":"10.1002\/jdn.10391","article-title":"Improving facial expression recognition for autism with IDenseNet-RCAformer under occlusions","volume":"85","author":"Selvi","year":"2025","journal-title":"Int. J. Dev. Neurosci."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"104227","DOI":"10.1016\/j.cviu.2024.104227","article-title":"2S-SGCN: A two-stage stratified graph convolutional network model for facial landmark detection on 3D data","volume":"250","author":"Burger","year":"2025","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Shou, Z., Huang, Y., Li, D., Feng, C., Zhang, H., Lin, Y., and Wu, G. (2024). A student facial expression recognition model based on multi-scale and deep fine-grained feature attention enhancement. Sensors, 24.","DOI":"10.3390\/s24206748"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"103416","DOI":"10.1016\/j.inffus.2025.103416","article-title":"FERmc: Facial expression recognition framework based on multi-branch fusion and depthwise separable convolution","volume":"124","author":"Li","year":"2025","journal-title":"Inf. Fusion"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"103371","DOI":"10.1016\/j.inffus.2025.103371","article-title":"FER-VMamba: A robust facial expression recognition framework with global compact attention and hierarchical feature interaction","volume":"124","author":"Ma","year":"2025","journal-title":"Inf. Fusion"},{"key":"ref_27","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, NeurIPS Proceedings."},{"key":"ref_28","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16 \u00d7 16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Tian, Y., Zhu, J., Yao, H., and Chen, D. (2024). Facial expression recognition based on Vision Transformer with hybrid local attention. Appl. Sci., 14.","DOI":"10.3390\/app14156471"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"104824","DOI":"10.1016\/j.imavis.2023.104824","article-title":"GFFT: Global-local feature fusion transformers for facial expression recognition in the wild","volume":"139","author":"Xu","year":"2023","journal-title":"Image Vis. Comput."},{"key":"ref_31","first-page":"240234","article-title":"Lightweight Swin Transformer combined with multi-scale feature fusion for face expression recognition","volume":"52","author":"Li","year":"2025","journal-title":"Guangdian-Gongcheng\/Opto-Electron. Eng."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Dong, J., Zhang, Y., and Fan, L. (2023). A Multi-view face expression recognition method based on DenseNet and GAN. Electronics, 12.","DOI":"10.3390\/electronics12112527"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"7655","DOI":"10.1109\/TNNLS.2021.3086066","article-title":"Toward region-aware attention learning for scene graph generation","volume":"33","author":"Liu","year":"2021","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_34","first-page":"10","article-title":"Graph attention networks","volume":"1050","author":"Velickovic","year":"2017","journal-title":"Stat"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"101605","DOI":"10.1016\/j.jksuci.2023.101605","article-title":"Gcanet: Geometry cues-aware facial expression recognition based on graph convolutional networks","volume":"35","author":"Wang","year":"2023","journal-title":"J. King Saud-Univ.-Comput. Inf. Sci."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016). Rethinking the inception architecture for computer vision (Conference Paper). IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Li, S., Deng, W., and Du, J. (2017, January 21\u201326). Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.277"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1109\/TAFFC.2017.2740923","article-title":"Affectnet: A database for facial expression, valence, and arousal computing in the wild","volume":"10","author":"Mollahosseini","year":"2017","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Goodfellow, I.J., Erhan, D., Carrier, P.L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., and Lee, D.H. (2013). Challenges in representation learning: A report on three machine learning contests. International Conference on Neural Information Processing, Springer.","DOI":"10.1007\/978-3-642-42051-1_16"},{"key":"ref_40","unstructured":"Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., and Lerer, A. (2017). Automatic Differentiation in Pytorch, NeurIPS Autodiff Workshop."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"5962","DOI":"10.1109\/TPAMI.2021.3087709","article-title":"ArcFace: Additive Angular Margin Loss for Deep Face Recognition","volume":"44","author":"Deng","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Guo, Y., Zhang, L., Hu, Y., He, X., and Gao, J. (2016). Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46487-9_6"},{"key":"ref_43","unstructured":"M\u00fcller, R., Kornblith, S., and Hinton, G.E. (2019). When does label smoothing help?. Advances in Neural Information Processing Systems, NeurIPS Proceedings."},{"key":"ref_44","unstructured":"Shi, J., Zhu, S., and Liang, Z. (2021). Learning to amend facial expression representation via de-albino and affinity. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Wen, Z., Lin, W., Wang, T., and Xu, G. (2023). Distract your attention: Multi-head cross attention network for facial expression recognition. Biomimetics, 8.","DOI":"10.3390\/biomimetics8020199"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"6544","DOI":"10.1109\/TIP.2021.3093397","article-title":"Learning deep global multi-scale and local attention features for facial expression recognition in the wild","volume":"30","author":"Zhao","year":"2021","journal-title":"IEEE Trans. Image Process."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Pecoraro, R., Basile, V., and Bono, V. (2022). Local multi-head channel self-attention for facial expression recognition. Information, 13.","DOI":"10.3390\/info13090419"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"She, J., Hu, Y., Shi, H., Wang, J., Shen, Q., and Mei, T. (2021, January 19\u201325). Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00618"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"El Boudouri, Y., and Bohi, A. (2023). Emonext: An adapted convnext for facial emotion recognition. 2023 IEEE 25th International Workshop on Multimedia Signal Processing (MMSP), IEEE.","DOI":"10.1109\/MMSP59012.2023.10337732"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"9995","DOI":"10.1109\/ACCESS.2023.3237817","article-title":"Fine-tuning swin transformer and multiple weights optimality-seeking for facial expression recognition","volume":"11","author":"Feng","year":"2023","journal-title":"IEEE Access"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Chen, X., and Huang, L. (2024). A lightweight model enhancing facial expression recognition with spatial bias and cosine-harmony loss. Computation, 12.","DOI":"10.20944\/preprints202408.1304.v1"},{"key":"ref_52","unstructured":"Murtada, A., Abdelrhman, O., and Attia, T.A. (2025). Mini-ResEmoteNet: Leveraging Knowledge Distillation for Human-Centered Design. arXiv."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"608","DOI":"10.1109\/TCE.2024.3519514","article-title":"Triple-Attribute Perceptron Facial Expression Recognition in Real-World Environments","volume":"71","author":"Hsu","year":"2024","journal-title":"IEEE Trans. Consum. Electron."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1145\/3735559","article-title":"Multi-Attribute Feature-Aware Network for Facial Expression Recognition","volume":"21","author":"Hsu","year":"2025","journal-title":"Acm Trans. Multimed. Comput. Commun. Appl."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1038\/s43586-024-00363-x","article-title":"Uniform manifold approximation and projection","volume":"4","author":"Healy","year":"2024","journal-title":"Nat. Rev. Methods Prim."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/17\/3\/246\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,2]],"date-time":"2026-03-02T14:22:54Z","timestamp":1772461374000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/17\/3\/246"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,2]]},"references-count":55,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2026,3]]}},"alternative-id":["info17030246"],"URL":"https:\/\/doi.org\/10.3390\/info17030246","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,2]]}}}