{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T14:31:37Z","timestamp":1774449097989,"version":"3.50.1"},"reference-count":43,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2025,11,13]],"date-time":"2025-11-13T00:00:00Z","timestamp":1762992000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004735","name":"Hunan Provincial Natural Science Foundation of China","doi-asserted-by":"crossref","award":["2024JJ7428"],"award-info":[{"award-number":["2024JJ7428"]}],"id":[{"id":"10.13039\/501100004735","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Facial expression recognition (FER) serves as a pivotal approach for understanding human affective states and behavioral intentions, forming the fundamental basis for achieving natural interaction in affective computing systems. To address the limitations of convolutional neural networks in capturing global facial expression features, while simultaneously overcoming the challenges of Vision Transformers regarding their substantial parameter requirements, high computational complexity, and difficulties in meeting lightweight deployment demands for practical applications, this paper proposes Agent-Poster, a lightweight multi-scale facial expression recognition model based on Agent Attention. Building upon the POSTER++ framework, the model innovatively integrates Agent Attention, adopts a streamlined dual-stream architecture to minimize redundant interactions, and implements efficient multi-scale feature fusion. Experimental results demonstrate that the proposed method achieves superior recognition performance compared to existing approaches, attaining accuracy rates of 92.61% on the RAF-DB dataset and 68.21% on the AffectNet dataset, thereby validating its robustness and accuracy in facial expression recognition tasks.<\/jats:p>","DOI":"10.3390\/info16110982","type":"journal-article","created":{"date-parts":[[2025,11,14]],"date-time":"2025-11-14T17:33:21Z","timestamp":1763141601000},"page":"982","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Agent-Poster: A Multi-Scale Feature Fusion Emotion Recognition Model Based on an Agent Attention Mechanism"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-7139-1042","authenticated-orcid":false,"given":"Lin","family":"Fu","sequence":"first","affiliation":[{"name":"School of Computer Science, University of South China, Hengyang 421001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yaping","family":"Wan","sequence":"additional","affiliation":[{"name":"School of Computer Science, University of South China, Hengyang 421001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gang","family":"Zou","sequence":"additional","affiliation":[{"name":"HuNan ZK Help Innovation Intelligent Technology Research Institute, Changsha 410000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"4057","DOI":"10.1109\/TIP.2019.2956143","article-title":"Region Attention Networks for Pose and Occlusion Robust Facial Expression Recognition","volume":"29","author":"Wang","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Yang, Y., Jia, B., Zhi, P., and Huang, S. (2024, January 16\u201322). PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.01539"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Gao, Z., and Patras, I. (2024, January 16\u201322). Self-Supervised Facial Representation Learning with Facial Region Awareness. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.00203"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Liu, Y., Wang, W., Zhan, Y., Feng, S., Liu, K., and Chen, Z. (2023, January 17\u201324). Pose-Disentangled Contrastive Learning for Self-supervised Facial Representation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00937"},{"key":"ref_5","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv."},{"key":"ref_6","unstructured":"Tan, M., and Le, Q. (2019, January 9\u201315). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_7","unstructured":"Han, D., Yun, S., Heo, B., and Yoo, Y.J. (2020). REXNet: Diminishing Representational Bottleneck on Convolutional Neural Network. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"110951","DOI":"10.1016\/j.patcog.2024.110951","article-title":"POSTER++: A Simpler and Stronger Facial Expression Recognition Network","volume":"157","author":"Mao","year":"2025","journal-title":"Pattern Recognit."},{"key":"ref_9","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems 30, Curran Associates, Inc."},{"key":"ref_10","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2021, January 3\u20137). An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. Proceedings of the 9th International Conference on Learning Representations, Virtual Event."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Mollahosseini, A., Chan, D., and Mahoor, M.H. (2016, January 7\u201310). Going Deeper in Facial Expression Recognition Using Deep Neural Networks. Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, Lake Placid, NY, USA.","DOI":"10.1109\/WACV.2016.7477450"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.neucom.2019.05.005","article-title":"Three Convolutional Neural Network Models for Facial Expression Recognition in the Wild","volume":"355","author":"Shao","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"45543","DOI":"10.1109\/ACCESS.2024.3380847","article-title":"Facial Emotion Recognition (FER) Through Custom Lightweight CNN Model: Performance Evaluation in Public Datasets","volume":"12","author":"Lombardi","year":"2024","journal-title":"IEEE Access"},{"key":"ref_14","unstructured":"Zhong, L., Liu, Q., Yang, P., Liu, B., Huang, J., and Metaxas, D.N. (2012, January 16\u201321). Learning Active Facial Patches for Expression Analysis. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Savchenko, A.V. (2021, January 16\u201318). Facial Expression and Attributes Recognition Based on Multi-Task Learning of Lightweight Neural Networks. Proceedings of the 2021 IEEE 19th International Symposium on Intelligent Systems and Informatics, Subotica, Serbia.","DOI":"10.1109\/SISY52375.2021.9582508"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Sang, D.V., and Ha, P.T. (2018, January 16\u201317). Discriminative Deep Feature Learning for Facial Emotion Recognition. Proceedings of the 2018 1st International Conference on Multimedia Analysis and Pattern Recognition, Hanoi, Vietnam.","DOI":"10.1109\/MAPR.2018.8337514"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Kim, S., Nam, J., and Ko, B.C. (2022). Facial expression recognition based on squeeze vision transformer. Sensors, 22.","DOI":"10.3390\/s22103729"},{"key":"ref_18","unstructured":"Li, H., Sui, M., Zhu, Z., and Zhao, F. (2021). MFEViT: A Robust Lightweight Transformer-Based Network for Multimodal 2D+3D Facial Expression Recognition. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Han, D., Ye, T., Han, Y., Xia, Z., Song, S., and Huang, G. (2023). Agent Attention: On the Integration of Softmax and Linear Attention. arXiv.","DOI":"10.1007\/978-3-031-72973-7_8"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"915","DOI":"10.1109\/TPAMI.2007.1110","article-title":"Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions","volume":"29","author":"Zhao","year":"2007","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"971","DOI":"10.1109\/TPAMI.2002.1017623","article-title":"Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns","volume":"24","author":"Ojala","year":"2002","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"781","DOI":"10.1016\/j.ins.2022.11.068","article-title":"Patch Attention Convolutional Vision Transformer for Facial Expression Recognition with Occlusion","volume":"619","author":"Liu","year":"2023","journal-title":"Inf. Sci."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"206","DOI":"10.1016\/j.ins.2023.03.105","article-title":"Self-Supervised Vision Transformer-Based Few-Shot Learning for Facial Expression Recognition","volume":"634","author":"Chen","year":"2023","journal-title":"Inf. Sci."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Deng, J., Guo, J., Xue, N., and Zafeiriou, S. (2019, January 15\u201320). ArcFace: Additive Angular Margin Loss for Deep Face Recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00482"},{"key":"ref_26","unstructured":"Chen, C.J. (2025, October 19). PyTorch Face Landmark: A Fast and Accurate Facial Landmark Detector, Available online: https:\/\/github.com\/cunjian\/pytorch_face_landmark."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Liu, Z., Hu, H., Lin, Y., Yao, Z., Xie, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., and Dong, L. (2022, January 18\u201324). Swin Transformer V2: Scaling Up Capacity and Resolution. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01170"},{"key":"ref_28","unstructured":"Wang, Z., Yan, C., and Hu, Z. (2021, January 26\u201328). Lightweight Multi-Scale Network with Attention for Facial Expression Recognition. Proceedings of the 4th International Conference on Advanced Electronic Materials, Computers and Software Engineering, Zhuhai, China."},{"key":"ref_29","first-page":"1445","article-title":"A Micro-Expression Recognition Method Based on Multi-Level Information Fusion Network","volume":"50","author":"Chen","year":"2024","journal-title":"Acta Autom. Sin."},{"key":"ref_30","first-page":"240234","article-title":"Lightweight Swin Transformer Combined with Multi-Scale Feature Fusion for Face Expression Recognition","volume":"52","author":"Li","year":"2025","journal-title":"Opt.-Electron. Eng."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zheng, C., Mendieta, M., and Chen, C. (2023, January 2\u20133). POSTER: A Pyramid Cross-Fusion Transformer Network for Facial Expression Recognition. Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshops, Paris, France.","DOI":"10.1109\/ICCVW60793.2023.00339"},{"key":"ref_32","unstructured":"Hazirbulan, I., Zafeiriou, S., and Pantic, M. (2017, January 21\u201326). Facial Expression Recognition Using Enhanced Deep 3D Convolutional Neural Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1016\/j.ins.2021.08.043","article-title":"Facial Expression Recognition with Grid-Wise Attention and Visual Transformer","volume":"580","author":"Li","year":"2021","journal-title":"Inf. Sci."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Li, S., Deng, W., and Du, J. (2017, January 21\u201326). Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.277"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1109\/TAFFC.2017.2740923","article-title":"AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild","volume":"10","author":"Mollahosseini","year":"2019","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_36","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A Method for Stochastic Optimization. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Xue, F., Wang, Q., and Guo, G. (2021, January 10\u201317). TransFER: Learning Relation-Aware Facial Expression Representations with Transformers. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00358"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Wang, C., Ling, X., and Deng, W. (2022, January 23\u201327). Learn from All: Erasing Attention Consistency for Noisy Label Facial Expression Recognition. Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-19809-0_24"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhang, S., Zhang, Y., Zhang, Y., Wang, Y., and Song, Z. (2023). A Dual-Direction Attention Mixed Feature Network for Facial Expression Recognition. Electronics, 12.","DOI":"10.3390\/electronics12173595"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Lee, I., Lee, E., and Yoo, S.B. (2023, January 2\u20133). Latent-OFER: Detect, Mask, and Reconstruct with Latent Vectors for Occluded Facial Expression Recognition. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.00148"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"624","DOI":"10.1109\/TAFFC.2024.3453443","article-title":"From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos","volume":"16","author":"Chen","year":"2024","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Colares, W.G., Costa, M.G.F., and Costa Filho, C.F.F. (2024, January 15\u201318). Enhancing Emotion Recognition: A Dual-Input Model for Facial Expression Recognition Using Images and Facial Landmarks. Proceedings of the 2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Orlando, FL, USA.","DOI":"10.1109\/EMBC53108.2024.10782924"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"639","DOI":"10.1109\/TAFFC.2024.3454102","article-title":"FERMixNet: An Occlusion Robust Facial Expression Recognition Model With Facial Mixing Augmentation and Mid-Level Representation Learning","volume":"16","author":"Huang","year":"2025","journal-title":"IEEE Trans. Affect. Comput."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/11\/982\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,15]],"date-time":"2025-11-15T05:31:08Z","timestamp":1763184668000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/11\/982"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,13]]},"references-count":43,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2025,11]]}},"alternative-id":["info16110982"],"URL":"https:\/\/doi.org\/10.3390\/info16110982","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,13]]}}}