{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,27]],"date-time":"2026-07-27T14:57:28Z","timestamp":1785164248275,"version":"3.55.0"},"reference-count":50,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2025,11,27]],"date-time":"2025-11-27T00:00:00Z","timestamp":1764201600000},"content-version":"vor","delay-in-days":330,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2025,1,1]],"date-time":"2025-01-01T00:00:00Z","timestamp":1735689600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/doi.wiley.com\/10.1002\/tdm_license_1.1"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62462053"],"award-info":[{"award-number":["62462053"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["International Journal of Intelligent Systems"],"published-print":{"date-parts":[[2025,1]]},"abstract":"<jats:p>Facial expression recognition (FER) remains a challenging task in computer vision. Recent works have shown excellent performance in overall recognition accuracy, but its accuracy significantly decreases when recognizing similar expressions. This is due to interclass homogeneity and intraclass heterogeneity. To address these issues, we propose a novel dual\u2010stage network called DUAL, inspired by contrastive learning. First, we increase the distance between negative samples while reducing the distance between positive ones. This is achieved by dynamically updating pairs of comparison samples. Second, we introduce a two\u2010stage network architecture. The first stage uses two branches to extract image features and facial keypoint features. These branches interact to learn coarse\u2010grained features through mutual guidance. The second stage focuses on fine\u2010grained features using scale\u2010specific residual blocks. This allows the model to identify facial regions that are critical for recognizing expressions. We conducted extensive experiments on multiple datasets. The results show that DUAL surpasses state\u2010of\u2010the\u2010art models in items of performance. Additionally, the model shows high accuracy even in noisy conditions, highlighting its robustness.<\/jats:p>","DOI":"10.1155\/int\/7401168","type":"journal-article","created":{"date-parts":[[2025,11,27]],"date-time":"2025-11-27T09:19:27Z","timestamp":1764235167000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["DUAL: A Dual\u2010Stage Approach for Facial Expression Recognition Based on Contrastive Learning"],"prefix":"10.1155","volume":"2025","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-6422-9010","authenticated-orcid":false,"given":"Anting","family":"Zhu","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7713-3520","authenticated-orcid":false,"given":"Xingxing","family":"Jia","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-9303-2190","authenticated-orcid":false,"given":"Longfei","family":"Yang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1634-9840","authenticated-orcid":false,"given":"Huiyu","family":"Zhou","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7516-1699","authenticated-orcid":false,"given":"Wei","family":"Su","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"311","published-online":{"date-parts":[[2025,11,27]]},"reference":[{"key":"e_1_2_9_1_2","volume-title":"Silent Messages","author":"Mehrabian A.","year":"1971"},{"key":"e_1_2_9_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/taffc.2020.2981446"},{"key":"e_1_2_9_3_2","doi-asserted-by":"crossref","unstructured":"RuanD. YanY. LaiS. ChaiZ. ShenC. andWangH. Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2021 7660\u20137669.","DOI":"10.1109\/CVPR46437.2021.00757"},{"key":"e_1_2_9_4_2","doi-asserted-by":"crossref","unstructured":"SavchenkoA. V. Facial Expression and Attributes Recognition Based on Multi-Task Learning of Lightweight Neural Networks 2021 IEEE 19th International Symposium on Intelligent Systems and Informatics (SISY) 2021 IEEE 119\u2013124.","DOI":"10.1109\/SISY52375.2021.9582508"},{"key":"e_1_2_9_5_2","doi-asserted-by":"crossref","unstructured":"ZhengC. MendietaM. andChenC. Poster: A Pyramid Cross-Fusion Transformer Network for Facial Expression Recognition Proceedings of the IEEE\/CVF International Conference on Computer Vision 2023 3146\u20133155.","DOI":"10.1109\/ICCVW60793.2023.00339"},{"key":"e_1_2_9_6_2","doi-asserted-by":"crossref","unstructured":"WangK. PengX. YangJ. LuS. andQiaoY. Suppressing Uncertainties for Large-Scale Facial Expression Recognition Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2020 6897\u20136906.","DOI":"10.1109\/CVPR42600.2020.00693"},{"key":"e_1_2_9_7_2","doi-asserted-by":"crossref","unstructured":"WengJ. YangY. TanZ. andLeiZ. Attentive Hybrid Feature With Two-Step Fusion for Facial Expression Recognition 2020 25th International Conference on Pattern Recognition (ICPR) 2021 IEEE 6410\u20136416.","DOI":"10.1109\/ICPR48806.2021.9412554"},{"key":"e_1_2_9_8_2","doi-asserted-by":"crossref","unstructured":"FanX. DengZ. WangK. PengX. andQiaoY. Learning Discriminative Representation for Facial Expression Recognition From Uncertainties 2020 IEEE International Conference on Image Processing (ICIP) 2020 IEEE 903\u2013907.","DOI":"10.1109\/ICIP40778.2020.9190643"},{"key":"e_1_2_9_9_2","doi-asserted-by":"crossref","unstructured":"ZengD. LinZ. YanX. LiuY. WangF. andTangB. Face2exp: Combating Data Biases for Facial Expression Recognition Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2022 20291\u201320300.","DOI":"10.1109\/CVPR52688.2022.01965"},{"key":"e_1_2_9_10_2","doi-asserted-by":"crossref","unstructured":"ZhangX. WangT. LiX. YangH. andYinL. Weakly-Supervised Text-Driven Contrastive Learning for Facial Behavior Understanding Proceedings of the IEEE\/CVF International Conference on Computer Vision 2023 20751\u201320762.","DOI":"10.1109\/ICCV51070.2023.01897"},{"key":"e_1_2_9_11_2","doi-asserted-by":"crossref","unstructured":"GaoZ.andPatrasI. Self-Supervised Facial Representation Learning with Facial Region Awareness Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2024 2081\u20132092 https:\/\/doi.org\/10.1109\/cvpr52733.2024.00203.","DOI":"10.1109\/CVPR52733.2024.00203"},{"key":"e_1_2_9_12_2","doi-asserted-by":"crossref","unstructured":"HasaniB.andMahoorM. H. Facial Expression Recognition Using Enhanced Deep 3d Convolutional Neural Networks Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 2017 30\u201340.","DOI":"10.1109\/CVPRW.2017.282"},{"key":"e_1_2_9_13_2","doi-asserted-by":"crossref","unstructured":"QiuY.andWanY. Facial Expression Recognition Based on Landmarks 2019 IEEE 4th Advanced Information Technology Electronic and Automation Control Conference (IAEAC) 2019 IEEE 1356\u20131360.","DOI":"10.1109\/IAEAC47372.2019.8997580"},{"key":"e_1_2_9_14_2","doi-asserted-by":"crossref","unstructured":"JungH. LeeS. YimJ. ParkS. andKimJ. Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition 2015 IEEE International Conference on Computer Vision (ICCV) Proceedings of the IEEE International Conference on Computer Vision 2015 2983\u20132991 https:\/\/doi.org\/10.1109\/iccv.2015.341 2-s2.0-84973917824.","DOI":"10.1109\/ICCV.2015.341"},{"key":"e_1_2_9_15_2","doi-asserted-by":"crossref","unstructured":"FarzanehA. H.andQiX. Facial Expression Recognition in the Wild via Deep Attentive Center Loss Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision 2021 2402\u20132411.","DOI":"10.1109\/WACV48630.2021.00245"},{"key":"e_1_2_9_16_2","first-page":"17616","article-title":"Relative Uncertainty Learning for Facial Expression Recognition","volume":"34","author":"Zhang Y.","year":"2021","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_9_17_2","doi-asserted-by":"crossref","unstructured":"XueF. WangQ. andGuoG. Transfer: Learning Relation-Aware Facial Expression Representations With Transformers Proceedings of the IEEE\/CVF International Conference on Computer Vision 2021 3601\u20133610.","DOI":"10.1109\/ICCV48922.2021.00358"},{"key":"e_1_2_9_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/tip.2021.3093397"},{"key":"e_1_2_9_19_2","doi-asserted-by":"publisher","DOI":"10.3390\/biomimetics8020199"},{"key":"e_1_2_9_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/tip.2019.2956143"},{"key":"e_1_2_9_21_2","unstructured":"DosovitskiyA. BeyerL. KolesnikovA.et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 2020 https:\/\/arxiv.org\/abs\/2010.119292010.11929."},{"key":"e_1_2_9_22_2","unstructured":"LiH. SuiM. ZhaoF. ZhaZ. andWuF. Mvt: Mask Vision Transformer for Facial Expression Recognition in the Wild 2021 https:\/\/arxiv.org\/abs\/2106.04520."},{"key":"e_1_2_9_23_2","unstructured":"Van den OordA. LiY. andVinyalsO. Representation Learning with Contrastive Predictive Coding 2018 https:\/\/arxiv.org\/abs\/1807.03748."},{"key":"e_1_2_9_24_2","unstructured":"ChenT. KornblithS. NorouziM. andHintonG. A Simple Framework for Contrastive Learning of Visual Representations International Conference on Machine Learning 2020 PMLR 1597\u20131607."},{"key":"e_1_2_9_25_2","doi-asserted-by":"crossref","unstructured":"ReimersN.andGurevychI. Sentence-Bert: Sentence Embeddings Using Siamese Bert-Networks Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing 2019 3982\u20133992.","DOI":"10.18653\/v1\/D19-1410"},{"key":"e_1_2_9_26_2","doi-asserted-by":"crossref","unstructured":"MuellerJ.andThyagarajanA. Siamese Recurrent Architectures for Learning Sentence Similarity Proceedings of the AAAI Conference on Artificial Intelligence 2016.","DOI":"10.1609\/aaai.v30i1.10350"},{"key":"e_1_2_9_27_2","doi-asserted-by":"publisher","DOI":"10.1142\/s0129065723500326"},{"key":"e_1_2_9_28_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-023-14803-5"},{"key":"e_1_2_9_29_2","doi-asserted-by":"crossref","unstructured":"RoyS.andEtemadA. Self-Supervised Contrastive Learning of Multi-View Facial Expressions Proceedings of the 2021 International Conference on Multimodal Interaction Proceedings of the 2021 International Conference on Multimodal Interaction 2021 253\u2013257 https:\/\/doi.org\/10.1145\/3462244.3479955.","DOI":"10.1145\/3462244.3479955"},{"key":"e_1_2_9_30_2","doi-asserted-by":"crossref","unstructured":"KimD.andSongB. C. Emotion-Aware Multi-View Contrastive Learning for Facial Emotion Recognition European Conference on Computer Vision 2022 Springer 178\u2013195.","DOI":"10.1007\/978-3-031-19778-9_11"},{"key":"e_1_2_9_31_2","doi-asserted-by":"crossref","unstructured":"HeK. FanH. WuY. XieS. andGirshickR. Momentum Contrast for Unsupervised Visual Representation Learning 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2020 CVPR 9726\u20139735.","DOI":"10.1109\/CVPR42600.2020.00975"},{"key":"e_1_2_9_32_2","unstructured":"ChenT. KornblithS. NorouziM. andHintonG. A Simple Framework for Contrastive Learning of Visual Representations Proceedings of the 37th International Conference on Machine Learning 2020 JMLR.org."},{"key":"e_1_2_9_33_2","volume-title":"Supervised Contrastive Learning","author":"Khosla P.","year":"2020"},{"key":"e_1_2_9_34_2","doi-asserted-by":"crossref","unstructured":"SongX. HuangL. HuS.et al. Supervised Prototypical Contrastive Learning for Emotion Recognition in Conversation Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing 2022 5197\u20135206.","DOI":"10.18653\/v1\/2022.emnlp-main.347"},{"key":"e_1_2_9_35_2","unstructured":"KalantidisY. SariyildizM. B. PionN. WeinzaepfelP. andLarlusD. Hard Negative Mixing for Contrastive Learning Proceedings of the 34th International Conference on Neural Information Processing Systems 2020 Curran Associates Inc Red Hook NY USA."},{"key":"e_1_2_9_36_2","doi-asserted-by":"crossref","unstructured":"DengJ. GuoJ. XueN. andZafeiriouS. Arcface: Additive Angular Margin Loss for Deep Face Recognition Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2019 4690\u20134699.","DOI":"10.1109\/CVPR.2019.00482"},{"key":"e_1_2_9_37_2","unstructured":"ChenC. Pytorch Face Landmark: A Fast and Accurate Facial Landmark Detector 2021."},{"key":"e_1_2_9_38_2","doi-asserted-by":"crossref","unstructured":"HadsellR. ChopraS. andLeCunY. Dimensionality Reduction by Learning an Invariant Mapping 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201906) 2006 IEEE 1735\u20131742.","DOI":"10.1109\/CVPR.2006.100"},{"key":"e_1_2_9_39_2","doi-asserted-by":"crossref","unstructured":"SchroffF. KalenichenkoD. andPhilbinJ. Facenet: A Unified Embedding for Face Recognition and Clustering 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2015 815\u2013823 https:\/\/doi.org\/10.1109\/cvpr.2015.7298682 2-s2.0-84946751287.","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"e_1_2_9_40_2","doi-asserted-by":"crossref","unstructured":"ZhangY. WangC. LingX. andDengW. Learn From All: Erasing Attention Consistency for Noisy Label Facial Expression Recognition European Conference on Computer Vision 2022 Springer 418\u2013434.","DOI":"10.1007\/978-3-031-19809-0_24"},{"key":"e_1_2_9_41_2","doi-asserted-by":"crossref","unstructured":"SelvarajuR. R. CogswellM. DasA. VedantamR. ParikhD. andBatraD. Grad-Cam: Visual Explanations from Deep Networks via Gradient-Based Localization Proceedings of the IEEE International Conference on Computer Vision 2017 618\u2013626 https:\/\/doi.org\/10.1109\/iccv.2017.74 2-s2.0-85041910265.","DOI":"10.1109\/ICCV.2017.74"},{"key":"e_1_2_9_42_2","doi-asserted-by":"crossref","unstructured":"LiS. DengW. andDuJ. Reliable Crowdsourcing and Deep locality-preserving Learning for Expression Recognition in the Wild Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017 2852\u20132861.","DOI":"10.1109\/CVPR.2017.277"},{"key":"e_1_2_9_43_2","doi-asserted-by":"crossref","unstructured":"BarsoumE. ZhangC. FerrerC. C. andZhangZ. Training Deep Networks for Facial Expression Recognition With Crowd-Sourced Label Distribution Proceedings of the 18th ACM International Conference on Multimodal Interaction 2016 279\u2013283 https:\/\/doi.org\/10.1145\/2993148.2993165 2-s2.0-85016568635.","DOI":"10.1145\/2993148.2993165"},{"key":"e_1_2_9_44_2","unstructured":"GoodfellowI. J. ErhanD. CarrierP. L.et al. Challenges in Representation Learning: A Report on Three Machine Learning Contests Neural Information Processing: 20th International Conference ICONIP 2013 Daegu Korea November 3-7 2013. Proceedings Part III 20 2013 Springer 117\u2013124."},{"key":"e_1_2_9_45_2","doi-asserted-by":"crossref","unstructured":"MollahosseiniA. HasaniB. andMahoorM. H. Affectnet: A Database for Facial Expression Valence and Arousal Computing in the Wild IEEE Transactions on Affective Computing 10 2017 no. 1 18\u201331 https:\/\/doi.org\/10.1109\/taffc.2017.2740923 2-s2.0-85028454548.","DOI":"10.1109\/TAFFC.2017.2740923"},{"key":"e_1_2_9_46_2","doi-asserted-by":"crossref","unstructured":"GuoY. ZhangL. HuY. HeX. andGaoJ. Ms-celeb-1m: a Dataset and Benchmark for large-scale Face Recognition Computer Vision\u2013ECCV 2016: 14th European Conference October 2016 Amsterdam The Netherlands Springer 87\u2013102.","DOI":"10.1007\/978-3-319-46487-9_6"},{"key":"e_1_2_9_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/taffc.2024.3382618"},{"key":"e_1_2_9_48_2","doi-asserted-by":"crossref","unstructured":"WagnerN. M\u00e4tzlerF. VossbergS. R. SchneiderH. PavlitskaS. andZ\u00f6llnerJ. M. Cage: Circumplex Affect Guided Expression Inference Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2024 4683\u20134692 https:\/\/doi.org\/10.1109\/cvprw63382.2024.00471.","DOI":"10.1109\/CVPRW63382.2024.00471"},{"key":"e_1_2_9_49_2","doi-asserted-by":"crossref","unstructured":"WuZ.andCuiJ. La-net: Landmark-Aware Learning for Reliable Facial Expression Recognition Under Label Noise Proceedings of the IEEE\/CVF International Conference on Computer Vision 2023 20698\u201320707.","DOI":"10.1109\/ICCV51070.2023.01892"},{"key":"e_1_2_9_50_2","article-title":"Visualizing Data Using t-sne","volume":"9","author":"Van der Maaten L.","year":"2008","journal-title":"Journal of Machine Learning Research"}],"container-title":["International Journal of Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/int\/7401168","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1155\/int\/7401168","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/int\/7401168","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,8]],"date-time":"2026-03-08T18:14:54Z","timestamp":1772993694000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/int\/7401168"}},"subtitle":[],"editor":[{"given":"Richard","family":"Murray","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2025,1]]},"references-count":50,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,1]]}},"alternative-id":["10.1155\/int\/7401168"],"URL":"https:\/\/doi.org\/10.1155\/int\/7401168","archive":["Portico"],"relation":{},"ISSN":["0884-8173","1098-111X"],"issn-type":[{"value":"0884-8173","type":"print"},{"value":"1098-111X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1]]},"assertion":[{"value":"2025-03-25","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-08","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-11-27","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"7401168"}}