{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T16:04:53Z","timestamp":1772553893159,"version":"3.50.1"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2023,3,15]],"date-time":"2023-03-15T00:00:00Z","timestamp":1678838400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["U21A20518, 61976086, and 62272157"],"award-info":[{"award-number":["U21A20518, 61976086, and 62272157"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100013096","name":"State Grid Science and Technology Project","doi-asserted-by":"crossref","award":["5100-202123009A"],"award-info":[{"award-number":["5100-202123009A"]}],"id":[{"id":"10.13039\/501100013096","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Special Project of Foshan Science and Technology Innovation Team","award":["FS0AA-KJ919-4402-0069"],"award-info":[{"award-number":["FS0AA-KJ919-4402-0069"]}]},{"name":"National Natural Science Foundation of Changsha","award":["kq2202177"],"award-info":[{"award-number":["kq2202177"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2023,5,31]]},"abstract":"<jats:p>Single-label facial expression recognition (FER), which aims to classify single expression for facial images, usually suffers from the label noisy and incomplete problem, where manual annotations for partial training images exist wrong or incomplete labels, resulting in performance decline. Although prior work has attempted to leverage external sources or manual annotations to handle this problem, it usually requires extra costs. This article explores a simple yet effective three-phase paradigm (\u201cwarm-up,\u201d \u201cselection,\u201d and \u201crelabeling\u201d) for FER task. First, the warm-up phase attempts to build an initial recognition network based on noisy samples for discriminative feature extractions and facial expression predictions. Then, the second selection phase defines several rules to choose high confident samples according to prediction scores, and the third relabeling phase assigns two potential labels to those samples for network updating according to a composite two-label loss. Compared with the previous studies, the three-phase learning could effectively correct noisy labels in the ground truth without extra information and automatically assign two potential labels to single-label samples without manual annotations. As a result, the label information is purified and supplemented with few cost, yielding significant performance improvement. Extensive experiments are conducted on three datasets, and the experimental results demonstrate that our approach is robust to noisy training samples and outperforms several state-of-the-art methods.<\/jats:p>","DOI":"10.1145\/3570329","type":"journal-article","created":{"date-parts":[[2022,11,17]],"date-time":"2022-11-17T15:13:02Z","timestamp":1668697982000},"page":"1-17","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["TP-FER: An Effective Three-phase Noise-tolerant Recognizer for Facial Expression Recognition"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4857-2214","authenticated-orcid":false,"given":"Junjie","family":"Li","sequence":"first","affiliation":[{"name":"Hunan University, Yuelu Qu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9600-7789","authenticated-orcid":false,"given":"Jin","family":"Yuan","sequence":"additional","affiliation":[{"name":"Hunan University, Yuelu Qu,  China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9720-5915","authenticated-orcid":false,"given":"Zhiyong","family":"Li","sequence":"additional","affiliation":[{"name":"Hunan University, Yuelu Qu,  China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,3,15]]},"reference":[{"key":"e_1_3_2_2_2","first-page":"7142","volume-title":"Proceedings of the 25th International Conference on Pattern Recognition (ICPR\u201921)","author":"Algan G\u00f6rkem","year":"2021","unstructured":"G\u00f6rkem Algan and Ilkay Ulusoy. 2021. Meta soft label generation for noisy labels. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR\u201921). IEEE, 7142\u20137148."},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/2993148.2993165"},{"key":"e_1_3_2_4_2","first-page":"302","volume-title":"Proceedings of the 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG\u201918)","author":"Cai Jie","year":"2018","unstructured":"Jie Cai, Zibo Meng, Ahmed Shehab Khan, Zhiyuan Li, James O\u2019Reilly, and Yan Tong. 2018. Island loss for learning discriminative features in facial expression recognition. In Proceedings of the 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG\u201918). IEEE, 302\u2013309."},{"key":"e_1_3_2_5_2","first-page":"13984","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Chen Shikai","year":"2020","unstructured":"Shikai Chen, Jianfeng Wang, Yuedong Chen, Zhongchao Shi, Xin Geng, and Yong Rui. 2020. Label distribution learning on auxiliary label space graphs for facial expression recognition. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 13984\u201313993."},{"key":"e_1_3_2_6_2","first-page":"423","volume-title":"Proceedings of the ACM on International Conference on Multimodal Interaction","author":"Dhall Abhinav","year":"2015","unstructured":"Abhinav Dhall, O. V. Ramana Murthy, Roland Goecke, Jyoti Joshi, and Tom Gedeon. 2015. Video and image based emotion recognition challenges in the wild: Emotiw 2015. In Proceedings of the ACM on International Conference on Multimodal Interaction. 423\u2013426."},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCB48548.2020.9304923"},{"key":"e_1_3_2_8_2","first-page":"1","volume-title":"Proceedings of the 12th Indian Conference on Computer Vision, Graphics and Image Processing","author":"Gera Darshan","year":"2021","unstructured":"Darshan Gera. 2021. Handling ambiguous annotations for facial expression recognition in the wild. In Proceedings of the 12th Indian Conference on Computer Vision, Graphics and Image Processing. 1\u20139."},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2021.01.029"},{"key":"e_1_3_2_10_2","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1007\/978-3-642-42051-1_16","volume-title":"Proceedings of the International Conference on Neural Information Processing","author":"Goodfellow Ian J.","year":"2013","unstructured":"Ian J. Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, et\u00a0al. 2013. Challenges in representation learning: A report on three machine learning contests. In Proceedings of the International Conference on Neural Information Processing. Springer, 117\u2013124."},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_12_2","first-page":"2712","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Hendrycks Dan","year":"2019","unstructured":"Dan Hendrycks, Kimin Lee, and Mantas Mazeika. 2019. Using pre-training can improve model robustness and uncertainty. In Proceedings of the International Conference on Machine Learning. PMLR, 2712\u20132721."},{"key":"e_1_3_2_13_2","first-page":"11879","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201919)","author":"Hu Wei","year":"2019","unstructured":"Wei Hu, Yangyu Huang, Fan Zhang, and Ruirui Li. 2019. Noise-tolerant paradigm for training face recognition CNNs. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201919). IEEE, 11879\u201311888."},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00342"},{"key":"e_1_3_2_15_2","first-page":"460","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshop (ICCVW\u201919)","author":"Jaehwan Lee","year":"2019","unstructured":"Lee Jaehwan, Yoo Donggeun, and Kim Hyo-Eun. 2019. Photometric transformer networks and label adjustment for breast density prediction. In Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshop (ICCVW\u201919). IEEE Computer Society, 460\u2013466."},{"key":"e_1_3_2_16_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Li Junnan","year":"2019","unstructured":"Junnan Li, Richard Socher, and Steven C. H. Hoi. 2019. DivideMix: Learning with noisy labels as semi-supervised learning. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_17_2","first-page":"5051","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Li Junnan","year":"2019","unstructured":"Junnan Li, Yongkang Wong, Qi Zhao, and Mohan S. Kankanhalli. 2019. Learning to learn from noisy labeled data. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 5051\u20135059."},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2018.2868382"},{"key":"e_1_3_2_19_2","first-page":"2852","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Li Shan","year":"2017","unstructured":"Shan Li, Weihong Deng, and JunPing Du. 2017. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2852\u20132861."},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2018.2886767"},{"key":"e_1_3_2_21_2","first-page":"3132","volume-title":"Proceedings of the 24th International Conference on Pattern Recognition (ICPR\u201918)","author":"Luo Zimeng","year":"2018","unstructured":"Zimeng Luo, Jiani Hu, and Weihong Deng. 2018. Local subclass constraint for facial expression recognition in the wild. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR\u201918). IEEE, 3132\u20133137."},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3474085.3475249"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/TAFFC.2017.2740923"},{"key":"e_1_3_2_24_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Nguyen Duc Tam","year":"2019","unstructured":"Duc Tam Nguyen, Chaithanya Kumar Mummadi, Thi Phuong Nhung Ngo, Thi Hoai Phuong Nguyen, Laura Beggel, and Thomas Brox. 2019. SELF: Learning to filter noisy labels with self-ensembling. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_25_2","article-title":"Local multi-head channel self-attention for facial expression recognition","author":"Pecoraro Roberto","year":"2021","unstructured":"Roberto Pecoraro, Valerio Basile, Viviana Bono, and Sara Gallo. 2021. Local multi-head channel self-attention for facial expression recognition. Retrieved from https:\/\/arXiv:2111.07224.","journal-title":"Retrieved from https:\/\/arXiv:2111.07224"},{"key":"e_1_3_2_26_2","first-page":"4513","volume-title":"Proceedings of the 25th International Conference on Pattern Recognition (ICPR\u201921)","author":"Pham Luan","year":"2021","unstructured":"Luan Pham, The Huynh Vu, and Tuan Anh Tran. 2021. Facial expression recognition using residual masking network. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR\u201921). IEEE, 4513\u20134519."},{"key":"e_1_3_2_27_2","article-title":"Facial expression recognition using convolutional neural networks: State of the art","author":"Pramerdorfer Christopher","year":"2016","unstructured":"Christopher Pramerdorfer and Martin Kampel. 2016. Facial expression recognition using convolutional neural networks: State of the art. Retrieved from https:\/\/arXiv:1612.02903.","journal-title":"Retrieved from https:\/\/arXiv:1612.02903"},{"key":"e_1_3_2_28_2","first-page":"7656","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201921)","author":"Ruan Delian","year":"2021","unstructured":"Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, and Hanzi Wang. 2021. Feature decomposition and reconstruction learning for effective facial expression recognition. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201921). IEEE, 7656\u20137665."},{"key":"e_1_3_2_29_2","first-page":"737","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Sharma Karishma","year":"2020","unstructured":"Karishma Sharma, Pinar Donmez, Enming Luo, Yan Liu, and I Zeki Yalniz. 2020. Noiserank: Unsupervised label noise reduction with dependence models. In Proceedings of the European Conference on Computer Vision. Springer, 737\u2013753."},{"key":"e_1_3_2_30_2","first-page":"6248","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"She Jiahui","year":"2021","unstructured":"Jiahui She, Yibo Hu, Hailin Shi, Jun Wang, Qiu Shen, and Tao Mei. 2021. Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 6248\u20136257."},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2020.11.026"},{"key":"e_1_3_2_32_2","article-title":"Very deep convolutional networks for large-scale image recognition","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from https:\/\/arXiv:1409.1556.","journal-title":"Retrieved from https:\/\/arXiv:1409.1556"},{"key":"e_1_3_2_33_2","volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201919)","author":"Thulasidasan Sunil","year":"2019","unstructured":"Sunil Thulasidasan, Tanmoy Bhattacharya, Jeff A. Bilmes, Gopinath Chennupati, and Jamal Mohd-Yusof. 2019. Combating label noise in deep learning using abstention. In Proceedings of the International Conference on Machine Learning (ICML\u201919)."},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.696"},{"key":"e_1_3_2_35_2","article-title":"AU-guided unsupervised domain adaptive facial expression recognition","author":"Wang Kai","year":"2020","unstructured":"Kai Wang, Yuxin Gu, Xiaojiang Peng, Panpan Zhang, Baigui Sun, and Hao Li. 2020. AU-guided unsupervised domain adaptive facial expression recognition. Retrieved from https:\/\/arXiv:2012.10078.","journal-title":"Retrieved from https:\/\/arXiv:2012.10078"},{"key":"e_1_3_2_36_2","first-page":"6897","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wang Kai","year":"2020","unstructured":"Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, and Yu Qiao. 2020. Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 6897\u20136906."},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2019.2956143"},{"issue":"7","key":"e_1_3_2_38_2","doi-asserted-by":"crossref","first-page":"3330","DOI":"10.1109\/TCYB.2019.2894498","article-title":"Enhancing sketch-based image retrieval by cnn semantic re-ranking","volume":"50","author":"Wang Luo","year":"2019","unstructured":"Luo Wang, Xueming Qian, Yuting Zhang, Jialie Shen, and Xiaochun Cao. 2019. Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE Trans. Cybernet. 50, 7 (2019), 3330\u20133342.","journal-title":"IEEE Trans. Cybernet."},{"key":"e_1_3_2_39_2","article-title":"Improved mean absolute error for learning meaningful patterns from abnormal training data","author":"Wang Xinshao","year":"2019","unstructured":"Xinshao Wang, Elyor Kodirov, Yang Hua, and Neil M. Robertson. 2019. Improved mean absolute error for learning meaningful patterns from abnormal training data. Technical report.","journal-title":"Technical report"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00041"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2019.03.019"},{"key":"e_1_3_2_42_2","first-page":"757","volume-title":"Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV\u201918)","author":"Yuan Bodi","year":"2018","unstructured":"Bodi Yuan, Jianyu Chen, Weidong Zhang, Hung-Shuo Tai, and Sara McMains. 2018. Iterative cross learning on noisy labels. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV\u201918). IEEE, 757\u2013765."},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2022.3145158"},{"key":"e_1_3_2_44_2","first-page":"20291","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Zeng Dan","year":"2022","unstructured":"Dan Zeng, Zhiyuan Lin, Xiao Yan, Yuting Liu, Fei Wang, and Bo Tang. 2022. Face2Exp: Combating data biases for facial expression recognition. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 20291\u201320300."},{"key":"e_1_3_2_45_2","first-page":"222","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201918)","author":"Zeng Jiabei","year":"2018","unstructured":"Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Facial expression recognition with inconsistently annotated datasets. In Proceedings of the European Conference on Computer Vision (ECCV\u201918). 222\u2013237."},{"key":"e_1_3_2_46_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Zhang Hongyi","year":"2018","unstructured":"Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, and David Lopez-Paz. 2018. mixup: Beyond empirical risk minimization. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_47_2","article-title":"Relative uncertainty learning for facial expression recognition","volume":"34","author":"Zhang Yuhang","year":"2021","unstructured":"Yuhang Zhang, Chengrui Wang, and Weihong Deng. 2021. Relative uncertainty learning for facial expression recognition. Adv. Neural Info. Process. Syst. 34 (2021).","journal-title":"Adv. Neural Info. Process. Syst."},{"key":"e_1_3_2_48_2","first-page":"3510","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"35","author":"Zhao Zengqun","year":"2021","unstructured":"Zengqun Zhao, Qingshan Liu, and Feng Zhou. 2021. Robust lightweight facial expression recognition network with label distribution training. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3510\u20133519."},{"key":"e_1_3_2_49_2","first-page":"11447","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Zheng Songzhu","year":"2020","unstructured":"Songzhu Zheng, Pengxiang Wu, Aman Goswami, Mayank Goswami, Dimitris Metaxas, and Chao Chen. 2020. Error-bounded correction of noisy labels. In Proceedings of the International Conference on Machine Learning. PMLR, 11447\u201311457."}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3570329","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3570329","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:38Z","timestamp":1750182578000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3570329"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,15]]},"references-count":48,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,5,31]]}},"alternative-id":["10.1145\/3570329"],"URL":"https:\/\/doi.org\/10.1145\/3570329","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,15]]},"assertion":[{"value":"2022-04-13","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-10-22","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-03-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}