{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,18]],"date-time":"2026-02-18T23:22:43Z","timestamp":1771456963798,"version":"3.50.1"},"reference-count":50,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2023,8,22]],"date-time":"2023-08-22T00:00:00Z","timestamp":1692662400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the National Key R&amp;D Program of China","award":["2018YFC2001700"],"award-info":[{"award-number":["2018YFC2001700"]}]},{"name":"the National Key R&amp;D Program of China","award":["L192005"],"award-info":[{"award-number":["L192005"]}]},{"DOI":"10.13039\/501100004826","name":"Beijing Natural Science Foundation","doi-asserted-by":"publisher","award":["2018YFC2001700"],"award-info":[{"award-number":["2018YFC2001700"]}],"id":[{"id":"10.13039\/501100004826","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004826","name":"Beijing Natural Science Foundation","doi-asserted-by":"publisher","award":["L192005"],"award-info":[{"award-number":["L192005"]}],"id":[{"id":"10.13039\/501100004826","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>The emotional changes in facial micro-expressions are combinations of action units. The researchers have revealed that action units can be used as additional auxiliary data to improve facial micro-expression recognition. Most of the researchers attempt to fuse image features and action unit information. However, these works ignore the impact of action units on the facial image feature extraction process. Therefore, this paper proposes a local detail feature enhancement model based on a multimodal dynamic attention fusion network (MADFN) method for micro-expression recognition. This method uses a masked autoencoder based on learnable class tokens to remove local areas with low emotional expression ability in micro-expression images. Then, we utilize the action unit dynamic fusion module to fuse action unit representation to improve the potential representation ability of image features. The state-of-the-art performance of our proposed model is evaluated and verified on SMIC, CASME II, SAMM, and their combined 3DB-Combined datasets. The experimental results demonstrated that the proposed model achieved competitive performance with accuracy rates of 81.71%, 82.11%, and 77.21% on SMIC, CASME II, and SAMM datasets, respectively, that show the MADFN model can help to improve the discrimination of facial image emotional features.<\/jats:p>","DOI":"10.3390\/e25091246","type":"journal-article","created":{"date-parts":[[2023,8,22]],"date-time":"2023-08-22T08:58:54Z","timestamp":1692694734000},"page":"1246","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Multimodal Attention Dynamic Fusion Network for Facial Micro-Expression Recognition"],"prefix":"10.3390","volume":"25","author":[{"given":"Hongling","family":"Yang","sequence":"first","affiliation":[{"name":"Department of Computer Science, Changzhi University, Changzhi 046011, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lun","family":"Xie","sequence":"additional","affiliation":[{"name":"School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hang","family":"Pan","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Changzhi University, Changzhi 046011, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chiqin","family":"Li","sequence":"additional","affiliation":[{"name":"School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhiliang","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jialiang","family":"Zhong","sequence":"additional","affiliation":[{"name":"School of Mathematics and Computer Sciences, Nanchang University, Nanchang 330031, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,8,22]]},"reference":[{"key":"ref_1","first-page":"5","article-title":"Lie catching and microexpressions","volume":"1","author":"Ekman","year":"2009","journal-title":"Philos. Decept."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"639","DOI":"10.1016\/j.tics.2019.05.006","article-title":"Multimodal language processing in human communication","volume":"23","author":"Holler","year":"2019","journal-title":"Trends Cognit. Sci."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1007\/s10979-008-9166-4","article-title":"Police lie detection accuracy: The effect of lie scenario","volume":"33","author":"Frank","year":"2009","journal-title":"Law. Human. Behav."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1016\/j.neucom.2021.01.032","article-title":"Micro-expression action unit detection with spatial and channel attention","volume":"436","author":"Li","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Xie, H.-X., Lo, L., Shuai, H.-H., and Cheng, W.-H. (2020, January 12\u201316). AU-assisted Graph Attention Convolutional Network for Micro-Expression Recognition. Proceedings of the ACM International Conference on Multimedia (ACM MM), Seattle, WA, USA.","DOI":"10.1145\/3394171.3414012"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Lei, L., Chen, T., Li, S., and Li, J. (2021, January 20\u201325). Micro-expression recognition based on facial graph representation learning and facial action unit fusion. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPRW53098.2021.00173"},{"key":"ref_7","unstructured":"Zhao, X., Ma, H., and Wang, R. (November, January 29). STA-GCN: Spatio-Temporal AU Graph Convolution Network for Facial Micro-expression Recognition. Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision (PRCV), Beijing, China."},{"key":"ref_8","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020, January 26\u201330). An Image is Worth 16 \u00d7 16 Words: Transformers for Image Recognition at Scale. Proceedings of the International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia."},{"key":"ref_9","unstructured":"Wang, Y., Huang, R., Song, S., Huang, Z., and Huang, G. (2021, January 6\u201314). Not All Images are Worth 16 \u00d7 16 Words: Dynamic Transformers for Efficient Image Recognition. Proceedings of the Advances Conference on Neural Information Processing Systems (NeurIPS), Virtual."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Lu, X., Cao, G., Yang, Y., Jiao, L., and Liu, F. (2021, January 11\u201317). ViT-YOLO: Transformer-Based YOLO for Object Detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00314"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., and Torr, P.H. (2021, January 20\u201325). Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"ref_12","first-page":"14745","article-title":"Transgan: Two pure transformers can make one strong gan, and that can scale up","volume":"34","author":"Jiang","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Li, X., Pfister, T., Huang, X., Zhao, G., and Pietik\u00e4inen, M. (2013, January 22\u201326). A spontaneous micro-expression database: Inducement, collection and baseline. Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, Shanghai, China.","DOI":"10.1109\/FG.2013.6553717"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Yan, W.J., Li, X., Wang, S.J., Zhao, G., Liu, Y.J., Chen, Y.H., and Fu, X. (2014). CASME II: An improved spontaneous micro-expression database and the baseline evaluation. PLoS ONE, 9.","DOI":"10.1371\/journal.pone.0086041"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1109\/TAFFC.2016.2573832","article-title":"SAMM: A spontaneous micro-facial movement dataset","volume":"9","author":"Davison","year":"2016","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"See, J., Yap, M.H., Li, J., Hong, X., and Wang, S.J. (2019, January 14\u201318). Megc 2019\u2014The second facial micro-expressions grand challenge. Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, Lille, France.","DOI":"10.1109\/FG.2019.8756611"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Pfister, T., Li, X., Zhao, G., and Pietik\u00e4inen, M. (2011, January 6\u201313). Recognising spontaneous facial micro-expressions. Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126401"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Huang, X., Wang, S.J., Zhao, G., and Piteikainen, M. (2015, January 7\u201313). Facial micro-expression recognition using spatiotemporal local binary pattern with integral projection. Proceedings of the IEEE International Conference on Computer Vision Workshops, Santiago, Chile.","DOI":"10.1109\/ICCVW.2015.10"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Le Ngo, A.C., Liong, S.T., See, J., and Phan, R.C.W. (2015, January 21\u201324). Are subtle expressions too sparse to recognize?. Proceedings of the IEEE International Conference on Digital Signal Processing, Singapore.","DOI":"10.1109\/ICDSP.2015.7252080"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"564","DOI":"10.1016\/j.neucom.2015.10.096","article-title":"Spontaneous facial micro-expression analysis using spatiotemporal completed local quantized patterns","volume":"175","author":"Huang","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1109\/TAFFC.2017.2667642","article-title":"Towards reading hidden emotions: A comparative study of spontaneous micro-expression spotting and recognition methods","volume":"9","author":"Li","year":"2017","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Faisal, M.M., Mohammed, M.S., Abduljabar, A.M., Abdulhussain, S.H., Mahmmod, B.M., Khan, W., and Hussain, A. (2021, January 7\u201310). Object Detection and Distance Measurement Using AI. Proceedings of the 2021 14th International Conference on Developments in eSystems Engineering (DeSE), Sharjah, United Arab Emirates.","DOI":"10.1109\/DeSE54285.2021.9719469"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"100969","DOI":"10.1016\/j.rineng.2023.100969","article-title":"Low-cost autonomous car level 2: Design and implementation for conventional vehicles","volume":"17","author":"Mohammed","year":"2023","journal-title":"Results Eng."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Wang, S.J., Yan, W.J., Li, X., Zhao, G., and Fu, X. (2014, January 24\u201328). Micro-expression recognition using dynamic textures on tensor independent color space. Proceedings of the International Conference on Pattern Recognition, Stockholm, Sweden.","DOI":"10.1109\/ICPR.2014.800"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"769","DOI":"10.1007\/s11760-022-02286-0","article-title":"A novel micro-expression recognition algorithm using dual-stream combining optical flow and dynamic image convolutional neural networks","volume":"17","author":"Tang","year":"2023","journal-title":"Signal Image Video Process."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/j.ins.2022.11.113","article-title":"Deep3DCANN: A Deep 3DCNN-ANN framework for spontaneous micro-expression recognition","volume":"630","author":"Thuseethan","year":"2023","journal-title":"Inf. Sci."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"122","DOI":"10.1016\/j.patrec.2023.02.003","article-title":"Temporal augmented contrastive learning for micro-expression recognition","volume":"167","author":"Wang","year":"2023","journal-title":"Pattern Recognit. Lett."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Kim, D.H., Baddar, W.J., and Ro, Y.M. (2016, January 15\u201319). Micro-expression recognition with expression-state constrained spatio-temporal feature representations. Proceedings of the ACM International Conference on Multimedia, Amsterdam, The Netherlands.","DOI":"10.1145\/2964284.2967247"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1016\/j.image.2019.02.005","article-title":"Off-apexnet on micro-expression recognition system","volume":"74","author":"Gan","year":"2019","journal-title":"Signal Process. Image Commun."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Van Quang, N., Chun, J., and Tokuyama, T. (2019, January 14\u201318). Capsulenet for micro-expression recognition. Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, Lille, France.","DOI":"10.1109\/FG.2019.8756544"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhou, L., Mao, Q., and Xue, L. (2019, January 14\u201318). Dual-inception network for cross-database micro-expression recognition. Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, Lille, France.","DOI":"10.1109\/FG.2019.8756579"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Liong, S.T., Gan, Y.S., See, J., Khor, H.Q., and Huang, Y.C. (2019, January 14\u201318). Shallow triple stream three-dimensional cnn (ststnet) for micro-expression recognition. Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, Lille, France.","DOI":"10.1109\/FG.2019.8756567"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Liu, Y., Du, H., Zheng, L., and Gedeon, T. (2019, January 14\u201318). A neural micro-expression recognizer. Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, Lille, France.","DOI":"10.1109\/FG.2019.8756583"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"6034","DOI":"10.1109\/TIP.2015.2496314","article-title":"Micro-Expression Recognition Using Color Spaces","volume":"24","author":"Wang","year":"2015","journal-title":"IEEE Trans. Image Process."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Davison, A., Merghani, W., Lansley, C., Ng, C.C., and Yap, M.H. (2018, January 15\u201319). Objective micro-facial movement detection using facs-based regions and baseline evaluation. Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, Xi\u2019an, China.","DOI":"10.1109\/FG.2018.00101"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Wang, S.J., Yan, W.J., Zhao, G., Fu, X., and Zhou, C.G. (2014, January 6\u201312). Micro-expression recognition using robust principal component analysis and local spatiotemporal directional features. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-16178-5_23"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1109\/TAFFC.2015.2485205","article-title":"A main directional mean optical flow feature for spontaneous micro-expression recognition","volume":"7","author":"Liu","year":"2015","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"254","DOI":"10.1109\/TAFFC.2016.2518162","article-title":"Microexpression identification and categorization using a facial dynamics map","volume":"8","author":"Xu","year":"2017","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"394","DOI":"10.1109\/TAFFC.2017.2723386","article-title":"Fuzzy histogram of optical flow orientations for micro-expression recognition","volume":"10","author":"Happy","year":"2017","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.image.2017.11.006","article-title":"Less is more: Micro-expression recognition from video using apex frame","volume":"62","author":"Liong","year":"2018","journal-title":"Signal Process. Image Commun."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Chen, B., Zhang, Z., Liu, N., Tan, Y., Liu, X., and Chen, T. (2020). Spatiotemporal Convolutional Neural Network with Convolutional Block Attention Module for Micro-Expression Recognition. Information, 11.","DOI":"10.3390\/info11080380"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1109\/TIP.2020.3035042","article-title":"Joint Local and Global Information Learning With Single Apex Frame Detection for Micro-Expression Recognition","volume":"30","author":"Li","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1016\/j.neucom.2020.06.005","article-title":"Micro-attention for micro-expression recognition","volume":"410","author":"Wang","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"8590","DOI":"10.1109\/TIP.2020.3018222","article-title":"Revealing the invisible with model and data shrinking for composite-database micro-expression recognition","volume":"29","author":"Xia","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"He, K., Chen, X., Xie, S., Li, Y., Doll\u00e1r, P., and Girshick, R. (2022, January 18\u201324). Masked autoencoders are scalable vision learners. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01553"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-end object detection with transformers. Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_47","first-page":"667","article-title":"Dynamic filter networks","volume":"29","author":"Jia","year":"2016","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_48","first-page":"1307","article-title":"Condconv: Conditionally parameterized convolutions for efficient inference","volume":"32","author":"Yang","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Sun, P., Zhang, R., Jiang, Y., Kong, T., Xu, C., Zhan, W., Tomizuka, M., Li, L., Yuan, Z., and Wang, C. (2021, January 20\u201325). Sparse r-cnn: End-to-end object detection with learnable proposals. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01422"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1111\/2041-210X.13335","article-title":"Thinking like a naturalist: Enhancing computer vision of citizen science images by harnessing contextual data","volume":"11","author":"Terry","year":"2020","journal-title":"Methods Ecol. Evol."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/25\/9\/1246\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:39:18Z","timestamp":1760128758000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/25\/9\/1246"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,22]]},"references-count":50,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2023,9]]}},"alternative-id":["e25091246"],"URL":"https:\/\/doi.org\/10.3390\/e25091246","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,22]]}}}