{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T16:04:55Z","timestamp":1772553895706,"version":"3.50.1"},"reference-count":36,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2019,5,2]],"date-time":"2019-05-02T00:00:00Z","timestamp":1556755200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61503143"],"award-info":[{"award-number":["61503143"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>Emotion recognition plays an essential role in human\u2013computer interaction. Previous studies have investigated the use of facial expression and electroencephalogram (EEG) signals from single modal for emotion recognition separately, but few have paid attention to a fusion between them. In this paper, we adopted a multimodal emotion recognition framework by combining facial expression and EEG, based on a valence-arousal emotional model. For facial expression detection, we followed a transfer learning approach for multi-task convolutional neural network (CNN) architectures to detect the state of valence and arousal. For EEG detection, two learning targets (valence and arousal) were detected by different support vector machine (SVM) classifiers, separately. Finally, two decision-level fusion methods based on the enumerate weight rule or an adaptive boosting technique were used to combine facial expression and EEG. In the experiment, the subjects were instructed to watch clips designed to elicit an emotional response and then reported their emotional state. We used two emotion datasets\u2014a Database for Emotion Analysis using Physiological Signals (DEAP) and MAHNOB-human computer interface (MAHNOB-HCI)\u2014to evaluate our method. In addition, we also performed an online experiment to make our method more robust. We experimentally demonstrated that our method produces state-of-the-art results in terms of binary valence\/arousal classification, based on DEAP and MAHNOB-HCI data sets. Besides this, for the online experiment, we achieved 69.75% accuracy for the valence space and 70.00% accuracy for the arousal space after fusion, each of which has surpassed the highest performing single modality (69.28% for the valence space and 64.00% for the arousal space). The results suggest that the combination of facial expressions and EEG information for emotion recognition compensates for their defects as single information sources. The novelty of this work is as follows. To begin with, we combined facial expression and EEG to improve the performance of emotion recognition. Furthermore, we used transfer learning techniques to tackle the problem of lacking data and achieve higher accuracy for facial expression. Finally, in addition to implementing the widely used fusion method based on enumerating different weights between two models, we also explored a novel fusion method, applying boosting technique.<\/jats:p>","DOI":"10.3390\/fi11050105","type":"journal-article","created":{"date-parts":[[2019,5,7]],"date-time":"2019-05-07T03:15:46Z","timestamp":1557198946000},"page":"105","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":112,"title":["Combining Facial Expressions and Electroencephalography to Enhance Emotion Recognition"],"prefix":"10.3390","volume":"11","author":[{"given":"Yongrui","family":"Huang","sequence":"first","affiliation":[{"name":"School of Software, South China Normal University, Guangzhou 510641, China"}]},{"given":"Jianhao","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Software, South China Normal University, Guangzhou 510641, China"}]},{"given":"Siyu","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Software, South China Normal University, Guangzhou 510641, China"}]},{"given":"Jiahui","family":"Pan","sequence":"additional","affiliation":[{"name":"School of Software, South China Normal University, Guangzhou 510641, China"}]}],"member":"1968","published-online":{"date-parts":[[2019,5,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1007\/s10458-005-1081-1","article-title":"Evaluating a computational model of emotion","volume":"11","author":"Gratch","year":"2005","journal-title":"Auton. Agents Multi-Agent Syst."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"695","DOI":"10.1177\/0539018405058216","article-title":"What are emotions? And how can they be measured?","volume":"44","author":"Scherer","year":"2005","journal-title":"Soc. Sci. Inf."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Gunes, H., Schuller, B., Pantic, M., and Cowie, R. (2011, January 21\u201325). Emotion representation, analysis and synthesis in continuous space: A survey. Proceedings of the IEEE International Conference on Automatic Face & Gesture Recognition and Workshops IEEE, Santa Barbara, CA, USA.","DOI":"10.1109\/FG.2011.5771357"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1016\/0092-6566(77)90037-X","article-title":"Evidence for a three-factor theory of emotions","volume":"11","author":"Russell","year":"1977","journal-title":"J. Res. Personal."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Dhall, A., Goecke, R., Ghosh, S., Joshi, J., Hoey, J., and Gedeon, T. (2017, January 13\u201317). From individual to group-level emotion recognition: EmotiW 5.0. Proceedings of the ACM International Conference on Multimodal Interaction, Glasgow, UK.","DOI":"10.1145\/3136755.3143004"},{"key":"ref_6","first-page":"47","article-title":"Emotion recognition from speech with gaussian mixture models & via boosted gmm","volume":"3","author":"Patel","year":"2017","journal-title":"Int. J. Res. Sci. Eng."},{"key":"ref_7","unstructured":"Zheng, W.-L., Zhu, J.-Y., and Lu, B.-L. (2017). Identifying stable patterns over time for emotion recognition from EEG. IEEE Trans. Affect. Comput., 1."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"26697","DOI":"10.1007\/s11042-018-5885-9","article-title":"EEG-based classification of emotions using empirical mode decomposition and autoregressive model","volume":"77","author":"Zhang","year":"2018","journal-title":"Multimed. Tools Appl."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Xie, J., Xu, X., and Shu, L. (2018, January 20\u201322). WT Feature Based Emotion Recognition from Multi-channel Physiological Signals with Decision Fusion. Proceedings of the 2018 First Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia), Beijing, China.","DOI":"10.1109\/ACIIAsia.2018.8470381"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1016\/j.imavis.2012.10.002","article-title":"Fusion of facial expressions and EEG for implicit affective tagging","volume":"31","author":"Koelstra","year":"2013","journal-title":"Image Vis. Comput."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Soleymani, M., Asghariesfeden, S., Pantic, M., and Fu, Y. (2014, January 14\u201318). Continuous emotion detection using EEG signals and facial expressions. Proceedings of the IEEE International Conference on Multimedia and Expo, Chengdu, China.","DOI":"10.1109\/ICME.2014.6890301"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2107451","DOI":"10.1155\/2017\/2107451","article-title":"Fusion of Facial Expressions and EEG for Multimodal Emotion Recognition","volume":"2017","author":"Huang","year":"2017","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1109\/T-AFFC.2011.25","article-title":"A multimodal database for affect recognition and implicit tagging","volume":"3","author":"Soleymani","year":"2012","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1109\/T-AFFC.2011.15","article-title":"Deap: A database for emotion analysis; using physiological signals","volume":"3","author":"Koelstra","year":"2012","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1016\/0005-7916(94)90063-9","article-title":"Measuring emotion: The self-assessment manikin and the semantic differential","volume":"25","author":"Bradley","year":"1994","journal-title":"J. Behav. Ther. Exp. Psychiatry"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1023\/B:VISI.0000013087.49260.fb","article-title":"Robust real-time face detection","volume":"57","author":"Viola","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Goodfellow, I.J., Erhan, D., Carrier, P.L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., and Lee, D.H. (2013, January 3\u20137). Challenges in representation learning: A report on three machine learning contests. Proceedings of the International Conference on Neural Information Processing, Daegu, Korea.","DOI":"10.1007\/978-3-642-42051-1_16"},{"key":"ref_18","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). Imagenet classification with deep convolutional neural networks. Proceedings of the Neural Information Processing Systems Conference (NIPS 2012), Lake Tahoe, NV, USA."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1109\/TSMCA.2011.2147307","article-title":"A new fractional random wavelet transform for fingerprint security","volume":"42","author":"Bhatnagar","year":"2012","journal-title":"IEEE Trans. Syst. Man Cybern. Part A Syst. Hum."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1016\/j.neuroimage.2013.11.007","article-title":"Multimodal fusion framework: A multiresolution approach for emotion classification and recognition from physiological signals","volume":"102","author":"Verma","year":"2014","journal-title":"Neuroimage"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1109\/TNB.2005.853657","article-title":"Multiple SVM-RFE for gene selection in cancer classification with expression data","volume":"4","author":"Duan","year":"2005","journal-title":"IEEE Trans. NanoBiosci."},{"key":"ref_22","unstructured":"Freund, Y., and Schapire, R.E. (1996, January 3\u20136). Experiments with a new boosting algorithm. Proceedings of the Thirteenth International Conference on International Conference on Machine Learning, Bari, Italy."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Ponti, M.P. (2011, January 28\u201330). Combining classifiers: From the creation of ensembles to the decision fusion. Proceedings of the 2011 24th SIBGRAPI Conference on Graphics, Patterns and Images Tutorials (SIBGRAPI-T), Alagoas, Brazil.","DOI":"10.1109\/SIBGRAPI-T.2011.9"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Gao, Z., and Wang, S. (2015, January 23\u201326). Emotion recognition from EEG signals using hierarchical bayesian network with privileged information. Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, Shanghai, China.","DOI":"10.1145\/2671188.2749364"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Rozgi\u0107, V., Vitaladevuni, S.N., and Prasad, R. (2013, January 26\u201331). Robust EEG emotion classification using segment level decision fusion. Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, BC, Canada.","DOI":"10.1109\/ICASSP.2013.6637858"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A. (2014). Return of the devil in the details: Delving deep into convolutional nets. arXiv.","DOI":"10.5244\/C.28.6"},{"key":"ref_28","unstructured":"Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., and Darrell, T. (2014, January 16\u201321). Decaf: A deep convolutional activation feature for generic visual recognition. Proceedings of the International Conference on Machine Learning, Beijing, China."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 24\u201327). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_30","unstructured":"Yosinski, J., Clune, J., Bengio, Y., and Lipson, H. (2014, January 8\u201313). How transferable are features in deep neural networks?. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhang, C., and Zhang, Z. (2014, January 24\u201326). Improving multiview face detection with multi-task deep convolutional neural networks. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Steamboat Springs, CO, USA.","DOI":"10.1109\/WACV.2014.6835990"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1109\/TPAMI.2017.2781233","article-title":"Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition","volume":"41","author":"Ranjan","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Khorrami, P., Paine, T., and Huang, T. (2015, January 7\u201313). Do deep neural networks learn facial action units when doing expression recognition?. Proceedings of the IEEE International Conference on Computer Vision Workshops (CVPR), Santiago, Chile.","DOI":"10.1109\/ICCVW.2015.12"},{"key":"ref_34","unstructured":"Yosinski, J., Clune, J., Fuchs, T., and Lipson, H. (2015, January 6\u201311). Understanding neural networks through deep visualization. Proceedings of the International Conference on Machine Learning (ICML) Workshop on Deep Learning, Lille, France."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"198","DOI":"10.3389\/fnhum.2018.00198","article-title":"Emotion-Related Consciousness Detection in Patients with Disorders of Consciousness through an EEG-Based BCI System","volume":"12","author":"Pan","year":"2018","journal-title":"Front. Hum. Neurosci."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1103\/PhysRevB.85.235149","article-title":"Density functionals for surface science: Exchange-correlation model development with Bayesian error estimation","volume":"85","author":"Wellendorff","year":"2012","journal-title":"Phys. Rev. B"}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/11\/5\/105\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:48:45Z","timestamp":1760186925000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/11\/5\/105"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,2]]},"references-count":36,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2019,5]]}},"alternative-id":["fi11050105"],"URL":"https:\/\/doi.org\/10.3390\/fi11050105","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,5,2]]}}}