{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T20:50:12Z","timestamp":1761598212109,"version":"3.37.3"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2020,3,1]],"date-time":"2020-03-01T00:00:00Z","timestamp":1583020800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/xx","name":"Humanities and Social Sciences Projects of the Ministry of Education","doi-asserted-by":"publisher","award":["18YJC760112"],"award-info":[{"award-number":["18YJC760112"]}],"id":[{"id":"10.13039\/xx","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/xx","name":"Social Science Fund of Jiangsu Province","doi-asserted-by":"publisher","award":["18YSD002"],"award-info":[{"award-number":["18YSD002"]}],"id":[{"id":"10.13039\/xx","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/xx","name":"Fundamental Research Funds for the Central University","doi-asserted-by":"publisher","award":["JUSRP11854","2019JDZD02"],"award-info":[{"award-number":["JUSRP11854","2019JDZD02"]}],"id":[{"id":"10.13039\/xx","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,8,13]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Human\u2013computer interaction (HCI) has received growing interest in both academic research and the design of information technological applications. Automated facial expression estimation of image is a difficult, yet crucial, problem in the design of HCI system. Although artificial neural network has achieved many remarkable results, few smart wearable devices can benefit from it. Most of these devices are constrained by limited computing and storage capacity. An effective solution is to allow servers to handle multiple tasks simultaneously. Toward this goal, we have been building an Efficient multitask scheme for facial expression estimation (EM-FEE). A multitask neural network is designed to enable the HCI system to accomplish different related tasks at the same time, that is, locating the user\u2019s facial landmarks and estimating facial expressions. Experimental results demonstrate that our proposed scheme outperforms state-of-the-art. Finally, we review the remaining challenges and corresponding opportunities as well as future directions of the design of facial expression estimation systems for smart wearable devices.<\/jats:p>","DOI":"10.1093\/iwcomp\/iwaa011","type":"journal-article","created":{"date-parts":[[2020,6,3]],"date-time":"2020-06-03T13:18:45Z","timestamp":1591190325000},"page":"142-152","source":"Crossref","is-referenced-by-count":4,"title":["EM-FEE: An Efficient Multitask Scheme for Facial Expression Estimation"],"prefix":"10.1093","volume":"32","author":[{"given":"Bin","family":"Yang","sequence":"first","affiliation":[{"name":"School of Design, Jiangnan University, NO.1800, Lihu Road, Wuxi City, Jiangsu Province, 214122, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhenyu","family":"Li","sequence":"additional","affiliation":[{"name":"School of Design, Jiangnan University, NO.1800, Lihu Road, Wuxi City, Jiangsu Province, 214122, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yingtao","family":"Sun","sequence":"additional","affiliation":[{"name":"School of Design, Jiangnan University, NO.1800, Lihu Road, Wuxi City, Jiangsu Province, 214122, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Enguo","family":"Cao","sequence":"additional","affiliation":[{"name":"School of Design, Jiangnan University, NO.1800, Lihu Road, Wuxi City, Jiangsu Province, 214122, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2020,6,3]]},"reference":[{"key":"2020081310163910100_ref1","first-page":"1859","article-title":"Incremental face alignment in the wild","volume-title":"Proc. computer vision and pattern recognition","author":"Asthana","year":"2014"},{"key":"2020081310163910100_ref2","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1017\/S0048577299971664","article-title":"Measuring facial expressions by computer image analysis","volume":"36","author":"Bartlett","year":"2010","journal-title":"Psychophysiology"},{"key":"2020081310163910100_ref3","doi-asserted-by":"crossref","first-page":"14","DOI":"10.3390\/bdcc3010014","article-title":"A review of facial landmark extraction in 2D images and videos using deep learning","volume":"3","author":"Bodini","year":"2019","journal-title":"Big Data Cogn. Comput."},{"key":"2020081310163910100_ref4","first-page":"109","article-title":"Joint cascade face detection and alignment","volume-title":"Proc. European conf. computer vision","author":"Chen","year":"2014"},{"key":"2020081310163910100_ref5","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1007\/s11263-017-0999-5","article-title":"A comprehensive performance evaluation of deformable face tracking \u201cin-the-wild\u201d","volume":"126","author":"Chrysos","year":"2016","journal-title":"Int. J. Comput. Vis."},{"key":"2020081310163910100_ref6","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1109\/ICMI.2002.1167051","article-title":"Head-pose invariant facial expression recognition using convolutional neural networks","volume-title":"Proc. IEEE int. conf. multimodal interfaces","author":"Fasel","year":"2002"},{"key":"2020081310163910100_ref7","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1016\/j.neucom.2017.04.014","article-title":"Multi-task, multi-domain learning: Application to semantic segmentation and pose regression","volume":"251","author":"Fourure","year":"2017","journal-title":"Neurocomputing"},{"key":"2020081310163910100_ref8","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.neunet.2014.09.005","article-title":"Challenges in representation learning: A report on three machine learning contests","volume":"64","author":"Goodfellow","year":"2015","journal-title":"Neural Netw."},{"key":"2020081310163910100_ref9","first-page":"2144","article-title":"Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization","volume-title":"Proc. IEEE int. conf. computer vision workshops","author":"K\u00f6stinger","year":"2012"},{"key":"2020081310163910100_ref10","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR.2014.241","article-title":"One millisecond face alignment with an ensemble of regression trees","volume-title":"Proc. IEEE conf. computer vision & pattern recognition","author":"Kazemi","year":"2014"},{"key":"2020081310163910100_ref11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s12193-015-0209-0","article-title":"Hierarchical committee of deep convolutional neural networks for robust facial expression recognition","volume":"10","author":"Kim","year":"2016","journal-title":"J. Multimodal User Interf."},{"key":"2020081310163910100_ref12","doi-asserted-by":"crossref","first-page":"652","DOI":"10.1007\/978-3-319-09147-1_47","article-title":"The relationship between human and smart TVs based on emotion recognition in HCI","volume-title":"Computational science and its applications\u2014ICCSA 2014","author":"Lee","year":"2014"},{"key":"2020081310163910100_ref13","doi-asserted-by":"crossref","first-page":"27703","DOI":"10.1007\/s11042-019-07892-8","article-title":"Deep learning face representation by fixed erasing in facial landmarks","author":"Lei","year":"2019","journal-title":"Multimed. Tools Appl."},{"key":"2020081310163910100_ref14","first-page":"793","article-title":"Probabilistic elastic part model for unsupervised face detector adaptation","volume-title":"Proc. IEEE int. conf. computer vision","author":"Li","year":"2014"},{"key":"2020081310163910100_ref15","doi-asserted-by":"publisher","DOI":"10.1109\/TAFFC.2020.2981446","article-title":"Deep facial expression recognition: A survey","volume-title":"IEEE Transactions on Affective Computing","author":"Li","year":"2018"},{"key":"2020081310163910100_ref16","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-73003-5","article-title":"Human\u2013computer interaction (HCI) and user interfaces","volume-title":"Encyclopedia of Biometrics","author":"Li","year":"2009"},{"key":"2020081310163910100_ref17","first-page":"94","article-title":"The extended Cohn\u2013Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression","volume-title":"Proc. computer vision and pattern recognition workshops","author":"Lucey","year":"2010"},{"key":"2020081310163910100_ref18","doi-asserted-by":"crossref","first-page":"555","DOI":"10.1016\/S0893-6080(03)00115-1","article-title":"Subject independent facial expression recognition with robust face detection using a convolutional neural network","volume":"16","author":"Matsugu","year":"2003","journal-title":"Neural Netw."},{"key":"2020081310163910100_ref19","first-page":"558","article-title":"Identity-aware convolutional neural network for facial expression recognition","volume-title":"Proc. IEEE int. conf. automatic face and gesture recognition","author":"Meng","year":"2017"},{"key":"2020081310163910100_ref20","first-page":"807","article-title":"Rectified linear units improve restricted Boltzmann machines","volume-title":"Proc. int. conf. machine learning","author":"Nair","year":"2010"},{"key":"2020081310163910100_ref21","first-page":"585","article-title":"DeepPrior++: improving fast and accurate 3D hand pose estimation","volume-title":"Proc. IEEE int. conf. computer vision workshops","author":"Oberweger","year":"2018"},{"key":"2020081310163910100_ref22","first-page":"1","article-title":"Web-based database for facial expression analysis","volume-title":"Proc. IEEE int. conf. multimedia and expo","author":"Pantic","year":"2005"},{"key":"2020081310163910100_ref23","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1016\/j.intcom.2003.12.001","article-title":"The effects of affective interventions in human-computer interaction","volume":"16","author":"Partala","year":"2004","journal-title":"Interact. Comput."},{"key":"2020081310163910100_ref24","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1016\/j.intcom.2005.05.002","article-title":"Real-time estimation of emotional experiences from facial expressions","volume":"18","author":"Partala","year":"2006","journal-title":"Interact. Comput."},{"key":"2020081310163910100_ref25","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1109\/34.598226","article-title":"Visual interpretation of hand gestures for human-computer interaction: A review","volume":"19","author":"Pavlovic","year":"1997","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"2020081310163910100_ref26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2818740","article-title":"Adaptive body gesture representation for automatic emotion recognition","volume":"6","author":"Piana","year":"2016","journal-title":"ACM Trans. Interact. Intell. Syst."},{"article-title":"Multi-task, multi-label and multi-domain learning with residual convolutional networks for emotion recognition","year":"2018","author":"Pons","key":"2020081310163910100_ref27"},{"key":"2020081310163910100_ref28","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1007\/s12021-008-9015-0","article-title":"Describing different brain computer interface systems through a unique model: A UML implementation","volume":"6","author":"Quitadamo","year":"2008","journal-title":"Neuroinformatics"},{"key":"2020081310163910100_ref29","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1109\/TPAMI.2017.2781233","article-title":"HyperFace: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition","volume":"41","author":"Ranjan","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"2020081310163910100_ref30","first-page":"1138","article-title":"Minimum precision requirements for the SVM-SGD learning algorithm","volume-title":"Proc. IEEE int. conf. acoustics","author":"Sakr","year":"2017"},{"key":"2020081310163910100_ref31","doi-asserted-by":"crossref","first-page":"803","DOI":"10.1016\/j.imavis.2008.08.005","article-title":"Facial expression recognition based on local binary patterns: A comprehensive study","volume":"27","author":"Shan","year":"2009","journal-title":"Image Vision Comput."},{"key":"2020081310163910100_ref32","first-page":"2217","article-title":"Understanding and improving convolutional neural networks via concatenated rectified linear units","author":"Shang","year":"2016"},{"key":"2020081310163910100_ref33","doi-asserted-by":"crossref","first-page":"061407","DOI":"10.1117\/1.JEI.25.6.061407","article-title":"Facial expression recognition in the wild based on multimodal texture features","volume":"25","author":"Sun","year":"2016","journal-title":"J. Electron. Imaging"},{"key":"2020081310163910100_ref34","doi-asserted-by":"crossref","first-page":"587","DOI":"10.1007\/s12559-019-09654-y","article-title":"Facial expression recognition based on a hybrid model combining deep and shallow features","volume":"11","author":"Sun","year":"2019","journal-title":"Cogn. Comput."},{"key":"2020081310163910100_ref35","first-page":"1","article-title":"Going deeper with convolutions","volume-title":"Proc. IEEE conf. computer vision and pattern recognition (CVPR)","author":"Szegedy","year":"2015"},{"key":"2020081310163910100_ref36","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4615-5529-2","volume-title":"Learning to Learn","author":"Thrun","year":"1998"},{"key":"2020081310163910100_ref37","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1109\/34.908962","article-title":"Recognizing action units for facial expression analysis","volume":"23","author":"Tian","year":"2001","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"2020081310163910100_ref38","article-title":"Affect and Emotion in HCI","volume-title":"Automatic Recognition of Emotions from Speech: A Review of the Literature and Recommendations for Practical Realisation","author":"Vogt","year":"2008"},{"key":"2020081310163910100_ref39","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1016\/j.neucom.2017.05.013","article-title":"Facial feature point detection: A comprehensive survey","volume":"275","author":"Wang","year":"2018","journal-title":"Neurocomputing"},{"key":"2020081310163910100_ref40","first-page":"499","article-title":"A discriminative feature learning approach for deep face recognition","volume-title":"Proc. European conf. computer vision","author":"Wen","year":"2016"},{"key":"2020081310163910100_ref41","first-page":"3631","article-title":"Learning social relation traits from face images","volume-title":"Proc. IEEE int. conf. computer vision","author":"Zhang","year":"2015"}],"container-title":["Interacting with Computers"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/iwc\/article-pdf\/32\/2\/142\/33646070\/iwaa011.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/iwc\/article-pdf\/32\/2\/142\/33646070\/iwaa011.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,8,14]],"date-time":"2020-08-14T04:22:54Z","timestamp":1597378974000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/iwc\/article\/32\/2\/142\/5849512"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,3]]},"references-count":41,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2020,6,3]]},"published-print":{"date-parts":[[2020,8,13]]}},"URL":"https:\/\/doi.org\/10.1093\/iwcomp\/iwaa011","relation":{},"ISSN":["0953-5438","1873-7951"],"issn-type":[{"type":"print","value":"0953-5438"},{"type":"electronic","value":"1873-7951"}],"subject":[],"published-other":{"date-parts":[[2020,3]]},"published":{"date-parts":[[2020,3]]}}}