{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T16:03:15Z","timestamp":1772553795717,"version":"3.50.1"},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2019,5,31]],"date-time":"2019-05-31T00:00:00Z","timestamp":1559260800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100008982","name":"Qatar National Research Fund","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100008982","id-type":"DOI","asserted-by":"crossref"}]},{"name":"NPRP","award":["10-0205-170346"],"award-info":[{"award-number":["10-0205-170346"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2019,5,31]]},"abstract":"<jats:p>This article presents an image-based real-time facial expression recognition system that is able to recognize the facial expressions of several subjects on a webcam at the same time. Our proposed methodology combines a supervised transfer learning strategy and a joint supervision method with center loss, which is crucial for facial tasks. A newly proposed Convolutional Neural Network (CNN) model, MobileNet, which has both accuracy and speed, is deployed in both offline and in a real-time framework that enables fast and accurate real-time output. Evaluations towards two publicly available datasets, JAFFE and CK+, are carried out respectively. The JAFFE dataset reaches an accuracy of 95.24%, while an accuracy of 96.92% is achieved on the 6-class CK+ dataset, which contains only the last frames of image sequences. At last, the average run-time cost for the recognition of the real-time implementation is around 3.57ms\/frame on a NVIDIA Quadro K4200 GPU.<\/jats:p>","DOI":"10.1145\/3311747","type":"journal-article","created":{"date-parts":[[2019,6,6]],"date-time":"2019-06-06T12:28:42Z","timestamp":1559824122000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":41,"title":["A Deep Learning System for Recognizing Facial Expression in Real-Time"],"prefix":"10.1145","volume":"15","author":[{"given":"Yu","family":"Miao","sequence":"first","affiliation":[{"name":"University of Ottawa, Ottawa, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1437-7805","authenticated-orcid":false,"given":"Haiwei","family":"Dong","sequence":"additional","affiliation":[{"name":"University of Ottawa, Ottawa, Canada"}]},{"given":"Jihad Mohamad Al","family":"Jaam","sequence":"additional","affiliation":[{"name":"Qatar University, Doha, Qatar"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7690-8547","authenticated-orcid":false,"given":"Abdulmotaleb El","family":"Saddik","sequence":"additional","affiliation":[{"name":"University of Ottawa, Ottawa, Canada"}]}],"member":"320","published-online":{"date-parts":[[2019,6,5]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Ognjen Rudovic Jaeryoung Lee Miles Dai Bjorn Schuller and Rosalind Picard. 2018. Personalized machine learning for robot perception of affect and engagement in autism therapy. Retrieved from arXiv preprint arXiv:1802.01186.  Ognjen Rudovic Jaeryoung Lee Miles Dai Bjorn Schuller and Rosalind Picard. 2018. Personalized machine learning for robot perception of affect and engagement in autism therapy. Retrieved from arXiv preprint arXiv:1802.01186.","DOI":"10.1126\/scirobotics.aao6760"},{"key":"e_1_2_1_2_1","volume-title":"EVM-CNN: Real-time contactless heart rate estimation from facial video","author":"Qiu Ying","year":"2019","unstructured":"Ying Qiu , Yang Liu , Juan Arteaga-Falconi , Haiwei Dong , and Abdulmotaleb El Saddik . 2019. EVM-CNN: Real-time contactless heart rate estimation from facial video . IEEE Trans. Multimedia ( 2019 ). Ying Qiu, Yang Liu, Juan Arteaga-Falconi, Haiwei Dong, and Abdulmotaleb El Saddik. 2019. EVM-CNN: Real-time contactless heart rate estimation from facial video. IEEE Trans. Multimedia (2019)."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/MMUL.2018.023121167"},{"key":"e_1_2_1_4_1","volume-title":"Communication without words","author":"Mehrabian Albert","unstructured":"Albert Mehrabian . 2008. Communication without words . Communication Theory, C. David Mortensen (Ed.). Transaction Publishers , New Brunswick , 193--200. Albert Mehrabian. 2008. Communication without words. Communication Theory, C. David Mortensen (Ed.). Transaction Publishers, New Brunswick, 193--200."},{"key":"e_1_2_1_5_1","volume-title":"Friesen","author":"Ekman Paul","year":"2003","unstructured":"Paul Ekman and Wallace V . Friesen . 2003 . Unmasking the Face : A Guide to Recognizing Emotions from Facial Clues. ISHK. Paul Ekman and Wallace V. Friesen. 2003. Unmasking the Face: A Guide to Recognizing Emotions from Facial Clues. ISHK."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/T-AFFC.2011.13"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/520809.796139"},{"key":"e_1_2_1_8_1","first-page":"86","article-title":"A new facial expression recognition method based on local Gabor filter bank and PCA plus LDA","volume":"11","author":"Deng Hong-Bo","year":"2005","unstructured":"Hong-Bo Deng , Lian-Wen Jin , Li-Xin Zhen , Jian-Cheng Huang . 2005 . A new facial expression recognition method based on local Gabor filter bank and PCA plus LDA . Int. J. Inform. Technol. 11 , 11 (2005), 86 -- 96 . Hong-Bo Deng, Lian-Wen Jin, Li-Xin Zhen, Jian-Cheng Huang.2005. A new facial expression recognition method based on local Gabor filter bank and PCA plus LDA. Int. J. Inform. Technol. 11, 11 (2005), 86--96.","journal-title":"Int. J. Inform. Technol."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3176646"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP.2006.312418"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/CAST.2016.7914982"},{"key":"e_1_2_1_12_1","unstructured":"Rahul Islam Karan Ahuja Sandip Karmakar and Ferdous Barbhuiya. 2016. SenTion: A framework for sensing facial expressions. Retrieved from arXiv preprint arXiv:1608.04489.  Rahul Islam Karan Ahuja Sandip Karmakar and Ferdous Barbhuiya. 2016. SenTion: A framework for sensing facial expressions. Retrieved from arXiv preprint arXiv:1608.04489."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3152118"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2016.07.233"},{"key":"e_1_2_1_15_1","volume-title":"Mobilenets: Efficient convolutional neural networks for mobile vision applications. Retrieved from arXiv preprint arXiv:1704.04861.","author":"Howard Andrew G.","year":"2017","unstructured":"Andrew G. Howard , Menglong Zhu , Bo Chen , Dmitry Kalenichenko , Weijun Wang , Tobias Weyand , Marco Andreetto , and Hartwig Adam . 2017 . Mobilenets: Efficient convolutional neural networks for mobile vision applications. Retrieved from arXiv preprint arXiv:1704.04861. Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. Retrieved from arXiv preprint arXiv:1704.04861."},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the 25th International Conference on Neural Information Processing Systems","volume":"1","author":"Krizhevsky Alex","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey E. Hinton . 2012. ImageNet classification with deep convolutional neural networks . In Proceedings of the 25th International Conference on Neural Information Processing Systems , Vol. 1 . 1097--1105. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Vol. 1. 1097--1105."},{"key":"e_1_2_1_17_1","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from arXiv preprint arXiv:1409.1556.  Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from arXiv preprint arXiv:1409.1556."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2010.5543262"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/520809.796143"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2005.1521424"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1080\/02699930903485076"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2876035"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2016.2535302"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/s12193-015-0209-0"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2818346.2830593"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46478-7_31"},{"key":"e_1_2_1_28_1","volume-title":"The Expression of the Emotions in Man and Animals","author":"Darwin Charles","unstructured":"Charles Darwin and Phillip Prodger . 1998. The Expression of the Emotions in Man and Animals . Oxford University Press , USA. Charles Darwin and Phillip Prodger. 1998. The Expression of the Emotions in Man and Animals. Oxford University Press, USA."},{"key":"e_1_2_1_29_1","volume-title":"Rosenberg","author":"Ekman Paul","year":"1997","unstructured":"Paul Ekman and Erika L . Rosenberg . 1997 . What the Face Reveals : Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press , USA. Paul Ekman and Erika L. Rosenberg. 1997. What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press, USA."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMCC.2011.2118750"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3131345"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2818346.2830595"},{"key":"e_1_2_1_33_1","volume-title":"Andreas Dengel, and Marcus Liwicki.","author":"Burkert Peter","year":"2015","unstructured":"Peter Burkert , Felix Trier , Muhammad Zeshan Afzal , Andreas Dengel, and Marcus Liwicki. 2015 . DeXpression: Deep convolutional neural network for expression recognition. Retrieved from arXiv preprint arXiv:1509.05371. Peter Burkert, Felix Trier, Muhammad Zeshan Afzal, Andreas Dengel, and Marcus Liwicki. 2015. DeXpression: Deep convolutional neural network for expression recognition. Retrieved from arXiv preprint arXiv:1509.05371."},{"key":"e_1_2_1_34_1","unstructured":"Yichuan Tang. 2013. Deep learning using linear support vector machines. Retrieved from arXiv preprint arXiv:1306.0239.  Yichuan Tang. 2013. Deep learning using linear support vector machines. Retrieved from arXiv preprint arXiv:1306.0239."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-42051-1_16"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the IEEE Winter Conference on Applications of Computer Vision. 1--10","author":"Mollahosseini Ali","unstructured":"Ali Mollahosseini , David Chan , and Mohammad H. Mahoor . 2016. Going deeper in facial expression recognition using deep neural networks . In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. 1--10 . Ali Mollahosseini, David Chan, and Mohammad H. Mahoor. 2016. Going deeper in facial expression recognition using deep neural networks. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. 1--10."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:VISI.0000013087.49260.fb"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2009.191"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. 8609--8613","author":"Dahl George E.","unstructured":"George E. Dahl , Tara N. Sainath , and Geoffrey E. Hinton . 2013. Improving deep neural networks for LVCSR using rectified linear units and dropout . In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. 8609--8613 . George E. Dahl, Tara N. Sainath, and Geoffrey E. Hinton. 2013. Improving deep neural networks for LVCSR using rectified linear units and dropout. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. 8609--8613."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAFFC.2014.2386334"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCECE.2013.6567728"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.233"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0218001408006284"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2010.2064176"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/T-AFFC.2012.33"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/SIBGRAPI.2015.14"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/SKIMA.2014.7083542"},{"key":"e_1_2_1_50_1","volume-title":"Wooi Ping Cheah, and Tee Connie","author":"Al-Shabi Mundher","year":"2016","unstructured":"Mundher Al-Shabi , Wooi Ping Cheah, and Tee Connie . 2016 . Facial expression recognition using a hybrid CNN-SIFT aggregator. Retrieved from arXiv preprint arXiv:1608.02833. Mundher Al-Shabi, Wooi Ping Cheah, and Tee Connie. 2016. Facial expression recognition using a hybrid CNN-SIFT aggregator. Retrieved from arXiv preprint arXiv:1608.02833."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2015.12"},{"key":"e_1_2_1_52_1","volume-title":"Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. 265--283","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , Manjunath Kudlur , Josh Levenberg , Rajat Monga , Sherry Moore , Derek G. Murray , Benoit Steiner , Paul Tucker , Vijay Vasudevan , Pete Warden , Martin Wicke , Yuan Yu , and Xiaoqiang Zheng . 2016 . TensorFlow: A system for large-scale machine learning . In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. 265--283 . Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. 265--283."}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3311747","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3311747","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T00:25:31Z","timestamp":1750206331000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3311747"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,31]]},"references-count":52,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2019,5,31]]}},"alternative-id":["10.1145\/3311747"],"URL":"https:\/\/doi.org\/10.1145\/3311747","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,5,31]]},"assertion":[{"value":"2018-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-06-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}