{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,20]],"date-time":"2025-12-20T22:16:50Z","timestamp":1766269010011,"version":"3.41.2"},"reference-count":70,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T00:00:00Z","timestamp":1648512000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61932007 and 62090025"],"award-info":[{"award-number":["61932007 and 62090025"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CNS-2008151"],"award-info":[{"award-number":["CNS-2008151"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Interact. Mob. Wearable Ubiquitous Technol."],"published-print":{"date-parts":[[2022,3,29]]},"abstract":"<jats:p>Emotion recognition in smart eyewear devices is valuable but challenging. One key limitation of previous works is that the expression-related information like facial or eye images is considered as the only evidence of emotion. However, emotional status is not isolated; it is tightly associated with people's visual perceptions, especially those with emotional implications. However, little work has examined such associations to better illustrate the causes of emotions. In this paper, we study the emotionship analysis problem in eyewear systems, an ambitious task that requires classifying the user's emotions and semantically understanding their potential causes. To this end, we describe EMOShip, a deep-learning-based eyewear system that can automatically detect the wearer's emotional status and simultaneously analyze its associations with semantic-level visual perception. Experimental studies with 20 participants demonstrate that, thanks to its awareness of emotionship, EMOShip achieves superior emotion recognition accuracy compared to existing methods (80.2% vs. 69.4%) and provides a valuable understanding of the causes of emotions. Further pilot studies with 20 additional participants further motivate the potential use of EMOShip to empower emotion-aware applications, such as emotionship self-reflection and emotionship life-logging.<\/jats:p>","DOI":"10.1145\/3517250","type":"journal-article","created":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T13:42:46Z","timestamp":1648561366000},"page":"1-29","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["Do Smart Glasses Dream of Sentimental Visions?"],"prefix":"10.1145","volume":"6","author":[{"given":"Yingying","family":"Zhao","sequence":"first","affiliation":[{"name":"School of Computer Science, Fudan University, Shanghai, China, Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, China"}]},{"given":"Yuhu","family":"Chang","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University, Shanghai, China, Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, China"}]},{"given":"Yutian","family":"Lu","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University, Shanghai, China, Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, China"}]},{"given":"Yujiang","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Computing, Imperial College London, London, United Kingdom"}]},{"given":"Mingzhi","family":"Dong","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University, Shanghai, China, Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, China"}]},{"given":"Qin","family":"Lv","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Colorado Boulder, Boulder, Colorado, United States"}]},{"given":"Robert P.","family":"Dick","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, United States"}]},{"given":"Fan","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Microelectronics, Fudan University, Shanghai, China"}]},{"given":"Tun","family":"Lu","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University, Shanghai, China, Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, China"}]},{"given":"Ning","family":"Gu","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University, Shanghai, China, Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, China"}]},{"given":"Li","family":"Shang","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University, Shanghai, China, Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, China"}]}],"member":"320","published-online":{"date-parts":[[2022,3,29]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10","author":"Aifanti Niki","year":"2010","unstructured":"Niki Aifanti, Christos Papachristou, and Anastasios Delopoulos. 2010. The MUG facial expression database. In 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10. IEEE, 1--4."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00583"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/SMC.2015.460"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jphysparis.2008.03.012"},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of ICLR. http:\/\/arxiv.org\/abs\/1409","author":"Bahdanau Dzmitry","year":"2015","unstructured":"Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of ICLR. http:\/\/arxiv.org\/abs\/1409.0473"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/MMUL.2006.87"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2017.01.011"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3463509"},{"key":"e_1_2_1_9_1","volume-title":"Faisal Ahmed, Zhe Gan, Yu Cheng, and Jingjing Liu.","author":"Chen Yen-Chun","year":"2019","unstructured":"Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, and Jingjing Liu. 2019. Uniter: Learning universal image-text representations. (2019)."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/0028-3932(93)90056-6"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1080\/10926488.2016.1150762"},{"key":"e_1_2_1_12_1","volume-title":"The interface between emotion and attention: A review of evidence from psychology and neuroscience. Behavioral and cognitive neuroscience reviews 2, 2","author":"Compton Rebecca J","year":"2003","unstructured":"Rebecca J Compton. 2003. The interface between emotion and attention: A review of evidence from psychology and neuroscience. Behavioral and cognitive neuroscience reviews 2, 2 (2003), 115--129."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2971648.2971752"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/79.911197"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1038\/nrn1432"},{"key":"e_1_2_1_16_1","volume-title":"Learning from experience through reflection. Organizational dynamics 24, 3","author":"Daudelin Marilyn Wood","year":"1996","unstructured":"Marilyn Wood Daudelin. 1996. Learning from experience through reflection. Organizational dynamics 24, 3 (1996), 36--48."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3267305.3267316"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3264913"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2818346.2830596"},{"key":"e_1_2_1_20_1","volume-title":"An argument for basic emotions. Cognition & emotion 6, 3-4","author":"Ekman Paul","year":"1992","unstructured":"Paul Ekman. 1992. An argument for basic emotions. Cognition & emotion 6, 3-4 (1992), 169--200."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2459236.2459273"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123818.3123852"},{"key":"e_1_2_1_23_1","volume-title":"Towards Improving Emotion Self-Report Collection Using Self-Reflection. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1--8.","author":"Ghosh Surjya","year":"2020","unstructured":"Surjya Ghosh, Bivas Mitra, and Pradipta De. 2020. Towards Improving Emotion Self-Report Collection Using Self-Reflection. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1--8."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-42051-1_16"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.670"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2019.00178"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00745"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2638728.2641695"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2818346.2830590"},{"key":"e_1_2_1_31_1","volume-title":"Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2818346.2830587"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i07.6795"},{"key":"e_1_2_1_34_1","volume-title":"Deep facial expression recognition: A survey","author":"Li Shan","year":"2020","unstructured":"Shan Li and Weihong Deng. 2020. Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing (2020)."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.277"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58577-8_8"},{"key":"e_1_2_1_37_1","volume-title":"Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. arXiv preprint arXiv:1908.02265","author":"Lu Jiasen","year":"2019","unstructured":"Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. 2019. Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. arXiv preprint arXiv:1908.02265 (2019)."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/d15-1166"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV48630.2021.00290"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1873951.1873965"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2800835.2800898"},{"key":"e_1_2_1_42_1","unstructured":"Vincent Le Moign. 2017. Streamline Emoji Free Icons from the Streamline Icons Pack. https:\/\/streamlineicons.com."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1037\/0096-3445.130.3.466"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1523\/JNEUROSCI.0747-13.2014"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2003.817122"},{"key":"e_1_2_1_46_1","volume-title":"Measuring emotions in students' learning and performance: The Achievement Emotions Questionnaire (AEQ). Contemporary educational psychology 36, 1","author":"Pekrun Reinhard","year":"2011","unstructured":"Reinhard Pekrun, Thomas Goetz, Anne C Frenzel, Petra Barchfeld, and Raymond P Perry. 2011. Measuring emotions in students' learning and performance: The Achievement Emotions Questionnaire (AEQ). Contemporary educational psychology 36, 1 (2011), 36--48."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.12.053"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3432223"},{"key":"e_1_2_1_50_1","volume-title":"Assessing the effectiveness of a large database of emotion-eliciting films: A new tool for emotion researchers. Cognition and emotion 24, 7","author":"Schaefer Alexandre","year":"2010","unstructured":"Alexandre Schaefer, Fr\u00e9d\u00e9ric Nils, Xavier Sanchez, and Pierre Philippot. 2010. Assessing the effectiveness of a large database of emotion-eliciting films: A new tool for emotion researchers. Cognition and emotion 24, 7 (2010), 1153--1172."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/632716.632878"},{"key":"e_1_2_1_52_1","volume-title":"Lifelogging: Digital self-tracking and Lifelogging-between disruptive technology and cultural transformation","author":"Selke Stefan","year":"2016","unstructured":"Stefan Selke. 2016. Lifelogging: Digital self-tracking and Lifelogging-between disruptive technology and cultural transformation. Springer."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2019.2939744"},{"key":"e_1_2_1_54_1","volume-title":"Lxmert: Learning cross-modality encoder representations from transformers. arXiv preprint arXiv:1908.07490","author":"Tan Hao","year":"2019","unstructured":"Hao Tan and Mohit Bansal. 2019. Lxmert: Learning cross-modality encoder representations from transformers. arXiv preprint arXiv:1908.07490 (2019)."},{"key":"e_1_2_1_55_1","volume-title":"Eye-Tracking Analysis for Emotion Recognition. Computational Intelligence and Neuroscience 2020","author":"Tarnowski Pawe\u0142","year":"2020","unstructured":"Pawe\u0142 Tarnowski, Marcin Ko\u0142odziej, Andrzej Majkowski, and Remigiusz Jan Rak. 2020. Eye-Tracking Analysis for Emotion Recognition. Computational Intelligence and Neuroscience 2020 (2020)."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3432207"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1037\/0022-3514.86.2.320"},{"key":"e_1_2_1_58_1","volume-title":"Proceedings of NIPS. 5998--6008","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Proceedings of NIPS. 5998--6008. http:\/\/papers.nips.cc\/paper\/7181-attention-is-all-you-need"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00813"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00699"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/3386901.3388917"},{"key":"e_1_2_1_62_1","volume-title":"International conference on machine learning. PMLR","author":"Xu Kelvin","year":"2015","unstructured":"Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning. PMLR, 2048--2057."},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00791"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11275"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2013.39"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00553"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/2973750.2973762"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2021.3076612"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i07.7005"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2017\/503"}],"container-title":["Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3517250","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3517250","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3517250","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,14]],"date-time":"2025-07-14T04:27:39Z","timestamp":1752467259000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3517250"}},"subtitle":["Deep Emotionship Analysis for Eyewear Devices"],"short-title":[],"issued":{"date-parts":[[2022,3,29]]},"references-count":70,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,3,29]]}},"alternative-id":["10.1145\/3517250"],"URL":"https:\/\/doi.org\/10.1145\/3517250","relation":{},"ISSN":["2474-9567"],"issn-type":[{"type":"electronic","value":"2474-9567"}],"subject":[],"published":{"date-parts":[[2022,3,29]]},"assertion":[{"value":"2022-03-29","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}