{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T14:30:35Z","timestamp":1766068235032,"version":"build-2065373602"},"reference-count":51,"publisher":"Association for Computing Machinery (ACM)","issue":"7","license":[{"start":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T00:00:00Z","timestamp":1760572800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["2239569"],"award-info":[{"award-number":["2239569"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Hum.-Comput. Interact."],"published-print":{"date-parts":[[2025,10,18]]},"abstract":"<jats:p>Emotional voice communication plays a crucial role in effective daily interactions. Deaf and Hard of Hearing (DHH) individuals, who often have limited use of voice, rely on facial expressions to supplement sign language and convey emotions. However, in American Sign Language (ASL), facial expressions serve not only emotional purposes but also function as linguistic markers that can alter the meaning of signs. This dual role can often confuse non-signers when interpreting a signer's emotional state. In this paper, we present studies that: (1) confirm the challenges non-signers face when interpreting emotions from facial expressions in ASL communication, and (2) demonstrate how integrating emotional voice into translation systems can enhance hearing individuals' understanding of a signer's emotional intent. An online survey with 45 hearing participants (non-ASL signers) revealed frequent misinterpretations of signers' emotions when emotional and linguistic facial expressions were used simultaneously. The findings show that incorporating emotional voice into translation systems significantly improves emotion recognition by 32%. Additionally, follow-up survey with 48 DHH participants highlights design considerations for implementing emotional voice features, emphasizing the importance of emotional voice integration to bridge communication gaps between DHH and hearing communities.<\/jats:p>","DOI":"10.1145\/3757597","type":"journal-article","created":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T17:32:00Z","timestamp":1760635920000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Exploring the Impact of Emotional Voice Integration in Sign-to-Speech Translators for Deaf-to-Hearing Communication"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8397-3534","authenticated-orcid":false,"given":"Hyunchul","family":"Lim","sequence":"first","affiliation":[{"name":"Cornell University, Ithaca, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2302-2408","authenticated-orcid":false,"given":"Minghan","family":"Gao","sequence":"additional","affiliation":[{"name":"Cornell University, Ithaca, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4995-4545","authenticated-orcid":false,"given":"Franklin Mingzhe","family":"Li","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, PA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-8080-9268","authenticated-orcid":false,"given":"Nam Anh","family":"Dang","sequence":"additional","affiliation":[{"name":"Cornell University, Ithaca, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-1445-7325","authenticated-orcid":false,"given":"Ianip","family":"Sit","sequence":"additional","affiliation":[{"name":"Rochester Institute of Technology, Rochester, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-8297-1743","authenticated-orcid":false,"given":"Michelle M","family":"Olson","sequence":"additional","affiliation":[{"name":"Rochester Institute of Technology, Rochester, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5079-5927","authenticated-orcid":false,"given":"Cheng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Cornell University, Ithaca, New York, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,10,16]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Sarah Ostadabbas, and Thierry Dutoit.","author":"Adigwe Adaeze","year":"2018","unstructured":"Adaeze Adigwe, No\u00e9 Tits, Kevin El Haddad, Sarah Ostadabbas, and Thierry Dutoit. 2018. The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems. arXiv:1806.09514 [cs.CL] https:\/\/arxiv.org\/abs\/1806.09514"},{"key":"e_1_2_1_2_1","volume-title":"Recognizing emotion from facial expressions: psychological and neurological mechanisms. Behavioral and cognitive neuroscience reviews 1, 1","author":"Adolphs Ralph","year":"2002","unstructured":"Ralph Adolphs. 2002. Recognizing emotion from facial expressions: psychological and neurological mechanisms. Behavioral and cognitive neuroscience reviews 1, 1 (2002), 21-62."},{"key":"e_1_2_1_3_1","doi-asserted-by":"crossref","first-page":"20220076","DOI":"10.1515\/jisys-2022-0076","article-title":"Intelligent gloves: An IT intervention for deaf-mute people","volume":"32","author":"Babour Amal","year":"2023","unstructured":"Amal Babour, Hind Bitar, Ohoud Alzamzami, Dimah Alahmadi, Amal Barsheed, Amal Alghamdi, and Hanadi Almshjary. 2023. Intelligent gloves: An IT intervention for deaf-mute people. Journal of Intelligent Systems 32, 1 (2023), 20220076.","journal-title":"Journal of Intelligent Systems"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3308561.3353774"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1037\/0022-3514.70.2.205"},{"key":"e_1_2_1_6_1","first-page":"2","article-title":"Neuropsychological studies of linguistic and affective facial expressions in deaf signers","volume":"42","author":"Corina David P","year":"1999","unstructured":"David P Corina, Ursula Bellugi, and Judy Reilly. 1999. Neuropsychological studies of linguistic and affective facial expressions in deaf signers. Language and Speech 42, 2-3 (1999), 307-331.","journal-title":"Language and Speech"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.intcom.2007.11.004"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1080\/026999300378824"},{"key":"e_1_2_1_9_1","unstructured":"Philippe Dreuw Hermann Ney Gregorio Martinez Onno A Crasborn Justus Piater J Miguel Moya and Mark Wheatley. 2010. The signspeak project-bridging the gap between signers and speakers. (2010)."},{"key":"e_1_2_1_10_1","volume-title":"Nebraska symposium on motivation","author":"Ekman Paul","year":"1971","unstructured":"Paul Ekman. 1971. Universals and cultural differences in facial expressions of emotion. In Nebraska symposium on motivation. University of Nebraska Press."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1037\/h0030377"},{"key":"e_1_2_1_12_1","volume-title":"Facial expressions, emotions, and sign languages. Frontiers in psychology 4","author":"Elliott Eeva A","year":"2013","unstructured":"Eeva A Elliott and Arthur M Jacobs. 2013. Facial expressions, emotions, and sign languages. Frontiers in psychology 4 (2013), 115."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 15th ACM conference on embedded network sensor systems. 1-13","author":"Fang Biyi","year":"2017","unstructured":"Biyi Fang, Jillian Co, and Mi Zhang. 2017. Deepasl: Enabling ubiquitous and non-intrusive word and sentence-level sign language translation. In Proceedings of the 15th ACM conference on embedded network sensor systems. 1-13."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-021-06079-3"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.26034\/cm.jostrans.2015.323"},{"key":"e_1_2_1_16_1","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1167\/11.3.17","article-title":"Is there a dynamic advantage for facial expressions","volume":"11","author":"Fiorentini Chiara","year":"2011","unstructured":"Chiara Fiorentini and Paolo Viviani. 2011. Is there a dynamic advantage for facial expressions? Journal of Vision 11, 3 (2011), 17-17.","journal-title":"Journal of Vision"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10796-017-9765-z"},{"key":"e_1_2_1_18_1","unstructured":"Elisa Ghia et al. 2012. The impact of translation strategies on subtitle reading. Eye tracking in audiovisual translation (2012) 157-182."},{"key":"e_1_2_1_19_1","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1167\/13.5.23","article-title":"The efficiency of dynamic and static facial expression recognition","volume":"13","author":"Gold Jason M","year":"2013","unstructured":"Jason M Gold, Jarrett D Barker, Shawn Barr, Jennifer L Bittner, W Drew Bromfield, Nicole Chu, Roy A Goode, Doori Lee, Michael Simmons, and Aparna Srinath. 2013. The efficiency of dynamic and static facial expression recognition. Journal of Vision 13, 5 (2013), 23-23.","journal-title":"Journal of Vision"},{"key":"e_1_2_1_20_1","volume-title":"2021 International Conference on Innovative Practices in Technology and Management (ICIPTM). IEEE, 10-14","author":"Grover Yuvraj","year":"2021","unstructured":"Yuvraj Grover, Riya Aggarwal, Deepak Sharma, and Prashant K Gupta. 2021. Sign language translation systems for hearing\/speech impaired people: a review. In 2021 International Conference on Innovative Practices in Technology and Management (ICIPTM). IEEE, 10-14."},{"key":"e_1_2_1_21_1","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1109\/EURCON.2005.1629959","volume-title":"EUROCON 2005-The International Conference on'' Computer as a Tool''","volume":"1","author":"Havasi L\u00e1szl\u00f3","year":"2005","unstructured":"L\u00e1szl\u00f3 Havasi and Helga M Szab\u00f3. 2005. A motion capture system for sign language synthesis: Overview and related issues. In EUROCON 2005-The International Conference on'' Computer as a Tool'', Vol. 1. IEEE, 445-448."},{"key":"e_1_2_1_22_1","volume-title":"Nonverbal communication. Encyclopedia of mental health 2, 3","author":"Hess Ursula","year":"2016","unstructured":"Ursula Hess. 2016. Nonverbal communication. Encyclopedia of mental health 2, 3 (2016), 208-218."},{"key":"e_1_2_1_23_1","volume-title":"Stats and Analysis. Retrieved","year":"2006","unstructured":"https:\/\/azure.microsoft.com\/. 2006. Stats and Analysis. Retrieved June 7, 2006 from http:\/\/www.poker-edge.com\/stats. php"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2049536.2049556"},{"key":"e_1_2_1_25_1","volume-title":"ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6317-6321","author":"Im Chae-Bin","year":"2022","unstructured":"Chae-Bin Im, Sang-Hoon Lee, Seung-Bin Kim, and Seong-Whan Lee. 2022. Emoq-tts: Emotion intensity quantization for fine-grained controllable emotional text-to-speech. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6317-6321."},{"key":"e_1_2_1_26_1","volume-title":"Md Saiful Islam, and Enamul Hassan","author":"Islam Khondoker Ittehadul","year":"2022","unstructured":"Khondoker Ittehadul Islam, Tanvir Yuvraz, Md Saiful Islam, and Enamul Hassan. 2022. Emonoba: A dataset for analyzing fine-grained emotions on noisy bangla texts. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 128-134."},{"key":"e_1_2_1_27_1","unstructured":"Gonzalo Iturregui-Gallardo. 2019. Audio subtitling: voicing strategies and their effect on emotional activation."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3581641.3584071"},{"key":"e_1_2_1_29_1","first-page":"1","article-title":"SmartASL: '' Point-of-Care'' Comprehensive ASL Interpreter Using Wearables","volume":"7","author":"Jin Yincheng","year":"2023","unstructured":"Yincheng Jin, Shibo Zhang, Yang Gao, Xuhai Xu, Seokmin Choi, Zhengxiong Li, Henry J Adler, and Zhanpeng Jin. 2023. SmartASL: '' Point-of-Care'' Comprehensive ASL Interpreter Using Wearables. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 2 (2023), 1-21.","journal-title":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies"},{"key":"e_1_2_1_30_1","doi-asserted-by":"crossref","first-page":"e32987","DOI":"10.1371\/journal.pone.0032987","article-title":"Context modulation of facial emotion perception differed by individual difference","volume":"7","author":"Lee Tae-Ho","year":"2012","unstructured":"Tae-Ho Lee, June-Seek Choi, and Yang Seok Cho. 2012. Context modulation of facial emotion perception differed by individual difference. PLOS one 7, 3 (2012), e32987.","journal-title":"PLOS one"},{"key":"e_1_2_1_31_1","volume-title":"Reinforcement learning for emotional text-to-speech synthesis with improved emotion discriminability. arXiv preprint arXiv:2104.01408","author":"Liu Rui","year":"2021","unstructured":"Rui Liu, Berrak Sisman, and Haizhou Li. 2021. Reinforcement learning for emotional text-to-speech synthesis with improved emotion discriminability. arXiv preprint arXiv:2104.01408 (2021)."},{"key":"e_1_2_1_32_1","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1016\/j.cognition.2008.11.007","article-title":"Categorical perception of affective and linguistic facial expressions","volume":"110","author":"McCullough Stephen","year":"2009","unstructured":"Stephen McCullough and Karen Emmorey. 2009. Categorical perception of affective and linguistic facial expressions. Cognition 110, 2 (2009), 208-221.","journal-title":"Cognition"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cogbrainres.2004.08.012"},{"key":"e_1_2_1_34_1","volume-title":"Computer-based recognition of facial expressions in ASL: from face tracking to linguistic interpretation. In sign-lang@ LREC","author":"Michael Nicholas","year":"2010","unstructured":"Nicholas Michael, Carol Neidle, and Dimitris Metaxas. 2010. Computer-based recognition of facial expressions in ASL: from face tracking to linguistic interpretation. In sign-lang@ LREC 2010. European Language Resources Association (ELRA), 164-167."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.405558"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1037\/1076-898X.7.3.171"},{"key":"e_1_2_1_37_1","unstructured":"Henrik Nordstr\u00f6m. 2019. Emotional communication in the human voice. Ph.D. Dissertation. Department of Psychology Stockholm University."},{"key":"e_1_2_1_38_1","unstructured":"Common European Framework of Reference for Languages (CEFR). [n.d.]. https:\/\/learntolanguage.com\/cefr-levels\/#: :text=What%20is%20the%20CEFR?of%20CEFR%20is%20more%20common."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00607"},{"key":"e_1_2_1_40_1","volume-title":"How faces come to serve grammar: The development of nonmanual morphology in American Sign Language. Advances in the sign language development of deaf children","author":"Reilly Judy","year":"2006","unstructured":"Judy Reilly. 2006. How faces come to serve grammar: The development of nonmanual morphology in American Sign Language. Advances in the sign language development of deaf children (2006), 262-290."},{"key":"e_1_2_1_41_1","volume-title":"Faces: The relationship between language and affect. In From gesture to language in hearing and deaf children","author":"Reilly J Snitzer","year":"1990","unstructured":"J Snitzer Reilly, Marina L McIntire, and Ursula Bellugi. 1990. Faces: The relationship between language and affect. In From gesture to language in hearing and deaf children. Springer, 128-141."},{"key":"e_1_2_1_42_1","volume-title":"The recognition of facial expressions of emotion in deaf and hearing individuals. Heliyon 7, 5","author":"Rodger Helen","year":"2021","unstructured":"Helen Rodger, Junpeng Lao, Chlo\u00e9 Stoll, Anne-Rapha\u00eblle Richoz, Olivier Pascalis, Matthew Dye, and Roberto Caldara. 2021. The recognition of facial expressions of emotion in deaf and hearing individuals. Heliyon 7, 5 (2021)."},{"key":"e_1_2_1_43_1","volume-title":"2024 5th International Conference for Emerging Technology (INCET). IEEE, 1-6.","author":"Roshan Rohit","year":"2024","unstructured":"Rohit Roshan, MH Sohan, SM Sutharsan Raj, VR Badri Prasad, et al. 2024. Sentient Sound waves: Elevating Emotional Communication with AI-Generated Speech Technology. In 2024 5th International Conference for Emerging Technology (INCET). IEEE, 1-6."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1037\/h0077714"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2008.02.001"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3234695.3240986"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACIIAsia.2018.8470350"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.2044-8309.1988.tb00837.x"},{"key":"e_1_2_1_49_1","first-page":"1","volume-title":"Proceedings of the ACM on Human-Computer Interaction 2, CSCW","author":"Wang Emily Q","year":"2018","unstructured":"Emily Q Wang and Anne Marie Piper. 2018. Accessibility in action: Co-located collaboration among deaf and hearing professionals. Proceedings of the ACM on Human-Computer Interaction 2, CSCW (2018), 1-25."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1037\/0022-3514.78.1.105"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP39728.2021.9413391"}],"container-title":["Proceedings of the ACM on Human-Computer Interaction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3757597","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3757597","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T17:45:03Z","timestamp":1760636703000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3757597"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,16]]},"references-count":51,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2025,10,18]]}},"alternative-id":["10.1145\/3757597"],"URL":"https:\/\/doi.org\/10.1145\/3757597","relation":{},"ISSN":["2573-0142"],"issn-type":[{"value":"2573-0142","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,16]]},"assertion":[{"value":"2025-10-16","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}