{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,1]],"date-time":"2026-03-01T11:03:54Z","timestamp":1772363034567,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":65,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,5,11]],"date-time":"2024-05-11T00:00:00Z","timestamp":1715385600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,5,11]]},"DOI":"10.1145\/3613904.3642817","type":"proceedings-article","created":{"date-parts":[[2024,5,11]],"date-time":"2024-05-11T08:39:12Z","timestamp":1715416752000},"page":"1-14","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["Uncovering Human Traits in Determining Real and Spoofed Audio: Insights from Blind and Sighted Individuals"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5275-7142","authenticated-orcid":false,"given":"Chaeeun","family":"Han","sequence":"first","affiliation":[{"name":"College of Information Sciences and Technology, Pennsylvania State University, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7530-9497","authenticated-orcid":false,"given":"Prasenjit","family":"Mitra","sequence":"additional","affiliation":[{"name":"College of Information Sciences and Technology, Pennsylvania State University, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5063-3808","authenticated-orcid":false,"given":"Syed Masum","family":"Billah","sequence":"additional","affiliation":[{"name":"College of Information Sciences and Technology, Pennsylvania State University, United States"}]}],"member":"320","published-online":{"date-parts":[[2024,5,11]]},"reference":[{"key":"e_1_3_3_3_1_1","unstructured":"2023. GitHub - ShieldMnt\/invisible-watermark: python library for invisible image watermark (blind image watermark) \u2014 github.com. https:\/\/github.com\/ShieldMnt\/invisible-watermark. [Accessed 11-09-2023]."},{"key":"e_1_3_3_3_2_1","unstructured":"2023. Seamless Communication Models - AI at Meta \u2014 ai.meta.com. https:\/\/ai.meta.com\/resources\/models-and-libraries\/seamless-communication-models\/#safetyandresponsibility. [Accessed 13-12-2023]."},{"key":"e_1_3_3_3_3_1","unstructured":"2023. Transforming the future of music creation \u2014 deepmind.google. https:\/\/deepmind.google\/discover\/blog\/transforming-the-future-of-music-creation\/. [Accessed 13-12-2023]."},{"key":"e_1_3_3_3_4_1","volume-title":"A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information fusion 76","author":"Abdar Moloud","year":"2021","unstructured":"Moloud Abdar, Farhad Pourpanah, Sadiq Hussain, Dana Rezazadegan, Li Liu, Mohammad Ghavamzadeh, Paul Fieguth, Xiaochun Cao, Abbas Khosravi, U\u00a0Rajendra Acharya, 2021. A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information fusion 76 (2021), 243\u2013297."},{"key":"e_1_3_3_3_5_1","doi-asserted-by":"crossref","unstructured":"David\u00a0W Addington. 1968. The relationship of selected vocal characteristics to personality perception. (1968).","DOI":"10.1080\/03637756809375599"},{"key":"e_1_3_3_3_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/WIFS49906.2020.9360904"},{"key":"e_1_3_3_3_7_1","volume-title":"Cognitive models of speech processing: Psycholinguistic and computational perspectives","author":"Altmann TM","unstructured":"Gerry\u00a0TM Altmann. 1995. Cognitive models of speech processing: Psycholinguistic and computational perspectives. Mit Press."},{"key":"e_1_3_3_3_8_1","volume-title":"Visual and multisensory processing and plasticity in the human brain","author":"Amedi Amir","unstructured":"Amir Amedi. 2004. Visual and multisensory processing and plasticity in the human brain. Hebrew University of Jerusalem."},{"key":"e_1_3_3_3_9_1","volume-title":"Transforming assistive technologies from the ground up for people with vision impairments. Ph.\u00a0D. Dissertation","author":"Billah Syed\u00a0Masum","unstructured":"Syed\u00a0Masum Billah. 2019. Transforming assistive technologies from the ground up for people with vision impairments. Ph.\u00a0D. Dissertation. State University of New York at Stony Brook."},{"key":"e_1_3_3_3_10_1","volume-title":"Benchmarking and challenges in security and privacy for voice biometrics. arXiv preprint arXiv:2109.00281","author":"Bonastre Jean-Francois","year":"2021","unstructured":"Jean-Francois Bonastre, Hector Delgado, Nicholas Evans, Tomi Kinnunen, Kong\u00a0Aik Lee, Xuechen Liu, Andreas Nautsch, Paul-Gauthier Noe, Jose Patino, Md Sahidullah, 2021. Benchmarking and challenges in security and privacy for voice biometrics. arXiv preprint arXiv:2109.00281 (2021)."},{"key":"e_1_3_3_3_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3173574.3174018"},{"key":"e_1_3_3_3_12_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3461700","article-title":"Expanding a Large Inclusive Study of Human Listening Rates","volume":"14","author":"Bragg Danielle","year":"2021","unstructured":"Danielle Bragg, Katharina Reinecke, and Richard\u00a0E Ladner. 2021. Expanding a Large Inclusive Study of Human Listening Rates. ACM Transactions on Accessible Computing (TACCESS) 14, 3 (2021), 1\u201326.","journal-title":"ACM Transactions on Accessible Computing (TACCESS)"},{"key":"e_1_3_3_3_13_1","unstructured":"A. Bryman and R.G. Burgess. 1994. Analyzing Qualitative Data. Routledge. https:\/\/books.google.com\/books?id=KQkotSd9YWkC"},{"key":"e_1_3_3_3_14_1","volume-title":"Perception and production of fluent speech","author":"Cole A","unstructured":"Ronald\u00a0A Cole. 2016. Perception and production of fluent speech. Routledge."},{"key":"e_1_3_3_3_15_1","volume-title":"Man versus machine or man+ machine?IEEE Intelligent Systems 29, 5","author":"Cummings Mary\u00a0Missy","year":"2014","unstructured":"Mary\u00a0Missy Cummings. 2014. Man versus machine or man+ machine?IEEE Intelligent Systems 29, 5 (2014), 62\u201369."},{"key":"e_1_3_3_3_16_1","doi-asserted-by":"publisher","DOI":"10.1177\/0963721410364726"},{"key":"e_1_3_3_3_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1980.1163420"},{"key":"e_1_3_3_3_18_1","unstructured":"Elevenlabs. 2023. ElevenLabs || Prime Voice AI \u2014 beta.elevenlabs.io. https:\/\/beta.elevenlabs.io. [Accessed 03-May-2023]."},{"key":"e_1_3_3_3_19_1","unstructured":"Hany Farid. 2023. Watermarking ChatGPT DALL-E and other generative AIs could help protect against fraud and misinformation \u2014 theconversation.com. https:\/\/theconversation.com\/watermarking-chatgpt-dall-e-and-other-generative-ais-could-help-protect-against-fraud-and-misinformation-202293. [Accessed 16-08-2023]."},{"key":"e_1_3_3_3_20_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-6494.1939.tb02169.x"},{"key":"e_1_3_3_3_21_1","volume-title":"Wavefake: A data set to facilitate audio deepfake detection. arXiv preprint arXiv:2111.02813","author":"Frank Joel","year":"2021","unstructured":"Joel Frank and Lea Sch\u00f6nherr. 2021. Wavefake: A data set to facilitate audio deepfake detection. arXiv preprint arXiv:2111.02813 (2021)."},{"key":"e_1_3_3_3_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW56347.2022.00015"},{"key":"e_1_3_3_3_23_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-acl.457"},{"key":"e_1_3_3_3_24_1","volume-title":"Hearing loss and aging: new research findings and clinical implications.Journal of Rehabilitation Research & Development 42","author":"Gordon-Salant Sandra","year":"2005","unstructured":"Sandra Gordon-Salant. 2005. Hearing loss and aging: new research findings and clinical implications.Journal of Rehabilitation Research & Development 42 (2005)."},{"key":"e_1_3_3_3_25_1","doi-asserted-by":"crossref","unstructured":"Brien\u00a0A Holden. 2007. Blindness and poverty: a tragic combination. 401\u2013403\u00a0pages.","DOI":"10.1111\/j.1444-0938.2007.00217.x"},{"key":"e_1_3_3_3_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3173574.3174141"},{"key":"e_1_3_3_3_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3382039"},{"key":"e_1_3_3_3_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.heares.2009.07.012"},{"key":"e_1_3_3_3_29_1","volume-title":"Biometrics: a tool for information security","author":"Jain K","year":"2006","unstructured":"Anil\u00a0K Jain, Arun Ross, and Sharath Pankanti. 2006. Biometrics: a tool for information security. IEEE transactions on information forensics and security 1, 2 (2006), 125\u2013143."},{"key":"e_1_3_3_3_30_1","volume-title":"Evading Watermark based Detection of AI-Generated Content. arXiv preprint arXiv:2305.03807","author":"Jiang Zhengyuan","year":"2023","unstructured":"Zhengyuan Jiang, Jinghuai Zhang, and Neil\u00a0Zhenqiang Gong. 2023. Evading Watermark based Detection of AI-Generated Content. arXiv preprint arXiv:2305.03807 (2023)."},{"key":"e_1_3_3_3_31_1","unstructured":"Makena Kelly. 2023. Meta Google and OpenAI promise the White House they\u2019ll develop AI responsibly \u2014 theverge.com. https:\/\/www.theverge.com\/2023\/7\/21\/23802274\/artificial-intelligence-meta-google-openai-white-house-security-safety. [Accessed 11-09-2023]."},{"key":"e_1_3_3_3_32_1","volume-title":"An overview of text-independent speaker recognition: From features to supervectors. Speech communication 52, 1","author":"Kinnunen Tomi","year":"2010","unstructured":"Tomi Kinnunen and Haizhou Li. 2010. An overview of text-independent speaker recognition: From features to supervectors. Speech communication 52, 1 (2010), 12\u201340."},{"key":"e_1_3_3_3_33_1","volume-title":"The accuracy of auditory spatial judgments in the visually impaired is dependent on sound source distance. Scientific reports 10, 1","author":"Kolarik J","year":"2020","unstructured":"Andrew\u00a0J Kolarik, Rajiv Raman, Brian\u00a0CJ Moore, Silvia Cirstea, Sarika Gopalakrishnan, and Shahina Pardhan. 2020. The accuracy of auditory spatial judgments in the visually impaired is dependent on sound source distance. Scientific reports 10, 1 (2020), 7169."},{"key":"e_1_3_3_3_34_1","volume-title":"Melgan: Generative adversarial networks for conditional waveform synthesis. Advances in neural information processing systems 32","author":"Kumar Kundan","year":"2019","unstructured":"Kundan Kumar, Rithesh Kumar, Thibault De\u00a0Boissiere, Lucas Gestin, Wei\u00a0Zhen Teoh, Jose Sotelo, Alexandre De\u00a0Brebisson, Yoshua Bengio, and Aaron\u00a0C Courville. 2019. Melgan: Generative adversarial networks for conditional waveform synthesis. Advances in neural information processing systems 32 (2019)."},{"key":"e_1_3_3_3_35_1","volume-title":"Spoof Detection Using Time-Delay Shallow Neural Network and Feature Switching. 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","author":"Kumar Mari\u00a0Ganesh","year":"2019","unstructured":"Mari\u00a0Ganesh Kumar, Suvidha\u00a0Rupesh Kumar, M.\u00a0S. Saranya, B. Bharathi, and Hema\u00a0A. Murthy. 2019. Spoof Detection Using Time-Delay Shallow Neural Network and Feature Switching. 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (2019), 1011\u20131017."},{"key":"e_1_3_3_3_36_1","volume-title":"ASSERT: Anti-spoofing with squeeze-excitation and residual networks. arXiv preprint arXiv:1904.01120","author":"Lai I","year":"2019","unstructured":"Cheng-I Lai, Nanxin Chen, Jes\u00fas Villalba, and Najim Dehak. 2019. ASSERT: Anti-spoofing with squeeze-excitation and residual networks. arXiv preprint arXiv:1904.01120 (2019)."},{"key":"e_1_3_3_3_37_1","volume-title":"ASVspoof 2021: Towards spoofed and deepfake speech detection in the wild. arXiv preprint arXiv:2210.02437","author":"Liu Xuechen","year":"2022","unstructured":"Xuechen Liu, Xin Wang, Md Sahidullah, Jose Patino, H\u00e9ctor Delgado, Tomi Kinnunen, Massimiliano Todisco, Junichi Yamagishi, Nicholas Evans, Andreas Nautsch, 2022. ASVspoof 2021: Towards spoofed and deepfake speech detection in the wild. arXiv preprint arXiv:2210.02437 (2022)."},{"key":"e_1_3_3_3_38_1","volume-title":"Visual influences on speech perception processes. Perception & psychophysics 24, 3","author":"MacDonald John","year":"1978","unstructured":"John MacDonald and Harry McGurk. 1978. Visual influences on speech perception processes. Perception & psychophysics 24, 3 (1978), 253\u2013257."},{"key":"e_1_3_3_3_39_1","doi-asserted-by":"publisher","DOI":"10.1016\/0010-0277(87)90005-9"},{"key":"e_1_3_3_3_40_1","volume-title":"International Conference on Machine Learning. PMLR, 7076\u20137087","author":"Mozannar Hussein","year":"2020","unstructured":"Hussein Mozannar and David Sontag. 2020. Consistent estimators for learning to defer to an expert. In International Conference on Machine Learning. PMLR, 7076\u20137087."},{"key":"e_1_3_3_3_41_1","unstructured":"NYT. 2023. \u2018The Godfather of A.I.\u2019 Leaves Google and Warns of Danger Ahead \u2014 nytimes.com. https:\/\/www.nytimes.com\/2023\/05\/01\/technology\/ai-google-chatbot-engineer-quits-hinton.html. [Accessed 02-May-2023]."},{"key":"e_1_3_3_3_42_1","first-page":"149","article-title":"Comprehension of synthetic and natural speech: Differences among Sighted and visually impaired young adults","volume":"147","author":"Papadopoulos Konstantinos","year":"2015","unstructured":"Konstantinos Papadopoulos and Eleni Koustriava. 2015. Comprehension of synthetic and natural speech: Differences among Sighted and visually impaired young adults. Enabling Access for Persons with Visual Impairment 147 (2015), 149\u2013153.","journal-title":"Enabling Access for Persons with Visual Impairment"},{"key":"e_1_3_3_3_43_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2021.101317"},{"key":"e_1_3_3_3_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/WIFS55849.2022.9975428"},{"key":"e_1_3_3_3_45_1","volume-title":"Communication acoustics: an introduction to speech, audio and psychoacoustics","author":"Pulkki Ville","unstructured":"Ville Pulkki and Matti Karjalainen. 2015. Communication acoustics: an introduction to speech, audio and psychoacoustics. John Wiley & Sons."},{"key":"e_1_3_3_3_46_1","volume-title":"The algorithmic automation problem: Prediction, triage, and human effort. arXiv preprint arXiv:1903.12220","author":"Raghu Maithra","year":"2019","unstructured":"Maithra Raghu, Katy Blumer, Greg Corrado, Jon Kleinberg, Ziad Obermeyer, and Sendhil Mullainathan. 2019. The algorithmic automation problem: Prediction, triage, and human effort. arXiv preprint arXiv:1903.12220 (2019)."},{"key":"e_1_3_3_3_47_1","unstructured":"Resemble AI. 2023. Introducing Neural Speech Watermarker - Resemble AI \u2014 resemble.ai. https:\/\/www.resemble.ai\/neural-speech-watermarker\/. [Accessed 16-08-2023]."},{"key":"e_1_3_3_3_48_1","doi-asserted-by":"publisher","DOI":"10.1046\/j.1460-9568.2002.02147.x"},{"key":"e_1_3_3_3_49_1","doi-asserted-by":"publisher","DOI":"10.3390\/math11143134"},{"key":"e_1_3_3_3_50_1","volume-title":"X-Vectors: Robust DNN Embeddings for Speaker Recognition. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Snyder David","year":"2018","unstructured":"David Snyder, Daniel Garcia-Romero, Gregory Sell, Daniel Povey, and Sanjeev Khudanpur. 2018. X-Vectors: Robust DNN Embeddings for Speaker Recognition. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018), 5329\u20135333."},{"key":"e_1_3_3_3_51_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijhcs.2023.103126"},{"key":"e_1_3_3_3_52_1","unstructured":"Joanna Stern. 2023. 24-Hour Challenge: Can My AI Voice and Video Clone Replace Me? \u2014 wsj.com. https:\/\/www.wsj.com\/video\/series\/joanna-stern-personal-technology\/24-hour-challenge-can-my-ai-voice-and-video-clone-replace-me\/EC817295-03D0-4031-B40B-694D7BDE2797. [Accessed 03-May-2023]."},{"key":"e_1_3_3_3_53_1","unstructured":"Synthesia. 2023. Ethical Deepfake Maker | Use Deepfakes for Good | Synthesia \u2014 synthesia.io. https:\/\/www.synthesia.io\/tools\/deepfake-video-maker. [Accessed 03-May-2023]."},{"key":"e_1_3_3_3_54_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jphysparis.2004.03.009"},{"key":"e_1_3_3_3_55_1","volume-title":"\u00a0D. Evans","author":"Todisco Massimiliano","year":"2016","unstructured":"Massimiliano Todisco, H\u00e9ctor Delgado, and Nicholas W.\u00a0D. Evans. 2016. A New Feature for Automatic Speaker Verification Anti-Spoofing: Constant Q Cepstral Coefficients. In Odyssey."},{"key":"e_1_3_3_3_56_1","volume-title":"ASVspoof 2019: Future horizons in spoofed and fake audio detection. arXiv preprint arXiv:1904.05441","author":"Todisco Massimiliano","year":"2019","unstructured":"Massimiliano Todisco, Xin Wang, Ville Vestman, Md Sahidullah, H\u00e9ctor Delgado, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, and Kong\u00a0Aik Lee. 2019. ASVspoof 2019: Future horizons in spoofed and fake audio detection. arXiv preprint arXiv:1904.05441 (2019)."},{"key":"e_1_3_3_3_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2006.878256"},{"key":"e_1_3_3_3_58_1","volume-title":"A Comparative Study on Recent Neural Spoofing Countermeasures for Synthetic Speech Detection. Interspeech 2021","author":"Wang Xin","year":"2021","unstructured":"Xin Wang and Junichi Yamagishi. 2021. A Comparative Study on Recent Neural Spoofing Countermeasures for Synthetic Speech Detection. Interspeech 2021 (2021)."},{"key":"e_1_3_3_3_59_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2020.101114"},{"key":"e_1_3_3_3_60_1","doi-asserted-by":"publisher","DOI":"10.1523\/jneurosci.20-07-02664.2000"},{"key":"e_1_3_3_3_61_1","volume-title":"Learning to complement humans. arXiv preprint arXiv:2005.00582","author":"Wilder Bryan","year":"2020","unstructured":"Bryan Wilder, Eric Horvitz, and Ece Kamar. 2020. Learning to complement humans. arXiv preprint arXiv:2005.00582 (2020)."},{"key":"e_1_3_3_3_62_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2014.10.005"},{"key":"e_1_3_3_3_63_1","unstructured":"Sung-Hyun Yoon Min-Sung Koh and Ha-Jin Yu. 2020. Phase Spectrum of Time-flipped Speech Signals for Robust Spoofing Detection. In Odyssey."},{"key":"e_1_3_3_3_64_1","volume-title":"Multiple Points Input For Convolutional Neural Networks in Replay Attack Detection. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Yoon Sung-Hyun","year":"2020","unstructured":"Sung-Hyun Yoon and Ha-Jin Yu. 2020. Multiple Points Input For Convolutional Neural Networks in Replay Attack Detection. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020), 6444\u20136448."},{"key":"e_1_3_3_3_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2017.2687041"}],"event":{"name":"CHI '24: CHI Conference on Human Factors in Computing Systems","location":"Honolulu HI USA","acronym":"CHI '24","sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction","SIGACCESS ACM Special Interest Group on Accessible Computing"]},"container-title":["Proceedings of the CHI Conference on Human Factors in Computing Systems"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3613904.3642817","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3613904.3642817","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T23:44:30Z","timestamp":1750290270000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3613904.3642817"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,11]]},"references-count":65,"alternative-id":["10.1145\/3613904.3642817","10.1145\/3613904"],"URL":"https:\/\/doi.org\/10.1145\/3613904.3642817","relation":{},"subject":[],"published":{"date-parts":[[2024,5,11]]},"assertion":[{"value":"2024-05-11","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}