{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T18:09:57Z","timestamp":1776276597659,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":43,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,28]],"date-time":"2022-10-28T00:00:00Z","timestamp":1666915200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"JST Moonthos","award":["Grant Number JPMJMS2012"],"award-info":[{"award-number":["Grant Number JPMJMS2012"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,29]]},"DOI":"10.1145\/3526113.3545685","type":"proceedings-article","created":{"date-parts":[[2022,10,28]],"date-time":"2022-10-28T16:37:41Z","timestamp":1666975061000},"page":"1-10","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["DualVoice: Speech Interaction that Discriminates between Normal and Whispered Voice Input"],"prefix":"10.1145","author":[{"given":"Jun","family":"Rekimoto","sequence":"first","affiliation":[{"name":"The University of Tokyo, Japan and Sony CSL Kyoto, Japan"}]}],"member":"320","published-online":{"date-parts":[[2022,10,28]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3271553.3271611"},{"key":"e_1_3_2_1_2_1","volume-title":"wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. arXiv [cs.CL] (June","author":"Baevski Alexei","year":"2020","unstructured":"Alexei Baevski , Henry Zhou , Abdelrahman Mohamed , and Michael Auli . 2020. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. arXiv [cs.CL] (June 2020 ). Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, and Michael Auli. 2020. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. arXiv [cs.CL] (June 2020)."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2015.310"},{"key":"e_1_3_2_1_4_1","volume-title":"End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Pseudo Whisper Pre-training. (May","author":"Chang Heng-Jui","year":"2020","unstructured":"Heng-Jui Chang , Alexander\u00a0 H Liu , Hung-Yi Lee , and Lin-Shan Lee . 2020. End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Pseudo Whisper Pre-training. (May 2020 ). arxiv:2005.01972\u00a0[cs.CL] Heng-Jui Chang, Alexander\u00a0H Liu, Hung-Yi Lee, and Lin-Shan Lee. 2020. End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Pseudo Whisper Pre-training. (May 2020). arxiv:2005.01972\u00a0[cs.CL]"},{"key":"e_1_3_2_1_5_1","volume-title":"Voice Conversion for Whispered Speech Synthesis. (Dec","author":"Cotescu Marius","year":"2019","unstructured":"Marius Cotescu , Thomas Drugman , Goeric Huybrechts , Jaime Lorenzo-Trueba , and Alexis Moinet . 2019. Voice Conversion for Whispered Speech Synthesis. (Dec . 2019 ). arxiv:1912.05289\u00a0[cs.SD] Marius Cotescu, Thomas Drugman, Goeric Huybrechts, Jaime Lorenzo-Trueba, and Alexis Moinet. 2019. Voice Conversion for Whispered Speech Synthesis. (Dec. 2019). arxiv:1912.05289\u00a0[cs.SD]"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2009.08.002"},{"key":"e_1_3_2_1_7_1","volume-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (Oct.","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (Oct. 2018). arxiv:1810.04805\u00a0[cs.CL] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (Oct. 2018). arxiv:1810.04805\u00a0[cs.CL]"},{"key":"e_1_3_2_1_8_1","unstructured":"Dyson. 2022. dyson zone: Air-purifying headphones with active noise cancelling. https:\/\/www.dyson.co.uk\/en.  Dyson. 2022. dyson zone: Air-purifying headphones with active noise cancelling. https:\/\/www.dyson.co.uk\/en."},{"key":"e_1_3_2_1_9_1","volume-title":"An Introduction to Silent Speech Interfaces","author":"Freitas Joo","unstructured":"Joo Freitas , Antnio Teixeira , Miguel\u00a0Sales Dias , and Samuel Silva . 2016. An Introduction to Silent Speech Interfaces ( 1 st ed.). Springer Publishing Company, Inc orporated. Joo Freitas, Antnio Teixeira, Miguel\u00a0Sales Dias, and Samuel Silva. 2016. An Introduction to Silent Speech Interfaces (1st ed.). Springer Publishing Company, Incorporated.","edition":"1"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3242587.3242603"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2016.2580944"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.21437\/Eurospeech.1999-60"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.21437\/Eurospeech.2003-387"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143891"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2017.2738559"},{"key":"e_1_3_2_1_16_1","volume-title":"Proc. 29 (jan 2021","author":"Hsu Wei-Ning","year":"2021","unstructured":"Wei-Ning Hsu , Benjamin Bolte , Yao- Hung\u00a0Hubert Tsai , Kushal Lakhotia , Ruslan Salakhutdinov , and Abdelrahman Mohamed . 2021 . HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units. IEEE\/ACM Trans. Audio, Speech and Lang . Proc. 29 (jan 2021 ), 3451\u20133460. https:\/\/doi.org\/10.1109\/TASLP.2021.3122291 10.1109\/TASLP.2021.3122291 Wei-Ning Hsu, Benjamin Bolte, Yao-Hung\u00a0Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, and Abdelrahman Mohamed. 2021. HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units. IEEE\/ACM Trans. Audio, Speech and Lang. Proc. 29 (jan 2021), 3451\u20133460. https:\/\/doi.org\/10.1109\/TASLP.2021.3122291"},{"key":"e_1_3_2_1_17_1","unstructured":"Amazon.com Inc.2018. How Alexa keeps getting smarter. https:\/\/www.aboutamazon.com\/devices\/how-alexa-keeps-getting-smarter  Amazon.com Inc.2018. How Alexa keeps getting smarter. https:\/\/www.aboutamazon.com\/devices\/how-alexa-keeps-getting-smarter"},{"key":"e_1_3_2_1_18_1","unstructured":"Google Inc.2020. Google Cloud Speech-to-Text. https:\/\/cloud.google.com\/speech-to-text.  Google Inc.2020. Google Cloud Speech-to-Text. https:\/\/cloud.google.com\/speech-to-text."},{"key":"e_1_3_2_1_19_1","unstructured":"Philips Inc.2021. Fresh Air Mask Series 6000. https:\/\/www.philips.com.sg\/c-p\/ACM066_01\/fresh-air-mask-series-6000.  Philips Inc.2021. Fresh Air Mask Series 6000. https:\/\/www.philips.com.sg\/c-p\/ACM066_01\/fresh-air-mask-series-6000."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2003.10.005"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3172944.3172977"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300376"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2940700"},{"key":"e_1_3_2_1_24_1","volume-title":"Computational differences between whispered and non-whispered speech. Ph.\u00a0D. Dissertation","author":"Lim Boon\u00a0Pang","unstructured":"Boon\u00a0Pang Lim . 2010. Computational differences between whispered and non-whispered speech. Ph.\u00a0D. Dissertation . University of Illinois Urbana-Champaign. Boon\u00a0Pang Lim. 2010. Computational differences between whispered and non-whispered speech. Ph.\u00a0D. Dissertation. University of Illinois Urbana-Champaign."},{"key":"e_1_3_2_1_25_1","volume-title":"An introduction to tkinter. www.pythonware.com\/library\/tkinter\/introduction\/index.htm","author":"Lundh Fredrik","year":"1999","unstructured":"Fredrik Lundh . 1999. An introduction to tkinter. www.pythonware.com\/library\/tkinter\/introduction\/index.htm ( 1999 ). Fredrik Lundh. 1999. An introduction to tkinter. www.pythonware.com\/library\/tkinter\/introduction\/index.htm (1999)."},{"key":"e_1_3_2_1_26_1","volume-title":"Sai Bharath\u00a0Chandra Gutha, and M\u00a0Ali\u00a0Basha Shaik","author":"Niranjan Abhishek","year":"2020","unstructured":"Abhishek Niranjan , Mukesh Sharma , Sai Bharath\u00a0Chandra Gutha, and M\u00a0Ali\u00a0Basha Shaik . 2020 . End-to-End Whisper to Natural Speech Conversion using Modified Transformer Network . https:\/\/doi.org\/10.48550\/ARXIV.2004.09347 10.48550\/ARXIV.2004.09347 Abhishek Niranjan, Mukesh Sharma, Sai Bharath\u00a0Chandra Gutha, and M\u00a0Ali\u00a0Basha Shaik. 2020. End-to-End Whisper to Natural Speech Conversion using Modified Transformer Network. https:\/\/doi.org\/10.48550\/ARXIV.2004.09347"},{"key":"e_1_3_2_1_27_1","volume-title":"DualBreath: Input Method Using Nasal and Mouth Breathing. In Augmented Humans Conference 2021","author":"Onishi Ryoya","year":"2021","unstructured":"Ryoya Onishi , Tao Morisaki , Shun Suzuki , Saya Mizutani , Takaaki Kamigaki , Masahiro Fujiwara , Yasutoshi Makino , and Hiroyuki Shinoda . 2021 . DualBreath: Input Method Using Nasal and Mouth Breathing. In Augmented Humans Conference 2021 ( Rovaniemi, Finland) (AHs\u201921). Association for Computing Machinery, New York, NY, USA, 283\u2013285. Ryoya Onishi, Tao Morisaki, Shun Suzuki, Saya Mizutani, Takaaki Kamigaki, Masahiro Fujiwara, Yasutoshi Makino, and Hiroyuki Shinoda. 2021. DualBreath: Input Method Using Nasal and Mouth Breathing. In Augmented Humans Conference 2021(Rovaniemi, Finland) (AHs\u201921). Association for Computing Machinery, New York, NY, USA, 283\u2013285."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178964"},{"key":"#cr-split#-e_1_3_2_1_29_1.1","doi-asserted-by":"crossref","unstructured":"Santiago Pascual Antonio Bonafonte Joan Serr\u00e0 and Jose\u00a0A. Gonzalez. 2018. Whispered-to-voiced Alaryngeal Speech Conversion with Generative Adversarial Networks. https:\/\/doi.org\/10.48550\/ARXIV.1808.10687 10.48550\/ARXIV.1808.10687","DOI":"10.21437\/IberSPEECH.2018-25"},{"key":"#cr-split#-e_1_3_2_1_29_1.2","doi-asserted-by":"crossref","unstructured":"Santiago Pascual Antonio Bonafonte Joan Serr\u00e0 and Jose\u00a0A. Gonzalez. 2018. Whispered-to-voiced Alaryngeal Speech Conversion with Generative Adversarial Networks. https:\/\/doi.org\/10.48550\/ARXIV.1808.10687","DOI":"10.21437\/IberSPEECH.2018-25"},{"key":"e_1_3_2_1_30_1","unstructured":"Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017).  Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017)."},{"key":"e_1_3_2_1_31_1","volume-title":"Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings. (April","author":"Pepino Leonardo","year":"2021","unstructured":"Leonardo Pepino , Pablo Riera , and Luciana Ferrer . 2021. Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings. (April 2021 ). arxiv:2104.03502\u00a0[cs.SD] Leonardo Pepino, Pablo Riera, and Luciana Ferrer. 2021. Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings. (April 2021). arxiv:2104.03502\u00a0[cs.SD]"},{"key":"e_1_3_2_1_32_1","volume-title":"Proceedings of the 28th ACM International Conference on Multimedia","author":"Prajwal K\u00a0R","unstructured":"K\u00a0R Prajwal , Rudrabha Mukhopadhyay , Vinay\u00a0 P. Namboodiri , and C.V. Jawahar . 2020. A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild . In Proceedings of the 28th ACM International Conference on Multimedia ( Seattle, WA, USA) (MM \u201920). Association for Computing Machinery, New York, NY, USA, 484\u2013492. https:\/\/doi.org\/10.1145\/3394171.3413532 10.1145\/3394171.3413532 K\u00a0R Prajwal, Rudrabha Mukhopadhyay, Vinay\u00a0P. Namboodiri, and C.V. Jawahar. 2020. A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild. In Proceedings of the 28th ACM International Conference on Multimedia (Seattle, WA, USA) (MM \u201920). Association for Computing Machinery, New York, NY, USA, 484\u2013492. https:\/\/doi.org\/10.1145\/3394171.3413532"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3411764.3445687"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2971763.2971765"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2634317.2634322"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3242587.3242599"},{"key":"e_1_3_2_1_37_1","first-page":"2579","article-title":"Visualizing Data using t-SNE","volume":"9","author":"van\u00a0der Maaten Laurens","year":"2008","unstructured":"Laurens van\u00a0der Maaten and Geoffrey Hinton . 2008 . Visualizing Data using t-SNE . Journal of Machine Learning Research 9 , 86 (2008), 2579 \u2013 2605 . http:\/\/jmlr.org\/papers\/v9\/vandermaaten08a.html Laurens van\u00a0der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579\u20132605. http:\/\/jmlr.org\/papers\/v9\/vandermaaten08a.html","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_1_38_1","volume-title":"(June","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan\u00a0 N Gomez , Lukasz Kaiser , and Illia Polosukhin . 2017. Attention Is All You Need. (June 2017 ). arxiv:1706.03762\u00a0[cs.CL] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan\u00a0N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. (June 2017). arxiv:1706.03762\u00a0[cs.CL]"},{"key":"e_1_3_2_1_39_1","volume-title":"HuggingFace\u2019s Transformers: State-of-the-art Natural Language Processing. (Oct","author":"Wolf Thomas","year":"2019","unstructured":"Thomas Wolf , Lysandre Debut , Victor Sanh , Julien Chaumond , Clement Delangue , Anthony Moi , Pierric Cistac , Tim Rault , R\u00e9mi Louf , Morgan Funtowicz , Joe Davison , Sam Shleifer , Patrick von Platen , Clara Ma , Yacine Jernite , Julien Plu , Canwen Xu , Teven Le\u00a0Scao , Sylvain Gugger , Mariama Drame , Quentin Lhoest , and Alexander\u00a0 M Rush . 2019. HuggingFace\u2019s Transformers: State-of-the-art Natural Language Processing. (Oct . 2019 ). arxiv:1910.03771\u00a0[cs.CL] Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R\u00e9mi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le\u00a0Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander\u00a0M Rush. 2019. HuggingFace\u2019s Transformers: State-of-the-art Natural Language Processing. (Oct. 2019). arxiv:1910.03771\u00a0[cs.CL]"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3332165.3347950"},{"key":"e_1_3_2_1_41_1","unstructured":"Cheng Yi Jianzhong Wang Ning Cheng Shiyu Zhou and Bo Xu. 2020. Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages. (Dec. 2020). arxiv:2012.12121\u00a0[cs.CL]  Cheng Yi Jianzhong Wang Ning Cheng Shiyu Zhou and Bo Xu. 2020. Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages. (Dec. 2020). arxiv:2012.12121\u00a0[cs.CL]"},{"key":"e_1_3_2_1_42_1","first-page":"2007","volume-title":"Proc. Interspeech","author":"Zhang Chi","year":"2007","unstructured":"Chi Zhang and John H . \u00a0L. Hansen. 2007. Analysis and classification of speech mode: whispered through shouted . In Proc. Interspeech 2007 . 2289\u20132292. https:\/\/doi.org\/10.21437\/Interspeech. 2007 - 2621 10.21437\/Interspeech.2007-621 Chi Zhang and John H.\u00a0L. Hansen. 2007. Analysis and classification of speech mode: whispered through shouted. In Proc. Interspeech 2007. 2289\u20132292. https:\/\/doi.org\/10.21437\/Interspeech.2007-621"}],"event":{"name":"UIST '22: The 35th Annual ACM Symposium on User Interface Software and Technology","location":"Bend OR USA","acronym":"UIST '22","sponsor":["SIGGRAPH ACM Special Interest Group on Computer Graphics and Interactive Techniques","SIGCHI ACM Special Interest Group on Computer-Human Interaction"]},"container-title":["Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3526113.3545685","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3526113.3545685","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:00:24Z","timestamp":1750186824000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3526113.3545685"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,28]]},"references-count":43,"alternative-id":["10.1145\/3526113.3545685","10.1145\/3526113"],"URL":"https:\/\/doi.org\/10.1145\/3526113.3545685","relation":{},"subject":[],"published":{"date-parts":[[2022,10,28]]},"assertion":[{"value":"2022-10-28","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}