{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T15:25:12Z","timestamp":1774365912033,"version":"3.50.1"},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2021,9,9]],"date-time":"2021-09-09T00:00:00Z","timestamp":1631145600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Interact. Mob. Wearable Ubiquitous Technol."],"published-print":{"date-parts":[[2021,9,9]]},"abstract":"<jats:p>This paper presents an anti-spoofing design to verify whether a voice command is spoken by one live legal user, which supplements existing speech recognition systems and could enable new application potentials when many crucial voice commands need a higher-standard verification in applications. In the literature, verifying the liveness and legality of the command's speaker has been studied separately. However, to accept a voice command from a live legal user, prior solutions cannot be combined directly due to two reasons. First, previous methods have introduced various sensing channels for the liveness detection, while the safety of a sensing channel itself cannot be guaranteed. Second, a direct combination is also vulnerable when an attacker plays a recorded voice command from the legal user and mimics this user to speak the command simultaneously. In this paper, we introduce an anti-spoofing sensing channel to fulfill the design. More importantly, our design provides a generic interface to form the sensing channel, which is compatible to a variety of widely-used signals, including RFID, Wi-Fi and acoustic signals. This offers a flexibility to balance the system cost and verification requirement. We develop a prototype system with three versions by using these sensing signals. We conduct extensive experiments in six different real-world environments under a variety of settings to examine the effectiveness of our design.<\/jats:p>","DOI":"10.1145\/3478116","type":"journal-article","created":{"date-parts":[[2021,9,14]],"date-time":"2021-09-14T22:48:23Z","timestamp":1631659703000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Anti-Spoofing Voice Commands"],"prefix":"10.1145","volume":"5","author":[{"given":"Cui","family":"Zhao","sequence":"first","affiliation":[{"name":"School of Cyber Science and Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhenjiang","family":"Li","sequence":"additional","affiliation":[{"name":"City University of Hong Kong, Hong Kong, China, CityU Shenzhen Research Institute, Shen Zhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Han","family":"Ding","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei","family":"Xi","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ge","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jizhong","family":"Zhao","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,9,14]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proc. of USENIX Security Symposium. 2685--2702","author":"Ahmed Muhammad Ejaz","year":"2020","unstructured":"Muhammad Ejaz Ahmed , Il-Youp Kwak , Jun Ho Huh , Iljoo Kim , Taekkyung Oh , and Hyoungshick Kim . 2020 . Void: A Fast and Light Voice Liveness Detection System . In Proc. of USENIX Security Symposium. 2685--2702 . Muhammad Ejaz Ahmed, Il-Youp Kwak, Jun Ho Huh, Iljoo Kim, Taekkyung Oh, and Hyoungshick Kim. 2020. Void: A Fast and Light Voice Liveness Detection System. In Proc. of USENIX Security Symposium. 2685--2702."},{"key":"e_1_2_1_2_1","volume-title":"Proc. of ASEE. 1--7.","author":"Bachu RG","year":"2008","unstructured":"RG Bachu , S Kopparthi , B Adapa , and BD Barkana . 2008 . Separation of Voiced and Unvoiced Using Zero Crossing Rate and Energy of the Speech Signal . In Proc. of ASEE. 1--7. RG Bachu, S Kopparthi, B Adapa, and BD Barkana. 2008. Separation of Voiced and Unvoiced Using Zero Crossing Rate and Energy of the Speech Signal. In Proc. of ASEE. 1--7."},{"key":"e_1_2_1_3_1","volume-title":"A Survey on Acoustic Sensing. arXiv preprint arXiv:1901.03450","author":"Cai Chao","year":"2019","unstructured":"Chao Cai , Rong Zheng , and Menglan Hu. 2019. A Survey on Acoustic Sensing. arXiv preprint arXiv:1901.03450 ( 2019 ). Chao Cai, Rong Zheng, and Menglan Hu. 2019. A Survey on Acoustic Sensing. arXiv preprint arXiv:1901.03450 (2019)."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.14722\/ndss.2020.23055"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM.2018.8486424"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/BTAS.2015.7358783"},{"key":"e_1_2_1_7_1","volume-title":"Shin","author":"Feng Huan","year":"2017","unstructured":"Huan Feng , Kassem Fawaz , and Kang G . Shin . 2017 . Continuous Authentication for Voice Assistants. In In Proc. of ACM Mobicom . Huan Feng, Kassem Fawaz, and Kang G. Shin. 2017. Continuous Authentication for Voice Assistants. In In Proc. of ACM Mobicom."},{"key":"e_1_2_1_8_1","volume-title":"Specification for RFID Air Interface. EPC Radio-Frequency Identity Protocols Class-1 Generation-2 UHF RFID Protocol for Communications 860","author":"Global EPC","year":"2005","unstructured":"EPC Global . 2005. Specification for RFID Air Interface. EPC Radio-Frequency Identity Protocols Class-1 Generation-2 UHF RFID Protocol for Communications 860 ( 2005 ), 1--94. EPC Global. 2005. Specification for RFID Air Interface. EPC Radio-Frequency Identity Protocols Class-1 Generation-2 UHF RFID Protocol for Communications 860 (2005), 1--94."},{"key":"e_1_2_1_9_1","volume-title":"Proc. of IEEE ICSP. 329--332","author":"Zhu Wei-Hong","year":"2000","unstructured":"Dai-fei Guo, Wei-Hong Zhu , Zhen-Ming Gao , and Jian-qiang Zhang. 2000 . A Study of Wavelet Thresholding Denoising . In Proc. of IEEE ICSP. 329--332 . Dai-fei Guo, Wei-Hong Zhu, Zhen-Ming Gao, and Jian-qiang Zhang. 2000. A Study of Wavelet Thresholding Denoising. In Proc. of IEEE ICSP. 329--332."},{"key":"e_1_2_1_10_1","volume-title":"Proc. of USENIX NSDI.","author":"Ha Unsoo","year":"2020","unstructured":"Unsoo Ha , Junshan Leng , Alaa Khaddaj , and Fadel Adib . 2020 . Food and Liquid Sensing in Practical Environments using RFIDs . In Proc. of USENIX NSDI. Unsoo Ha, Junshan Leng, Alaa Khaddaj, and Fadel Adib. 2020. Food and Liquid Sensing in Practical Environments using RFIDs. In Proc. of USENIX NSDI."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/WCNC.2016.7564953"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCOMM.2010.03.080291"},{"key":"e_1_2_1_13_1","volume-title":"Proc. of USENIX NSDI.","author":"Hassanieh Haitham","year":"2015","unstructured":"Haitham Hassanieh , Jue Wang , Dina Katabi , and Tadayoshi Kohno . 2015 . Securing RFIDs by Randomizing the Modulation and Channel . In Proc. of USENIX NSDI. Haitham Hassanieh, Jue Wang, Dina Katabi, and Tadayoshi Kohno. 2015. Securing RFIDs by Randomizing the Modulation and Channel. In Proc. of USENIX NSDI."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3241539.3241548"},{"key":"e_1_2_1_16_1","volume-title":"Proc. of USENIX NSDI.","author":"Joshi Kiran","year":"2015","unstructured":"Kiran Joshi , Dinesh Bharadia , Manikanta Kotaru , and Sachin Katti . 2015 . WiDeo: Fine-Grained Device-Free Motion Tracing Using RF Backscatter . In Proc. of USENIX NSDI. Kiran Joshi, Dinesh Bharadia, Manikanta Kotaru, and Sachin Katti. 2015. WiDeo: Fine-Grained Device-Free Motion Tracing Using RF Backscatter. In Proc. of USENIX NSDI."},{"key":"e_1_2_1_17_1","volume-title":"Ville Hautam\u00e4ki, Nicholas Evans, and Zheng-Hua Tan.","author":"Kinnunen Tomi","year":"2016","unstructured":"Tomi Kinnunen , Md Sahidullah , Ivan Kukanov , H\u00e9ctor Delgado , Massimiliano Todisco , Achintya Sarkar , Nicolai B\u00e6k Thomsen , Ville Hautam\u00e4ki, Nicholas Evans, and Zheng-Hua Tan. 2016 . Utterance Verification for Text-Dependent Speaker Recognition: A Comparative Assessment Using the RedDots Corpus . (2016). Tomi Kinnunen, Md Sahidullah, Ivan Kukanov, H\u00e9ctor Delgado, Massimiliano Todisco, Achintya Sarkar, Nicolai B\u00e6k Thomsen, Ville Hautam\u00e4ki, Nicholas Evans, and Zheng-Hua Tan. 2016. Utterance Verification for Text-Dependent Speaker Recognition: A Comparative Assessment Using the RedDots Corpus. (2016)."},{"key":"e_1_2_1_18_1","volume-title":"1D Convolutional Neural Networks and Applications: A Survey. arXiv preprint arXiv:1905.03554","author":"Kiranyaz Serkan","year":"2019","unstructured":"Serkan Kiranyaz , Onur Avci , Osama Abdeljaber , Turker Ince , Moncef Gabbouj , and Daniel J Inman . 2019. 1D Convolutional Neural Networks and Applications: A Survey. arXiv preprint arXiv:1905.03554 ( 2019 ). Serkan Kiranyaz, Onur Avci, Osama Abdeljaber, Turker Ince, Moncef Gabbouj, and Daniel J Inman. 2019. 1D Convolutional Neural Networks and Applications: A Survey. arXiv preprint arXiv:1905.03554 (2019)."},{"key":"e_1_2_1_19_1","volume-title":"Dynamic Dialects: An Articulatory Web Resource for the Study of Accents.","author":"Lawson Eleanor","year":"2015","unstructured":"Eleanor Lawson , Jane Stuart-Smith , James M Scobbie , Satsuki Nakai , David Beavan , Fiona Edmonds , Iain Edmonds , Alice Turk , Claire Timmins , J Beck , 2015 . Dynamic Dialects: An Articulatory Web Resource for the Study of Accents. (2015). Eleanor Lawson, Jane Stuart-Smith, James M Scobbie, Satsuki Nakai, David Beavan, Fiona Edmonds, Iain Edmonds, Alice Turk, Claire Timmins, J Beck, et al. 2015. Dynamic Dialects: An Articulatory Web Resource for the Study of Accents. (2015)."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM.2018.8486283"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2973750.2973755"},{"key":"e_1_2_1_22_1","unstructured":"Seshashyama Sameeraj Meduri and Rufus Ananth. 2012. A Survey and Evaluation of Voice Activity Detection Algorithms.  Seshashyama Sameeraj Meduri and Rufus Ananth. 2012. A Survey and Evaluation of Voice Activity Detection Algorithms."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3209582.3209591"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/LWC.2015.2475749"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/35.46670"},{"key":"e_1_2_1_26_1","volume-title":"Proc. of IEEE 2011 workshop on ASRU.","author":"Povey Daniel","year":"2011","unstructured":"Daniel Povey , Arnab Ghoshal , Gilles Boulianne , Lukas Burget , Ondrej Glembek , Nagendra Goel , Mirko Hannemann , Petr Motlicek , Yanmin Qian , Petr Schwarz , 2011 . The Kaldi Speech Recognition Toolkit . In Proc. of IEEE 2011 workshop on ASRU. Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, et al. 2011. The Kaldi Speech Recognition Toolkit. In Proc. of IEEE 2011 workshop on ASRU."},{"key":"e_1_2_1_27_1","volume-title":"OFDM for Wireless Communications Systems","author":"Prasad Ramjee","unstructured":"Ramjee Prasad . 2004. OFDM for Wireless Communications Systems . Artech House . Ramjee Prasad. 2004. OFDM for Wireless Communications Systems. Artech House."},{"key":"e_1_2_1_28_1","volume-title":"Proc. of IEEE Aerospace conference.","author":"Prabhu Raghavendra S.","year":"2009","unstructured":"S. Prabhu Raghavendra and Grayver Eugene . 2009 . Active Constellation Modification Techniques for OFDM PAR Reduction . In Proc. of IEEE Aerospace conference. S. Prabhu Raghavendra and Grayver Eugene. 2009. Active Constellation Modification Techniques for OFDM PAR Reduction. In Proc. of IEEE Aerospace conference."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-018-6834-3"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2017.2760243"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3264944"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2010.5495503"},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Sayaka Shiota Fernando Villavicencio Junichi Yamagishi Nobutaka Ono Isao Echizen and Tomoko Matsui. 2016. Voice Liveness Detection for Speaker Verification based on a Tandem Single\/Double-channel Pop Noise Detector. In Odyssey. 259--263.  Sayaka Shiota Fernando Villavicencio Junichi Yamagishi Nobutaka Ono Isao Echizen and Tomoko Matsui. 2016. Voice Liveness Detection for Speaker Verification based on a Tandem Single\/Double-channel Pop Noise Detector. In Odyssey. 259--263.","DOI":"10.21437\/Odyssey.2016-37"},{"key":"e_1_2_1_34_1","volume-title":"Hsiao-Chun Wu, Scott C-H Huang, and Hsiao-Hwa Chen.","author":"Shiu Yi-Sheng","year":"2011","unstructured":"Yi-Sheng Shiu , Shih Yu Chang , Hsiao-Chun Wu, Scott C-H Huang, and Hsiao-Hwa Chen. 2011 . Physical Layer Security in Wireless Networks : A Tutorial. IEEE wireless Communications 18, 2 (2011), 66--74. Yi-Sheng Shiu, Shih Yu Chang, Hsiao-Chun Wu, Scott C-H Huang, and Hsiao-Hwa Chen. 2011. Physical Layer Security in Wireless Networks: A Tutorial. IEEE wireless Communications 18, 2 (2011), 66--74."},{"key":"e_1_2_1_35_1","volume-title":"Dynamic Consequences of Differences in Male and Female Vocal Tract Dimensions. The journal of the Acoustical society of America 109, 5","author":"Simpson Adrian P","year":"2001","unstructured":"Adrian P Simpson . 2001. Dynamic Consequences of Differences in Male and Female Vocal Tract Dimensions. The journal of the Acoustical society of America 109, 5 ( 2001 ), 2153--2164. Adrian P Simpson. 2001. Dynamic Consequences of Differences in Male and Female Vocal Tract Dimensions. The journal of the Acoustical society of America 109, 5 (2001), 2153--2164."},{"key":"e_1_2_1_36_1","volume-title":"Juwesh Binong, and Lairenlakpam Joyprakash Singh.","author":"Syiem Bronson","year":"2020","unstructured":"Bronson Syiem , Sushanta Kabir Dutta , Juwesh Binong, and Lairenlakpam Joyprakash Singh. 2020 . Comparison of Khasi Speech Representations with Different Spectral Features and Hidden Markov States. Journal of Electronic Science and Technology ( 2020), 100079. Bronson Syiem, Sushanta Kabir Dutta, Juwesh Binong, and Lairenlakpam Joyprakash Singh. 2020. Comparison of Khasi Speech Representations with Different Spectral Features and Hidden Markov States. Journal of Electronic Science and Technology (2020), 100079."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3264947"},{"key":"e_1_2_1_38_1","volume-title":"International journal on emerging technologies 1, 1","author":"Tiwari Vibha","year":"2010","unstructured":"Vibha Tiwari . 2010. MFCC and its Applications in Speaker Recognition . International journal on emerging technologies 1, 1 ( 2010 ), 19--22. Vibha Tiwari. 2010. MFCC and its Applications in Speaker Recognition. International journal on emerging technologies 1, 1 (2010), 19--22."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01019494"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2018.8462665"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3241539.3241541"},{"key":"e_1_2_1_42_1","volume-title":"Voicefilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking. arXiv preprint arXiv:1810.04826","author":"Wang Quan","year":"2018","unstructured":"Quan Wang , Hannah Muckenhirn , Kevin Wilson , Prashant Sridhar , Zelin Wu , John Hershey , Rif A Saurous , Ron J Weiss , Ye Jia , and Ignacio Lopez Moreno . 2018 . Voicefilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking. arXiv preprint arXiv:1810.04826 (2018). Quan Wang, Hannah Muckenhirn, Kevin Wilson, Prashant Sridhar, Zelin Wu, John Hershey, Rif A Saurous, Ron J Weiss, Ye Jia, and Ignacio Lopez Moreno. 2018. Voicefilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking. arXiv preprint arXiv:1810.04826 (2018)."},{"key":"e_1_2_1_43_1","unstructured":"Inc Wikimedia Foundation. 2019. \"Voice Frequency\". https:\/\/en.wikipedia.org\/wiki\/Voice_frequency.  Inc Wikimedia Foundation. 2019. \"Voice Frequency\". https:\/\/en.wikipedia.org\/wiki\/Voice_frequency."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2014.10.005"},{"key":"e_1_2_1_45_1","volume-title":"AUTOMATIC SPEECH RECOGNITION","author":"Yu Dong","unstructured":"Dong Yu and Li Deng . 2016. AUTOMATIC SPEECH RECOGNITION . Springer . Dong Yu and Li Deng. 2016. AUTOMATIC SPEECH RECOGNITION. Springer."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00124"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2018.2831456"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133956.3133962"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3230543.3230579"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3081333.3081363"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3241539.3241575"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/MWC.2019.1800477"}],"container-title":["Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3478116","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3478116","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:31:33Z","timestamp":1750188693000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3478116"}},"subtitle":["A Generic Wireless Assisted Design"],"short-title":[],"issued":{"date-parts":[[2021,9,9]]},"references-count":52,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2021,9,9]]}},"alternative-id":["10.1145\/3478116"],"URL":"https:\/\/doi.org\/10.1145\/3478116","relation":{},"ISSN":["2474-9567"],"issn-type":[{"value":"2474-9567","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,9]]},"assertion":[{"value":"2021-09-14","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}