{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,2]],"date-time":"2025-12-02T19:46:54Z","timestamp":1764704814095,"version":"3.46.0"},"reference-count":41,"publisher":"Association for Computing Machinery (ACM)","issue":"4","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Interact. Mob. Wearable Ubiquitous Technol."],"published-print":{"date-parts":[[2025,12,2]]},"abstract":"<jats:p>With the rise of voice-enabled technologies, loudspeaker playback has become widespread, posing increasing risks to speech privacy. Traditional eavesdropping methods often require invasive access or line-of-sight, limiting their practicality. In this paper, we present mmSpeech, an end-to-end mmWave-based eavesdropping system that reconstructs intelligible speech solely from vibration signals induced by loudspeaker playback, even through walls and without prior knowledge of the speaker. To achieve this, we reveal an optimal combination of vibrating material and radar sampling rate for capturing high-quality vibrations using narrowband mmWave signals. We then design a deep neural network that reconstructs intelligible speech from the estimated noisy spectrograms. To further support downstream speech understanding, we introduce a synthetic training pipeline and selectively fine-tune the encoder of a pre-trained ASR model. We implement mmSpeech with a commercial mmWave radar and validate its performance through extensive experiments. Results show that mmSpeech achieves state-of-the-art speech quality and generalizes well across unseen speakers and various conditions.<\/jats:p>","DOI":"10.1145\/3770708","type":"journal-article","created":{"date-parts":[[2025,12,2]],"date-time":"2025-12-02T19:42:32Z","timestamp":1764704552000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["We Can Hear You with mmWave Radar! An End-to-End Eavesdropping System"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-2414-338X","authenticated-orcid":false,"given":"Dachao","family":"Han","sequence":"first","affiliation":[{"name":"Xi'an Jiaotong University, Xi'an, Shaanxi, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-0910-5739","authenticated-orcid":false,"given":"Teng","family":"Huang","sequence":"additional","affiliation":[{"name":"Xi'an Jiaotong University, Xi'an, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5274-7988","authenticated-orcid":false,"given":"Han","family":"Ding","sequence":"additional","affiliation":[{"name":"Xi'an Jiaotong University, Xi'an, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4603-4914","authenticated-orcid":false,"given":"Cui","family":"Zhao","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Xi'an Jiaotong University, Xi'an, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0750-6990","authenticated-orcid":false,"given":"Fei","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Software Engineering, Xi'an Jiaotong University, Xi'an, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3845-1646","authenticated-orcid":false,"given":"Ge","family":"Wang","sequence":"additional","affiliation":[{"name":"Xi'an Jiaotong University, Xi'an, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9348-2982","authenticated-orcid":false,"given":"Wei","family":"Xi","sequence":"additional","affiliation":[{"name":"Xi'an Jiaotong University, Xi'an, China"}]}],"member":"320","published-online":{"date-parts":[[2025,12,2]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"d.]. Facts about using Long Range Directional Microphones for Eavesdropping. https:\/\/ampflab.com\/long-range-directional-microphone-research.html","author":"AMPFLAB.","year":"2024","unstructured":"AMPFLAB. [n. d.]. Facts about using Long Range Directional Microphones for Eavesdropping. https:\/\/ampflab.com\/long-range-directional-microphone-research.html. 2024."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.14722\/ndss.2020.24076"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP46214.2022.9833568"},{"key":"e_1_2_1_4_1","volume-title":"d.]. Facts About Speech Intelligibility. https:\/\/www.dpamicrophones.com\/mic-university\/background-knowledge\/facts-about-speech-intelligibility\/","author":"Brixen Eddy B\u00f8gh","year":"2024","unstructured":"Eddy B\u00f8gh Brixen. [n. d.]. Facts About Speech Intelligibility. https:\/\/www.dpamicrophones.com\/mic-university\/background-knowledge\/facts-about-speech-intelligibility\/. 2024."},{"key":"e_1_2_1_5_1","volume-title":"CMGAN: Conformer-based Metric GAN for Speech Enhancement.","author":"Cao Ruizhe","year":"2022","unstructured":"Ruizhe Cao, Sherif Abdulatif, and Bin Yang. 2022. CMGAN: Conformer-based Metric GAN for Speech Enhancement. (2022)."},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of NeurIPS.","author":"Dao Tri","year":"2022","unstructured":"Tri Dao, Dan Fu, Stefano Ermon, Atri Rudra, and Christopher R\u00e9. 2022. FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. In Proceedings of NeurIPS."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601119"},{"key":"e_1_2_1_8_1","volume-title":"HARVARD Speech Corpus-Audio Recording","author":"Demonte Philippa","year":"2019","unstructured":"Philippa Demonte. 2019. HARVARD Speech Corpus-Audio Recording 2019. (2019)."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3580861"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3550303"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM53939.2023.10229095"},{"key":"e_1_2_1_12_1","volume-title":"Factors Governing the Intelligibility of Speech Sounds. The journal of the Acoustical society of America 19, 1","author":"French Norman R","year":"1947","unstructured":"Norman R French and John C Steinberg. 1947. Factors Governing the Intelligibility of Speech Sounds. The journal of the Acoustical society of America 19, 1 (1947), 90\u2013119."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2020-2471"},{"key":"e_1_2_1_14_1","volume-title":"What Lessons Can We Learn? https:\/\/iotsecurityfoundation.org\/cia-exploits-of-iot-devices-what-lessons-can-we-learn\/.","author":"Grau Alan","year":"2025","unstructured":"Alan Grau. [n. d.]. CIA Exploits of IoT Devices, What Lessons Can We Learn? https:\/\/iotsecurityfoundation.org\/cia-exploits-of-iot-devices-what-lessons-can-we-learn\/. 2025."},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of IEEE S&P.","author":"Hu Pengfei","year":"2023","unstructured":"Pengfei Hu, Wenhao Li, Riccardo Spolaor, and Xiuzhen Cheng. 2023. mmEcho: A mmwave-based Acoustic Eavesdropping Method. In Proceedings of IEEE S&P."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP46214.2022.9833716"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM55648.2025.11044473"},{"key":"e_1_2_1_18_1","volume-title":"HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis. Advances in neural information processing systems 33","author":"Kong Jungil","year":"2020","unstructured":"Jungil Kong, Jaehyeon Kim, and Jaekyoung Bae. 2020. HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis. Advances in neural information processing systems 33 (2020), 17022\u201317033."},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of NeurIPS.","author":"Kumar Rithesh","year":"2023","unstructured":"Rithesh Kumar, Prem Seetharaman, Alejandro Luebs, Ishaan Kumar, and Kundan Kumar. 2023. High-fidelity Audio Compression with Improved RGQGAN. In Proceedings of NeurIPS."},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of ICLR.","author":"Ping Wei","year":"2025","unstructured":"Sang-gil Lee, Wei Ping, Boris Ginsburg, Bryan Catanzaro, and Sungroh Yoon. 2025. BigVGAN: A Universal Neural Vocoder with Large-scale Training. In Proceedings of ICLR."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMTT.2022.3183575"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of ICLR.","author":"Li Zhiyuan","year":"2019","unstructured":"Zhiyuan Li and Sanjeev Arora. 2019. An exponential learning rate schedule for deep learning. In Proceedings of ICLR."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TDSC.2023.3322295"},{"key":"e_1_2_1_24_1","volume-title":"Multimedia analysis, processing and communications","author":"Loizou Philipos C","unstructured":"Philipos C Loizou. 2011. Speech Quality Assessment. In Multimedia analysis, processing and communications. Springer, 623\u2013654."},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of USENIX Security.","author":"Nassi Ben","year":"2022","unstructured":"Ben Nassi, Yaron Pirutin, Raz Swisa, Adi Shamir, Yuval Elovici, and Boris Zadov. 2022. Lamphone: Passive Sound Recovery from a Desk Lamp's Light Bulb Vibrations. In Proceedings of USENIX Security."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178964"},{"key":"e_1_2_1_27_1","volume-title":"This is What Happens When Your Phone is Spying on You. https:\/\/www.universityofcalifornia.edu\/news\/what-happens-when-your-phone-spying-you","author":"Patringenaru Ioana","year":"2023","unstructured":"Ioana Patringenaru. [n.d.]. This is What Happens When Your Phone is Spying on You. https:\/\/www.universityofcalifornia.edu\/news\/what-happens-when-your-phone-spying-you. 2023."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/EuroSP53844.2022.00040"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3534592"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3495243.3560543"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of USENIX Security.","author":"Wang Chao","year":"2024","unstructured":"Chao Wang, Feng Lin, Hao Yan, Tong Wu, Wenyao Xu, and Kui Ren. 2024. VibSpeech: Exploring Practical Wideband Eavesdropping via Bandlimited Signal of Vibration-based Side Channel. In Proceedings of USENIX Security."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3643543"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOMWKSHPS61880.2024.10620704"},{"key":"e_1_2_1_34_1","volume-title":"Your Breath Doesn't Lie: Multi-user Authentication by Sensing Respiration Using mmWave Radar","author":"Wang Yao","unstructured":"Yao Wang, Tao Gu, Tom H Luan, and Yong Yu. 2022. Your Breath Doesn't Lie: Multi-user Authentication by Sensing Respiration Using mmWave Radar. In IEEE SECON."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2789168.2790119"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM52122.2024.10621229"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3372224.3380901"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3570361.3613302"},{"key":"e_1_2_1_39_1","first-page":"1108","article-title":"The Feasibility of Injecting Inaudible Voice Commands to Voice Assistants","volume":"18","author":"Yan Chen","year":"2019","unstructured":"Chen Yan, Guoming Zhang, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, and Wenyuan Xu. 2019. The Feasibility of Injecting Inaudible Voice Commands to Voice Assistants. IEEE Transactions on Dependable and Secure Computing 18, 3 (2019), 1108\u20131124.","journal-title":"IEEE Transactions on Dependable and Secure Computing"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2021-41"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/SECON58729.2023.10287427"}],"container-title":["Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3770708","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,2]],"date-time":"2025-12-02T19:42:40Z","timestamp":1764704560000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3770708"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,2]]},"references-count":41,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,12,2]]}},"alternative-id":["10.1145\/3770708"],"URL":"https:\/\/doi.org\/10.1145\/3770708","relation":{},"ISSN":["2474-9567"],"issn-type":[{"value":"2474-9567","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,2]]},"assertion":[{"value":"2025-12-02","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}