{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T02:05:51Z","timestamp":1775873151663,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":32,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,9,21]],"date-time":"2023-09-21T00:00:00Z","timestamp":1695254400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,9,21]]},"DOI":"10.1145\/3615834.3615835","type":"proceedings-article","created":{"date-parts":[[2023,10,11]],"date-time":"2023-10-11T22:45:48Z","timestamp":1697064348000},"page":"1-10","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Compressed, Real-Time Voice Activity Detection with Open Source Implementation for Small Devices"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-4874-1109","authenticated-orcid":false,"given":"Lasse R.","family":"Andersen","sequence":"first","affiliation":[{"name":"Aalborg University, Denmark"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-5545-5465","authenticated-orcid":false,"given":"Lukas J.","family":"Jacobsen","sequence":"additional","affiliation":[{"name":"Aalborg University, Denmark"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-3044-6584","authenticated-orcid":false,"given":"David","family":"Campos","sequence":"additional","affiliation":[{"name":"Aalborg University, Denmark"}]}],"member":"320","published-online":{"date-parts":[[2023,10,11]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Pietro Barbiero Giovanni Squillero and Alberto Tonda. 2020. Modeling Generalization in Machine Learning: A Methodological and Computational Study. arxiv:2006.15680\u00a0[cs.LG]"},{"key":"e_1_3_2_1_2_1","volume-title":"Encyclopedia Britannica (May","author":"Berg E.","year":"2023","unstructured":"Richard\u00a0E. Berg. 2023. Sound | Properties, Types, & Facts. Encyclopedia Britannica (May 2023). https:\/\/www.britannica.com\/science\/sound-physics"},{"key":"e_1_3_2_1_3_1","volume-title":"VADLite: an open-source lightweight system for real-time voice activity detection on smartwatches","author":"Boateng George","unstructured":"George Boateng, Prabhakaran Santhanam, Janina L\u00fcscher, Urte Scholz, and Tobias Kowatsch. 2019. VADLite: an open-source lightweight system for real-time voice activity detection on smartwatches. In UbiComp\/ISWC, Robert Harle, Katayoun Farrahi, and Nicholas\u00a0D. Lane (Eds.). ACM, London, United Kingdom, 902\u2013906."},{"key":"e_1_3_2_1_4_1","volume-title":"How to Fix the Vanishing Gradients Problem Using the ReLU - MachineLearningMastery.com. MachineLearningMastery (Aug","author":"Brownlee Jason","year":"2020","unstructured":"Jason Brownlee. 2020. How to Fix the Vanishing Gradients Problem Using the ReLU - MachineLearningMastery.com. MachineLearningMastery (Aug 2020). https:\/\/machinelearningmastery.com\/how-to-fix-vanishing-gradients-using-the-rectified-linear-activation-function"},{"key":"e_1_3_2_1_5_1","unstructured":"Tom B\u00e4ckstr\u00f6m Okko R\u00e4s\u00e4nen Abraham Zewoudie Pablo\u00a0P\u00e9rez Zarazaga Liisa Koivusalo Sneha Das Esteban\u00a0G\u00f3mez Mellado Marieum\u00a0Bouafif Mansali Daniel Ramos Sudarsana Kadiri and Paavo Alku. 2022. Introduction to Speech Processing (2 ed.). https:\/\/speechprocessingbook.aalto.fi"},{"key":"e_1_3_2_1_6_1","volume-title":"Teknik Telekomunikasi, & Teknik Elektronika 9, 4 (Oct.","author":"Faridh Muhammad\u00a0Hilmi","year":"2021","unstructured":"Muhammad\u00a0Hilmi Faridh and Ulil\u00a0Surtia Zulpratita. 2021. HiVAD : A Voice Activity Detection Application Based on Deep Learning. ELKOMIKA: Jurnal Teknik Energi Elektrik, Teknik Telekomunikasi, & Teknik Elektronika 9, 4 (Oct. 2021), 856."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.21437\/Eurospeech.1997-108"},{"key":"e_1_3_2_1_8_1","unstructured":"Google Git. 2015. webRTC VAD. https:\/\/chromium.googlesource.com\/external\/webrtc\/+\/branch-heads\/43\/webrtc\/common_audio\/vad\/."},{"key":"e_1_3_2_1_9_1","unstructured":"Yun-Ning Hung Karn\u00a0N. Watcharasupat Chih-Wei Wu Iroro Orife Kelian Li Pavan Seshadri and Junyoung Lee. 2021. AVASpeech-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-Occurrence. arxiv:2111.01320"},{"key":"e_1_3_2_1_10_1","unstructured":"M. Huzaifah. 2017. Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks. arxiv:1706.07156\u00a0[cs.CV]"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Fei Jia Somshubra Majumdar and Boris Ginsburg. 2021. MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection. arxiv:2010.13886\u00a0[eess.AS]","DOI":"10.1109\/ICASSP39728.2021.9414470"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2018.8462127"},{"key":"e_1_3_2_1_13_1","unstructured":"Ian Lavery Alireza Kenarsari Reza Rostam and Dilek Karasoy. 2023. Picovoice. https:\/\/picovoice.ai"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2021.07.045"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2022.11.072"},{"key":"e_1_3_2_1_16_1","volume-title":"Deep Neural Networks for Voice Activity Detection. In 2021 44th International Conference on Telecommunications and Signal Processing (TSP). 191\u2013194","author":"Mihalache Serban","year":"2021","unstructured":"Serban Mihalache, Ioan-Alexandru Ivanov, and Dragos Burileanu. 2021. Deep Neural Networks for Voice Activity Detection. In 2021 44th International Conference on Telecommunications and Signal Processing (TSP). 191\u2013194."},{"key":"e_1_3_2_1_17_1","unstructured":"Rahul Mishra Hari\u00a0Prabhat Gupta and Tanima Dutta. 2020. A Survey on Deep Neural Network Compression: Challenges Overview and Solutions. arxiv:2010.03954\u00a0[cs.LG]"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472772"},{"key":"e_1_3_2_1_19_1","unstructured":"Alan\u00a0V. Oppenheim and Ronald\u00a0W. Schafer. 2013. Discrete-Time Signal Processing. Pearson Education."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078195"},{"key":"e_1_3_2_1_21_1","volume-title":"speech and gender:: Male-female acoustic differences and cross-language variation in English and French speakers. Corela (06","author":"P\u00e9piot Erwan","year":"2012","unstructured":"Erwan P\u00e9piot. 2012. Voice, speech and gender:: Male-female acoustic differences and cross-language variation in English and French speakers. Corela (06 2012)."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/SAMI48414.2020.9108717"},{"key":"e_1_3_2_1_23_1","unstructured":"Abhipray Sahoo. 2020. Voice activity detection for low-resource settings. (2020)."},{"key":"e_1_3_2_1_24_1","volume-title":"5th IEEE International Conference on High Speed Networks and Multimedia Communication (Cat. No.02EX612)","author":"Sangwan A.","unstructured":"A. Sangwan, M.C. Chiranth, H.S. Jamadagni, R. Sah, R. Venkatesha\u00a0Prasad, and V. Gaurav. 2002. VAD techniques for real-time speech transmission on the Internet. In 5th IEEE International Conference on High Speed Networks and Multimedia Communication (Cat. No.02EX612). 46\u201350."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2800728"},{"key":"e_1_3_2_1_26_1","unstructured":"Audacity Team. 2023. Audacity Development Manual. https:\/\/alphamanual.audacityteam.org\/man\/Sample_Format_-_Bit_Depth"},{"key":"e_1_3_2_1_27_1","unstructured":"Silero Team. 2021. Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD) Number Detector and Language Classifier. https:\/\/github.com\/snakers4\/silero-vad."},{"key":"e_1_3_2_1_28_1","volume-title":"One Voice Detector to Rule Them All. https:\/\/thegradient.pub\/one-voice-detector-to-rule-them-all\/. The Gradient","author":"Veysov Alexander","year":"2022","unstructured":"Alexander Veysov and Dimitrii Voronin. 2022. One Voice Detector to Rule Them All. https:\/\/thegradient.pub\/one-voice-detector-to-rule-them-all\/. The Gradient (2022)."},{"key":"e_1_3_2_1_29_1","volume-title":"CoRR abs\/1907.10121","author":"Virtanen Pauli","year":"2019","unstructured":"Pauli Virtanen, Ralf Gommers, Travis\u00a0E. Oliphant, Matt Haberland, 2019. SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python. CoRR abs\/1907.10121 (2019). arXiv:1907.10121"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2014.2364452"},{"key":"e_1_3_2_1_31_1","volume-title":"Spoken Language Processing: A Guide to Theory, Algorithm and System Development","author":"Xuedong\u00a0Huang Wuen\u00a0Hon","unstructured":"Hsiao-Wuen\u00a0Hon Xuedong\u00a0Huang, Alex\u00a0Acero. 2001. Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall PTR."},{"key":"e_1_3_2_1_32_1","volume-title":"Proceedings of the 27th International Conference on Neural Information Processing Systems -","volume":"2","author":"Yosinski Jason","year":"2014","unstructured":"Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How Transferable Are Features in Deep Neural Networks?. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2 (Montreal, Canada) (NIPS\u201914). MIT Press, Cambridge, MA, USA, 3320\u20133328."}],"event":{"name":"iWOAR 2023: 8th international Workshop on Sensor-Based Activity Recognition and Artificial Intelligence","location":"L\u00fcbeck Germany","acronym":"iWOAR 2023"},"container-title":["Proceedings of the 8th international Workshop on Sensor-Based Activity Recognition and Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3615834.3615835","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3615834.3615835","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:10:17Z","timestamp":1750295417000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3615834.3615835"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,21]]},"references-count":32,"alternative-id":["10.1145\/3615834.3615835","10.1145\/3615834"],"URL":"https:\/\/doi.org\/10.1145\/3615834.3615835","relation":{},"subject":[],"published":{"date-parts":[[2023,9,21]]},"assertion":[{"value":"2023-10-11","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}