{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:19:37Z","timestamp":1760242777725,"version":"build-2065373602"},"reference-count":13,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2016,7,6]],"date-time":"2016-07-06T00:00:00Z","timestamp":1467763200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>In this paper, we propose a robust voice activity detection (VAD) algorithm to effectively distinguish speech from non-speech in various noisy environments. The proposed VAD utilizes power spectral deviation (PSD), using Teager energy (TE) to provide a better representation of the PSD, resulting in improved decision performance for speech segments. In addition, the TE-based likelihood ratio and speech absence probability are derived in each frame to modify the PSD for further VAD. We evaluate the performance of the proposed VAD algorithm by objective testing in various environments and obtain better results that those attained by of the conventional methods.<\/jats:p>","DOI":"10.3390\/sym8070058","type":"journal-article","created":{"date-parts":[[2016,7,6]],"date-time":"2016-07-06T09:55:55Z","timestamp":1467798955000},"page":"58","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Power Spectral Deviation-Based Voice Activity Detection Incorporating Teager Energy for Speech Enhancement"],"prefix":"10.3390","volume":"8","author":[{"given":"Sang-Kyun","family":"Kim","sequence":"first","affiliation":[{"name":"Department of Electronic Engineering, Inha University, Incheon 402-751, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sang-Ick","family":"Kang","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering, Inha University, Incheon 402-751, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Young-Jin","family":"Park","sequence":"additional","affiliation":[{"name":"Korea Electrotechnology Research Institute (KERI), 111 Hanggaul ro, Sangrok Gu, An-San shi,  Kyunggi Do 426-170, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sanghyuk","family":"Lee","sequence":"additional","affiliation":[{"name":"Department of Electrical and Electronic Engineering, Xi\u2019an Jiaotong-Liverpool University, Suzhou 215000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sangmin","family":"Lee","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering, Inha University, Incheon 402-751, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2016,7,6]]},"reference":[{"key":"ref_1","unstructured":"Karray, L., Mokbel, C., and Monne, J. (1998, January 29\u201330). Solutions for robust speech\/non-speech detection in wireless environment. Proceedings of the IEEE 4th Workshop, Interactive Voice Technology for Telecommunications Applications IVITA\u201998, Torino, Italy."},{"key":"ref_2","unstructured":"TIA\/EIA\/IS-127 (Enhanced Variable Rate Codec, Speech Service Option 3 for Wideband Spread Spectrum Digital Systems, 1996). Enhanced Variable Rate Codec, Speech Service Option 3 for Wideband Spread Spectrum Digital Systems."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/97.736233","article-title":"A statistical model-based voice activity detection","volume":"6","author":"Sohn","year":"1999","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1109\/97.789604","article-title":"Teager energy based feature parameters for speech recognition in car noise","volume":"6","author":"Jabloun","year":"1999","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1327","DOI":"10.1016\/j.patrec.2006.11.023","article-title":"Robust voice activity detection using perceptual wavelet-packet transform and Teager energy operator","volume":"28","author":"Chen","year":"2007","journal-title":"Pattern Recognit. Lett."},{"key":"ref_6","first-page":"2024","article-title":"Multiband modulation energy tracking for noisy speech detection","volume":"14","author":"Evangelopoulos","year":"2006","journal-title":"IEEE Trans. ASLP"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1109\/TASSP.1980.1163394","article-title":"Speech enhancement using a soft-decision noise suppression filter","volume":"28","author":"McAualy","year":"1980","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1109\/97.841154","article-title":"Spectral enhancement based on global soft decision","volume":"7","author":"Kim","year":"2000","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1109","DOI":"10.1109\/TASSP.1984.1164453","article-title":"Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator","volume":"32","author":"Ephraim","year":"1984","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1119","DOI":"10.1109\/TSA.2005.853212","article-title":"An effective subband OSF-based VAD with noise reduction for robust speech recognition","volume":"13","author":"Ramirez","year":"2005","journal-title":"IEEE Trans. Speech Audio Process."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1699","DOI":"10.1016\/j.sigpro.2012.01.005","article-title":"Voice activity detection based on conditional MAP criterion incorporating the spectral gradient","volume":"92","author":"Kim","year":"2012","journal-title":"Signal Process."},{"key":"ref_12","unstructured":"ITU-T (Appendix III: G.729 Annex B Enhancement in Voice-Over-IP Applications-Option 2, 2005). Appendix III: G.729 Annex B Enhancement in Voice-Over-IP Applications-Option 2."},{"key":"ref_13","unstructured":"ITU-T (Recommendation P.862, Perceptual Evaluatioon of Speech Quality (PESQ), an Objective Method for end-to-end Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs, 2001). Recommendation P.862, Perceptual Evaluatioon of Speech Quality (PESQ), an Objective Method for end-to-end Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs."}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/8\/7\/58\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T19:25:32Z","timestamp":1760210732000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/8\/7\/58"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,7,6]]},"references-count":13,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2016,7]]}},"alternative-id":["sym8070058"],"URL":"https:\/\/doi.org\/10.3390\/sym8070058","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2016,7,6]]}}}