{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T23:32:45Z","timestamp":1771025565627,"version":"3.50.1"},"reference-count":37,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2016,8,12]],"date-time":"2016-08-12T00:00:00Z","timestamp":1470960000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>This paper proposes support vector machine (SVM) based voice activity detection using FuzzyEn to improve detection performance under noisy conditions. The proposed voice activity detection (VAD) uses fuzzy entropy (FuzzyEn) as a feature extracted from noise-reduced speech signals to train an SVM model for speech\/non-speech classification. The proposed VAD method was tested by conducting various experiments by adding real background noises of different signal-to-noise ratios (SNR) ranging from \u221210 dB to 10 dB to actual speech signals collected from the TIMIT database. The analysis proves that FuzzyEn feature shows better results in discriminating noise and corrupted noisy speech. The efficacy of the SVM classifier was validated using 10-fold cross validation. Furthermore, the results obtained by the proposed method was compared with those of previous standardized VAD algorithms as well as recently developed methods. Performance comparison suggests that the proposed method is proven to be more efficient in detecting speech under various noisy environments with an accuracy of 93.29%, and the FuzzyEn feature detects speech efficiently even at low SNR levels.<\/jats:p>","DOI":"10.3390\/e18080298","type":"journal-article","created":{"date-parts":[[2016,8,12]],"date-time":"2016-08-12T10:05:06Z","timestamp":1470996306000},"page":"298","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Voice Activity Detection Using Fuzzy Entropy and Support Vector Machine"],"prefix":"10.3390","volume":"18","author":[{"given":"R.","family":"Johny Elton","sequence":"first","affiliation":[{"name":"Department of Electronics and Communication Engineering, K.L.N. College of Information Technology, Madurai 630612, India"}]},{"given":"P.","family":"Vasuki","sequence":"additional","affiliation":[{"name":"Department of Electronics and Communication Engineering, K.L.N. College of Information Technology, Madurai 630612, India"}]},{"given":"J.","family":"Mohanalin","sequence":"additional","affiliation":[{"name":"Department of Electrical and Electronics Engineering, College of Engineering Pathanapuram, Kerala 689696, India"}]}],"member":"1968","published-online":{"date-parts":[[2016,8,12]]},"reference":[{"key":"ref_1","unstructured":"Zhang, L., Gao, Y.-C., Bian, Z.-Z., and Chen, L. (2005, January 23\u201326). Voice activity detection algorithm improvement in multi-rate speech coding of 3GPP. Proceedings of the 2005 International Conference on Wireless Communications, Networking and Mobile Computing, (WCNM 2005), Wuhan, China."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1016\/S0167-6393(02)00066-3","article-title":"Towards improving speech detection robustness for speech recognition in adverse conditions","volume":"40","author":"Karray","year":"2003","journal-title":"Speech Commun."},{"key":"ref_3","unstructured":"Freeman, D.K., Southcott, C.B., Boyd, I., and Cosier, G. (1989, January 23\u201326). A voice activity detector for pan-European digital cellular mobile telephone service. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Glasgow, Scotland."},{"key":"ref_4","unstructured":"Sangwan, A., Chiranth, M.C., Jamadagni, H.S., Sah, R., Venkatesha Prasad, R., and Gaurav, V. (2002, January 3\u20135). VAD techniques for real-time speech transmission on the Internet. Proceedings of the 5th IEEE International Conference on High Speed Networks and Multimedia Communications, Jeju Island, Korea."},{"key":"ref_5","unstructured":"Itoh, K., and Mizushima, M. (1997, January 21\u201324). Environmental noise reduction based on speech\/non-speech identification for hearing aids. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Munich, Germany."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1109\/TASSP.1979.1163209","article-title":"Suppression of acoustic noise in speech using spectral subtraction","volume":"27","author":"Boll","year":"1979","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1109\/TASSP.1979.1163199","article-title":"Adaptive noise spectral shaping and entropy coding in predictive coding of speech","volume":"27","author":"Makhoul","year":"1979","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"810","DOI":"10.1016\/j.specom.2008.08.005","article-title":"Energy-based VAD with grey magnitude spectral subtraction","volume":"51","author":"Hsieh","year":"2009","journal-title":"Speech Commun."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Kotnik, B., Kacic, Z., and Horvat, B. (2001, January 3\u20137). A multiconditional robust front-end feature extraction with a noise reduction procedure based on improved spectral subtraction algorithm. Proceedings of the 7th European Conference on Speech Communication and Technology, Aalborg, Denmark.","DOI":"10.21437\/Eurospeech.2001-72"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Shi, W., Zou, Y., and Liu, Y. (2014, January 9\u201313). Long-term auto-correlation statistics based on voice activity detection for strong noisy speech. Proceedings of the 2014 IEEE China Summit & International Conference on Signal and Information Processing, Xi\u2019an, China.","DOI":"10.1109\/ChinaSIP.2014.6889210"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1594","DOI":"10.1093\/ietfec\/e89-a.6.1594","article-title":"Statistical Model-Based VAD algorithm with wavelet transform","volume":"E89-A","author":"Lee","year":"2006","journal-title":"IEICE Trans. Fundam. Electron. Commun. Comput. Sci."},{"key":"ref_12","unstructured":"Haigh, J.A., and Mason, J.S. (1993, January 19\u201321). Robust voice activity detection using cepstral features. Proceedings of the IEEE Region 10 Conference on Computer, Communication, Control and Power Engineering, Beijing, China."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1016\/j.specom.2003.10.002","article-title":"Efficient voice activity algorithms using long-term speech information","volume":"42","author":"Ramirez","year":"2004","journal-title":"Speech Commun."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Kristjansson, T., Deligne, S., and Olsen, P.A. (2005, January 4\u20138). Voicing features for robust speech detection. Proceedings of the Ninth European Conference on Speech Communication and Technology, Lisbon, Portugal.","DOI":"10.21437\/Interspeech.2005-186"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1016\/j.specom.2009.08.003","article-title":"Noise robust voice activity detection based on periodic to aperiodic component ratio","volume":"52","author":"Ishizuka","year":"2010","journal-title":"Speech Commun."},{"key":"ref_16","unstructured":"G.729: A Silence Compression Scheme for G.729 Optimized for Terminals Conforming to Recommendation V.70. Available online: https:\/\/www.itu.int\/rec\/T-REC-G.729-199610-S!AnnB\/en."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1109\/89.905996","article-title":"Robust voice activity detection using higher-order statistics in the LPC residual domain","volume":"9","author":"Nemer","year":"2001","journal-title":"IEEE Trans. Speech Audio Process."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1109\/TITB.2012.2228211","article-title":"Support Vector Machine Classification Based on Correlation Prototypes Applied to Bone Age Assessment","volume":"17","author":"Harmsen","year":"2012","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"721","DOI":"10.1109\/TPAMI.2008.110","article-title":"Supervised and traditional term weighting methods for automatic text categorization","volume":"31","author":"Lan","year":"2009","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_20","first-page":"112","article-title":"Performance Comparison of SVM and k-NN for Oriya Character Recognition","volume":"1","author":"Mohanty","year":"2011","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"key":"ref_21","unstructured":"Vapnik, V.N. (1998). Statistical Learning Theory, Wiley."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1109\/TNSRE.2007.897025","article-title":"Characterization of Surface EMG signal based on Fuzzy Entropy","volume":"15","author":"Chen","year":"2007","journal-title":"IEEE Trans. Neural Syst. Rehabil. Eng."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"H2039","DOI":"10.1152\/ajpheart.2000.278.6.H2039","article-title":"Physiological time-series analysis using approximate entropy and sample entropy","volume":"278","author":"Richman","year":"2000","journal-title":"Am. J. Physiol. Heart Circ. Physiol."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1016\/j.medengphy.2008.04.005","article-title":"Measuring complexity using FuzzyEn, ApEn and SampEn","volume":"31","author":"Chen","year":"2009","journal-title":"Med. Eng. Phys."},{"key":"ref_25","unstructured":"Holzinger, A., H\u00f6rtenhuber, M., Mayer, C., Bachler, M., Wassertheurer, S., Pinho, A., and Koslicki, D. (2014). Interactive Knowledge Discovery and Data Mining in Biomedical Informatics, Springer."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Mayer, C., Bachler, M., Hortenhuber, M., Stocker, C., Holzinger, A., and Wassertheurer, S. (2014). Selection of entropy-measure parameters for knowledge discovery in heart rate variability data. BMC Bioinform., 15.","DOI":"10.1186\/1471-2105-15-S6-S2"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"129","DOI":"10.3390\/e18040129","article-title":"The Effect of Threshold Values and Weighting Factors on the Association between Entropy Measures and Mortality after Myocardial Infarction in the Cardiac Arrhythmia Suppression Trial (CAST)","volume":"18","author":"Mayer","year":"2016","journal-title":"Entropy"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"952","DOI":"10.1016\/j.dsp.2012.12.003","article-title":"Objective evaluation of speech dysfluencies using wavelet packet transform with sample entropy","volume":"23","author":"Hariharan","year":"2013","journal-title":"Digit. Signal Process."},{"key":"ref_29","first-page":"249","article-title":"Comapatients expression analysis under different lighting using k-NN and LDA","volume":"1","author":"Muhammad","year":"2010","journal-title":"Int. J. Signal Process. Image Process."},{"key":"ref_30","unstructured":"Duda, R.O., Hart, P.E., and Stork, D.G. (2001). Pattern Classification, Wiley. [2nd ed.]."},{"key":"ref_31","unstructured":"Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., and Dahlgren, N.L. TIMIT Acoustic-Phonetic Continuous Speech Corpus Linguistic Data Consortium. Available online: https:\/\/catalog.ldc.upenn.edu\/LDC93S1."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1109\/97.995824","article-title":"Performance Evaluation and Comparison of G.729\/AMR\/ Fuzzy Voice Activity Detectors","volume":"9","author":"Beritelli","year":"2002","journal-title":"IEEE Signal Process. Let."},{"key":"ref_33","unstructured":"Hirsch, H.-G., and Pearce, D. (2000, January 18\u201320). The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions. Proceedings of the ISCA ITRW ASR2000, Paris, France."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Ma, Y., and Nishihara, A. (2013). Efficient voice activity detection algorithm using long-term spectral flatness measure. EURASIP J. Audio Speech Music Process., 2013.","DOI":"10.1186\/1687-4722-2013-21"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"600","DOI":"10.1109\/TASL.2010.2052803","article-title":"Robust Voice Activity Detection Using Long-Term Signal Variability","volume":"19","author":"Ghosh","year":"2011","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Sohn, J., and Kim, N.S. (1999). A statistical model-based voice activity detection. IEEE Signal Process. Lett., 6.","DOI":"10.1109\/97.736233"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"412","DOI":"10.1109\/TSA.2005.855842","article-title":"Statistical Voice Activity Detection Using Low-Variance Spectrum Estimation and an Adaptive Threshold","volume":"14","author":"Davis","year":"2006","journal-title":"IEEE Trans. Audio Speech Lang. Process."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/18\/8\/298\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T19:28:19Z","timestamp":1760210899000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/18\/8\/298"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,8,12]]},"references-count":37,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2016,8]]}},"alternative-id":["e18080298"],"URL":"https:\/\/doi.org\/10.3390\/e18080298","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,8,12]]}}}