{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T21:20:40Z","timestamp":1776115240820,"version":"3.50.1"},"reference-count":48,"publisher":"MDPI AG","issue":"18","license":[{"start":{"date-parts":[[2020,9,5]],"date-time":"2020-09-05T00:00:00Z","timestamp":1599264000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61801066"],"award-info":[{"award-number":["61801066"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The quality and intelligibility of the speech are usually impaired by the interference of background noise when using internet voice calls. To solve this problem in the context of wearable smart devices, this paper introduces a dual-microphone, bone-conduction (BC) sensor assisted beamformer and a simple recurrent unit (SRU)-based neural network postfilter for real-time speech enhancement. Assisted by the BC sensor, which is insensitive to the environmental noise compared to the regular air-conduction (AC) microphone, the accurate voice activity detection (VAD) can be obtained from the BC signal and incorporated into the adaptive noise canceller (ANC) and adaptive block matrix (ABM). The SRU-based postfilter consists of a recurrent neural network with a small number of parameters, which improves the computational efficiency. The sub-band signal processing is designed to compress the input features of the neural network, and the scale-invariant signal-to-distortion ratio (SI-SDR) is developed as the loss function to minimize the distortion of the desired speech signal. Experimental results demonstrate that the proposed real-time speech enhancement system provides significant speech sound quality and intelligibility improvements for all noise types and levels when compared with the AC-only beamformer with a postfiltering algorithm.<\/jats:p>","DOI":"10.3390\/s20185050","type":"journal-article","created":{"date-parts":[[2020,9,6]],"date-time":"2020-09-06T23:12:49Z","timestamp":1599433969000},"page":"5050","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":25,"title":["A Real-Time Dual-Microphone Speech Enhancement Algorithm Assisted by Bone Conduction Sensor"],"prefix":"10.3390","volume":"20","author":[{"given":"Yi","family":"Zhou","sequence":"first","affiliation":[{"name":"School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China"}]},{"given":"Yufan","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China"}]},{"given":"Yongbao","family":"Ma","sequence":"additional","affiliation":[{"name":"Suresense Technology, Chongqing 400065, China"}]},{"given":"Hongqing","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China"}]}],"member":"1968","published-online":{"date-parts":[[2020,9,5]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Loizou, P.C. (2013). Speech Enhancement: Theory and Practice, CRC Press.","DOI":"10.1201\/b14529"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1016\/j.proeng.2013.09.103","article-title":"An improved multi-band spectral subtraction algorithm for enhancing speech in various noise environments","volume":"64","author":"Upadhyay","year":"2013","journal-title":"Procedia Eng."},{"key":"ref_3","first-page":"1182","article-title":"Spectral subtraction based on minimum statistics","volume":"6","author":"Martin","year":"1994","journal-title":"Power"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"167","DOI":"10.2528\/PIERM08061206","article-title":"Speech enhancement using an adaptive wiener filtering approach","volume":"4","author":"Dessouky","year":"2008","journal-title":"Prog. Electromagn. Res."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1109\/97.1001645","article-title":"Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator","volume":"9","author":"Cohen","year":"2002","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1702","DOI":"10.1109\/TASLP.2018.2842159","article-title":"Supervised speech separation based on deep learning: An overview","volume":"26","author":"Wang","year":"2018","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Valin, J.M. (2018, January 29\u201331). A hybrid DSP\/deep learning approach to real-time full-band speech enhancement. Proceedings of the 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP), Vancouver, BC, Canada.","DOI":"10.1109\/MMSP.2018.8547084"},{"key":"ref_8","unstructured":"Benesty, J., Chen, J., and Huang, Y. (2008). Microphone Array Signal Processing, Springer Science & Business Media."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1109\/TASL.2009.2024731","article-title":"New insights into the MVDR beamformer in room acoustics","volume":"18","author":"Habets","year":"2009","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1109\/PROC.1972.8817","article-title":"An algorithm for linearly constrained adaptive array processing","volume":"60","author":"Frost","year":"1972","journal-title":"Proc. IEEE"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1109\/TAP.1982.1142739","article-title":"An alternative approach to linearly constrained adaptive beamforming","volume":"30","author":"Griffiths","year":"1982","journal-title":"IEEE Trans. Antennas Propag."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Bitzer, J., Simmer, K.U., and Kammeyer, K.D. (1999, January 15\u201319). Theoretical noise reduction limits of the generalized sidelobe canceller (GSC) for speech enhancement. Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP99 (Cat. No. 99CH36258), Phoenix, AZ, USA.","DOI":"10.1109\/ICASSP.1999.761385"},{"key":"ref_13","first-page":"2735","article-title":"Redundant rule Detection for Software-Defined Networking","volume":"14","author":"Su","year":"2020","journal-title":"KSII Trans. Internet Inf. Syst."},{"key":"ref_14","first-page":"2294","article-title":"Idle Slots Skipped Mechanism based Tag Identification Algorithm with Enhanced Collision Detection","volume":"14","author":"Su","year":"2020","journal-title":"KSII Trans. Internet Inf. Syst. TIIS"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"2381","DOI":"10.1109\/TCOMM.2020.2968438","article-title":"From M-Ary query to bit query: A new strategy for efficient large-scale RFID identification","volume":"68","author":"Su","year":"2020","journal-title":"IEEE Trans. Commun."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"998","DOI":"10.1109\/TCOMM.2019.2952126","article-title":"A group-based binary splitting algorithm for UHF RFID anti-collision systems","volume":"68","author":"Su","year":"2019","journal-title":"IEEE Trans. Commun."},{"key":"ref_17","unstructured":"Shin, H.S., Kang, H.G., and Fingscheidt, T. (2012, January 26\u201328). Survey of speech enhancement supported by a bone conduction microphone. Proceedings of the 10th ITG Symposium on Speech Communication, Braunschweig, Germany."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1016\/j.specom.2018.06.002","article-title":"Bone-conducted speech enhancement using deep denoising autoencoder","volume":"104","author":"Liu","year":"2018","journal-title":"Speech Commun."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1321","DOI":"10.1121\/1.4976051","article-title":"In-ear microphone speech quality enhancement via adaptive filtering and artificial bandwidth extension","volume":"141","author":"Bouserhal","year":"2017","journal-title":"J. Acoust. Soc. Am."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"2015","DOI":"10.1109\/TASLP.2015.2446202","article-title":"A priori SNR estimation using air-and bone-conduction microphones","volume":"23","author":"Shin","year":"2015","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Lee, C.H., Rao, B.D., and Garudadri, H. (2018, January 2\u20136). Bone-Conduction Sensor Assisted Noise Estimation for Improved Speech Enhancement. Proceedings of the INTERSPEECH, Hyderabad, India.","DOI":"10.21437\/Interspeech.2018-1046"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"McCowan, I.A., and Bourlard, H. (2002, January 13\u201317). Microphone array post-filter for diffuse noise field. Proceedings of the 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL, USA.","DOI":"10.1109\/ICASSP.2002.1005887"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Soni, M.H., Shah, N., and Patil, H.A. (2018, January 15\u201320). Time-frequency masking-based speech enhancement using generative adversarial network. Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada.","DOI":"10.1109\/ICASSP.2018.8462068"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Zezario, R.E., Huang, J.W., Lu, X., Tsao, Y., Hwang, H.T., and Wang, H.M. (2018, January 12\u201315). Deep denoising autoencoder based post filtering for speech enhancement. Proceedings of the 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Honolulu, HI, USA.","DOI":"10.23919\/APSIPA.2018.8659598"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"546","DOI":"10.1016\/j.apacoust.2007.01.005","article-title":"A hybrid microphone array post-filter in a diffuse noise field","volume":"69","author":"Li","year":"2008","journal-title":"Appl. Acoust."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Kumatani, K., Raj, B., Singh, R., and McDonough, J. (2012, January 9\u201313). Microphone array post-filter based on spatially-correlated noise measurements for distant speech recognition. Proceedings of the Thirteenth Annual Conference of the International Speech Communication Association, Portland, OR, USA.","DOI":"10.21437\/Interspeech.2012-107"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Cohen, I., and Berdugo, B. (2002, January 13\u201317). Microphone array post-filtering for non-stationary noise suppression. Proceedings of the 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL, USA.","DOI":"10.1109\/ICASSP.2002.1005886"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"4089","DOI":"10.1109\/TVT.2019.2896482","article-title":"An information-theoretic view of WLAN localization error bound in GPS-denied environment","volume":"68","author":"Zhou","year":"2019","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"6125","DOI":"10.1109\/JIOT.2018.2869671","article-title":"Calibrated data simplification for energy-efficient location sensing in internet of things","volume":"6","author":"Zhou","year":"2018","journal-title":"IEEE Internet Things J."},{"key":"ref_30","unstructured":"Lleida, E., Fernandez, J., and Masgrau, E. (1998, January 15). Robust continuous speech recognition system based on a microphone array. Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP\u201998 (Cat. No. 98CH36181), Seattle, WA, USA."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1109\/TASSP.1985.1164550","article-title":"Speech enhancement using a minimum mean-square error log-spectral amplitude estimator","volume":"33","author":"Ephraim","year":"1985","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1383","DOI":"10.1109\/TASL.2011.2180896","article-title":"Unbiased MMSE-based noise power estimation with low complexity and low tracking delay","volume":"20","author":"Gerkmann","year":"2011","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Yoon, B.J., Tashev, I., and Acero, A. (2007, January 15\u201320). Robust adaptive beamforming algorithm using instantaneous direction of arrival with enhanced noise suppression capability. Proceedings of the 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP\u201907, Honolulu, HI, USA.","DOI":"10.1109\/ICASSP.2007.366634"},{"key":"ref_34","unstructured":"Nelke, C.M., and Vary, P. (2014, January 24\u201326). Dual microphoneWind Noise Reduction by Exploiting the Complex Coherence. Proceedings of the 11th ITG Symposium on Speech Communication, Erlangen, Germany."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"M\u00fcller, G., and M\u00f6ser, M. (2012). Handbook of Engineering Acoustics, Springer Science & Business Media.","DOI":"10.1007\/978-3-540-69460-1"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"1993","DOI":"10.1109\/TASLP.2014.2359159","article-title":"A feature study for classification-based speech separation at low signal-to-noise ratios","volume":"22","author":"Chen","year":"2014","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Doblinger, G. (1995, January 18\u201321). Computationally efficient speech enhancement by spectral minima tracking in subbands. Proceedings of the Fourth European Conference on speech Communication and Technology, Madrid, Spain.","DOI":"10.21437\/Eurospeech.1995-370"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1109\/TASSP.1980.1163420","article-title":"Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences","volume":"28","author":"Davis","year":"1980","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"2222","DOI":"10.1109\/TNNLS.2016.2582924","article-title":"LSTM: A search space odyssey","volume":"28","author":"Greff","year":"2016","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_40","unstructured":"Chung, J., Gulcehre, C., Cho, K., and Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv."},{"key":"ref_41","unstructured":"Lei, T., Zhang, Y., and Artzi, Y. (2017). Training RNNs as Fast as CNNs. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Le Roux, J., Wisdom, S., Erdogan, H., and Hershey, J.R. (2019, January 12\u201317). SDR\u2013half-baked or well done?. Proceedings of the ICASSP 2019\u20132019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK.","DOI":"10.1109\/ICASSP.2019.8683855"},{"key":"ref_43","unstructured":"Rix, A.W., Beerends, J.G., Hollier, M.P., and Hekstra, A.P. (2001, January 7\u201311). Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, (Cat. No. 01CH37221), Salt Lake City, UT, USA."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"2125","DOI":"10.1109\/TASL.2011.2114881","article-title":"An algorithm for intelligibility prediction of time\u2013frequency weighted noisy speech","volume":"19","author":"Taal","year":"2011","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Reddy, C.K., Beyrami, E., Pool, J., Cutler, R., Srinivasan, S., and Gehrke, J. (2019). A scalable noisy speech dataset and online subjective test framework. arXiv.","DOI":"10.21437\/Interspeech.2019-3087"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"561","DOI":"10.1109\/TSA.2004.834599","article-title":"Speech enhancement based on the general transfer function GSC and postfiltering","volume":"12","author":"Gannot","year":"2004","journal-title":"IEEE Trans. Speech Audio Process."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Reddy, C.K., Beyrami, E., Dubey, H., Gopal, V., Cheng, R., Cutler, R., Matusevych, S., Aichner, R., Aazami, A., and Braun, S. (2020). The interspeech 2020 deep noise suppression challenge: Datasets, subjective speech quality and testing framework. arXiv.","DOI":"10.21437\/Interspeech.2020-3038"},{"key":"ref_48","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/18\/5050\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:07:10Z","timestamp":1760177230000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/18\/5050"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,5]]},"references-count":48,"journal-issue":{"issue":"18","published-online":{"date-parts":[[2020,9]]}},"alternative-id":["s20185050"],"URL":"https:\/\/doi.org\/10.3390\/s20185050","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,5]]}}}