{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:13:08Z","timestamp":1760242388766,"version":"build-2065373602"},"reference-count":27,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2017,6,20]],"date-time":"2017-06-20T00:00:00Z","timestamp":1497916800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>In speech separation tasks, many separation methods have the limitation that the microphones are closely spaced, which means that these methods are unprevailing for phase wrap-around. In this paper, we present a novel speech separation scheme by using two microphones that does not have this restriction. The technique utilizes the estimation of interaural time difference (ITD) statistics and binary time-frequency mask for the separation of mixed speech sources. The novelties of the paper consist in: (1) the extended application of delay-and-sum beamforming (DSB) and cosine function for ITD calculation; and (2) the clarification of the connection between ideal binary mask and DSB amplitude ratio. Our objective quality evaluation experiments demonstrate the effectiveness of the proposed method.<\/jats:p>","DOI":"10.3390\/s17061447","type":"journal-article","created":{"date-parts":[[2017,6,20]],"date-time":"2017-06-20T10:15:38Z","timestamp":1497953738000},"page":"1447","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Dual-Channel Cosine Function Based ITD Estimation for Robust Speech Separation"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1555-1852","authenticated-orcid":false,"given":"Xuliang","family":"Li","sequence":"first","affiliation":[{"name":"Department of Electronic Engineering\/Graduate School at Shenzhen, Tsinghua University, Beijing 100084, China"}]},{"given":"Zhaogui","family":"Ding","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering\/Graduate School at Shenzhen, Tsinghua University, Beijing 100084, China"}]},{"given":"Weifeng","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering\/Graduate School at Shenzhen, Tsinghua University, Beijing 100084, China"}]},{"given":"Qingmin","family":"Liao","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering\/Graduate School at Shenzhen, Tsinghua University, Beijing 100084, China"}]}],"member":"1968","published-online":{"date-parts":[[2017,6,20]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Kouchaki, S., and Sanei, S. (2003, January 22\u201325). Supervised single channel source separation of EEG signals. Proceedings of the 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Southampton, UK.","DOI":"10.1109\/MLSP.2013.6661895"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"961","DOI":"10.1109\/TASL.2010.2072500","article-title":"Single-channel source separation using EMD-subband variable regularized sparse features","volume":"19","author":"Gao","year":"2011","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1881","DOI":"10.1109\/TSP.2015.2477059","article-title":"Online noisy single-channel source separation using adaptive spectrum amplitude estimator and masking","volume":"64","author":"Tengtrairat","year":"2016","journal-title":"IEEE Trans. Signal Process."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1722","DOI":"10.1109\/TNNLS.2013.2258680","article-title":"Single-channel blind separation using pseudo-stereo mixture and complex 2D histogram","volume":"24","author":"Tengtrairat","year":"2013","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Clark, B., and Flint, J.A. (2016). Acoustical direction finding with time-modulated arrays. Sensors, 16.","DOI":"10.3390\/s16122107"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"13781","DOI":"10.3390\/s121013781","article-title":"Source localization with acoustic sensor arrays using generative model based fitting with sparse constraints","volume":"12","author":"Velasco","year":"2012","journal-title":"Sensors"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1763","DOI":"10.1109\/TSMCB.2004.830345","article-title":"Phase-based dual-microphone robust speech enhancement","volume":"34","author":"Aarabi","year":"2004","journal-title":"IEEE Trans. Syst. Man Cybern. Part B Cybern."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Kim, C., Stern, R.M., Eom, K., and Lee, J. (2010, January 26\u201330). Automatic selection of thresholds for signal separation algorithms based on interaural delay. Proccedings of the INTERSPEECH 2010, Chiba, Japan.","DOI":"10.21437\/Interspeech.2010-271"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Kim, C., Khawand, C., and Stern, R.M. (2012, January 25\u201330). Two-microphone source separation algorithm based on statistical modeling of angle distributions. Proccedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan.","DOI":"10.1109\/ICASSP.2012.6288950"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1109\/TASL.2012.2215594","article-title":"A dual-microphone algorithm that can cope with competing-talker scenarios","volume":"21","author":"Yousefian","year":"2013","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"475","DOI":"10.1109\/TNN.2007.911740","article-title":"Two-microphone separation of speech mixtures","volume":"19","author":"Pedersen","year":"2008","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"3428","DOI":"10.1121\/1.4971036","article-title":"Sound source separation and synthesis for audio enhancement based on spectral amplitudes of two-channel stereo signals","volume":"140","author":"Nishiguchi","year":"2016","journal-title":"J. Acoust. Soc. Am."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Wood, S., and Rouat, J. (2016, January 8\u201312). Blind speech separation with GCC-NMF. Proceedings of the INTERSPEECH 2016, San Francisco, CA, USA.","DOI":"10.21437\/Interspeech.2016-1449"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1109\/TASLP.2016.2620600","article-title":"Underdetermined Convolutive Source Separation Using GEM-MU With Variational Approximated Optimum Model Order NMF2D","volume":"25","author":"Woo","year":"2017","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process. (TASLP)"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Brandstein, M., and Ward, D. (2001). Microphone Arrays: Signal Processing Techniques and Applications, Springer.","DOI":"10.1007\/978-3-662-04619-7"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1830","DOI":"10.1109\/TSP.2004.828896","article-title":"Blind separation of speech mixtures via time-frequency masking","volume":"52","author":"Yilmaz","year":"2004","journal-title":"IEEE Trans. Signal Process."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1486","DOI":"10.1016\/j.specom.2006.09.003","article-title":"Binary and ratio time-frequency masks for robust speech recognition","volume":"48","author":"Srinivasan","year":"2006","journal-title":"Speech Commun."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1016\/0167-6393(90)90010-7","article-title":"Speech database development at MIT: TIMIT and beyond","volume":"9","author":"Zue","year":"1990","journal-title":"Speech Commun."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1109\/TASL.2007.911054","article-title":"Evaluation of objective quality measures for speech enhancement","volume":"16","author":"Hu","year":"2008","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Vincent, E., Sawada, H., Bofill, P., Makino, S., and Rosca, J.P. (2007). First stereo audio source separation evaluation campaign: Data, algorithms and results. Independent Component Analysis and Signal Separation, Springer.","DOI":"10.1007\/978-3-540-74494-8_69"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1109\/97.957270","article-title":"Analysis and improvement of a statistical model-based voice activity detector","volume":"8","author":"Cho","year":"2001","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"943","DOI":"10.1121\/1.382599","article-title":"Image method for efficiently simulating small-room acoustics","volume":"65","author":"Allen","year":"1979","journal-title":"J. Acoust. Soc. Am."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/j.acha.2013.01.002","article-title":"Phase aliasing correction for robust blind source separation using DUET","volume":"35","author":"Wang","year":"2013","journal-title":"Appl. Comput. Harmonic Anal."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"382","DOI":"10.1109\/TASL.2009.2029711","article-title":"Model-based expectation-maximization source separation and localization","volume":"18","author":"Mandel","year":"2010","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Izumi, Y., Ono, N., and Sagayama, S. (2007, January 21\u201324). Sparseness-based 2ch BSS using the EM algorithm in reverberant environment. Proceedings of the 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, USA.","DOI":"10.1109\/ASPAA.2007.4393015"},{"key":"ref_26","first-page":"414","article-title":"The 2010 Signal Separation Evaluation Campaign (SiSEC2010): Audio Source Separation","volume":"6365","author":"Araki","year":"2010","journal-title":"Lect. Notes Comput. Sci."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1265","DOI":"10.1109\/TNN.2006.875991","article-title":"Efficient Variant of Algorithm FastICA for Independent Component Analysis Attaining the Cram\u2014Rao Lower Bound","volume":"17","author":"Koldovsky","year":"2006","journal-title":"IEEE Trans. Neural Netw."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/17\/6\/1447\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T18:39:42Z","timestamp":1760207982000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/17\/6\/1447"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,6,20]]},"references-count":27,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2017,6]]}},"alternative-id":["s17061447"],"URL":"https:\/\/doi.org\/10.3390\/s17061447","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2017,6,20]]}}}