{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T19:31:44Z","timestamp":1777663904042,"version":"3.51.4"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2012,3,17]],"date-time":"2012-03-17T00:00:00Z","timestamp":1331942400000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/2.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["EURASIP J. Adv. Signal Process."],"published-print":{"date-parts":[[2012,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Computational Auditory Scene Analysis (CASA) has been the focus in recent literature for speech separation from monaural mixtures. The performance of current CASA systems on voiced speech separation strictly depends on the robustness of the algorithm used for pitch frequency estimation. We propose a new system that estimates pitch (frequency) range of a target utterance and separates voiced portions of target speech. The algorithm, first, estimates the pitch range of target speech in each frame of data in the modulation frequency domain, and then, uses the estimated pitch range for segregating the target speech. The method of pitch range estimation is based on an onset and offset algorithm. Speech separation is performed by filtering the mixture signal with a mask extracted from the modulation spectrogram. A systematic evaluation shows that the proposed system extracts the majority of target speech signal with minimal interference and outperforms previous systems in both pitch extraction and voiced speech separation.<\/jats:p>","DOI":"10.1186\/1687-6180-2012-67","type":"journal-article","created":{"date-parts":[[2012,3,17]],"date-time":"2012-03-17T13:13:59Z","timestamp":1331990039000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Single channel speech separation in modulation frequency domain based on a novel pitch range estimation method"],"prefix":"10.1186","volume":"2012","author":[{"given":"Azar","family":"Mahmoodzadeh","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hamid Reza","family":"Abutalebi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hamid","family":"Soltanian-Zadeh","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hamid","family":"Sheikhzadeh","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2012,3,17]]},"reference":[{"key":"202_CR1","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/S0167-6393(97)00021-6","volume":"22","author":"RP Lippmann","year":"1997","unstructured":"Lippmann RP: Speech recognition by machines and humans. Speech Commun 1997, 22: 1-16. 10.1016\/S0167-6393(97)00021-6","journal-title":"Speech Commun"},{"key":"202_CR2","doi-asserted-by":"publisher","first-page":"410","DOI":"10.1016\/j.specom.2004.11.009","volume":"45","author":"JJ Sroka","year":"2005","unstructured":"Sroka JJ, Braida LD: Human and machine consonant recognition. Speech Commun 2005, 45: 410-423.","journal-title":"Speech Commun"},{"key":"202_CR3","first-page":"45","volume-title":"Computational Auditory Scene Analysis: Principles, Algorithms, and Applications","author":"A de Cheveigne","year":"2006","unstructured":"de Cheveigne A: Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. Edited by: Wang DL, Brown GJ. Wiley & IEEE, Hoboken, NJ; 2006:45-79."},{"key":"202_CR4","doi-asserted-by":"publisher","first-page":"38412, 11","DOI":"10.1155\/ASP\/2006\/38412","volume":"2006","author":"S Dubnov","year":"2006","unstructured":"Dubnov S, Tabrikian J, Arnon-Targan M: Speech source separation in convolutive environments using space-time-frequency analysis. EURASIP J Appl Signal Process 2006, 2006: 38412, 11.","journal-title":"EURASIP J Appl Signal Process"},{"key":"202_CR5","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/1486.001.0001","volume-title":"Auditory Scene Analysis","author":"AS Bregman","year":"1990","unstructured":"Bregman AS: Auditory Scene Analysis. MIT, Cambridge, MA; 1990."},{"key":"202_CR6","volume-title":"Computational Auditory Scene Analysis: Principles, Algorithms, and Applications","year":"2006","unstructured":"Wang DL, Brown GJ (Eds): Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. Wiley & IEEE, Hoboken, NJ; 2006."},{"key":"202_CR7","doi-asserted-by":"publisher","first-page":"2991","DOI":"10.1155\/ASP.2005.2991","volume":"18","author":"M Buchler","year":"2005","unstructured":"Buchler M, Allegro S, Launer S, Dillier N: Sound classification in hearing aids inspired by auditory scene analysis. EURASIP J Appl Signal Process 2005, 18: 2991-3002.","journal-title":"EURASIP J Appl Signal Process"},{"issue":"8","key":"202_CR8","first-page":"2067","volume":"18","author":"G Hu","year":"2007","unstructured":"Hu G, Wang D: A Tandem algorithm for pitch estimation and voiced speech segregation. IEEE Trans Audio Speech Lang Process 2007, 18(8):2067-2079.","journal-title":"IEEE Trans Audio Speech Lang Process"},{"key":"202_CR9","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1016\/j.csl.2008.03.004","volume":"24","author":"Y Shao","year":"2010","unstructured":"Shao Y, Srinivasan S, Jin Z, Wang D: A computational auditory scene analysis system for speech segregation and robust speech recognition. Comput Speech Lang 2010, 24: 77-93. 10.1016\/j.csl.2008.03.004","journal-title":"Comput Speech Lang"},{"key":"202_CR10","doi-asserted-by":"publisher","first-page":"84186, 15","DOI":"10.1155\/2007\/84186","volume":"2007","author":"MH Radfar","year":"2007","unstructured":"Radfar MH, Dansereau RM, Sayadiyan A: A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation. EURASIP J Audio Speech Music Process 2007, 2007: 84186, 15.","journal-title":"EURASIP J Audio Speech Music Process"},{"key":"202_CR11","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1016\/j.specom.2004.05.002","volume":"45","author":"J Barker","year":"2005","unstructured":"Barker J, Cooke M, Ellis D: Decoding speech in the presence of other sources. Speech Commun 2005, 45: 5-25. 10.1016\/j.specom.2004.05.002","journal-title":"Speech Commun"},{"key":"202_CR12","first-page":"289","volume":"14","author":"Y Shao","year":"2005","unstructured":"Shao Y, Wang DL: Model-based sequential organization in cochannel speech. IEEE Trans Acoust Speech Signal Process 2005, 14: 289-298.","journal-title":"IEEE Trans Acoust Speech Signal Process"},{"key":"202_CR13","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1006\/csla.1994.1016","volume":"8","author":"GJ Brown","year":"1994","unstructured":"Brown GJ, Cooke M: Computational auditory scene analysis. Comput Speech Lang 1994, 8: 297-336. 10.1006\/csla.1994.1016","journal-title":"Comput Speech Lang"},{"key":"202_CR14","doi-asserted-by":"publisher","first-page":"1135","DOI":"10.1109\/TNN.2004.832812","volume":"15","author":"G Hu","year":"2004","unstructured":"Hu G, Wang DL: Monaural speech separation based on pitch tracking and amplitude modulation. IEEE Trans Neural Net 2004, 15: 1135-1150. 10.1109\/TNN.2004.832812","journal-title":"IEEE Trans Neural Net"},{"key":"202_CR15","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1109\/TSA.2003.811539","volume":"11","author":"M Wu","year":"2003","unstructured":"Wu M, Wang DL, Brown GJ: A multipitch tracking algorithm for noisy speech. IEEE Trans Speech Audio Process 2003, 11: 229-241. 10.1109\/TSA.2003.811539","journal-title":"IEEE Trans Speech Audio Process"},{"key":"202_CR16","doi-asserted-by":"publisher","first-page":"1135","DOI":"10.1109\/TASL.2007.894510","volume":"15","author":"J Le Roux","year":"2007","unstructured":"Le Roux J, Kameoka H, Ono N, de Cheveigne A, Sagayama S: Single and multiple F0 contour estimation through parametric spectrogram modeling of speech in noisy environments. IEEE Trans Audio Speech Lang Process 2007, 15: 1135-1145.","journal-title":"IEEE Trans Audio Speech Lang Process"},{"key":"202_CR17","first-page":"605","volume":"4","author":"SM Schimmel","year":"2007","unstructured":"Schimmel SM, Atlas LE, Nie K: Feasibility of single channel speaker separation based on modulation frequency analysis. Proc IEEE International Conference on Acoustics, Speech and Signal Processing, Hawaii, USA 2007, 4: 605-608.","journal-title":"Proc IEEE International Conference on Acoustics, Speech and Signal Processing, Hawaii, USA"},{"key":"202_CR18","unstructured":"Schimmel SM Dissertation, University of Washington; 2007."},{"issue":"7","key":"202_CR19","doi-asserted-by":"publisher","first-page":"668","DOI":"10.1155\/S1110865703305013","volume":"2003","author":"L Atlas","year":"2003","unstructured":"Atlas L, Shamma SA: Joint acoustic and modulation frequency. EURASIP J Appl Signal Process 2003, 2003(7):668-675. 10.1155\/S1110865703305013","journal-title":"EURASIP J Appl Signal Process"},{"issue":"2","key":"202_CR20","doi-asserted-by":"publisher","first-page":"396","DOI":"10.1109\/TASL.2006.881700","volume":"15","author":"G Hu","year":"2007","unstructured":"Hu G, Wang DL: Auditory segmentation based on onset and offset analysis. IEEE Trans Audio Speech Lang Process 2007, 15(2):396-405.","journal-title":"IEEE Trans Audio Speech Lang Process"},{"key":"202_CR21","doi-asserted-by":"publisher","first-page":"1053","DOI":"10.1121\/1.408467","volume":"95","author":"R Drullman","year":"1994","unstructured":"Drullman R, Festen JM, Plomp R: Effect of temporal envelope smearing on speech reception. J Acoust Soc Am 1994, 95: 1053-1064. 10.1121\/1.408467","journal-title":"J Acoust Soc Am"},{"key":"202_CR22","first-page":"221","volume-title":"Proc IEEE International Conference on Acoustics, Speech and Signal Processing, Pennsylvania, USA","author":"SM Schimmel","year":"2005","unstructured":"Schimmel SM, Atlas LE: Coherent envelope detection for modulation filtering of speech. Proc IEEE International Conference on Acoustics, Speech and Signal Processing, Pennsylvania, USA 2005, 221-224."},{"key":"202_CR23","volume-title":"Blind source separation: audio examples","author":"TW Lee","year":"1998","unstructured":"Lee TW: Blind source separation: audio examples. 1998. . Accessed 4 May 2011 http:\/\/www.snl.salk.edu\/~tewon\/Blind\/blind_audio.html"},{"key":"202_CR24","volume-title":"Modeling Auditory Processing and Organization","author":"MP Cooke","year":"1993","unstructured":"Cooke MP: Modeling Auditory Processing and Organization. Cambridge University Press, Cambridge; 1993."},{"key":"202_CR25","unstructured":"Drake LA Dissertation, University of Northwestern; 2001."},{"key":"202_CR26","doi-asserted-by":"publisher","first-page":"684","DOI":"10.1109\/72.761727","volume":"10","author":"DL Wang","year":"1999","unstructured":"Wang DL, Brown GJ: Separation of speech from interfering sounds based on oscillatory correlation. IEEE Trans Neural Netw 1999, 10: 684-697. 10.1109\/72.761727","journal-title":"IEEE Trans Neural Netw"},{"key":"202_CR27","first-page":"41","volume":"2","author":"Q Li","year":"2003","unstructured":"Li Q, Atlas L: Time-variant least-squares harmonic modeling. Proc IEEE International Conference on Acoustics, Speech and Signal Processing, Hong Kong 2003, 2: 41-44.","journal-title":"Proc IEEE International Conference on Acoustics, Speech and Signal Processing, Hong Kong"},{"key":"202_CR28","first-page":"495","volume-title":"Speech Coding and Synthesis","author":"D Talkin","year":"1995","unstructured":"Talkin D: A robust algorithm for pitch tracking (RAPT). In Speech Coding and Synthesis. Edited by: Klein WB, Paliwal KK. Elsevier, NewYork, NY; 1995:495-518."},{"key":"202_CR29","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1109\/TSA.2003.819950","volume":"12","author":"J Tabrikian","year":"2004","unstructured":"Tabrikian J, Dubnov S, Dickalov Y: Maximum a posterior probability pitch tracking in noisy environments using harmonic model. IEEE Trans Speech Audio Process 2004, 12: 76-87. 10.1109\/TSA.2003.819950","journal-title":"IEEE Trans Speech Audio Process"},{"key":"202_CR30","volume-title":"Spoken Language Processing: A Guide to Theory, Algorithms, and System Development","author":"X Huang","year":"2001","unstructured":"Huang X, Acero A, Hon HW: Spoken Language Processing: A Guide to Theory, Algorithms, and System Development. Prentice Hall PTR, Upper Saddle River, NJ; 2001."},{"key":"202_CR31","unstructured":"Shao Y Dissertation, University of Ohio State; 2007."}],"container-title":["EURASIP Journal on Advances in Signal Processing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/1687-6180-2012-67.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/1687-6180-2012-67\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1687-6180-2012-67.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T19:06:21Z","timestamp":1630523181000},"score":1,"resource":{"primary":{"URL":"https:\/\/asp-eurasipjournals.springeropen.com\/articles\/10.1186\/1687-6180-2012-67"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,3,17]]},"references-count":31,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2012,12]]}},"alternative-id":["202"],"URL":"https:\/\/doi.org\/10.1186\/1687-6180-2012-67","relation":{},"ISSN":["1687-6180"],"issn-type":[{"value":"1687-6180","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,3,17]]},"assertion":[{"value":"7 May 2011","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 March 2012","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 March 2012","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"67"}}