{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:40:24Z","timestamp":1760132424798},"reference-count":48,"publisher":"Springer Science and Business Media LLC","issue":"9","license":[{"start":{"date-parts":[[2017,1,3]],"date-time":"2017-01-03T00:00:00Z","timestamp":1483401600000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Circuits Syst Signal Process"],"published-print":{"date-parts":[[2017,9]]},"DOI":"10.1007\/s00034-016-0476-3","type":"journal-article","created":{"date-parts":[[2017,1,3]],"date-time":"2017-01-03T11:06:58Z","timestamp":1483441618000},"page":"3650-3673","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Parameterization of Excitation Signal for Improving the Quality of HMM-Based Speech Synthesis System"],"prefix":"10.1007","volume":"36","author":[{"given":"N. P.","family":"Narendra","sequence":"first","affiliation":[]},{"given":"K. Sreenivasa","family":"Rao","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2017,1,3]]},"reference":[{"key":"476_CR1","unstructured":"N. Adiga, S.R.M. Prasanna, Significance of instants of significant excitation for source modeling, in Proceedings of Interspeech (2013), pp.\u00a01677\u20131681"},{"issue":"2\u20133","key":"476_CR2","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1016\/0167-6393(92)90005-R","volume":"11","author":"P Alku","year":"1992","unstructured":"P. Alku, Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering. Speech Commun. 11(2\u20133), 109\u2013118 (1992)","journal-title":"Speech Commun."},{"key":"476_CR3","doi-asserted-by":"crossref","unstructured":"J.P. Cabral, S. Renals, J. Yamagishi, K. Richmond, HMM-based speech synthesiser using the LF-model of the glottal source, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2011), pp.\u00a04704\u20134707","DOI":"10.1109\/ICASSP.2011.5947405"},{"key":"476_CR4","unstructured":"J.P. Cabral, Uniform concatenative excitation model for synthesising speech without voiced\/unvoiced classification, in Proceedings of Interspeech (2013) pp.\u00a01082\u20131086"},{"key":"476_CR5","unstructured":"CMU ARCTIC speech synthesis databases (online). http:\/\/festvox.org\/cmu_arctic\/"},{"key":"476_CR6","unstructured":"T.G. Csap\u00f3, G. N\u00e9meth, A novel irregular voice model for HMM-based speech synthesis. in Proceedings of ISCA Speech Synthesis Workshop (2013), pp.\u00a0229\u2013234"},{"issue":"2","key":"476_CR7","doi-asserted-by":"crossref","first-page":"209","DOI":"10.1109\/JSTSP.2013.2292037","volume":"8","author":"TG Csap\u00f3","year":"2014","unstructured":"T.G. Csap\u00f3, G. N\u00e9meth, Modeling irregular voice in statistical parametric speech synthesis with residual codebook based excitation. IEEE J. Sel. Top. Signal Process. 8(2), 209\u2013220 (2014)","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"476_CR8","unstructured":"T. Drugman, A. Moinet, T. Dutoit, G. Wilfart, Using a pitch-synchrounous residual codebook for hybrid HMM\/frame selection speech synthesis, in Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2009), pp.\u00a03793\u20133796"},{"key":"476_CR9","unstructured":"T. Drugman, G. Wilfart, T. Dutoit, A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis, in Proceeding of Interspeech (2009), pp.\u00a01779\u20131782"},{"key":"476_CR10","unstructured":"T. Drugman, G. Wilfart, T. Dutoit, Eigenresiduals for improved parametric speech synthesis, in Proceedings of European Signal Processing Conference (EUSIPCO) (2009), pp.\u00a02177\u20132180"},{"issue":"3","key":"476_CR11","doi-asserted-by":"crossref","first-page":"968","DOI":"10.1109\/TASL.2011.2169787","volume":"20","author":"T Drugman","year":"2012","unstructured":"T. Drugman, T. Dutoit, The deterministic plus stochastic model of the residual signal and its applications. IEEE Trans. Audio Speech Lang. Process. 20(3), 968\u2013981 (2012)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"476_CR12","doi-asserted-by":"crossref","unstructured":"T. Drugman, T. Raitio, Excitation modeling for HMM-based speech synthesis: breaking down the impact of periodic and aperiodic components, in Proceedings of International Conference on Audio, Speech and Signal Processing (ICASSP) (2014), pp.\u00a0260\u2013264","DOI":"10.1109\/ICASSP.2014.6853598"},{"key":"476_CR13","unstructured":"HMM-based speech synthesis system (HTS) (online). http:\/\/hts.sp.nitech.ac.jp\/"},{"key":"476_CR14","volume-title":"Spoken Language Processing: A Guide to Theory, Algorithm and System Development","author":"X Huang","year":"2001","unstructured":"X. Huang, A. Acero, H.W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development (Prentice Hall, Upper Saddle River, 2001)"},{"key":"476_CR15","unstructured":"ITU-T Draft Recommendation P.862, Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs (2000)"},{"key":"476_CR16","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1016\/S0167-6393(98)00085-5","volume":"27","author":"H Kawahara","year":"1998","unstructured":"H. Kawahara, I. Masuda-Katsuse, A. de Cheveigne, Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds. Speech Commun. 27, 187\u2013207 (1998)","journal-title":"Speech Commun."},{"key":"476_CR17","doi-asserted-by":"crossref","unstructured":"H. Kawahara, M. Morise, T. Takahashi, R. Nisimura, T. Irino, H. Banno, Tandem-STRAIGHT: a temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation, in Proceeding of International Conference on Audio, Speech and Signal Processing (ICASSP) (2008), pp.\u00a03933\u20133936","DOI":"10.1109\/ICASSP.2008.4518514"},{"key":"476_CR18","doi-asserted-by":"crossref","first-page":"1384","DOI":"10.1109\/TCE.2006.273160","volume":"52","author":"S Kim","year":"2006","unstructured":"S. Kim, J. Kim, M. Hahn, HMM-based Korean speech synthesis system for hand-held devices. IEEE Trans. Consum. Electron. 52, 1384\u20131390 (2006)","journal-title":"IEEE Trans. Consum. Electron."},{"key":"476_CR19","doi-asserted-by":"crossref","DOI":"10.1201\/9781420015836","volume-title":"Speech Enhancement: Theory and Practice","author":"P Loizou","year":"2007","unstructured":"P. Loizou, Speech Enhancement: Theory and Practice (CRC Press, Boca Raton, 2007)"},{"key":"476_CR20","unstructured":"S.L. Maguer, N. Barbot, O. Boeffard, Evaluation of contextual descriptors for HMM-based speech synthesis in French, in Proceedings of ISCA Speech Synthesis Workshop (2013), pp.\u00a0153\u2013158"},{"key":"476_CR21","unstructured":"R. Maia, T. Toda, H. Zen, Y. Nankaku, K. Tokuda, An excitation model for HMM-based speech synthesis based on residual modeling, in Proceeding of International Speech Communication Association Speech Synthesis Workshop 6 (ISCA SW6) (2007), pp.\u00a0131\u2013136"},{"key":"476_CR22","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-66286-7","volume-title":"Linear Prediction of Speech","author":"JD Markel","year":"1976","unstructured":"J.D. Markel, A.H. Gray, Linear Prediction of Speech (Springer, Berlin, 1976)"},{"key":"476_CR23","doi-asserted-by":"crossref","unstructured":"A. McCree, K. Truong, E. George, T. Barnwell, V. Viswanathan, A 2.4 kbit\/s MELP coder candidate for the new U.S. Federal Standard, in Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP) (1996), pp.\u00a0200\u2013203","DOI":"10.1109\/ICASSP.1996.540325"},{"key":"476_CR24","doi-asserted-by":"crossref","unstructured":"A. McCree, A 14 kb\/s wideband speech coder with a parametric highband model, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2000), pp.\u00a01153\u20131156","DOI":"10.1109\/ICASSP.2000.859169"},{"issue":"8","key":"476_CR25","doi-asserted-by":"crossref","first-page":"1602","DOI":"10.1109\/TASL.2008.2004526","volume":"16","author":"KSR Murty","year":"2008","unstructured":"K.S.R. Murty, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602\u20131613 (2008)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"issue":"3","key":"476_CR26","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1007\/s10772-011-9094-4","volume":"14","author":"NP Narendra","year":"2011","unstructured":"N.P. Narendra, K.S. Rao, K. Ghosh, R.R. Vempada, S. Maity, Development of syllable-based text to speech synthesis system in Bengali. Int. J. Speech Technol. 14(3), 167\u2013181 (2011)","journal-title":"Int. J. Speech Technol."},{"key":"476_CR27","doi-asserted-by":"crossref","unstructured":"N.P. Narendra, K.S. Rao, K. Ghosh, V.R. Reddy, S. Maity, Development of Bengali screen reader using Festival speech synthesizer, in Proceedings of IEEE India Conference (INDICON) (2011), pp.\u00a01\u20134","DOI":"10.1109\/INDCON.2011.6139376"},{"issue":"8","key":"476_CR28","doi-asserted-by":"crossref","first-page":"2597","DOI":"10.1007\/s00034-015-9977-8","volume":"34","author":"NP Narendra","year":"2015","unstructured":"N.P. Narendra, K.S. Rao, Robust voicing detection and F0 estimation for HMM-based speech synthesis. Circuits Syst. Signal Process. 34(8), 2597\u20132619 (2015)","journal-title":"Circuits Syst. Signal Process."},{"key":"476_CR29","doi-asserted-by":"crossref","unstructured":"N.P. Narendra, K.S. Rao, A deterministic plus noise model of excitation signal using principal component analysis for parametric speech synthesis, in Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2016), pp.\u00a05635\u20135639","DOI":"10.1109\/ICASSP.2016.7472756"},{"key":"476_CR30","unstructured":"J.J. Odella, The Use of Context in Large Vocabulary Speech Recognition. Ph.D. thesis, Cambridge University, Cambridge (1995)"},{"key":"476_CR31","volume-title":"Speech Coding and Synthesis","author":"K Paliwal","year":"1995","unstructured":"K. Paliwal, W. Kleijn, Quantization of LPC parameters, in Speech Coding and Synthesis, ed. by W. Kleijn, E.K. Paliwal (Elsevier, Amsterdam, 1995)"},{"key":"476_CR32","doi-asserted-by":"crossref","unstructured":"Y. Pantazis, Y. Stylianou, Improving the modeling of the noise part in the harmonic plus noise model of speech, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.\u00a04609\u20134612 (2008)","DOI":"10.1109\/ICASSP.2008.4518683"},{"key":"476_CR33","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1016\/j.neucom.2012.10.040","volume":"132","author":"B Picart","year":"2014","unstructured":"B. Picart, T. Drugman, T. Dutoit, HMM-based speech synthesis with various degrees of articulation: a perceptual study. J. Neurocomput. 132, 142\u2013147 (2014)","journal-title":"J. Neurocomput."},{"key":"476_CR34","volume-title":"Discrete-Time Speech Signal Processing: Principles and Practice","author":"TF Quatieri","year":"2002","unstructured":"T.F. Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice (Prentice Hall, Upper Saddle River, 2002)"},{"key":"476_CR35","doi-asserted-by":"crossref","unstructured":"E.V. Raghavendra, K. Prahallad, A multilingual screen reader in Indian languages, in Proceedings of National Conference on Communications (NCC) (2010), pp.\u00a01\u20135","DOI":"10.1109\/NCC.2010.5430191"},{"issue":"1","key":"476_CR36","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1109\/TASL.2010.2045239","volume":"19","author":"T Raitio","year":"2011","unstructured":"T. Raitio, A. Suni, J. Yamagishi, H. Pulakka, J. Nurminen, M. Vainio, P. Alku, HMM-based speech synthesis utilizing glottal inverse filtering. IEEE Trans. Audio Speech Lang. Process. 19(1), 153\u2013165 (2011)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"476_CR37","doi-asserted-by":"crossref","unstructured":"T. Raitio, A. Suni, H. Pulakka, M. Vainio, P. Alku, Utilizing glottal source pulse library for generating improved excitation signal for HMM-based speech synthesis, in Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2011), pp.\u00a04564\u20134567","DOI":"10.1109\/ICASSP.2011.5947370"},{"issue":"2","key":"476_CR38","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1250\/ast.21.79","volume":"21","author":"K Shinoda","year":"2000","unstructured":"K. Shinoda, T. Watanabe, MDL-based context-dependent subword modeling for speech recognition. J. Acoust. Soc. Jpn. (E) 21(2), 79\u201386 (2000)","journal-title":"J. Acoust. Soc. Jpn. (E)"},{"key":"476_CR39","doi-asserted-by":"crossref","unstructured":"F. Soong, B. Juang, Line spectrum pair (LSP) and speech data compression, in Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP) (1984) pp.\u00a037\u201340","DOI":"10.1109\/ICASSP.1984.1172448"},{"key":"476_CR40","unstructured":"Y. Stylianou, Harmonic Plus Noise Models for Speech, Combined with Statistical Methods, for Speech and Speaker Modification. Ph.D. thesis, Ecole Nationale Sup\u00e9rieure des T\u00e9l\u00e9communications (1996)"},{"issue":"5","key":"476_CR41","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1093\/ietisy\/e90-d.5.816","volume":"90","author":"T Toda","year":"2007","unstructured":"T. Toda, K. Tokuda, A speech parameter generation algorithm considering global variance for HMM-based speech synthesis. IEICE Trans. Inform. Syst. 90(5), 816\u2013824 (2007)","journal-title":"IEICE Trans. Inform. Syst."},{"key":"476_CR42","unstructured":"K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, T. Kitamura, Speech parameter generation algorithms for HMM-based speech synthesis, in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, (ICASSP) (2000), pp.\u00a01315\u20131318"},{"issue":"5","key":"476_CR43","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1109\/JPROC.2013.2251852","volume":"101","author":"K Tokuda","year":"2013","unstructured":"K. Tokuda, Y. Nankaku, T. Toda, H. Zen, J. Yamagishi, K. Oura, Speech synthesis based on hidden Markov models. Proc. IEEE 101(5), 1234\u20131252 (2013)","journal-title":"Proc. IEEE"},{"issue":"3","key":"476_CR44","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1007\/s11265-013-0862-z","volume":"74","author":"Z Wen","year":"2013","unstructured":"Z. Wen, J. Tao, S. Pan, Y. Wang, Pitch-scaled spectrum based excitation model for HMM-based speech synthesis. J. Signal Process. Syst. 74(3), 423\u2013435 (2013)","journal-title":"J. Signal Process. Syst."},{"key":"476_CR45","unstructured":"T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, T. Kitamura, Mixed-excitation for HMM-based speech synthesis, in Proceedings of Eurospeech (2001), pp.\u00a02259\u20132262"},{"key":"476_CR46","doi-asserted-by":"crossref","unstructured":"E. Yumoto, W. Gould, T. Baer, Harmonics-to-noise ratio as an index of the degree of hoarseness. J. Acoust. Soc. Am. 71(6), 1544\u20131550 (1982)","DOI":"10.1121\/1.387808"},{"key":"476_CR47","doi-asserted-by":"crossref","unstructured":"H. Zen, T. Toda, M. Nakamura, K. Tokuda, Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005. IEICE Trans. Inform. Syst. E90-D, 325\u2013333 (2007)","DOI":"10.1093\/ietisy\/e90-1.1.325"},{"key":"476_CR48","doi-asserted-by":"crossref","unstructured":"H. Zen, T. Toda, K. Tokuda, The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006. IEICE Trans. Inform. Syst. E91-D(6), 1764\u20131773 (2008)","DOI":"10.1093\/ietisy\/e91-d.6.1764"}],"container-title":["Circuits, Systems, and Signal Processing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s00034-016-0476-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00034-016-0476-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00034-016-0476-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,9,17]],"date-time":"2019-09-17T03:42:00Z","timestamp":1568691720000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s00034-016-0476-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,1,3]]},"references-count":48,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2017,9]]}},"alternative-id":["476"],"URL":"https:\/\/doi.org\/10.1007\/s00034-016-0476-3","relation":{},"ISSN":["0278-081X","1531-5878"],"issn-type":[{"value":"0278-081X","type":"print"},{"value":"1531-5878","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,1,3]]}}}