{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T00:40:02Z","timestamp":1721349602679},"reference-count":15,"publisher":"Institute of Electronics, Information and Communications Engineers (IEICE)","issue":"6","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IEICE Trans. Inf. &amp; Syst."],"published-print":{"date-parts":[[2019,6,1]]},"DOI":"10.1587\/transinf.2018edl8264","type":"journal-article","created":{"date-parts":[[2019,5,31]],"date-time":"2019-05-31T22:23:00Z","timestamp":1559341380000},"page":"1218-1221","source":"Crossref","is-referenced-by-count":0,"title":["Prosody Correction Preserving Speaker Individuality for Chinese-Accented Japanese HMM-Based Text-to-Speech Synthesis"],"prefix":"10.1587","volume":"E102.D","author":[{"given":"Daiki","family":"SEKIZAWA","sequence":"first","affiliation":[{"name":"University of Tokyo"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shinnosuke","family":"TAKAMICHI","sequence":"additional","affiliation":[{"name":"University of Tokyo"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hiroshi","family":"SARUWATARI","sequence":"additional","affiliation":[{"name":"University of Tokyo"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"532","reference":[{"key":"1","doi-asserted-by":"publisher","unstructured":"[1] K. Tokuda, Y. Nankaku, T. Toda, H. Zen, J. Yamagishi, and K. Oura, \u201cSpeech synthesis based on hidden Markov models,\u201d Proceedings of the IEEE, vol.101, no.5, pp.1234-1252, 2013. 10.1109\/jproc.2013.2251852","DOI":"10.1109\/JPROC.2013.2251852"},{"key":"2","doi-asserted-by":"crossref","unstructured":"[2] H. Ze, A. Senior, and M. Schuster, \u201cStatistical parametric speech synthesis using deep neural networks,\u201d in Proc. ICASSP, May 2013. 10.1109\/icassp.2013.6639215","DOI":"10.1109\/ICASSP.2013.6639215"},{"key":"3","doi-asserted-by":"crossref","unstructured":"[3] Y. Wang, R. Skerry-Ryan, D. Stanton, Y. Wu, R.J. Weiss, N. Jaitly, Z. Yang, Y. Xiao, Z. Chen, S. Bengio, Q. Le, Y. Agiomyrgiannakis, R. Clark, and R.A. Saurous, \u201cTacotron: Towards end-to-end speech synthesis,\u201d Interspeech 2017, pp.4006-4010, 2017. 10.21437\/interspeech.2017-1452","DOI":"10.21437\/Interspeech.2017-1452"},{"key":"4","doi-asserted-by":"publisher","unstructured":"[4] Y. Oshima, S. Takamichi, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, \u201cNon-native text-to-speech preserving speaker individuality based on partial correction of prosodic and phonetic characteristics,\u201d IEICE Trans. Inf. &amp; Syst., vol.E99-D, no.12, pp.3132-3139, 2016. 10.1587\/transinf.2016edp7231","DOI":"10.1587\/transinf.2016EDP7231"},{"key":"5","doi-asserted-by":"crossref","unstructured":"[5] J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, \u201cAnalysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm,\u201d IEEE Trans. Audio, Speech, Language Process., vol.17, no.1, pp.66-83, 2009. 10.1109\/tasl.2008.2006647","DOI":"10.1109\/TASL.2008.2006647"},{"key":"6","doi-asserted-by":"publisher","unstructured":"[6] J. Yamagishi, C. Veaux, S. King, and S. Renals, \u201cSpeech synthesis technologies for individuals with vocal disabilities: Voice banking and reconstruction,\u201d Acoustical Science and Technology, vol.33, no.1, pp.1-5, 2012. 10.1250\/ast.33.1","DOI":"10.1250\/ast.33.1"},{"key":"7","doi-asserted-by":"publisher","unstructured":"[7] T. Toda, A.W. Black, and K. Tokuda, \u201cVoice conversion based on maximum likelihood estimation of spectral parameter trajectory,\u201d IEEE Trans. Audio, Speech, Language Process., vol.15, no.8, pp.2222-2235, 2007. 10.1109\/tasl.2007.907344","DOI":"10.1109\/TASL.2007.907344"},{"key":"8","unstructured":"[8] R. Sonobe, S. Takamichi, and H. Saruwatari, \u201cJSUT corpus: Free large-scale Japanese speech corpus for end-to-end speech synthesis,\u201d vol.abs\/1711.00354, 2017."},{"key":"9","unstructured":"[9] \u201cJapanese speech database read by foreign students (UME-JRF),\u201d http:\/\/research.nii.ac.jp\/src\/UME-JRF.html."},{"key":"10","unstructured":"[10] Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, \u201cMaximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation,\u201d Proc. INTERSPEECH, Pittsburgh, U.S.A., pp.2266-2269, Sep. 2006."},{"key":"11","doi-asserted-by":"publisher","unstructured":"[11] H. Kawahara, I. Masuda-Katsuse, and A.D. Cheveign\u00e9, \u201cRestructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds,\u201d Speech Communication, vol.27, no.3-4, pp.187-207, 1999. 10.1016\/s0167-6393(98)00085-5","DOI":"10.1016\/S0167-6393(98)00085-5"},{"key":"12","unstructured":"[12] H. Kawahara, J. Estill, and O. Fujimura, \u201cAperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT,\u201d MAVEBA, Firentze, Italy, pp.1-6, Sept. 2001."},{"key":"13","doi-asserted-by":"crossref","unstructured":"[13] S. Takamichi, K. Kobayashi, K. Tanaka, T. Toda, and S. Nakamura, \u201cThe NAIST text-to-speech system for the Blizzard Challenge 2015,\u201d Proc. Blizzard Challenge Workshop, Berlin, Germany, Sept. 2015.","DOI":"10.21437\/Blizzard.2015-7"},{"key":"14","doi-asserted-by":"publisher","unstructured":"[14] T. Toda and K. Tokuda, \u201cA speech parameter generation algorithm considering global variance for HMM-based speech synthesis,\u201d IEICE Trans. Inf. &amp; Syst., vol.E90-D, no.5, pp.816-824, 2007. 10.1093\/ietisy\/e90-d.5.816","DOI":"10.1093\/ietisy\/e90-d.5.816"},{"key":"15","unstructured":"[15] \u201cLancers,\u201d https:\/\/www.lancers.jp\/."}],"container-title":["IEICE Transactions on Information and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transinf\/E102.D\/6\/E102.D_2018EDL8264\/_pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T00:03:28Z","timestamp":1721347408000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transinf\/E102.D\/6\/E102.D_2018EDL8264\/_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,6,1]]},"references-count":15,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2019]]}},"URL":"https:\/\/doi.org\/10.1587\/transinf.2018edl8264","relation":{},"ISSN":["0916-8532","1745-1361"],"issn-type":[{"value":"0916-8532","type":"print"},{"value":"1745-1361","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,6,1]]}}}