{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,3,3]],"date-time":"2024-03-03T09:54:53Z","timestamp":1709459693770},"reference-count":30,"publisher":"Institute of Electronics, Information and Communications Engineers (IEICE)","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IEICE Trans. Inf. &amp; Syst."],"published-print":{"date-parts":[[2020,5,1]]},"DOI":"10.1587\/transinf.2019edp7260","type":"journal-article","created":{"date-parts":[[2020,4,30]],"date-time":"2020-04-30T22:14:05Z","timestamp":1588284845000},"page":"1108-1117","source":"Crossref","is-referenced-by-count":3,"title":["Mimicking Lombard Effect: An Analysis and Reconstruction"],"prefix":"10.1587","volume":"E103.D","author":[{"given":"Thuan Van","family":"NGO","sequence":"first","affiliation":[{"name":"Graduate School of Advanced Science and Technology, JAIST"}]},{"given":"Rieko","family":"KUBO","sequence":"additional","affiliation":[{"name":"Graduate School of Advanced Science and Technology, JAIST"}]},{"given":"Masato","family":"AKAGI","sequence":"additional","affiliation":[{"name":"Graduate School of Advanced Science and Technology, JAIST"}]}],"member":"532","reference":[{"key":"1","unstructured":"[1] E. Lombard, \u201cLe signe de l&apos;\u00e9l\u00e9vation de la voix,\u201d Annales des Maladies de L&apos;Oreille et du Larynx, vol.37, pp.101-119, 1911."},{"key":"2","doi-asserted-by":"publisher","unstructured":"[2] Y. Lu and M. Cooke, \u201cSpeech production modifications produced by competing talkers, babble, and stationary noise,\u201d J. Acoust. Soc. Am., vol.124, no.5, pp.3261-3275, 2008. 10.1121\/1.2990705","DOI":"10.1121\/1.2990705"},{"key":"3","doi-asserted-by":"publisher","unstructured":"[3] M. Cooke and Y. Lu, \u201cSpectral and temporal changes to speech produced in the presence of energetic and informational maskers,\u201d J. Acoust. Soc. Am., vol.128, no.4, pp.2059-2069, 2010. 10.1121\/1.3478775","DOI":"10.1121\/1.3478775"},{"key":"4","doi-asserted-by":"crossref","unstructured":"[4] A.R. L\u00f3pez, S. Seshadri, L. Juvela, O. R\u00e4s\u00e4nen, and P. Alku, \u201cSpeaking style conversion from normal to Lombard speech using a Glottal vocoder and Bayesian GMMs,\u201d Interspeech, pp.1363-1367, 2017. 10.21437\/interspeech.2017-400","DOI":"10.21437\/Interspeech.2017-400"},{"key":"5","doi-asserted-by":"crossref","unstructured":"[5] B. Bollepalli, M. Airaksinen, and P. Alku, \u201cLombard speech synthesis using long short-term memory recurrent neural networks,\u201d 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5505-5509, 2017. 10.1109\/icassp.2017.7953209","DOI":"10.1109\/ICASSP.2017.7953209"},{"key":"6","doi-asserted-by":"publisher","unstructured":"[6] T.V. Ngo, R. Kubo, D. Morikawa, and M. Akagi, \u201cAcoustical analyses of tendencies of intelligibility in Lombard speech with different background noise levels,\u201d Journal of Signal Processing, vol.21, no.4, pp.171-174, 2017. 10.2299\/jsp.21.171","DOI":"10.2299\/jsp.21.171"},{"key":"7","unstructured":"[7] S. Matsumoto and M. Akagi, \u201cVariation of formant amplitude and frequencies in vowel spectrum uttered under various noisy environments,\u201d 2019 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP2019), Research Institute of Signal Processing, Japan, 2019."},{"key":"8","doi-asserted-by":"publisher","unstructured":"[8] Y. Lu and M. Cooke, \u201cThe contribution of changes in F0 and spectral tilt to increased intelligibility of speech produced in noise,\u201d Speech Communication, vol.51, no.12, pp.1253-1262, 2009. 10.1016\/j.specom.2009.07.002","DOI":"10.1016\/j.specom.2009.07.002"},{"key":"9","doi-asserted-by":"publisher","unstructured":"[9] M. Cooke, C. Mayo, and J. Villegas, \u201cThe contribution of durational and spectral changes to the Lombard speech intelligibility benefit,\u201d J. Acoust. Soc. Am., vol.135, no.2, pp.874-883, 2014. 10.1121\/1.4861342","DOI":"10.1121\/1.4861342"},{"key":"10","doi-asserted-by":"publisher","unstructured":"[10] J.-C. Junqua, \u201cThe Lombard reflex and its role on human listeners and automatic speech recognizers,\u201d J. Acoust. Soc. Am., vol.93, no.1, pp.510-524, 1993. 10.1121\/1.405631","DOI":"10.1121\/1.405631"},{"key":"11","unstructured":"[11] D.Y. Huang, S. Rahardja, and E.P. Ong, \u201cLombard effect mimicking,\u201d ISCA, 2010."},{"key":"12","doi-asserted-by":"crossref","unstructured":"[12] D.-Y. Huang and E.P. Ong, \u201cLombard speech model for automatic enhancement of speech intelligibility over telephone channel,\u201d 2010 International Conference on Audio, Language and Image Processing, pp.258-263, 2010. 10.1109\/icalip.2010.5684545","DOI":"10.1109\/ICALIP.2010.5684545"},{"key":"13","doi-asserted-by":"crossref","unstructured":"[13] S. Rottschafer, H. Buschmeier, H. Welbergen, and S. Kopp, \u201cOnline Lombard adaptation in incremental speech synthesis,\u201d ISCA, pp.80-84, 2015.","DOI":"10.21437\/Interspeech.2015-31"},{"key":"14","doi-asserted-by":"publisher","unstructured":"[14] P.T. Nghia, L.C. Mai, and M. Akagi, \u201cImproving the naturalness of concatenative Vietnamese speech synthesis under limited data conditions,\u201d Journal of Computer Science and Cybernetics, vol.31, no.1, pp.1-16, 2015. 10.15625\/1813-9663\/31\/1\/5064","DOI":"10.15625\/1813-9663\/31\/1\/5064"},{"key":"15","unstructured":"[15] P.C. Nguyen, T. Ochi, and M. Akagi, \u201cModified restricted temporal decomposition and its application to low rate speech coding,\u201d IEICE Trans. Inf. &amp; Syst., vol.E86-D, no.3, pp.397-405, March 2003."},{"key":"16","doi-asserted-by":"publisher","unstructured":"[16] B.P. Nguyen and M. Akagi, \u201cA flexible spectral modification method based on temporal decomposition and Gaussian mixture model,\u201d Acoustical Science and Technology, vol.30, no.3, pp.170-179, 2009. 10.1250\/ast.30.170","DOI":"10.1250\/ast.30.170"},{"key":"17","doi-asserted-by":"publisher","unstructured":"[17] H. Kawahara, I. Masuda-Katsuse, and A. De Cheveign\u00e9, \u201cRestructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds,\u201d Speech Communication, vol.27, no.3-4, pp.187-207, 1999. 10.1016\/s0167-6393(98)00085-5","DOI":"10.1016\/S0167-6393(98)00085-5"},{"key":"18","unstructured":"[18] T.N. Phung, M.C. Luong, and M. Akagi, \u201cAn investigation on perceptual line spectral frequency (PLP-LSF) target stability against the vowel neutralization phenomenon,\u201d 2011 3rd International Conference on Signal Acquisition and Processing (ICSAP 2011), Institute of Electrical and Electronics Engineers (IEEE), pp.512-514, 2011."},{"key":"19","unstructured":"[19] T. Kondo, S. Amano, S. Sakamoto, and Y. Suzuki, \u201cDevelopment of familiarity-controlled word-lists (FW07),\u201d IEICE Tech. Rep., vol.107, no.432, pp.43-48, 2008."},{"key":"20","unstructured":"[20] K. Kondo, S. Amano, Y. Suzuki, and S. Sakamoto, \u201cJapanese speech dataset for familiarity-controlled spoken-word intelligibility test (FW07),\u201d NII Speech Resources Consortium, 2007."},{"key":"21","doi-asserted-by":"publisher","unstructured":"[21] D.D. Mehta, D. Rudoy, and P.J. Wolfe, \u201cKalman-based autoregressive moving average modeling and inference for formant and antiformant tracking,\u201d J. Acoust. Soc. Am., vol.132, no.3, pp.1732-1746, 2012. 10.1121\/1.4739462","DOI":"10.1121\/1.4739462"},{"key":"22","doi-asserted-by":"publisher","unstructured":"[22] A.C. Lammert and S.S. Narayanan, \u201cOn short-time estimation of vocal tract length from formant frequencies,\u201d PloS one, vol.10, no.7, e0132193, 2015. 10.1371\/journal.pone.0132193","DOI":"10.1371\/journal.pone.0132193"},{"key":"23","doi-asserted-by":"publisher","unstructured":"[23] P.F. Assmann and T.M. Nearey, \u201cRelationship between fundamental and formant frequencies in voice preference,\u201d J. Acoust. Soc. Am., vol.122, no.2, pp.EL35-EL43, 2007. 10.1121\/1.2719045","DOI":"10.1121\/1.2719045"},{"key":"24","doi-asserted-by":"publisher","unstructured":"[24] M. Hodgson, G. Steininger, and Z. Razavi, \u201cMeasurement and prediction of speech and noise levels and the Lombard effect in eating establishments,\u201d J. Acoust. Soc. Am., vol.121, no.4, pp.2023-2033, 2007. 10.1121\/1.2535571","DOI":"10.1121\/1.2535571"},{"key":"25","doi-asserted-by":"crossref","unstructured":"[25] S. Narusawa, N. Minematsu, K. Hirose, and H. Fujisaki, \u201cA method for automatic extraction of model parameters from fundamental frequency contours of speech,\u201d 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.I-509-I-512. 2002. 10.1109\/ICASSP.2002.5743766","DOI":"10.1109\/ICASSP.2002.5743766"},{"key":"26","doi-asserted-by":"publisher","unstructured":"[26] M. Akagi and Y. Tohkura, \u201cSpectrum target prediction model and its application to speech recognition,\u201d Computer Speech &amp; Language, vol.4, no.4, pp.325-344, 1990. 10.1016\/0885-2308(90)90014-w","DOI":"10.1016\/0885-2308(90)90014-W"},{"key":"27","doi-asserted-by":"publisher","unstructured":"[27] Y. Xue, Y. Hamada, and M. Akagi, \u201cVoice conversion for emotional speech: Rule-based synthesis with degree of emotion controllable in dimensional space,\u201d Speech Communication, vol.102, pp.54-67, 2018. 10.1016\/j.specom.2018.06.006","DOI":"10.1016\/j.specom.2018.06.006"},{"key":"28","doi-asserted-by":"crossref","unstructured":"[28] B.O. Bush and A. Kain, \u201cModeling coarticulation in continuous speech,\u201d 15th Annual Conference of the International Speech Communication Association, INTERSPEECH 2014, pp.193-197, 2014.","DOI":"10.21437\/Interspeech.2014-51"},{"key":"29","doi-asserted-by":"crossref","unstructured":"[29] A. Kurematsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara, and K. Shikano, \u201cATR Japanese speech database as a tool of speech recognition and synthesis,\u201d Speech Communication, vol.9, no.4, pp.357-363, 1990. 10.1016\/0167-6393(90)90011-w","DOI":"10.1016\/0167-6393(90)90011-W"},{"key":"30","unstructured":"[30] Pink-Noise, \u201cVarious-audio test CD-1-91 test signals for home and laboratory use.\u201d"}],"container-title":["IEICE Transactions on Information and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transinf\/E103.D\/5\/E103.D_2019EDP7260\/_pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,22]],"date-time":"2022-10-22T18:31:42Z","timestamp":1666463502000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transinf\/E103.D\/5\/E103.D_2019EDP7260\/_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,1]]},"references-count":30,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2020]]}},"URL":"https:\/\/doi.org\/10.1587\/transinf.2019edp7260","relation":{},"ISSN":["0916-8532","1745-1361"],"issn-type":[{"value":"0916-8532","type":"print"},{"value":"1745-1361","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,5,1]]}}}