{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,12]],"date-time":"2024-08-12T00:05:07Z","timestamp":1723421107576},"reference-count":40,"publisher":"Institute of Electronics, Information and Communications Engineers (IEICE)","issue":"8","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IEICE Trans. Inf. &amp; Syst."],"published-print":{"date-parts":[[2020,8,1]]},"DOI":"10.1587\/transinf.2019edp7211","type":"journal-article","created":{"date-parts":[[2020,7,31]],"date-time":"2020-07-31T22:15:38Z","timestamp":1596233738000},"page":"1875-1887","source":"Crossref","is-referenced-by-count":3,"title":["Silent Speech Interface Using Ultrasonic Doppler Sonar"],"prefix":"10.1587","volume":"E103.D","author":[{"given":"Ki-Seung","family":"LEE","sequence":"first","affiliation":[{"name":"Department of Electronic Engineering, Konkuk University"}]}],"member":"532","reference":[{"key":"1","doi-asserted-by":"publisher","unstructured":"[1] B. Denby, T. Schultz, K. Honda, T. Hueber, J.M. Gilbert, and J.S. Brumberg, \u201cSilent speech interfaces,\u201d Speech Communication, vol.52, no.4, pp.270-287, 2010. 10.1016\/j.specom.2009.08.002","DOI":"10.1016\/j.specom.2009.08.002"},{"key":"2","doi-asserted-by":"publisher","unstructured":"[2] T.F. Quatieri, K. Brady, D. Messing, J.P. Campbell, W.M. Campbell, M.S. Brandstein, C.J. Weinstein, J.D. Tardelli, and P.D. Gatewood, \u201cExploiting nonacoustic sensors for speech encoding,\u201d IEEE Trans. Audio, Speech, Lang. Process., vol.14, no.2, pp.533-544, 2006. 10.1109\/tsa.2005.855838","DOI":"10.1109\/TSA.2005.855838"},{"key":"3","doi-asserted-by":"crossref","unstructured":"[3] M. Jiao, G. Lu, X. Jing, S. Li, Y. Li, and J. Wang, \u201cA novel radar sensor for the non-contact detection of speech signals,\u201d Sensors, vol.10, no.5, pp.4622-4633, 2010. 10.3390\/s100504622","DOI":"10.3390\/s100504622"},{"key":"4","doi-asserted-by":"publisher","unstructured":"[4] S. Li, J.-Q. Wang, M. Niu, T. Liu, and X.-J. Jing, \u201cThe enhancement of millimeter wave conduct speech based on perceptual weighting,\u201d Progress in Electromagnetics Research B, vol.9, pp.199-214, 2008. 10.2528\/pierb08063001","DOI":"10.2528\/PIERB08063001"},{"key":"5","doi-asserted-by":"crossref","unstructured":"[5] S. Li, Y. Tian, G. Lu, Y. Zhang, H. Lv, X. Yu, H. Xue, H. Zhang, J. Wang, and X. Jing, \u201cA 94-GHz milimeter-wave sensor for speech signal acquisition,\u201d Sensors, vol.13, no.11, pp.14248-14260, 2013. 10.3390\/s131114248","DOI":"10.3390\/s131114248"},{"key":"6","doi-asserted-by":"crossref","unstructured":"[6] S. Li, Y. Tian, G. Lu, Y. Zhang, H. Xue, J.-Q. Wang, and X.-J. Jing, \u201cA new kind of non-acoustic speech acquisition method based on millimeter wave radar,\u201d Progress in Electromagnetics Research B, vol.130, pp.17-40, 2012. 10.2528\/pier12052207","DOI":"10.2528\/PIER12052207"},{"key":"7","doi-asserted-by":"publisher","unstructured":"[7] C.-S. Lin, S.-F. Chang, C.-C. Chang, and C.-C. Lin, \u201cMicrowave human vocal vibration signal detection based on Doppler radar technology,\u201d IEEE Trans. Microw. Theory Techn., vol.58, no.8, pp.2299-2306, 2010. 10.1109\/tmtt.2010.2052968","DOI":"10.1109\/TMTT.2010.2052968"},{"key":"8","doi-asserted-by":"crossref","unstructured":"[8] B. Denby and M. Stone, \u201cSpeech synthesis from real time ultrasound images of the tongue,\u201d Proc. IEEE International Conference on Acoustic, Speech and Signal Processing, pp.685-688, 2004. 10.1109\/icassp.2004.1326078","DOI":"10.1109\/ICASSP.2004.1326078"},{"key":"9","doi-asserted-by":"crossref","unstructured":"[9] B. Denby, Y. Oussar, G. Dreyfus, and M. Stone, \u201cProspects for a silent speech interface using ultrasound imaging,\u201d Proc. IEEE International Conference on Acoustic, Speech and Signal Processing, pp.365-368, 2006. 10.1109\/icassp.2006.1660033","DOI":"10.1109\/ICASSP.2006.1660033"},{"key":"10","doi-asserted-by":"crossref","unstructured":"[10] T. Hueber, G. Aversano, G. Chollet, B. Denby, G. Dreyfus, Y. Oussar, P. Roussel, and M. Stone, \u201cEigentongue feature extraction for an ultrasound-based silent speech interface,\u201d Proc. IEEE International Conference on Acoustic, Speech and Signal Processing, pp.1245-1248, 2007. 10.1109\/icassp.2007.366140","DOI":"10.1109\/ICASSP.2007.366140"},{"key":"11","doi-asserted-by":"publisher","unstructured":"[11] I. Almajai and B. Milner, \u201cVisually derived Wiener filters for speech enhancement,\u201d IEEE Trans. Audio, Speech, Lang. Process., vol.19, no.6, pp.1642-1651, 2011. 10.1109\/tasl.2010.2096212","DOI":"10.1109\/TASL.2010.2096212"},{"key":"12","unstructured":"[12] L. Girin, L. Varin, G. Feng, and J.L. Schwartz, \u201cAudiovisual speech enhancement: New advances using multi-layer perceptrons,\u201d Proc. IEEE 2nd Workshop on Multimedia Signal Processing, pp.77-82, 1998. 10.1109\/mmsp.1998.738916"},{"key":"13","doi-asserted-by":"crossref","unstructured":"[13] S. Deligne, G. Potamianos, and C. Neti, \u201cAudio-visual speech enhancement with AVCDCN (Audio-Visual Codebook Dependent Cepstral Normalization),\u201d Proc. Internationl Conference on Spoken Language Processing, pp.1449-1452, 2002. 10.1109\/sam.2002.1191001","DOI":"10.21437\/ICSLP.2002-421"},{"key":"14","doi-asserted-by":"publisher","unstructured":"[14] G. Potamianos, C. Neti, G. Gravier, A. Garg, and A.W. Senior, \u201cRecent advances in the automatic recognition of audiovisual speech,\u201d Proceedings of the IEEE, vol.91, no.9, pp.1306-1326, 2003. 10.1109\/jproc.2003.817150","DOI":"10.1109\/JPROC.2003.817150"},{"key":"15","doi-asserted-by":"crossref","unstructured":"[15] V.-M. Florescu, L. Crevier-Buchman, B. Denby, T. Hueber, A. ColazoSimon, C. Pillot-Loiseau, P. Roussel, C. Gendrot, and S. Quattrocchi, \u201cSilent vs vocalized articulation for a portable ultrasound-based silent speech interface,\u201d Proc. Interspeech, pp.450-453, 2010.","DOI":"10.21437\/Interspeech.2010-195"},{"key":"16","doi-asserted-by":"publisher","unstructured":"[16] T.L. Cornu and B. Milner, \u201cGenerating intelligible audio speech from visual speech,\u201d IEEE Trans. Audio, Speech, Lang. Process., vol.25, no.9, pp.1447-1457, 2017. 10.1109\/taslp.2017.2716178","DOI":"10.1109\/TASLP.2017.2716178"},{"key":"17","doi-asserted-by":"crossref","unstructured":"[17] A.R. Toth, K. Kalgaonkar, B. Raj, and T. Ezzat, \u201cSynthesizing speech from doppler signals,\u201d Proc. IEEE International Conference on Acoustic, Speech and Signal Processing, pp.4638-4641, 2010. 10.1109\/icassp.2010.5495552","DOI":"10.1109\/ICASSP.2010.5495552"},{"key":"18","doi-asserted-by":"crossref","unstructured":"[18] K. Livescu, B. Zhu, and J. Glass, \u201cOn the phonetic information in ultrasonic microphone signals,\u201d Proc. IEEE International Conference on Acoustic, Speech and Signal Processing, 4621-4624, 2009. 10.1109\/icassp.2009.4960660","DOI":"10.1109\/ICASSP.2009.4960660"},{"key":"19","doi-asserted-by":"publisher","unstructured":"[19] K. Kalgaonkar, R. Hu, and B. Raj, \u201cUltrasonic doppler sensor for voice activity detection,\u201d IEEE signal processing Letters, vol.14, no.10, pp.754-757, 2007. 10.1109\/lsp.2007.896450","DOI":"10.1109\/LSP.2007.896450"},{"key":"20","doi-asserted-by":"crossref","unstructured":"[20] K. Kalgaonkar and B. Raj, \u201cUltrasonic doppler sensor for speaker recognition,\u201d Proc. IEEE International Conference on Acoustic, Speech and Signal Processing, pp.4865-4868, 2008. 10.1109\/icassp.2008.4518747","DOI":"10.1109\/ICASSP.2008.4518747"},{"key":"21","unstructured":"[21] T. Toda and K. Shikano, \u201cNAM-to-speech conversion with Gaussian mixture models,\u201d Proc. INTERSPEECH, pp.1957-1960, 2005."},{"key":"22","doi-asserted-by":"publisher","unstructured":"[22] T. Toda, M. Nakagiri, and K. Shikano, \u201cStatistical voice conversion techniques for body-conducted unvoiced speech enhancement,\u201d IEEE Trans. Audio, Speech, Lang. Process., vol.20, no.9, pp.2505-2517, 2012. 10.1109\/tasl.2012.2205241","DOI":"10.1109\/TASL.2012.2205241"},{"key":"23","doi-asserted-by":"crossref","unstructured":"[23] M. Janke, M. Wand, K. Nakamura, and T. Schultz, \u201cFurther investigations on EMG-to-speech conversion,\u201d Proc. IEEE International Conference on Acoustic, Speech and Signal Processing, pp.365-368, 2012. 10.1109\/icassp.2012.6287892","DOI":"10.1109\/ICASSP.2012.6287892"},{"key":"24","doi-asserted-by":"publisher","unstructured":"[24] K.-S. Lee, \u201cPrediction of acoustic feature parameters using myoelectric signals,\u201d IEEE Trans. Biomed. Eng., vol.51, no.7, pp.1587-1595, 2010. 10.1109\/tbme.2010.2041455","DOI":"10.1109\/TBME.2010.2041455"},{"key":"25","doi-asserted-by":"publisher","unstructured":"[25] K.-S. Lee, \u201cEMG-based speech recognition using Hidden Markov Models with global control variables,\u201d IEEE Trans. Biomed. Eng., vol.55, no.3, pp.930-940, 2008. 10.1109\/tbme.2008.915658","DOI":"10.1109\/TBME.2008.915658"},{"key":"26","doi-asserted-by":"publisher","unstructured":"[26] M. Wand, M. Janke, and T. Schultz, \u201cTackling speaking mode varieties in EMG-based speech recognition,\u201d IEEE Trans. Biomed. Eng., vol.61, no.10, pp.2515-2526, 2014. 10.1109\/tbme.2014.2319000","DOI":"10.1109\/TBME.2014.2319000"},{"key":"27","doi-asserted-by":"publisher","unstructured":"[27] M. Janke and L. Diener, \u201cEMG-to-Speech: Direct generation of speech from facial electromyographic signals,\u201d IEEE Trans. Audio, Speech, Lang. Process., vol.25, no.12, pp.2375-2385, 2017. 10.1109\/taslp.2017.2738568","DOI":"10.1109\/TASLP.2017.2738568"},{"key":"28","doi-asserted-by":"publisher","unstructured":"[28] B. Raj, K. Kalgaonkar, C. Harrison, and P. Dietz, \u201cUltrasonic doppler sensing in HCI,\u201d IEEE Pervasive Computing, vol.11, no.2, pp.24-29, 2012. 10.1109\/mprv.2012.17","DOI":"10.1109\/MPRV.2012.17"},{"key":"29","unstructured":"[29] K. Kalgaonkar and B. Raj, \u201cAcoustic doppler sonar for gait recognition,\u201d Proc. IEEE Conference Advanced Video and Signal Based Surveillance, pp.27-32, 2007. 10.1109\/avss.2007.4425281"},{"key":"30","doi-asserted-by":"crossref","unstructured":"[30] K. Kalgaonkar and B. Raj, \u201cOne-handed gesture recognition using ultrasonic doppler sonar,\u201d Proc. IEEE International Conference on Acoustic, Speech and Signal Processing, pp.1889-1892, 2009. 10.1109\/icassp.2009.4959977","DOI":"10.1109\/ICASSP.2009.4959977"},{"key":"31","unstructured":"[31] L.R. Rabiner and R.W. Schafer, Digital Processing of Speech Signals, Prentice Hall, 1978."},{"key":"32","doi-asserted-by":"publisher","unstructured":"[32] G. White and R.B. Neely, \u201cSpeech recognition experiments with linear prediction, bandpass filtering, and dynamic programming,\u201d IEEE Trans. Acoustic Speech and Signal Processing, vol.24, no.2, pp.183-188, 1976. 10.1109\/tassp.1976.1162779","DOI":"10.1109\/TASSP.1976.1162779"},{"key":"33","unstructured":"[33] L. Deng, M.L. Seltzer, D. Yu, et al, \u201cBinary coding of speech spectrogram using a deep auto-encoder,\u201d Proc. Interspeech, pp.1692-1695, 2010."},{"key":"34","doi-asserted-by":"publisher","unstructured":"[34] Y. Xu, J. Du, L.-R. Dai, and C.-H. Lee, \u201cA regression approach to speech enhancement based on deep neural networks,\u201d IEEE Trans. Audio, Speech, Lang. Process., vol.23, no.1, pp.7-19, 2015. 10.1109\/taslp.2014.2364452","DOI":"10.1109\/TASLP.2014.2364452"},{"key":"35","doi-asserted-by":"crossref","unstructured":"[35] D. Griffin and J. Lim, \u201c Signal estimation from the modified short-time fourier transform,\u201d IEEE Trans. Acoustic Speech and Signal Processing, vol.32, pp.236-243, 1984. 10.1109\/tassp.1984.1164317","DOI":"10.1109\/TASSP.1984.1164317"},{"key":"36","doi-asserted-by":"publisher","unstructured":"[36] W. Jin, X. Liu, M.S. Scordilis, and L. Han, \u201cSpeech enhancement using harmonic emphasis and adaptive comb filtering,\u201d IEEE Trans. Audio, Speech, Lang. Process., vol.18, no.2, pp.356-368, 2010. 10.1109\/tasl.2009.2028916","DOI":"10.1109\/TASL.2009.2028916"},{"key":"37","doi-asserted-by":"publisher","unstructured":"[37] K. Han and D. Wang, \u201cNeural network based pitch tracking in very noisy speech,\u201d IEEE Trans. Audio, Speech, Lang. Process., vol.22, no.12, pp.2158-2168, 2014. 10.1109\/taslp.2014.2363410","DOI":"10.1109\/TASLP.2014.2363410"},{"key":"38","doi-asserted-by":"publisher","unstructured":"[38] G.E. Hinton, \u201cTraining products of experts by minimizing contrastive divergence,\u201d Neural Computation, vol.14, no.8, pp.1711-1800, 2002. 10.1162\/089976602760128018","DOI":"10.1162\/089976602760128018"},{"key":"39","unstructured":"[39] ITU-T, Rec. P. 862, Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow band telephone networks and speech codecs, International Telecommunication Union-Telecommunication Standardisation Sector, 2001."},{"key":"40","doi-asserted-by":"publisher","unstructured":"[40] C.H. Taal, R.C. Hendriks, R. Heusdens, and J. Jensen, \u201cAn algorithm for intelligibility prediction of time-frequency weighted noisy speech,\u201d IEEE Trans. Audio, Speech, Lang. Process., vol.19, no.7, pp.2125-2136, 2011. 10.1109\/tasl.2011.2114881","DOI":"10.1109\/TASL.2011.2114881"}],"container-title":["IEICE Transactions on Information and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transinf\/E103.D\/8\/E103.D_2019EDP7211\/_pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,11]],"date-time":"2024-08-11T05:05:33Z","timestamp":1723352733000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transinf\/E103.D\/8\/E103.D_2019EDP7211\/_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,1]]},"references-count":40,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2020]]}},"URL":"https:\/\/doi.org\/10.1587\/transinf.2019edp7211","relation":{},"ISSN":["0916-8532","1745-1361"],"issn-type":[{"type":"print","value":"0916-8532"},{"type":"electronic","value":"1745-1361"}],"subject":[],"published":{"date-parts":[[2020,8,1]]}}}