{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,10,23]],"date-time":"2023-10-23T05:06:41Z","timestamp":1698037601675},"reference-count":13,"publisher":"Wiley","issue":"4","license":[{"start":{"date-parts":[[2007,3,22]],"date-time":"2007-03-22T00:00:00Z","timestamp":1174521600000},"content-version":"vor","delay-in-days":5924,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Systems &amp; Computers in Japan"],"published-print":{"date-parts":[[1991,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>This paper describes a neural approach intended to improve the performance of a voice recognition system for unrestricted speakers using not only voice sound features but also image features of the mouth shape. The FFT power spectrum of acoustic speech was used as the voice feature. In addition, the gray\u2010level image, binary image, and geometrical shape features of the mouth were used as the compensatory information and a comparison made of which kinds of image features are effective for voice recognition by a neural network.<\/jats:p><jats:p>For unrestricted speakers, a vowel recognition rate of about 80 percent was obtained using only voice features. However, this increased to some 92 percent when voice features plus binary images were used. This method can be applied not only to the improvement of voice recognition, but also to aid the communication of hearing\u2010impaired people.<\/jats:p>","DOI":"10.1002\/scj.4690220410","type":"journal-article","created":{"date-parts":[[2007,7,7]],"date-time":"2007-07-07T20:53:14Z","timestamp":1183841594000},"page":"100-109","source":"Crossref","is-referenced-by-count":0,"title":["Speaker\u2010independent vowel recognition combining voice features and mouth shape image with neural network"],"prefix":"10.1002","volume":"22","author":[{"given":"Jian\u2010Tong","family":"Wu","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shinichi","family":"Tamura","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hiroshi","family":"Mitsumoto","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hideo","family":"Kawai","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kenji","family":"Kurosu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kozo","family":"Okazaki","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","published-online":{"date-parts":[[2007,3,22]]},"reference":[{"issue":"3","key":"e_1_2_1_2_2","first-page":"235","article-title":"Lip\u2010reader trainer","volume":"3","author":"Hight R. L.","year":"1982","journal-title":"Johns Hopkins APL Technical Digest"},{"key":"e_1_2_1_3_2","doi-asserted-by":"publisher","DOI":"10.1121\/1.395916"},{"key":"e_1_2_1_4_2","volume-title":"Advance in Neural Information Processing Systems 2","author":"Sejnowski T. J.","year":"1989"},{"key":"e_1_2_1_5_2","unstructured":"E. D.Petajan.Automatic lip\u2010reading to enhance speech recognition. Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition 40\u201347(1985)."},{"key":"e_1_2_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.1979.4310076"},{"key":"e_1_2_1_7_2","volume-title":"In: Parallel Distributed Processing, Volume 1","author":"Rumelhart D. E.","year":"1987"},{"key":"e_1_2_1_8_2","first-page":"31","article-title":"Vowel recognition by neural network\u2014A study on the ability of the feature extraction","volume":"87","author":"Irino T.","year":"1987","journal-title":"I.E.I.C.E. Technical Report"},{"key":"e_1_2_1_9_2","first-page":"67","article-title":"Speech recognition by image processing of lip movements","volume":"22","author":"Matsuoka K.","year":"1986","journal-title":"Journal of the Society of Instrument and Control Engineers"},{"issue":"12","key":"e_1_2_1_10_2","first-page":"2700","article-title":"Discrimination of Japanese vowels by image analysis","volume":"71","author":"Uchimura K.","year":"1988","journal-title":"Trans. I.E.I.C.E., Japan"},{"issue":"10","key":"e_1_2_1_11_2","first-page":"2181","article-title":"A neural sequence identification network model","volume":"71","author":"Futami R.","year":"1988","journal-title":"Trans. I.E.I.C.E., Japan"},{"key":"e_1_2_1_12_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.84.7.1896"},{"key":"e_1_2_1_13_2","unstructured":"K.Kurosu T.Furuya K.Matsuoka andS.Tamura.Word recognition by mouth shape and voice. The First Symposium on Advanced Man\u2010Machine Interface Through Spoken Language Tokyo Japan 205\u2013206(1988)."},{"key":"e_1_2_1_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/0031-3203(88)90048-9"}],"container-title":["Systems and Computers in Japan"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fscj.4690220410","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/scj.4690220410","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,23]],"date-time":"2023-10-23T02:36:53Z","timestamp":1698028613000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/scj.4690220410"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1991,1]]},"references-count":13,"journal-issue":{"issue":"4","published-print":{"date-parts":[[1991,1]]}},"alternative-id":["10.1002\/scj.4690220410"],"URL":"https:\/\/doi.org\/10.1002\/scj.4690220410","archive":["Portico"],"relation":{},"ISSN":["0882-1666","1520-684X"],"issn-type":[{"value":"0882-1666","type":"print"},{"value":"1520-684X","type":"electronic"}],"subject":[],"published":{"date-parts":[[1991,1]]}}}