{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,10,26]],"date-time":"2023-10-26T14:41:54Z","timestamp":1698331314750},"reference-count":17,"publisher":"Wiley","issue":"8","license":[{"start":{"date-parts":[[2007,3,21]],"date-time":"2007-03-21T00:00:00Z","timestamp":1174435200000},"content-version":"vor","delay-in-days":5192,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Systems &amp;amp; Computers in Japan"],"published-print":{"date-parts":[[1993,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>This paper proposes a new training method for the phoneme identification neural network called \u201cneural fuzzy training.\u201d In the proposed training, nondeterministic (fuzzy) class information is assigned to the training signal, in contrast to the traditional method where a deterministic class information is assigned.<\/jats:p><jats:p>This study aims at the realization of a robust neural network, thereby improving the cumulative recognition rate of the phoneme identification and avoiding overtraining. The proposed neural fuzzy training is realized by backpropagation. In the conventional training, a deterministic phoneme class information is assigned to the training signal of the neural network as the value 1 or 0. However, in the proposed training, the fuzzy class information is assigned to the training signal for each training sample as the likelihood value between 0 and 1.<\/jats:p><jats:p>In the proposed training method, the likelihood is calculated by the monotonically decreasing function (such as exp(\u2212\u03b1 \u00b7 <jats:italic>d<\/jats:italic><jats:sup>2<\/jats:sup>)) of the distance between the training sample and the closest sample belonging to each phoneme class. The proposed neural fuzzy training method has a problem in that a large amount of computation cost is required since the training signal is determined by calculating the distances to all training samples. To solve this problem, the representative samples in each phoneme class are defined and the likelihood to the phoneme classes are determined by calculating the distance between the representative sample and the training sample.<\/jats:p><jats:p>By this simplification of the likelihood calculation, the computational cost to determine the training signal is reduced considerably. To demonstrate the usefulness of the neural fuzzy training, an experiment is conducted: \/b, d, g, m, n, N\/ identification, 18 consonant identification and phrase recognition using TDNN\u2010LR. The ATR database is used in the experiment. In the phoneme identification experiment, the speech samples which are extracted using the hand\u2010label is used. The TDNN is trained using speed samples uttered in word style, and the evaluation is performed using speech samples uttered in phrase style and in sentence style.<\/jats:p><jats:p>In the phrase recognition experiment using TDNN\u2010LR, the TDNN is trained using speed samples uttered word style using a hand label. The evaluation is performed using speech samples uttered in phrase style. In either experiment, an improvement of using the fuzzy training can be observed. Especially, in the phrase recognition experiment using TDNN\u2010LR, the top recognition rate is improved from 71.2 percent to 80.9 percent, and the top 5th recognition rate is improved from 92.8 percent to 96.O percent. Furthermore, it appeared also that the neural fuzzy training is a high\u2010speed training method.<\/jats:p>","DOI":"10.1002\/scj.4690240808","type":"journal-article","created":{"date-parts":[[2007,7,8]],"date-time":"2007-07-08T01:33:34Z","timestamp":1183858414000},"page":"82-94","source":"Crossref","is-referenced-by-count":3,"title":["A neural fuzzy training approach for improving speech recognition"],"prefix":"10.1002","volume":"24","author":[{"given":"Yasuhiro","family":"Komori","sequence":"first","affiliation":[]},{"given":"Shigeki","family":"Sagayama","sequence":"additional","affiliation":[]},{"given":"Alexander H.","family":"Waibel","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2007,3,21]]},"reference":[{"key":"e_1_2_1_2_2","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/5236.001.0001","volume-title":"Parallel Distributed Processing; Explorations in the Micro\u2010structure of Cognition","author":"Rumelhart D. E.","year":"1986"},{"key":"e_1_2_1_3_2","first-page":"4","volume-title":"An introduction to computing with neural nets","author":"Lippmann R. P.","year":"1987"},{"key":"e_1_2_1_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/29.21701"},{"key":"e_1_2_1_5_2","article-title":"Evaluation criterion for neural network using average curvature, Papers of Technical Group on Neurocomputing","volume":"89","author":"Suzuki S.","year":"1990","journal-title":"I.E.I.C.E., Japan"},{"key":"e_1_2_1_6_2","first-page":"13","volume-title":"Smoothing of TDNN output using neighborhood information of vector in input and hidden layers","author":"Minami Y.","year":"1990"},{"key":"e_1_2_1_7_2","first-page":"2","volume-title":"Phoneme recognition by k\u2010neighborhood interpolation training","author":"Kawabata T.","year":"1990"},{"key":"e_1_2_1_8_2","series-title":"ICSLP '90","first-page":"S165","volume-title":"Phoneme Recognition by Pairwise Discriminant TDNNs","author":"Takami J.","year":"1990"},{"key":"e_1_2_1_9_2","first-page":"2","volume-title":"A phoneme filter using neural network","author":"Nakamura M.","year":"1990"},{"key":"e_1_2_1_10_2","first-page":"15","volume-title":"A fuzzy training method in phoneme identification neural net","author":"Komori Y.","year":"1991"},{"key":"e_1_2_1_11_2","first-page":"13","volume-title":"A fuzzy regression analysis for a fuzzy set \u2014A method by linear programming and neural network","author":"Ishibashi","year":"1989"},{"key":"e_1_2_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.1985.4767725"},{"issue":"10","key":"e_1_2_1_13_2","first-page":"747","volume":"44","author":"Purpose J.","year":"1988","journal-title":"Acous. Soc. Jap."},{"key":"e_1_2_1_14_2","unstructured":"K.Takeda Y.Sagisaka S.Katagiri M.Abe andH.KuwaharaUser's guide for Japanese speech database for research ATR Technical Report TR\u2010I\u20100028 (1988)."},{"key":"e_1_2_1_15_2","doi-asserted-by":"crossref","unstructured":"P.Haffner A.Waibel H.Sawai andK.ShikanoFast Back\u2010Propagation Learning Methods for Neural Networks in Speech ATR Technical Report TR\u2010I\u20100058 (1988).","DOI":"10.21437\/Eurospeech.1989-95"},{"key":"e_1_2_1_16_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-1885-0"},{"key":"e_1_2_1_17_2","series-title":"ICSLP '90 S31.4","first-page":"1349","volume-title":"The TDNN\u2010LR Large\u2010Vocabulary and Continuous Speech Recognition System","author":"Sawai H.","year":"1990"},{"key":"e_1_2_1_18_2","series-title":"IEEE, ICASSP '90","first-page":"S810","volume-title":"Integrated Training for Spotting Japanese Phonemes Using Large Phonemic Time\u2010Delay Neural Networks","author":"Miyatake M.","year":"1990"}],"container-title":["Systems and Computers in Japan"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fscj.4690240808","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/scj.4690240808","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,25]],"date-time":"2023-10-25T04:56:20Z","timestamp":1698209780000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/scj.4690240808"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1993,1]]},"references-count":17,"journal-issue":{"issue":"8","published-print":{"date-parts":[[1993,1]]}},"alternative-id":["10.1002\/scj.4690240808"],"URL":"https:\/\/doi.org\/10.1002\/scj.4690240808","archive":["Portico"],"relation":{},"ISSN":["0882-1666","1520-684X"],"issn-type":[{"value":"0882-1666","type":"print"},{"value":"1520-684X","type":"electronic"}],"subject":[],"published":{"date-parts":[[1993,1]]}}}