{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,1,18]],"date-time":"2025-01-18T21:10:01Z","timestamp":1737234601676,"version":"3.33.0"},"reference-count":21,"publisher":"Wiley","issue":"2","license":[{"start":{"date-parts":[[2007,3,21]],"date-time":"2007-03-21T00:00:00Z","timestamp":1174435200000},"content-version":"vor","delay-in-days":4097,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Systems &amp;amp; Computers in Japan"],"published-print":{"date-parts":[[1996,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>This study aims at the realization of a speaker\u2010independent speech recognition system based on the speaker adaptation with a supervisor. This paper describes the highly accurate speaker\u2010adaptation technique using a small number of training samples. When a small number of speech samples are used for adaptation there arise problems that sufficient information cannot be obtained to update simultaneously a large number of model parameters, and an estimation error is included to the statistical bias of the samples.<\/jats:p><jats:p>From such a viewpoint, this paper proposes a speaker\u2010adaptation technique using the hidden Markov network (HMnet), which employs a smaller number of model parameters than the mixed continuous\u2010distributed phoneme HMM, which is independent of the phoneme context, while realizing an equal or better recognition performance. As the adaptation technique, the moving vector field smoothing (VFS) method is used. This method can realize simultaneously the interpolation for the unadapted model parameters to cope with the small number of samples and the correction of the estimation in the speaker adaptation. The standard speaker pre\u2010selection method also is investigated in order to improve the accuracy of the speaker adaptation.<\/jats:p>","DOI":"10.1002\/scj.4690270207","type":"journal-article","created":{"date-parts":[[2007,7,8]],"date-time":"2007-07-08T10:27:22Z","timestamp":1183890442000},"page":"75-86","source":"Crossref","is-referenced-by-count":0,"title":["A speaker\u2010adaptation technique for context\u2010dependent models represented by hidden markov networks"],"prefix":"10.1002","volume":"27","author":[{"given":"Jun\u2010Ichi","family":"Takami","sequence":"first","affiliation":[]},{"given":"Shigeki","family":"Sagayama","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2007,3,21]]},"reference":[{"key":"e_1_2_1_2_2","doi-asserted-by":"crossref","unstructured":"K.Shikano K. F.LeeandR.Reddy.Speaker Adaptation through Vector Quantization.Proc. ICASSP86 pp.2643\u20132646(1986).","DOI":"10.1109\/ICASSP.1986.1168676"},{"key":"e_1_2_1_3_2","unstructured":"S.Furui.Speaker adaptation in hierarchical clustering of spectrum space.Tech. Rep. I.E.I.C.E. Japan SP88\u201021 (1988)."},{"key":"e_1_2_1_4_2","unstructured":"H.Matsumoto Y.YamashitaandS.Nishizawa.Short\u2010term speaker adaptation of the spectrum without supervisor using fuzzy objective function minimization criterion.Tech. Rep. I.E.I.C.E. Japan SP88\u2010122 (1989)."},{"key":"e_1_2_1_5_2","doi-asserted-by":"crossref","unstructured":"Y.HirataandS.Nakagawa.Evaluation of speaker adaptation by continuous output distribution HMM through Japanese phoneme recognition.Tech. Rep. I.E.I.C.E. Japan SP90\u201016 (1990).","DOI":"10.21437\/ICSLP.1990-119"},{"key":"e_1_2_1_6_2","first-page":"35","article-title":"An overview of the SPHINX speech recognition system","volume":"38","author":"Lee K. F.","year":"1990","journal-title":"IEEE Trans."},{"key":"e_1_2_1_7_2","doi-asserted-by":"crossref","unstructured":"Y. L.Chow M. O.Donham O. A.Kimball M. A.Krasner G. F.Kubala J. M.Makhjoul P. J.Price S.RoucosandR. M.Schwartz.BYBLOS: The BBN Continuous Speech Recognition System.Proc. ICASSP'87 pp.89\u201392(1987).","DOI":"10.1109\/ICASSP.1987.1169748"},{"key":"e_1_2_1_8_2","doi-asserted-by":"crossref","unstructured":"A.Nagai J.TakamiandS.Sagayama.SSS\u2010LR continuous speech recognition system combining successive state\u2010splitting (SSS) and phoneme\u2010context\u2010dependent LR parser.Tech. Rep. I.E.I.C.E. Japan SP92\u201033 (1992).","DOI":"10.21437\/ICSLP.1992-192"},{"key":"e_1_2_1_9_2","unstructured":"S.HayamiandK.Tanaka.Prediction and evaluation of acoustic fluctuation in unknown phoneme context by tree\u2010structured phoneme model.Tech. Rep. I.E.I.C.E. Japan SP90\u201064 (1990)."},{"key":"e_1_2_1_10_2","doi-asserted-by":"crossref","unstructured":"K. F.Lee S.Hayamizu H. W.Hon C.Huang J.SwartzandR.Weide.Allophone Clustering for Continuous Speech Recognition.Proc. Rep. ICASSP'90 pp.749\u2013752(1990).","DOI":"10.1109\/ICASSP.1990.115900"},{"key":"e_1_2_1_11_2","unstructured":"S.Sagayama.Principle and algorithm of phoneme environment clustering.Tech. Rep. I.E.I.C.E. Japan SP87\u201086 (1987)."},{"key":"e_1_2_1_12_2","doi-asserted-by":"crossref","unstructured":"R.Schwartz Y.Chow O.Kimball S.Roucos M.KrasnerandJ.Makhoul.Context\u2010Dependent Modeling for Acoustic\u2010Phonetic Recognition of Continuous Speech.Proc. ICASSP'85 pp.1205\u20131208(1985).","DOI":"10.1109\/ICASSP.1985.1168283"},{"key":"e_1_2_1_13_2","doi-asserted-by":"crossref","unstructured":"X. D.Huang K. F.Lee H. W.HonandM. Y.Hwang.Improved Acoustic Modeling with the SPHINX Speech Recognition System.Proc. ICASSP'91 pp.345\u2013348(1991).","DOI":"10.1109\/ICASSP.1991.150347"},{"key":"e_1_2_1_14_2","doi-asserted-by":"crossref","unstructured":"S.EulerandJ.Zinke.Extending the Vocabulary of a Speaker Independent Recognition System.Proc. ICASSP'91 PP.301\u2013304(1991).","DOI":"10.1109\/ICASSP.1991.150336"},{"issue":"10","key":"e_1_2_1_15_2","first-page":"2155","article-title":"Automatic generation of hidden Markov network by successive state\u2010splitting technique","volume":"76","author":"Takami J.","year":"1993","journal-title":"Trans. (D\u2010II) I.E.I.C.E., Japan"},{"key":"e_1_2_1_16_2","first-page":"49","article-title":"Speaker adaptation by code\u2010book mapping using a small number of training data","volume":"1","author":"Hattori H.","year":"1991","journal-title":"Trans. Acoust. Soc. Jap."},{"key":"e_1_2_1_17_2","first-page":"191","article-title":"Moving vector field smoothing speaker adaptation technique using mixed continuous distribution HMM","volume":"2","author":"Okura K.","year":"1992","journal-title":"Trans. Acoust. Soc. Jap."},{"key":"e_1_2_1_18_2","first-page":"15","article-title":"Speaker adaptation using hidden Markov network (HMnet)","volume":"1","author":"Takami J.","year":"1992","journal-title":"Trans. Acoust. Soc. Jap."},{"key":"e_1_2_1_19_2","first-page":"121","article-title":"A study of standard speaker selection method in moving vector field smoothing speaker adaptation","volume":"2","author":"Kiyazawa Y.","year":"1992","journal-title":"Trans. Acoust. Soc. Jap."},{"key":"e_1_2_1_20_2","unstructured":"A.Nagai K.KitaandS.Sagayama.A realization algorithm for phoneme\u2010environment\u2010dependent parser in HMM\u2010LR continuous speech recognition.Tech. Rep. I.E.I.C.E. Japan SP91\u201023 (1991)."},{"key":"e_1_2_1_21_2","first-page":"155","article-title":"Generation of speaker\u2010shared hidden Markov network by successive state\u2010splitting (SSS) with speaker direction","volume":"3","author":"Takami J.","year":"1992","journal-title":"Trans. Acoust. Soc. Jap."},{"key":"e_1_2_1_22_2","doi-asserted-by":"crossref","unstructured":"C. H.LeeandJ. L.Gauvian.Speaker Adaptation Based on MAP Estimation of HMM parameters.Proc. ICASSP'93 pp.558\u2013561(1993).","DOI":"10.1109\/ICASSP.1993.319368"}],"container-title":["Systems and Computers in Japan"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fscj.4690270207","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/scj.4690270207","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,18]],"date-time":"2025-01-18T20:31:07Z","timestamp":1737232267000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/scj.4690270207"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1996,1]]},"references-count":21,"journal-issue":{"issue":"2","published-print":{"date-parts":[[1996,1]]}},"alternative-id":["10.1002\/scj.4690270207"],"URL":"https:\/\/doi.org\/10.1002\/scj.4690270207","archive":["Portico"],"relation":{},"ISSN":["0882-1666","1520-684X"],"issn-type":[{"type":"print","value":"0882-1666"},{"type":"electronic","value":"1520-684X"}],"subject":[],"published":{"date-parts":[[1996,1]]}}}