{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,21]],"date-time":"2025-04-21T04:42:58Z","timestamp":1745210578428},"reference-count":35,"publisher":"Elsevier BV","issue":"1-2","license":[{"start":{"date-parts":[[1998,10,1]],"date-time":"1998-10-01T00:00:00Z","timestamp":907200000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.elsevier.com\/tdm\/userlicense\/1.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Speech Communication"],"published-print":{"date-parts":[[1998,10]]},"DOI":"10.1016\/s0167-6393(98)00056-9","type":"journal-article","created":{"date-parts":[[2003,4,5]],"date-time":"2003-04-05T03:57:58Z","timestamp":1049515078000},"page":"149-161","source":"Crossref","is-referenced-by-count":34,"title":["Adaptive fusion of acoustic and visual sources for automatic speech recognition"],"prefix":"10.1016","volume":"26","author":[{"given":"Alexandrina","family":"Rogozan","sequence":"first","affiliation":[]},{"given":"Paul","family":"Del\u00e9glise","sequence":"additional","affiliation":[]}],"member":"78","reference":[{"key":"10.1016\/S0167-6393(98)00056-9_BIB1","unstructured":"Abry, C., Lallouache, M.T., 1991. Audibility and stability of articulatory movements: Deciphering two experiments on anticipatory rounding in French. In: Proc. of the XIIth Internat. Congress of Phon. Sci., 1991, Aix-en-Provence, France, pp. 220\u2013225"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB2","doi-asserted-by":"crossref","unstructured":"Adjoudani, A., Beno\u0131\u0302t, C., 1996. On the integration of auditory and visual parameters in an HMM-based ASR. In: Stork, D., Hennecke, M. (Eds.), Speechreading by Humans and Machines, NATO ASI Series, Series F: Computer and Systems Science, Vol. 150, pp. 461\u2013473","DOI":"10.1007\/978-3-662-13015-5_35"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB3","doi-asserted-by":"crossref","unstructured":"Alissali, M., Del\u00e9glise, P., Rogozan, A., 1996. Asynchronous integration of visual information in an automatic speech recognition system. In: Proc. of Internat. Conf. on Spoken Language Processing, Philadelphia, USA, 3\u20136 October 1996, pp. 34\u201337","DOI":"10.1109\/ICSLP.1996.607018"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB4","unstructured":"Andr\u00e9-Obrecht, R., Jacob, B., Parlangeau, N., 1997. Audio-visual speech recognition and segmental master\u2013slave HMM. In: Beno\u0131\u0302t, C., Campbell, R. (Eds.), Proc. of The ESCA\/ESCOP Workshop on Audio-Visual Speech Processing, Rhodes, Greece, 26\u201327 September 1997, pp. 49\u201352"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB5","first-page":"1","article-title":"An inequality and associated maximization technique in statistical estimation of probabilistic functions of Markov processes","volume":"3","author":"Baum","year":"1972","journal-title":"Inequalities"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB6","first-page":"32","article-title":"The intrinsic bimodality of speech communication and the synthesis of talking faces","volume":"43","author":"Beno\u0131\u0302t","year":"1992","journal-title":"HIRADA TECHNIKA XLIII Journal on Communications"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB7","unstructured":"Beno\u0131\u0302t, C., Lallouache, T., Mohamadi, M.T., Abry, C., 1992. A set of French visemes for visual speech synthesis. In: Bailly, G., Beno\u0131\u0302t, C. (Eds.), Talking Machines: Theories, Models, and Designs. Elsevier, Amsterdam, pp. 485\u2013504"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB8","doi-asserted-by":"crossref","first-page":"1195","DOI":"10.1044\/jshr.3705.1195","article-title":"Effects of phonetic context on audio-visual intelligibility of French","volume":"37","author":"Beno\u0131\u0302t","year":"1994","journal-title":"Journal of Speech and Hearing Research"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB9","doi-asserted-by":"crossref","unstructured":"Brooke, M., 1996. Talking heads and speech recognisers that can see: The computer processing of visual speech signals. In: Stork, D., Hennecke, M. (Eds.), Speechreading by Humans and Machines, NATO ASI Series, Series F: Computer and Systems Science, Vol. 150, pp. 351\u2013372","DOI":"10.1007\/978-3-662-13015-5_26"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB10","unstructured":"Caelen, J., Kabre, H., Delemar, O., 1996. Reconnaissance de la parole: Vers l'utilisabilit\u00e9. In: Actes des XXIes Journ\u00e9es d'Etude sur la Parole, Avignon, 10\u201314 June 1996, pp. 325\u2013329"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB11","unstructured":"Chen, J.-S., Garcia, O., 1997. Challenges in the fusion of video and audio for robust speech recognition. In: Proc. of American Association of Artificial Intelligence Spring Symposium, USA, pp. 57\u201360"},{"issue":"4","key":"10.1016\/S0167-6393(98)00056-9_BIB12","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1145\/360924.360971","article-title":"Scalar and planar valued curve fitting using splines under tension","volume":"17","author":"Cline","year":"1974","journal-title":"Communications of the ACM"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB13","first-page":"481","article-title":"Confusions among visually perceived consonants","volume":"40","author":"Ficher","year":"1968","journal-title":"Journal of Speech and Hearing Disorders"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB14","unstructured":"Goldschen, J., 1993. Continuous automatic speech recognition by lipreading. Ph.D. Dissertation, George Washington University"},{"issue":"3","key":"10.1016\/S0167-6393(98)00056-9_BIB15","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1016\/0167-6393(94)00059-J","article-title":"Speech recognition in noisy environments: A survey","volume":"16","author":"Gong","year":"1995","journal-title":"Speech Communication"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB16","doi-asserted-by":"crossref","unstructured":"Hennecke, M., Stork, D., Prasad, V., 1996. Visionary speech: Looking ahead to practical speechreading systems. In: Stork, D., Hennecke, M. (Eds.), Speechreading by Humans and Machines, NATO ASI Series, Series F: Computer and Systems Science, Vol. 150, pp. 331\u2013351","DOI":"10.1007\/978-3-662-13015-5_25"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB17","unstructured":"Jourlin, P., 1996. Handling the disynchronization phenomena with HMM in connected speech. In: Proc. of VIII European Signal Processing Conference, Trieste, Italy, September 1996, pp. 133\u2013136"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB18","unstructured":"Jourlin, P., 1997. Word-dependent acoustic-labial weights in HMM-based speech recognition. In: Beno\u0131\u0302t, C., Campbell, R. (Eds.), Proc. of The ESCA\/ESCOP Workshop on Audio-Visual Speech Processing, Rhodes, Greece, 26\u201327 September 1997, pp. 69\u201372"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB19","doi-asserted-by":"crossref","unstructured":"Kohonen, T., 1988. Self-Organization and Associative Memory. Springer, Berlin","DOI":"10.1007\/978-3-662-00784-6"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB20","doi-asserted-by":"crossref","unstructured":"Kricos, P.B., 1996. Differences in visual intelligibility across talkers. In: Stork, D., Hennecke, M. (Eds.), Speechreading by Humans and Machines, NATO ASI Series, Series F: Computer and Systems Science, Vol. 150, pp. 43\u201355","DOI":"10.1007\/978-3-662-13015-5_4"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB21","unstructured":"Lallouache, T., 1990. Un poste visage-parole: Acquisition et traitement des contours labiaux. In: Actes des Journ\u00e9es d'Etudes sur la Parole, Montr\u00e9al"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB22","unstructured":"Lee, K.-F., 1992. Automatic Speech Recognition. The Development of the SPHINX System. Kluwer Academic Publishers, Boston, MA"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB23","unstructured":"Massaro, D.W., Cohen, M.M., 1995. Modelling the perception of bimodal speech. In: Proc. of The XIIIth Internat. Congress of Phon. Sci., Stockholm, pp. 106\u2013113"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB24","doi-asserted-by":"crossref","unstructured":"Meier, U., H\u00fcrst, W., Duchnowski, P., 1996. Adaptive bimodal sensor fusion for automatic speechreading. In: Proc. of Internat. Conf. on Acoustics, Speech and Signal Processing","DOI":"10.1109\/ICASSP.1996.543250"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB25","unstructured":"Movellan, J.R., Mineiro, P., 1997. Modularity and catastrophic fusion: A Bayesian approach with applications to audio-visual speech recognition. Advances in Neural Information Processing Systems 10"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB26","unstructured":"Reisberg, D., McLean, J., Goldfield, A., 1987. Easy to hear but hard to understand: A lip-reading advantage with intact auditory stimuli. In: Dodd, B., Campbell, R. (Eds.), Hering by Eye: The Psychology of Lip-Reading. Lawrence Erlbaum, Hillsdale, NJ, pp. 97\u2013113"},{"issue":"4\u20135","key":"10.1016\/S0167-6393(98)00056-9_BIB27","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1007\/BF00849043","article-title":"A comparison of models for fusion of the auditory and visual sensors in speech perception","volume":"9","author":"Robert-Ribes","year":"1995","journal-title":"Artificial Intelligence Review"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB28","doi-asserted-by":"crossref","unstructured":"Robert-Ribes, J., Piquemal, M., Schwartz, J.-L., Escudier, P., 1996. Exploiting sensor fusion architectures and stimuli complementarily in AV speech recognition. In: Stork, D., Hennecke, M. (Eds.), Speechreading by Humans and Machines, NATO ASI Series, Series F: Computer and Systems Science, Vol. 150, pp. 193\u2013210","DOI":"10.1007\/978-3-662-13015-5_14"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB29","doi-asserted-by":"crossref","unstructured":"Rogozan, A., Del\u00e9glise, P., 1997. Continuous visual speech recognition using geometric lip-shape models and neural networks. In: Proc. of the Fifth Conf. on Speech Communication and Technology, Rhodes, Greece, 26\u201327 September 1997, pp. 1999\u20132003","DOI":"10.21437\/Eurospeech.1997-530"},{"issue":"5","key":"10.1016\/S0167-6393(98)00056-9_BIB30","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1109\/89.536928","article-title":"Computer lip-reading for improved accuracy in automatic speech recognition","volume":"4","author":"Silsbee","year":"1996","journal-title":"IEEE Transactions on Speech and Audio Processing"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB31","doi-asserted-by":"crossref","unstructured":"Suaudeau, N., Andr\u00e9-Obretch, R., 1993. Sound duration modeling and time variable speaking rate in a speech recognition system. In: Proc. of Eurospeech'93, pp. 307\u2013310","DOI":"10.21437\/Eurospeech.1993-96"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB32","doi-asserted-by":"crossref","first-page":"314","DOI":"10.1159\/000259969","article-title":"Use of visual information for phonetic perception","volume":"36","author":"Summerfield","year":"1979","journal-title":"Phonetica"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB33","unstructured":"Summerfield, A.Q., 1987. Some preliminaries to a comprehensive account of audio-visual speech perception. In: Dodd, B., Campbell, R. (Eds.), Hering by Eye: The Psychology of Lip-Reading. Lawrence Erlbaum, Hillsdale, NJ, pp. 3\u201351"},{"key":"10.1016\/S0167-6393(98)00056-9_BIB34","doi-asserted-by":"crossref","unstructured":"Teissier, P., Robert-Ribes, J., Schwartz, J.-L., Gu\u00e9rin-Dugu\u00e9, A., 1997. Models for audio-visual fusion in a noisy recognition task. In: Proc. of The First IEEE Workshop on Multimedia Signal Processing, Princeton, New Jersey, USA, 23\u201325 June 1997, pp. 37\u201344","DOI":"10.1109\/MMSP.1997.602610"},{"issue":"2","key":"10.1016\/S0167-6393(98)00056-9_BIB35","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1109\/TIT.1967.1054010","article-title":"Error bounds for convolutional codes and an asymptotically optimum decoding algorithm","volume":"IT-13","author":"Viterbi","year":"1967","journal-title":"IEEE Transactions on Information Theory"}],"container-title":["Speech Communication"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.elsevier.com\/content\/article\/PII:S0167639398000569?httpAccept=text\/xml","content-type":"text\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/api.elsevier.com\/content\/article\/PII:S0167639398000569?httpAccept=text\/plain","content-type":"text\/plain","content-version":"vor","intended-application":"text-mining"}],"deposited":{"date-parts":[[2023,4,8]],"date-time":"2023-04-08T19:13:48Z","timestamp":1680981228000},"score":1,"resource":{"primary":{"URL":"https:\/\/linkinghub.elsevier.com\/retrieve\/pii\/S0167639398000569"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1998,10]]},"references-count":35,"journal-issue":{"issue":"1-2","published-print":{"date-parts":[[1998,10]]}},"alternative-id":["S0167639398000569"],"URL":"https:\/\/doi.org\/10.1016\/s0167-6393(98)00056-9","relation":{},"ISSN":["0167-6393"],"issn-type":[{"value":"0167-6393","type":"print"}],"subject":[],"published":{"date-parts":[[1998,10]]}}}