{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T16:02:42Z","timestamp":1761580962581,"version":"3.40.3"},"publisher-location":"Berlin, Heidelberg","reference-count":23,"publisher":"Springer Berlin Heidelberg","isbn-type":[{"type":"print","value":"9783540611233"},{"type":"electronic","value":"9783540499503"}],"license":[{"start":{"date-parts":[[1996,1,1]],"date-time":"1996-01-01T00:00:00Z","timestamp":820454400000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[1996]]},"DOI":"10.1007\/3-540-61123-1_154","type":"book-chapter","created":{"date-parts":[[2012,2,26]],"date-time":"2012-02-26T21:17:11Z","timestamp":1330291031000},"page":"376-387","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":30,"title":["Real-time lip tracking for audio-visual speech recognition applications"],"prefix":"10.1007","author":[{"given":"Robert","family":"Kaucic","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Barney","family":"Dalton","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrew","family":"Blake","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2005,6,2]]},"reference":[{"key":"32_CR1","doi-asserted-by":"crossref","unstructured":"A. Adjoudani and C. Benoit. On the integration of auditory and visual parameters in an HMM-based ASR. In Proceedings NATO ASI Conference on Speechreading by Man and Machine: Models, Systems and Applications. NATO Scientific Affairs Division, Sep 1995.","DOI":"10.1007\/978-3-662-13015-5_35"},{"issue":"2","key":"32_CR2","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1007\/BF01469225","volume":"11","author":"A. Blake","year":"1993","unstructured":"A. Blake, R. Curwen, and A. Zisserman. A framework for spatio-temporal control in the tracking of visual contours. Int. Journal of Computer Vision, 11(2):127\u2013145, 1993.","journal-title":"Int. Journal of Computer Vision"},{"key":"32_CR3","doi-asserted-by":"crossref","unstructured":"A. Blake and M.A. Isard. 3D position, attitude and shape input using video tracking of hands and lips. In Proc. Siggraph, pp. 185\u2013192. ACM, 1994.","DOI":"10.1145\/192161.192197"},{"key":"32_CR4","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1016\/0004-3702(95)00032-1","volume":"78","author":"A. Blake","year":"1995","unstructured":"A. Blake, M.A. Isard, and D. Reynard. Learning to track the visual motion of contours. Artificial Intelligence, 78:101\u2013134, 1995.","journal-title":"Artificial Intelligence"},{"key":"32_CR5","doi-asserted-by":"crossref","unstructured":"C. Bregler and Y. Konig. Eigenlips for robust speech recognition. In Proc. Int. Conf. on Acoust., Speech, Signal Processing, pp. 669\u2013672, Adelaide, 1994.","DOI":"10.1109\/ICASSP.1994.389567"},{"key":"32_CR6","doi-asserted-by":"crossref","unstructured":"C. Bregler and S.M. Omohundro. Nonlinear manifold learning for visual speech recognition. In Proc. 5th Int. Conf. on Computer Vision, pp. 494\u2013499, Boston, Jun 1995.","DOI":"10.1109\/ICCV.1995.466899"},{"issue":"1","key":"32_CR7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/89.365385","volume":"3","author":"R. Cole","year":"1995","unstructured":"R. Cole, L. Hirschmann, L. Atlas, et al. The challenge of spoken language systems: Research directions for the nineties. IEEE Trans. on Speech and Audio Processing, 3(1):1\u201320, 1995.","journal-title":"IEEE Trans. on Speech and Audio Processing"},{"key":"32_CR8","doi-asserted-by":"crossref","unstructured":"B. Dalton, R. Kaucic, and A. Blake. Automatic Speechreading using dynamic contours. In Proceedings NATO ASI Conference on Speechreading by Man and Machine: Models, Systems and Applications. NATO Scientific Affairs Division, Sep 1995.","DOI":"10.1007\/978-3-662-13015-5_27"},{"key":"32_CR9","unstructured":"B. Dodd and R. Campbell. Hearing By Eye: The Psychology of Lip Reading. Erlbaum, 1987."},{"issue":"3","key":"32_CR10","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1016\/0167-8655(88)90094-3","volume":"8","author":"E. K. Finn","year":"1988","unstructured":"E. K. Finn and A. A. Montgomery. Automatic optically based recognition of speech. Pattern Recognition Letters, 8(3):159\u2013164, 1988.","journal-title":"Pattern Recognition Letters"},{"key":"32_CR11","doi-asserted-by":"crossref","unstructured":"M.J.F. Gales and S. Young. An improved approach to the Hidden Markov Model decomposition of speech and noise. In Proc. Int. Conf. on Acoust., Speech, Signal Processing, pp. 233\u2013239, San Franciso, Mar 1992.","DOI":"10.1109\/ICASSP.1992.225929"},{"issue":"3","key":"32_CR12","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1016\/0167-6393(94)90067-1","volume":"14","author":"M.W. Mak","year":"1994","unstructured":"M.W. Mak and W.G. Allen. Lip-motion analysis for speech segmentation in noise. Speech Communication, 14(3):279\u2013296, 1994.","journal-title":"Speech Communication"},{"key":"32_CR13","doi-asserted-by":"crossref","unstructured":"Y. Moses, D. Reynard, and A. Blake. Determining facial expressions in real-time. In Proc. 5th Int. Conf. on Computer Vision, pp. 296\u2013301, Boston, Jun 1995.","DOI":"10.1109\/ICCV.1995.466926"},{"key":"32_CR14","doi-asserted-by":"crossref","unstructured":"J.P. Openshaw and J.S. Mason. A review of robust techniques for the analysis of degraded speech. In Proc. IEEE Region 10 Conf. on Comp., Control, and Power Engr., pp. 329\u2013332, 1993.","DOI":"10.1109\/TENCON.1993.327989"},{"key":"32_CR15","doi-asserted-by":"crossref","unstructured":"E.D. Petajan, N.M. Brooke, B.J. Bischofy, and D.A. Bodoff. An improved automatic lipreading system to enhance speech recognition. In E. Soloway, D. Frye, and S.B. Sheppard, editors, Proc. Human Factors in Computing Systems, pp. 19\u201325. ACM, 1988.","DOI":"10.1145\/57167.57170"},{"key":"32_CR16","unstructured":"L. Rabiner and J. Bing-Hwang. Fundamentals of speech recognition. Prentice-Hall, 1993."},{"key":"32_CR17","unstructured":"D. Reisberg, J. McLean, and A. Goldfield. Easy to hear but hard to understand: A lip-reading advantage with intact auditory stimuli. In B. Dodd and R. Campbell, editors, Hearing By Eye: The Psychology of Lip Reading, pp. 97\u2013113. Erlbaum, 1987."},{"key":"32_CR18","doi-asserted-by":"crossref","unstructured":"D. Reynard, A. Wildenberg, A. Blake, and J. Marchant. Learning dynamics of complex motions from image sequences. In Proc. 4th European Conf. on Computer Vision, Cambridge, England, Apr 1996.","DOI":"10.1007\/BFb0015550"},{"key":"32_CR19","first-page":"289","volume":"2","author":"D.G. Stork","year":"1992","unstructured":"D.G. Stork, G. Wolff, and E. Levine. Neural network lipreading system for improved speech recognition. In Proceedings International Joint Conference on Neural Networks, volume 2, pp. 289\u2013295, 1992.","journal-title":"Proceedings International Joint Conference on Neural Networks"},{"key":"32_CR20","doi-asserted-by":"crossref","unstructured":"Q. Summerfield, A. MacLeod, M. McGrath, and M. Brooke. Lips, teeth and the benefits of lipreading. In A.W. Young and H.D. Ellis, editors, Handbook of Research on Face Processing, pp. 223\u2013233. Elsevier Science Publishers, 1989.","DOI":"10.1016\/B978-0-444-87143-5.50019-6"},{"key":"32_CR21","doi-asserted-by":"crossref","unstructured":"A.P. Varga and R.K. Moore. Hidden Markov Model decomposition of speech and noise. In Proc. Int. Conf. on Acoust., Speech, Signal Processing, pp. 845\u2013848, 1990.","DOI":"10.1109\/ICASSP.1990.115970"},{"issue":"10","key":"32_CR22","doi-asserted-by":"crossref","first-page":"1658","DOI":"10.1109\/5.58349","volume":"78","author":"B.P. Yuhas","year":"1990","unstructured":"B.P. Yuhas, M.H. Goldstein, T.J. Sejnowski, and R.E. Jenkins. Neural network models of sensory integration for improved vowel recognition. Proceedings of the IEEE, 78(10):1658\u20131668, 1990.","journal-title":"Proceedings of the IEEE"},{"key":"32_CR23","unstructured":"V. Zue, J. Glass, D. Goodine, L. Hirschman, H. Leung, M. Phillips, J. Polifroni, and S. Seneff. From speech recognition to spoken language understanding: The development of the MIT SUMMIT and VOYAGER systems. In R.P. Lippman, J.E. Moody, and D.S. Touretzky, editors, Advances in Neural Information Processing 3, pp. 255\u2013261. Morgan Kaufman, 1991."}],"container-title":["Lecture Notes in Computer Science","Computer Vision \u2014 ECCV '96"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/3-540-61123-1_154","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,21]],"date-time":"2025-03-21T23:14:15Z","timestamp":1742598855000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/3-540-61123-1_154"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1996]]},"ISBN":["9783540611233","9783540499503"],"references-count":23,"URL":"https:\/\/doi.org\/10.1007\/3-540-61123-1_154","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[1996]]},"assertion":[{"value":"2 June 2005","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}}]}}