{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T19:57:10Z","timestamp":1776887830607,"version":"3.51.2"},"reference-count":15,"publisher":"World Scientific Pub Co Pte Ltd","issue":"05","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Patt. Recogn. Artif. Intell."],"published-print":{"date-parts":[[2000,8]]},"abstract":"<jats:p>Lip reading provides useful information in speech perception and language understanding, especially when the auditory speech is degraded. However, many current automatic lip reading systems impose some restrictions on users. In this paper, we present our research efforts in the Interactive System Laboratory, towards unrestricted lip reading. We first introduce a top\u2013down approach to automatically track and extract lip regions. This technique makes it possible to acquire visual information in real-time without limiting the user's freedom of movement. We then discuss normalization algorithms to preprocess images for different lightning conditions (global illumination and side illumination). We also compare different visual preprocessing methods such as raw image, Linear Discriminant Analysis (LDA), and Principle Component Analysis (PCA). We demonstrate the feasibility of the proposed methods by the development of a modular system for flexible human\u2013computer interaction via both visual and acoustic speech. The system is based on an extension of the existing state-of-the-art speech recognition system, a modular Multiple State\u2013Time Delayed Neural Network (MS\u2013TDNN) system. We have developed adaptive combination methods at several different levels of the recognition network. The system can automatically track a speaker and extract his\/her lip region in real-time. The system has been evaluated under different noisy conditions such as white noise, music, and mechanical noise. The experimental results indicate that the system can achieve up to 55% error reduction using visual information in addition to the acoustic signal.<\/jats:p>","DOI":"10.1142\/s0218001400000374","type":"journal-article","created":{"date-parts":[[2002,7,27]],"date-time":"2002-07-27T11:03:21Z","timestamp":1027767801000},"page":"571-585","source":"Crossref","is-referenced-by-count":25,"title":["TOWARDS UNRESTRICTED LIP READING"],"prefix":"10.1142","volume":"14","author":[{"given":"UWE","family":"MEIER","sequence":"first","affiliation":[{"name":"Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, USA"},{"name":"University of Karlsruhe, Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"RAINER","family":"STIEFELHAGEN","sequence":"additional","affiliation":[{"name":"Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, USA"},{"name":"University of Karlsruhe, Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"JIE","family":"YANG","sequence":"additional","affiliation":[{"name":"Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, USA"},{"name":"University of Karlsruhe, Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"ALEX","family":"WAIBEL","sequence":"additional","affiliation":[{"name":"Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, USA"},{"name":"University of Karlsruhe, Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"219","published-online":{"date-parts":[[2011,11,21]]},"reference":[{"key":"p_1","doi-asserted-by":"crossref","first-page":"547","DOI":"10.21437\/ICSLP.1994-139","author":"Duchnowski P.","year":"1994","journal-title":"Int. Conf. Spoken Language Processing, ICSLP"},{"key":"p_2","first-page":"575","author":"Goldschen A. J.","year":"1994","journal-title":"28th Annual Asimolar Conf. Signal Speech and Computers"},{"key":"p_4","first-page":"578","author":"Hennecke M. E.","year":"1994","journal-title":"28th Annual Asimolar Conf. Signal Speech and Computers"},{"key":"p_5","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.1993.319284"},{"key":"p_6","doi-asserted-by":"crossref","first-page":"1481","DOI":"10.21437\/Eurospeech.1993-296","author":"Hild H.","year":"1993","journal-title":"3rd European Conf. Speech, Communication and Technology (EUROSPEECH 93)"},{"key":"p_8","doi-asserted-by":"publisher","DOI":"10.1002\/scj.4690220607"},{"key":"p_9","doi-asserted-by":"publisher","DOI":"10.1038\/264746a0"},{"key":"p_13","first-page":"833","volume":"2","author":"Meier U.","year":"1996","journal-title":"Proc. ICASSP"},{"key":"p_14","first-page":"851","volume":"94","author":"Movellan J. R.","year":"1994","journal-title":"NIPS"},{"key":"p_16","first-page":"265","author":"Petajan E. D.","year":"1984","journal-title":"Proc. IEEE Communications Society Global Telecommunications Conf."},{"key":"p_20","first-page":"561","author":"Silsbee P. L.","year":"1994","journal-title":"28th Annual Asimolar Conf. Signal Speech and Computers"},{"key":"p_21","doi-asserted-by":"crossref","first-page":"2007","DOI":"10.21437\/Eurospeech.1997-532","volume":"97","author":"Stiefelhagen R.","year":"1997","journal-title":"Eurospeech"},{"key":"p_22","first-page":"289","volume":"2","author":"Stork D. G.","year":"1992","journal-title":"IJCNN"},{"key":"p_23","first-page":"3","volume":"37","author":"Waibel A.","year":"1989","journal-title":"IEEE Trans. Acoust. Speech Sign. Process."},{"key":"p_24","first-page":"142","volume":"96","author":"Yang J.","year":"1996","journal-title":"Proc. WACV"}],"container-title":["International Journal of Pattern Recognition and Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218001400000374","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,1,6]],"date-time":"2024-01-06T01:49:57Z","timestamp":1704505797000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/abs\/10.1142\/S0218001400000374"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2000,8]]},"references-count":15,"journal-issue":{"issue":"05","published-online":{"date-parts":[[2011,11,21]]},"published-print":{"date-parts":[[2000,8]]}},"alternative-id":["10.1142\/S0218001400000374"],"URL":"https:\/\/doi.org\/10.1142\/s0218001400000374","relation":{},"ISSN":["0218-0014","1793-6381"],"issn-type":[{"value":"0218-0014","type":"print"},{"value":"1793-6381","type":"electronic"}],"subject":[],"published":{"date-parts":[[2000,8]]}}}