{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,8]],"date-time":"2026-06-08T21:57:33Z","timestamp":1780955853350,"version":"3.54.1"},"reference-count":88,"publisher":"Wiley","issue":"6","license":[{"start":{"date-parts":[[2005,11,1]],"date-time":"2005-11-01T00:00:00Z","timestamp":1130803200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Cognitive Science"],"published-print":{"date-parts":[[2005,11,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that research in these related fields has focused on the mechanics of how speech can be recognized. In Marr's (1982) terms, emphasis has been on the algorithmic and implementational levels rather than on the computational level. In this article, we provide a computational\u2010level analysis of the task of speech recognition, which reveals the close parallels between research concerned with HSR and ASR. We illustrate this relation by presenting a new computational model of human spoken\u2010word recognition, built using techniques from the field of ASR that, in contrast to current existing models of HSR, recognizes words from real speech input.<\/jats:p>","DOI":"10.1207\/s15516709cog0000_37","type":"journal-article","created":{"date-parts":[[2005,12,1]],"date-time":"2005-12-01T15:08:54Z","timestamp":1133449734000},"page":"867-918","update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":34,"title":["How Should a Speech Recognizer Work?"],"prefix":"10.1111","volume":"29","author":[{"given":"Odette","family":"Scharenborg","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Dennis","family":"Norris","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Louis","family":"ten Bosch","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"James M.","family":"McQueen","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"311","published-online":{"date-parts":[[2005,11]]},"reference":[{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1006\/jmla.1997.2558"},{"key":"e_1_2_1_3_1","volume-title":"The adaptive character of thought.","author":"Anderson J. R.","year":"1990"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/0010-0277(94)90042-6"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.3758\/BF03210424"},{"key":"e_1_2_1_6_1","first-page":"59","volume-title":"Weighting phone confidence measures for automatic speech recognition","author":"Bouwman G.","year":"2000"},{"key":"e_1_2_1_7_1","volume-title":"CELEX: A guide for users.","author":"Burnage G.","year":"1990"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1016\/0010-0277(87)90004-7"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1037\/0096-1523.16.3.551"},{"key":"e_1_2_1_10_1","doi-asserted-by":"crossref","first-page":"1679","DOI":"10.21437\/Eurospeech.2001-393","volume-title":"Proceedings of Eurospeech","author":"Cucchiarini C.","year":"2001"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1111\/1467-9280.00447"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1037\/0096-1523.14.1.113"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1037\/0096-1523.28.1.218"},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","first-page":"1973","DOI":"10.21437\/Eurospeech.2003-570","volume-title":"Proceedings of Eurospeech","author":"Demuynck K.","year":"2003"},{"key":"e_1_2_1_15_1","first-page":"360","volume-title":"Invariance and variability of speech processes","author":"Elman J. L.","year":"1986"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4613-1367-0_2"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0010-0277(03)00070-2"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1080\/016909697386646"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.105.2.251"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1037\/0096-1523.28.1.163"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1037\/0096-1523.21.2.344"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-6393(99)00059-X"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.93.4.411"},{"key":"e_1_2_1_24_1","unstructured":"Hirose K. Minematsu N. Hashimoto Y. &Iwano K.(2001).Continuous speech recognition of Japanese using prosodic word boundaries detected by mora transition modeling of fundamental frequency contours. InProceedings of the Workshop on Prosody in Automatic Speech Recognition and Understanding(pp.61\u201366). Red Bank NJ."},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","first-page":"2699","DOI":"10.21437\/Eurospeech.1999-578","volume-title":"Proceedings of Eurospeech","author":"H\u00f6ge H.","year":"1999"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1016\/0893-6080(89)90020-8"},{"key":"e_1_2_1_27_1","volume-title":"Statistical methods for speech recognition.","author":"Jelinek F.","year":"1997"},{"issue":"8","key":"e_1_2_1_28_1","article-title":"Spoken language processing [Special issue]","volume":"88","author":"Juang B. H.","year":"2000","journal-title":"Proceedings of the IEEE"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1207\/s15516709cog2002_1"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0095-4470(19)31059-9"},{"key":"e_1_2_1_31_1","doi-asserted-by":"crossref","first-page":"169","DOI":"10.7551\/mitpress\/4213.003.0010","volume-title":"Lexical representation and process","author":"Klatt D. H.","year":"1989"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1975.1162648"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.3758\/BF03212485"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.3758\/BF03212113"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1097\/00003446-199802000-00001"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511753459"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1207\/s15516709cog2702_6"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.880087"},{"key":"e_1_2_1_39_1","volume-title":"Vision: A computational investigation into the human representation and processing of visual information.","author":"Marr D.","year":"1982"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.101.4.653"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1016\/0010-0277(87)90005-9"},{"key":"e_1_2_1_42_1","first-page":"148","volume-title":"Cognitive models of speech processing: In Psycholinguistic and computational perspectives","author":"Marslen\u2010Wilson W. D.","year":"1990"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1016\/0010-0285(78)90018-X"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1016\/0010-0285(86)90015-0"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1006\/jmla.1998.2568"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1207\/s15516709cog2705_6"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.4135\/9781848608177.n11"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1080\/01690969508407098"},{"key":"e_1_2_1_49_1","unstructured":"McQueen J. M. Cutler A. &Norris D.(2005).The mental lexicon is not episodic: A belated reply to Goldinger (1998).Manuscript in preparation."},{"key":"e_1_2_1_50_1","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1515\/9783110895094.39","volume-title":"Phonetics and phonology in language comprehension and production: Differences and similarities","author":"McQueen J.M.","year":"2003"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1037\/0278-7393.20.3.621"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1037\/0096-1523.25.5.1363"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.400725"},{"key":"e_1_2_1_54_1","first-page":"145","volume-title":"Proceedings of the Workshop on Speech Recognition as Pattern Classification","author":"Moore R. K.","year":"2001"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4613-1367-0_16"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1016\/0010-0277(82)90007-5"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0010-0277(86)90001-6"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1016\/0010-0277(94)90043-4"},{"key":"e_1_2_1_59_1","first-page":"331","volume-title":"Twenty\u2010first century psycholinguistics: Four cornerstones","author":"Norris D.","year":"2005"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1037\/0278-7393.21.5.1209"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1017\/S0140525X00003241"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0010-0285(03)00006-9"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1006\/cogp.1997.0671"},{"key":"e_1_2_1_64_1","unstructured":"Norris D. McQueen J. M. &Smits R.(2004).Shortlist II: A Bayesian model of continuous speech recognition.Manuscript in preparation."},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.414456"},{"key":"e_1_2_1_66_1","doi-asserted-by":"crossref","unstructured":"Paul D. B.(1992).An efficient A* stack decoder algorithm for continuous speech recognition with a stochastic language model. InProceedings of the IEEE International Conference on Acoustics Speech and Signal Processing(pp.25\u201328).","DOI":"10.1109\/ICASSP.1992.225981"},{"key":"e_1_2_1_67_1","volume-title":"Invariance and variability of speech processes.","author":"Perkell J. S.","year":"1986"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1006\/jmla.1998.2571"},{"key":"e_1_2_1_69_1","volume-title":"Fundamentals of speech processing.","author":"Rabiner L.","year":"1993"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0010-0277(03)00139-2"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1111\/1467-9280.00364"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0749-596X(02)00514-4"},{"key":"e_1_2_1_73_1","unstructured":"Scharenborg O. &Boves L.(2002).Pronunciation variation modelling in a model of human word recognition. InProceedings of Workshop on Pronunciation Modeling and Lexicon Adaptation(pp.65\u201370). Estes Park CO."},{"key":"e_1_2_1_74_1","doi-asserted-by":"crossref","first-page":"2097","DOI":"10.21437\/Eurospeech.2003-606","volume-title":"Proceedings of Eurospeech","author":"Scharenborg O.","year":"2003"},{"key":"e_1_2_1_75_1","first-page":"61","volume-title":"Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop","author":"Scharenborg O.","year":"2003"},{"key":"e_1_2_1_76_1","first-page":"2285","volume-title":"Proceedings of Eurospeech","author":"Scharenborg O.","year":"2003"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.1624065"},{"key":"e_1_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0749-596X(02)00513-2"},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.1458026"},{"key":"e_1_2_1_80_1","first-page":"361","volume-title":"Proceedings of ICSLP","author":"Sturm J.","year":"2000"},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1006\/jmla.1995.1020"},{"key":"e_1_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.1037\/0096-1523.26.2.758"},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1111\/1467-9280.00064"},{"key":"e_1_2_1_84_1","doi-asserted-by":"publisher","DOI":"10.1006\/jmla.1998.2618"},{"key":"e_1_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1037\/0096-1523.21.1.98"},{"key":"e_1_2_1_86_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.880085"},{"key":"e_1_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.1109\/89.906002"},{"key":"e_1_2_1_88_1","first-page":"11","volume-title":"Proceedings of the ISCA Workshop on Adaptation Methods for Speech Recognition","author":"Woodland P. C.","year":"2001"},{"key":"e_1_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.1016\/0010-0277(89)90013-9"}],"container-title":["Cognitive Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1207%2Fs15516709cog0000_37","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1207\/s15516709cog0000_37","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,26]],"date-time":"2024-10-26T16:06:20Z","timestamp":1729958780000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1207\/s15516709cog0000_37"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,11]]},"references-count":88,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2005,11,12]]}},"alternative-id":["10.1207\/s15516709cog0000_37"],"URL":"https:\/\/doi.org\/10.1207\/s15516709cog0000_37","archive":["Portico"],"relation":{},"ISSN":["0364-0213","1551-6709"],"issn-type":[{"value":"0364-0213","type":"print"},{"value":"1551-6709","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,11]]},"assertion":[{"value":"2005-11-01","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}