{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:52:02Z","timestamp":1750308722227,"version":"3.41.0"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2013,6,1]],"date-time":"2013-06-01T00:00:00Z","timestamp":1370044800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Speech Lang. Process."],"published-print":{"date-parts":[[2013,6]]},"abstract":"<jats:p>What is a multiword expression (MWE) and how many are there? Mark Liberman gave a great invited talk at ACL-89, titled \u201cHow Many Words Do People Know?\u201d where he spent the entire hour questioning the question. Many of the same questions apply to multiword expressions. What is a word? An expression? What is many? What is a person? What does it mean to know? Rather than answer these questions, this article will use them as Liberman did, as an excuse for surveying how such issues are addressed in a variety of fields: computer science, Web search, linguistics, lexicography, educational testing, psychology, statistics, and so on.<\/jats:p>","DOI":"10.1145\/2483691.2483693","type":"journal-article","created":{"date-parts":[[2013,7,1]],"date-time":"2013-07-01T12:27:28Z","timestamp":1372681648000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["How many multiword expressions do people know?"],"prefix":"10.1145","volume":"10","author":[{"given":"Kenneth","family":"Church","sequence":"first","affiliation":[{"name":"IBM"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2013,6,21]]},"reference":[{"volume-title":"Word Frequency Distributions","author":"Baayan H.","key":"e_1_2_1_1_1","unstructured":"Baayan , H. 2001. Word Frequency Distributions . Kluwer , Dordrecht . Baayan, H. 2001. Word Frequency Distributions. Kluwer, Dordrecht."},{"key":"e_1_2_1_2_1","doi-asserted-by":"crossref","unstructured":"Banko M. and Brill E. 2001. Scaling to very very large corpora for natural language disambiguation. ACL. acl.ldc.upenn.edu\/P\/P01\/P01-1005.pdf.  Banko M. and Brill E. 2001. Scaling to very very large corpora for natural language disambiguation. ACL. acl.ldc.upenn.edu\/P\/P01\/P01-1005.pdf.","DOI":"10.3115\/1073012.1073017"},{"key":"e_1_2_1_3_1","unstructured":"Bergsma S. Yarowsky D. and Church K. 2011. Using large monolingual and bilingual corpora to improve coordination disambiguation. ACL 1346--1355.   Bergsma S. Yarowsky D. and Church K. 2011. Using large monolingual and bilingual corpora to improve coordination disambiguation. ACL 1346--1355."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10590-011-9089-6"},{"key":"e_1_2_1_5_1","unstructured":"Boswell D. 2004. Speling korecksion: A survey of techniques from past to present. http:\/\/dustwell.com\/PastWork\/SpellingCorrectionResearchExam.pdf.  Boswell D. 2004. Speling korecksion: A survey of techniques from past to present. http:\/\/dustwell.com\/PastWork\/SpellingCorrectionResearchExam.pdf."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.3115\/1118693.1118726"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.3115\/1225403.1225423"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/792550.792552"},{"key":"e_1_2_1_9_1","unstructured":"Chapman R. 1998 American Slang. Collins.  Chapman R. 1998 American Slang. Collins."},{"key":"e_1_2_1_10_1","volume-title":"Roget's Thesaurus","author":"Chapman R.","unstructured":"Chapman , R. 1977. Roget's Thesaurus ( 4 th ed.). Harper & Row . Chapman, R. 1977. Roget's Thesaurus (4th ed.). Harper & Row.","edition":"4"},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Chomsky N. 1957. Syntactic Structures. Mouton Paris.  Chomsky N. 1957. Syntactic Structures. Mouton Paris.","DOI":"10.1515\/9783112316009"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1956.1056813"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075434.1075450"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/89086.89095"},{"volume-title":"The ESCA Workshop on Speech Synthesis (SSW1'90)","author":"Coker C.","key":"e_1_2_1_15_1","unstructured":"Coker , C. , Church , K. , and Liberman , M . 1990. Morphology and rhyming: Two powerful alternatives to letter-to-sound rules for speech synthesis . In The ESCA Workshop on Speech Synthesis (SSW1'90) , 83--86. Coker, C., Church, K., and Liberman, M. 1990. Morphology and rhyming: Two powerful alternatives to letter-to-sound rules for speech synthesis. In The ESCA Workshop on Speech Synthesis (SSW1'90), 83--86."},{"key":"e_1_2_1_16_1","unstructured":"Cucerzan S. and Brill E. 2004. Spelling correction as an iterative process that exploits the collective knowledge of Web users. EMNLP 293--300.  Cucerzan S. and Brill E. 2004. Spelling correction as an iterative process that exploits the collective knowledge of Web users. EMNLP 293--300."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.3115\/974358.974367"},{"key":"e_1_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Efron B. and Thisted R. 1976. Estimating the number of unseen species: How many words did Shakespeare know&quest; Biometrika 63 3 435--447.  Efron B. and Thisted R. 1976. Estimating the number of unseen species: How many words did Shakespeare know&quest; Biometrika 63 3 435--447.","DOI":"10.1093\/biomet\/63.3.435"},{"volume-title":"Studies in Linguistic Analysis","author":"Firth J.","key":"e_1_2_1_19_1","unstructured":"Firth , J. 1967. A synopsis of linguistic theory 1930--1955 . In Studies in Linguistic Analysis , Philological Society , Oxford . Firth, J. 1967. A synopsis of linguistic theory 1930--1955. In Studies in Linguistic Analysis, Philological Society, Oxford."},{"key":"e_1_2_1_20_1","unstructured":"Franz A. and Brants T. 2006. http:\/\/googleresearch.blogspot.com\/2006\/08\/all-our-n-gram-are-belong-to-you.html.  Franz A. and Brants T. 2006. http:\/\/googleresearch.blogspot.com\/2006\/08\/all-our-n-gram-are-belong-to-you.html."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/972450.972455"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075527.1075579"},{"volume-title":"The Architecture of the Language Faculty","author":"Jackendoff R.","key":"e_1_2_1_23_1","unstructured":"Jackendoff , R. 1997 , The Architecture of the Language Faculty . MIT Press , Cambridge, MA . Jackendoff, R. 1997, The Architecture of the Language Faculty. MIT Press, Cambridge, MA."},{"volume-title":"Six Lectures on Sound and Meaning","author":"Jacobson R.","key":"e_1_2_1_24_1","unstructured":"Jacobson , R. 1978. Six Lectures on Sound and Meaning . MIT Press , Cambridge, MA . http:\/\/www.scribd.com\/doc\/37655838\/Jakobson-Six-Lectures-on-Sound-Meaning-1. Jacobson, R. 1978. Six Lectures on Sound and Meaning. MIT Press, Cambridge, MA. http:\/\/www.scribd.com\/doc\/37655838\/Jakobson-Six-Lectures-on-Sound-Meaning-1."},{"key":"e_1_2_1_25_1","unstructured":"Jelinek F. 2002. Some of my best friends are linguists. Zampolli Award speech LREC.  Jelinek F. 2002. Some of my best friends are linguists. Zampolli Award speech LREC."},{"key":"e_1_2_1_26_1","volume-title":"Roget's Thesaurus","author":"Kipfer B.","unstructured":"Kipfer , B. Ed. 2001. Roget's Thesaurus ( 6 th ed.). HarperCollins , New York . Kipfer, B. Ed. 2001. Roget's Thesaurus (6th ed.). HarperCollins, New York.","edition":"6"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075096.1075150"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/146370.146380"},{"volume-title":"How many words do people know&quest","author":"Liberman M.","key":"e_1_2_1_29_1","unstructured":"Liberman , M. 1989. How many words do people know&quest ; ACL. Liberman, M. 1989. How many words do people know&quest; ACL."},{"key":"e_1_2_1_30_1","unstructured":"Liberman M. and Church K. 1991. Text analysis and word pronunciation in text-to-speech synthesis. In Advances in Speech Signal Processing S. Furui and M. Sondhi (eds.) 791--831.  Liberman M. and Church K. 1991. Text analysis and word pronunciation in text-to-speech synthesis. In Advances in Speech Signal Processing S. Furui and M. Sondhi (eds.) 791--831."},{"key":"e_1_2_1_31_1","unstructured":"Norvig P. 2002. Better Web search with and without computational linguistics. Invited talk ACL.  Norvig P. 2002. Better Web search with and without computational linguistics. Invited talk ACL."},{"key":"e_1_2_1_32_1","unstructured":"Mosteller F. and Wallace D. 1964. Inference and Disputed Authorship: The Federalist. Addison-Wesley Reading MA.  Mosteller F. and Wallace D. 1964. Inference and Disputed Authorship: The Federalist. Addison-Wesley Reading MA."},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Pullum G. and Scholz B. 2010 Recursion and the infinitude claim. In Recursion in Human Language Harry van der Hulst (ed.) Mouton de Gruyter Berlin 113--138.  Pullum G. and Scholz B. 2010 Recursion and the infinitude claim. In Recursion in Human Language Harry van der Hulst (ed.) Mouton de Gruyter Berlin 113--138.","DOI":"10.1515\/9783110219258.111"},{"key":"e_1_2_1_34_1","doi-asserted-by":"crossref","unstructured":"Sag I. Baldwin T. Bond F. Copestake A. and Flickinger D. 2002. Multiword expressions: A pain in the neck for NLP. In LNCS 2276 Springer Berlin 1--15.   Sag I. Baldwin T. Bond F. Copestake A. and Flickinger D. 2002. Multiword expressions: A pain in the neck for NLP. In LNCS 2276 Springer Berlin 1--15.","DOI":"10.1007\/3-540-45715-1_1"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1951.tb01366.x"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.5555\/239895.239900"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1006\/csla.1994.1004"},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop.","author":"Stolcke A.","year":"1998","unstructured":"Stolcke , A. 1998 . Entropy-based pruning of backoff language models . In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop. Stolcke, A. 1998. Entropy-based pruning of backoff language models. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1037\/h0070343"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1162\/COLI_a_00076"},{"key":"e_1_2_1_41_1","volume-title":"Overview of the TREC 2004 question answering track. In The Text REtrieval Conference (TREC).","author":"Voorhees E.","year":"2004","unstructured":"Voorhees , E. 2004 . Overview of the TREC 2004 question answering track. In The Text REtrieval Conference (TREC). Voorhees, E. 2004. Overview of the TREC 2004 question answering track. In The Text REtrieval Conference (TREC)."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1017\/S135132490400364X"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075671.1075731"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.3115\/992133.992140"}],"container-title":["ACM Transactions on Speech and Language Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2483691.2483693","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2483691.2483693","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T20:14:36Z","timestamp":1750277676000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2483691.2483693"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,6]]},"references-count":44,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2013,6]]}},"alternative-id":["10.1145\/2483691.2483693"],"URL":"https:\/\/doi.org\/10.1145\/2483691.2483693","relation":{},"ISSN":["1550-4875","1550-4883"],"issn-type":[{"type":"print","value":"1550-4875"},{"type":"electronic","value":"1550-4883"}],"subject":[],"published":{"date-parts":[[2013,6]]},"assertion":[{"value":"2012-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-06-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}