{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,1]],"date-time":"2026-03-01T10:46:02Z","timestamp":1772361962388,"version":"3.50.1"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Speech Lang. Process."],"published-print":{"date-parts":[[2005,2]]},"abstract":"<jats:p>\n            Previous work demonstrated that Web counts can be used to approximate bigram counts, suggesting that Web-based frequencies should be useful for a wide variety of Natural Language Processing (NLP) tasks. However, only a limited number of tasks have so far been tested using Web-scale data sets. The present article overcomes this limitation by systematically investigating the performance of Web-based models for several NLP tasks, covering both syntax and semantics, both generation and analysis, and a wider range of\n            <jats:italic>n<\/jats:italic>\n            -grams and parts of speech than have been previously explored. For the majority of our tasks, we find that simple, unsupervised models perform better when\n            <jats:italic>n<\/jats:italic>\n            -gram counts are obtained from the Web rather than from a large corpus. In some cases, performance can be improved further by using backoff or interpolation techniques that combine Web counts and corpus counts. However, unsupervised Web-based models generally fail to outperform supervised state-of-the-art models trained on smaller corpora. We argue that Web-based models should therefore be used as a baseline for, rather than an alternative to, standard supervised models.\n          <\/jats:p>","DOI":"10.1145\/1075389.1075392","type":"journal-article","created":{"date-parts":[[2005,8,1]],"date-time":"2005-08-01T15:52:28Z","timestamp":1122911548000},"page":"3","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":65,"title":["Web-based models for natural language processing"],"prefix":"10.1145","volume":"2","author":[{"given":"Mirella","family":"Lapata","sequence":"first","affiliation":[{"name":"University of Edinburgh, Edinburgh, UK"}]},{"given":"Frank","family":"Keller","sequence":"additional","affiliation":[{"name":"University of Edinburgh, Edinburgh, UK"}]}],"member":"320","published-online":{"date-parts":[[2005,2]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics","author":"Baldwin T.","unstructured":"Baldwin , T. and Bond , F . 2003. Learning the countability of English nouns from corpus data . In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics . Sapporo, Japan. 463--470. 10.3115\/1075096.1075155 Baldwin, T. and Bond, F. 2003. Learning the countability of English nouns from corpus data. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. Sapporo, Japan. 463--470. 10.3115\/1075096.1075155"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 1st International Conference on Human Language Technology Research. J. Allan, Ed. Morgan Kaufmann","author":"Banko M.","unstructured":"Banko , M. and Brill , E . 2001a. Mitigating the paucity-of-data problem: Exploring the effect of training corpus size on classifier performance for natural language processing . In Proceedings of the 1st International Conference on Human Language Technology Research. J. Allan, Ed. Morgan Kaufmann , San Francisco. 10.3115\/1072133.1072204 Banko, M. and Brill, E. 2001a. Mitigating the paucity-of-data problem: Exploring the effect of training corpus size on classifier performance for natural language processing. In Proceedings of the 1st International Conference on Human Language Technology Research. J. Allan, Ed. Morgan Kaufmann, San Francisco. 10.3115\/1072133.1072204"},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics","author":"Banko M.","unstructured":"Banko , M. and Brill , E . 2001b. Scaling to very very large corpora for natural language disambiguation . In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics . Toulouse, France. 10.3115\/1073012.1073017 Banko, M. and Brill, E. 2001b. Scaling to very very large corpora for natural language disambiguation. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics. Toulouse, France. 10.3115\/1073012.1073017"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 15th International Conference on Computational Linguistics","author":"Brill E.","unstructured":"Brill , E. and Resnik , P . 1994. A rule-based approach to prepositional phrase attachment disambiguation . In Proceedings of the 15th International Conference on Computational Linguistics . Kyoto, Japan. 1198--1204. 10.3115\/991250.991346 Brill, E. and Resnik, P. 1994. A rule-based approach to prepositional phrase attachment disambiguation. In Proceedings of the 15th International Conference on Computational Linguistics. Kyoto, Japan. 1198--1204. 10.3115\/991250.991346"},{"key":"e_1_2_1_5_1","volume-title":"Companion","volume":"9","author":"Bulyko I.","unstructured":"Bulyko , I. , Ostendorf , M. , and Stolcke , A . 2003. Getting more mileage from Web text sources for conversational speech language modeling using class-dependent mixtures . In Companion Volume of the Proceedings of HLT-NAACL 2003: Short Papers. Edmonton, Canada. 7-- 9 . 10.3115\/1073483.1073486 Bulyko, I., Ostendorf, M., and Stolcke, A. 2003. Getting more mileage from Web text sources for conversational speech language modeling using class-dependent mixtures. In Companion Volume of the Proceedings of HLT-NAACL 2003: Short Papers. Edmonton, Canada. 7--9. 10.3115\/1073483.1073486"},{"key":"e_1_2_1_6_1","volume-title":"The Users Reference Guide for the British National Corpus","author":"Burnard L.","unstructured":"Burnard , L. 1995. The Users Reference Guide for the British National Corpus . British National Corpus Consortium, Oxford University Computing Service . Burnard, L. 1995. The Users Reference Guide for the British National Corpus. British National Corpus Consortium, Oxford University Computing Service."},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the 19th International Conference on Computational Linguistics","author":"Cao Y.","unstructured":"Cao , Y. and Li , H . 2002. Base noun phrase translation: Using Web data and the EM algorithm . In Proceedings of the 19th International Conference on Computational Linguistics . Taipei, Taiwan. 127--133. 10.3115\/1072228.1072239 Cao, Y. and Li, H. 2002. Base noun phrase translation: Using Web data and the EM algorithm. In Proceedings of the 19th International Conference on Computational Linguistics. Taipei, Taiwan. 127--133. 10.3115\/1072228.1072239"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing, D. Lin and D. Wu, Eds","author":"Chklovski T.","unstructured":"Chklovski , T. and Pantel , P . 2004. VerbOcean: Mining the Web for fine-grained semantic verb relations . In Proceedings of the Conference on Empirical Methods in Natural Language Processing, D. Lin and D. Wu, Eds . Barcelona, Spain. 33--40. Chklovski, T. and Pantel, P. 2004. VerbOcean: Mining the Web for fine-grained semantic verb relations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, D. Lin and D. Wu, Eds. Barcelona, Spain. 33--40."},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the 3rd Workshop on Very Large Corpora. D. Yarowsky and K. W. Church, Eds","author":"Collins M.","unstructured":"Collins , M. and Brooks , J . 1995. Prepositional phrase attachment through a backed-off model . In Proceedings of the 3rd Workshop on Very Large Corpora. D. Yarowsky and K. W. Church, Eds . Cambridge, MA. 27--38. Collins, M. and Brooks, J. 1995. Prepositional phrase attachment through a backed-off model. In Proceedings of the 3rd Workshop on Very Large Corpora. D. Yarowsky and K. W. Church, Eds. Cambridge, MA. 27--38."},{"key":"e_1_2_1_10_1","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1023\/A:1002497503122","article-title":"Finding syntactic structure in unparsed corpora: The Gsearch corpus query system","volume":"35","author":"Corley S.","year":"2001","unstructured":"Corley , S. , Corley , M. , Keller , F. , Crocker , M. W. , and Trewin , S. 2001 . Finding syntactic structure in unparsed corpora: The Gsearch corpus query system . Comput. Humanities 35 , 2, 81 -- 94 . Corley, S., Corley, M., Keller, F., Crocker, M. W., and Trewin, S. 2001. Finding syntactic structure in unparsed corpora: The Gsearch corpus query system. Comput. Humanities 35, 2, 81--94.","journal-title":"Comput. Humanities"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. J. Haji\u010d and Y. Matsumoto, Eds","author":"Cucerzan S.","year":"1869","unstructured":"Cucerzan , S. and Yarowsky , D . 2002. Augmented mixture models for lexical disambiguation . In Proceedings of the Conference on Empirical Methods in Natural Language Processing. J. Haji\u010d and Y. Matsumoto, Eds . Philadelphia, PA. 33--40. 10.3115\/11 1869 3.1118698 Cucerzan, S. and Yarowsky, D. 2002. Augmented mixture models for lexical disambiguation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. J. Haji\u010d and Y. Matsumoto, Eds. Philadelphia, PA. 33--40. 10.3115\/1118693.1118698"},{"key":"e_1_2_1_12_1","first-page":"563","article-title":"Machine translation divergences: A formal description and proposed solution","volume":"20","author":"Dagan I.","year":"1994","unstructured":"Dagan , I. and Itai , A. 1994 . Machine translation divergences: A formal description and proposed solution . Computat. Ling. 20 , 4, 563 -- 597 . Dagan, I. and Itai, A. 1994. Machine translation divergences: A formal description and proposed solution. Computat. Ling. 20, 4, 563--597.","journal-title":"Computat. Ling."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 25th ACM Conference on Research and Development in Information Retrieval","author":"Dumais S.","unstructured":"Dumais , S. , Banko , M. , Brill , E. , Lin , J. , and Ng , A . 2002. Web question answering: Is more always better? In Proceedings of the 25th ACM Conference on Research and Development in Information Retrieval . Tampere, Finland. 291--298. 10.1145\/564376.564428 Dumais, S., Banko, M., Brill, E., Lin, J., and Ng, A. 2002. Web question answering: Is more always better? In Proceedings of the 25th ACM Conference on Research and Development in Information Retrieval. Tampere, Finland. 291--298. 10.1145\/564376.564428"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 3rd Workshop on Very Large Corpora. D. Yarowsky and K. W. Church, Eds","author":"Golding A. R.","year":"1995","unstructured":"Golding , A. R. 1995 . A Bayesian hybrid method for context-sensitive spelling correction . In Proceedings of the 3rd Workshop on Very Large Corpora. D. Yarowsky and K. W. Church, Eds . Cambridge, MA. 39--53. Golding, A. R. 1995. A Bayesian hybrid method for context-sensitive spelling correction. In Proceedings of the 3rd Workshop on Very Large Corpora. D. Yarowsky and K. W. Church, Eds. Cambridge, MA. 39--53."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007545901558"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics","author":"Golding A. R.","year":"1863","unstructured":"Golding , A. R. and Schabes , Y . 1996. Combining trigram-based and feature-based methods for context-sensitive spelling correction . In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics . Santa Cruz, CA. 71--78. 10.3115\/98 1863 .981873 Golding, A. R. and Schabes, Y. 1996. Combining trigram-based and feature-based methods for context-sensitive spelling correction. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics. Santa Cruz, CA. 71--78. 10.3115\/981863.981873"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the ASLIB Conference on Translating and the Computer","author":"Grefenstette G.","year":"1998","unstructured":"Grefenstette , G. 1998 . The World Wide Web as a resource for example-based machine translation tasks . In Proceedings of the ASLIB Conference on Translating and the Computer . London, UK. Grefenstette, G. 1998. The World Wide Web as a resource for example-based machine translation tasks. In Proceedings of the ASLIB Conference on Translating and the Computer. London, UK."},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the 15th International Conference on Computational Linguistics","author":"Grishman R.","year":"1886","unstructured":"Grishman , R. , Macleod , C. , and Meyers , A . 1994. COMLEX Syntax: Building a computational lexicon . In Proceedings of the 15th International Conference on Computational Linguistics . Kyoto, Japan. 268--272. 10.3115\/99 1886 .991931 Grishman, R., Macleod, C., and Meyers, A. 1994. COMLEX Syntax: Building a computational lexicon. In Proceedings of the 15th International Conference on Computational Linguistics. Kyoto, Japan. 268--272. 10.3115\/991886.991931"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics","author":"Hildebrandt W.","unstructured":"Hildebrandt , W. , Katz , B. , and Lin , J . 2004. Answering definition questions with multiple knowledge sources . In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics . Boston. MA. 49--56. Hildebrandt, W., Katz, B., and Lin, J. 2004. Answering definition questions with multiple knowledge sources. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. Boston. MA. 49--56."},{"key":"e_1_2_1_20_1","first-page":"103","article-title":"Structural ambiguity and lexical relations","volume":"19","author":"Hindle D.","year":"1993","unstructured":"Hindle , D. and Rooth , M. 1993 . Structural ambiguity and lexical relations . Computat. Ling. 19 , 1, 103 -- 120 . Hindle, D. and Rooth, M. 1993. Structural ambiguity and lexical relations. Computat. Ling. 19, 1, 103--120.","journal-title":"Computat. Ling."},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the 3rd Machine Translation Summit","author":"Ikehara S.","unstructured":"Ikehara , S. , Shirai , S. , Yokoo , A. , and Nakaiwa , H . 1991. Toward an MT system without pre-editing effects of new methods in ALT-J\/E . In Proceedings of the 3rd Machine Translation Summit . Washington, DC. 101--106. Ikehara, S., Shirai, S., Yokoo, A., and Nakaiwa, H. 1991. Toward an MT system without pre-editing effects of new methods in ALT-J\/E. In Proceedings of the 3rd Machine Translation Summit. Washington, DC. 101--106."},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 5th Conference on Applied Natural Language Processing","author":"Jones M. P.","unstructured":"Jones , M. P. and Martin , J. H . 1997. Contextual spelling correction using latent semantic analysis . In Proceedings of the 5th Conference on Applied Natural Language Processing . Washington, DC. 166--173. 10.3115\/974557.974582 Jones, M. P. and Martin, J. H. 1997. Contextual spelling correction using latent semantic analysis. In Proceedings of the 5th Conference on Applied Natural Language Processing. Washington, DC. 166--173. 10.3115\/974557.974582"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1162\/089120103322711604"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1162\/089120103322711569"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of 12th National Conference on Artificial Intelligence","author":"Knight K.","unstructured":"Knight , K. and Chander , I . 1994. Automated postediting of documents . In Proceedings of 12th National Conference on Artificial Intelligence . Seattle, WA. 770--784. Knight, K. and Chander, I. 1994. Automated postediting of documents. In Proceedings of 12th National Conference on Artificial Intelligence. Seattle, WA. 770--784."},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics","author":"Knight K.","unstructured":"Knight , K. and Hatzivassiloglou , V . 1995. Two-level, many paths generation . In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics . Cambridge, MA. 252--260. 10.3115\/981658.981692 Knight, K. and Hatzivassiloglou, V. 1995. Two-level, many paths generation. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. Cambridge, MA. 252--260. 10.3115\/981658.981692"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 17th International Conference on Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics","author":"Langkilde I.","unstructured":"Langkilde , I. and Knight , K . 1998. Generation that exploits corpus-based statistical knowledge . In Proceedings of the 17th International Conference on Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics . Montr\u00e9al, Canada. 704--710. 10.3115\/980845.980963 Langkilde, I. and Knight, K. 1998. Generation that exploits corpus-based statistical knowledge. In Proceedings of the 17th International Conference on Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics. Montr\u00e9al, Canada. 704--710. 10.3115\/980845.980963"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics","author":"Lapata M.","unstructured":"Lapata , M. and Keller , F . 2004. The Web as a baseline: Evaluating the performance of unsupervised Web-based models for a range of NLP tasks . In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics . Boston, MA. 121--128. Lapata, M. and Keller, F. 2004. The Web as a baseline: Evaluating the performance of unsupervised Web-based models for a range of NLP tasks. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. Boston, MA. 121--128."},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics","author":"Lauer M.","year":"1995","unstructured":"Lauer , M. 1995 . Corpus statistics meet the noun compound: Some empirical results . In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics . Cambridge, MA. 47--54. 10.3115\/981658.981665 Lauer, M. 1995. Corpus statistics meet the noun compound: Some empirical results. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. Cambridge, MA. 47--54. 10.3115\/981658.981665"},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics","author":"Lee J.","year":"2004","unstructured":"Lee , J. 2004 . Automatic article restoration . In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics . Boston, MA. 31--36. Lee, J. 2004. Automatic article restoration. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. Boston, MA. 31--36."},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics","author":"Malouf R.","year":"2000","unstructured":"Malouf , R. 2000 . The order of prenominal adjectives in natural language generation . In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics . Hong Kong. 85--92. 10.3115\/1075218.1075230 Malouf, R. 2000. The order of prenominal adjectives in natural language generation. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics. Hong Kong. 85--92. 10.3115\/1075218.1075230"},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the 14th International Conference on Machine Learning","author":"Mangu L.","unstructured":"Mangu , L. and Brill , E . 1997. Automatic rule acquisition of spelling correction . In Proceedings of the 14th International Conference on Machine Learning . Nashville, TN. 187--194. Mangu, L. and Brill, E. 1997. Automatic rule acquisition of spelling correction. In Proceedings of the 14th International Conference on Machine Learning. Nashville, TN. 187--194."},{"key":"e_1_2_1_33_1","first-page":"313","article-title":"Building a large annotated corpus of English","volume":"19","author":"Marcus M. P.","year":"1993","unstructured":"Marcus , M. P. , Santorini , B. , and Marcinkiewicz , M. A. 1993 . Building a large annotated corpus of English : The Penn Treebank. Computat. Ling. 19 , 2, 313 -- 330 . Marcus, M. P., Santorini, B., and Marcinkiewicz, M. A. 1993. Building a large annotated corpus of English: The Penn Treebank. Computat. Ling. 19, 2, 313--330.","journal-title":"The Penn Treebank. Computat. Ling."},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics","author":"Mihalcea R.","unstructured":"Mihalcea , R. and Moldovan , D . 1999. A method for word sense disambiguation of unrestricted text . In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics . College Park, MD. 152--158. 10.3115\/1034678.1034709 Mihalcea, R. and Moldovan, D. 1999. A method for word sense disambiguation of unrestricted text. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics. College Park, MD. 152--158. 10.3115\/1034678.1034709"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 4th Workshop on Computational Natural Language Learning","author":"Minnen G.","unstructured":"Minnen , G. , Bond , F. , and Copestake , A . 2000. Memory-based learning for article generation . In Proceedings of the 4th Workshop on Computational Natural Language Learning . Lisbon, Portugal. 43--48. 10.3115\/1117601.1117611 Minnen, G., Bond, F., and Copestake, A. 2000. Memory-based learning for article generation. In Proceedings of the 4th Workshop on Computational Natural Language Learning. Lisbon, Portugal. 43--48. 10.3115\/1117601.1117611"},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the 8th Conference on Empirical Methods in Natural Language Processing","author":"Modjeska N.","year":"1935","unstructured":"Modjeska , N. , Markert , K. , and Nissim , M . 2003. Using the Web in machine learning for other-anaphora resolution . In Proceedings of the 8th Conference on Empirical Methods in Natural Language Processing . Sapporo, Japan. 176--183. 10.3115\/11 1935 5.1119378 Modjeska, N., Markert, K., and Nissim, M. 2003. Using the Web in machine learning for other-anaphora resolution. In Proceedings of the 8th Conference on Empirical Methods in Natural Language Processing. Sapporo, Japan. 176--183. 10.3115\/1119355.1119378"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics","author":"Pantel P.","unstructured":"Pantel , P. and Lin , D . 2000. An unsupervised approach to prepositional phrase attachment using contextually similar words . In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics . Hong Kong. 101--108. 10.3115\/1075218.1075232 Pantel, P. and Lin, D. 2000. An unsupervised approach to prepositional phrase attachment using contextually similar words. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics. Hong Kong. 101--108. 10.3115\/1075218.1075232"},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the 18th International Conference on Computational Linguistics","author":"Prescher D.","unstructured":"Prescher , D. , Riezler , S. , and Rooth , M . 2000. Using a probabilistic class-based lexicon for lexical ambiguity resolution . In Proceedings of the 18th International Conference on Computational Linguistics . Saarbr\u00fccken, Germany. 649--655. 10.3115\/992730.992740 Prescher, D., Riezler, S., and Rooth, M. 2000. Using a probabilistic class-based lexicon for lexical ambiguity resolution. In Proceedings of the 18th International Conference on Computational Linguistics. Saarbr\u00fccken, Germany. 649--655. 10.3115\/992730.992740"},{"key":"e_1_2_1_39_1","first-page":"331","article-title":"Lexical semantic techniques for corpus analysis","volume":"19","author":"Pustejovsky J.","year":"1993","unstructured":"Pustejovsky , J. , Bergler , S. , and Anick , P. 1993 . Lexical semantic techniques for corpus analysis . Computat. Ling. 19 , 3, 331 -- 358 . Pustejovsky, J., Bergler, S., and Anick, P. 1993. Lexical semantic techniques for corpus analysis. Computat. Ling. 19, 3, 331--358.","journal-title":"Computat. Ling."},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the 17th International Conference on Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics","author":"Ratnaparkhi A.","year":"1998","unstructured":"Ratnaparkhi , A. 1998 . Unsupervised statistical models for prepositional phrase attachment . In Proceedings of the 17th International Conference on Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics . Montr\u00e9al, Canada. 1079--1085. 10.3115\/980691.980746 Ratnaparkhi, A. 1998. Unsupervised statistical models for prepositional phrase attachment. In Proceedings of the 17th International Conference on Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics. Montr\u00e9al, Canada. 1079--1085. 10.3115\/980691.980746"},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the ARPA Workshop on Human Language Technology","author":"Ratnaparkhi A.","unstructured":"Ratnaparkhi , A. , Reynar , J. , and Roukos , S . 1993. A maximum entropy model for prepositional phrase attachment . In Proceedings of the ARPA Workshop on Human Language Technology . Plainsboro, NJ. 10.3115\/1075812.1075868 Ratnaparkhi, A., Reynar, J., and Roukos, S. 1993. A maximum entropy model for prepositional phrase attachment. In Proceedings of the ARPA Workshop on Human Language Technology. Plainsboro, NJ. 10.3115\/1075812.1075868"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1162\/089120103322711578"},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of COLING Workshop on a Roadmap for Computational Linguistics","author":"Rigau G.","year":"1875","unstructured":"Rigau , G. , Magnini , B. , Agirre , E. , and Carroll , J . 2002. Meaning: A roadmap to knowledge technologies . In Proceedings of COLING Workshop on a Roadmap for Computational Linguistics . Taipei, Taiwan. 10.3115\/11 1875 4.1118758 Rigau, G., Magnini, B., Agirre, E., and Carroll, J. 2002. Meaning: A roadmap to knowledge technologies. In Proceedings of COLING Workshop on a Roadmap for Computational Linguistics. Taipei, Taiwan. 10.3115\/1118754.1118758"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1162\/089120103322711613"},{"key":"e_1_2_1_46_1","volume-title":"Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics","author":"Shaw J.","unstructured":"Shaw , J. and Hatzivassiloglou , V . 1999. Ordering among premodifiers . In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics . College Park, MD. 135--143. 10.3115\/1034678.1034707 Shaw, J. and Hatzivassiloglou, V. 1999. Ordering among premodifiers. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics. College Park, MD. 135--143. 10.3115\/1034678.1034707"},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics","author":"Shinzato K.","unstructured":"Shinzato , K. and Torisawa , K . 2004. Acquiring hyponymy relations from Web documents . In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics . Boston, MA. 73--80. Shinzato, K. and Torisawa, K. 2004. Acquiring hyponymy relations from Web documents. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. Boston, MA. 73--80."},{"key":"e_1_2_1_48_1","volume-title":"Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics","author":"Soricut R.","unstructured":"Soricut , R. and Brill , E . 2004. Automatic question answering: Beyond the factoid . In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics . Boston, MA. 57--64. Soricut, R. and Brill, E. 2004. Automatic question answering: Beyond the factoid. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. Boston, MA. 57--64."},{"key":"e_1_2_1_49_1","volume-title":"Proceedings of the 5th Workshop on Very Large Corpora","author":"Stetina J.","unstructured":"Stetina , J. and Nagao , M . 1997. Corpus-based PP attachment ambiguity resolution with a semantic dictionary . In Proceedings of the 5th Workshop on Very Large Corpora . Beijing, China. 66--80. Stetina, J. and Nagao, M. 1997. Corpus-based PP attachment ambiguity resolution with a semantic dictionary. In Proceedings of the 5th Workshop on Very Large Corpora. Beijing, China. 66--80."},{"key":"e_1_2_1_50_1","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. D. Lin and D. Wu, Eds","author":"Szpektor I.","unstructured":"Szpektor , I. , Tanev , H. , Dagan , I. , and Coppola , B . 2004. Scaling Web-based aquisition of entailment relations . In Proceedings of the Conference on Empirical Methods in Natural Language Processing. D. Lin and D. Wu, Eds . Barcelona, Spain. 41--48. Szpektor, I., Tanev, H., Dagan, I., and Coppola, B. 2004. Scaling Web-based aquisition of entailment relations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. D. Lin and D. Wu, Eds. Barcelona, Spain. 41--48."},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of the Corpus Linguistics Conference. P. Rayson, A. Wilson, T. McEnery, A. Hardie, and S. Khoja, Eds","author":"Volk M.","year":"2001","unstructured":"Volk , M. 2001 . Exploiting the WWW as a corpus to resolve PP attachment ambiguities . In Proceedings of the Corpus Linguistics Conference. P. Rayson, A. Wilson, T. McEnery, A. Hardie, and S. Khoja, Eds . Lancaster, UK. 601--606. Volk, M. 2001. Exploiting the WWW as a corpus to resolve PP attachment ambiguities. In Proceedings of the Corpus Linguistics Conference. P. Rayson, A. Wilson, T. McEnery, A. Hardie, and S. Khoja, Eds. Lancaster, UK. 601--606."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1162\/089120103322711596"},{"key":"e_1_2_1_53_1","volume-title":"Proceedings of the 1st Workshop on Computational Natural Language Learning","author":"Zavrel J.","unstructured":"Zavrel , J. , Daelemans , W. , and Veenstra , J . 1997. Resolving PP attachment ambiguities with memory-based learning . In Proceedings of the 1st Workshop on Computational Natural Language Learning . Madrid, Spain. 136--144. Zavrel, J., Daelemans, W., and Veenstra, J. 1997. Resolving PP attachment ambiguities with memory-based learning. In Proceedings of the 1st Workshop on Computational Natural Language Learning. Madrid, Spain. 136--144."},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the International Conference on Acoustics Speech and Signal Processing","author":"Zhu X.","unstructured":"Zhu , X. and Rosenfeld , R . 2001. Improving trigram language modeling with the World Wide Web . In Proceedings of the International Conference on Acoustics Speech and Signal Processing . Salt Lake City, Utah. Zhu, X. and Rosenfeld, R. 2001. Improving trigram language modeling with the World Wide Web. In Proceedings of the International Conference on Acoustics Speech and Signal Processing. Salt Lake City, Utah."}],"container-title":["ACM Transactions on Speech and Language Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1075389.1075392","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T12:18:02Z","timestamp":1672229882000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1075389.1075392"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,2]]},"references-count":53,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2005,2]]}},"alternative-id":["10.1145\/1075389.1075392"],"URL":"https:\/\/doi.org\/10.1145\/1075389.1075392","relation":{},"ISSN":["1550-4875","1550-4883"],"issn-type":[{"value":"1550-4875","type":"print"},{"value":"1550-4883","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,2]]},"assertion":[{"value":"2005-02-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}