{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T15:18:00Z","timestamp":1771600680105,"version":"3.50.1"},"reference-count":11,"publisher":"MIT Press - Journals","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computational Linguistics"],"published-print":{"date-parts":[[2003,9]]},"abstract":"<jats:p> This article shows that the Web can be employed to obtain frequencies for bigrams that are unseen in a given corpus. We describe a method for retrieving counts for adjective-noun, noun-noun, and verb-object bigrams from the Web by querying a search engine. We evaluate this method by demonstrating: (a) a high correlation between Web frequencies and corpus frequencies; (b) a reliable correlation between Web frequencies and plausibility judgments; (c) a reliable correlation between Web frequencies and frequencies recreated using class-based smoothing; (d) a good performance of Web frequencies in a pseudo disambiguation task. <\/jats:p>","DOI":"10.1162\/089120103322711604","type":"journal-article","created":{"date-parts":[[2004,1,23]],"date-time":"2004-01-23T04:43:15Z","timestamp":1074832995000},"page":"459-484","source":"Crossref","is-referenced-by-count":143,"title":["Using the Web to Obtain Frequencies for Unseen Bigrams"],"prefix":"10.1162","volume":"29","author":[{"given":"Frank","family":"Keller","sequence":"first","affiliation":[{"name":"University of Edinburgh, School of Informatics, 2 Buccleuch Place, Edinburgh EH8 9LW, UK."}]},{"given":"Mirella","family":"Lapata","sequence":"additional","affiliation":[{"name":"University of Sheffield, Department of Computer Science, 211 Portobello Street, Sheffield S1 4DP, UK."}]}],"member":"281","reference":[{"key":"p_6","doi-asserted-by":"publisher","DOI":"10.2307\/416793"},{"key":"p_11","doi-asserted-by":"publisher","DOI":"10.1162\/089120102760173643"},{"key":"p_12","doi-asserted-by":"publisher","DOI":"10.3758\/BF03196267"},{"key":"p_13","doi-asserted-by":"publisher","DOI":"10.1023\/A:1002497503122"},{"key":"p_16","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007537716579"},{"issue":"1","key":"p_21","first-page":"103","volume":"19","author":"Hindle Donald","year":"1993","journal-title":"Computational Linguistics"},{"key":"p_24","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1987.1165125"},{"key":"p_25","doi-asserted-by":"publisher","DOI":"10.1016\/S0010-0277(00)00131-1"},{"key":"p_30","doi-asserted-by":"publisher","DOI":"10.1162\/089120102760276018"},{"issue":"2","key":"p_38","first-page":"217","volume":"24","author":"Li Hang","year":"1998","journal-title":"Computational Linguistics"},{"key":"p_45","doi-asserted-by":"publisher","DOI":"10.1093\/ijl\/3.4.235"}],"container-title":["Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/089120103322711604","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T21:42:11Z","timestamp":1615585331000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/coli\/article\/29\/3\/459-484\/1811"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2003,9]]},"references-count":11,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2003,9]]}},"alternative-id":["10.1162\/089120103322711604"],"URL":"https:\/\/doi.org\/10.1162\/089120103322711604","relation":{},"ISSN":["0891-2017","1530-9312"],"issn-type":[{"value":"0891-2017","type":"print"},{"value":"1530-9312","type":"electronic"}],"subject":[],"published":{"date-parts":[[2003,9]]}}}