{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,4]],"date-time":"2026-07-04T04:51:17Z","timestamp":1783140677068,"version":"3.54.6"},"reference-count":16,"publisher":"Cambridge University Press (CUP)","issue":"2","license":[{"start":{"date-parts":[[2008,9,12]],"date-time":"2008-09-12T00:00:00Z","timestamp":1221177600000},"content-version":"unspecified","delay-in-days":4852,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[1995,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Shannon (1948) showed that a wide range of practical problems can be reduced to the problem of estimating probability distributions of words and ngrams in text. It has become standard practice in text compression, speech recognition, information retrieval and many other applications of Shannon's theory to introduce a \u201cbag-of-words\u201d assumption. But obviously, word rates vary from genre to genre, author to author, topic to topic, document to document, section to section, and paragraph to paragraph. The proposed Poisson mixture captures much of this heterogeneous structure by allowing the Poisson parameter \u03b8 to vary over documents subject to a density function \u03c6. \u03c6 is intended to capture dependencies on hidden variables such genre, author, topic, etc. (The Negative Binomial is a well-known special case where \u03c6 is a \u0413 distribution.) Poisson mixtures fit the data better than standard Poissons, producing more accurate estimates of the variance over documents (\u03c3<jats:sup>2<\/jats:sup>), entropy (H), inverse document frequency (IDF), and adaptation (Pr(<jats:italic>x<\/jats:italic> \u2265 2\/<jats:italic>x<\/jats:italic> \u2265 1)).<\/jats:p>","DOI":"10.1017\/s1351324900000139","type":"journal-article","created":{"date-parts":[[2008,9,12]],"date-time":"2008-09-12T11:21:29Z","timestamp":1221218489000},"page":"163-190","source":"Crossref","is-referenced-by-count":196,"title":["Poisson mixtures"],"prefix":"10.1017","volume":"1","author":[{"given":"Kenneth W.","family":"Church","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"William A.","family":"Gale","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"56","published-online":{"date-parts":[[2008,9,12]]},"reference":[{"key":"S1351324900000139_ref012","volume-title":"Automatic Text Processing","author":"Salton","year":"1989"},{"key":"S1351324900000139_ref009","first-page":"108","volume-title":"Adaptive Language Modeling using the Maximum Entropy Principle","author":"Lau","year":"1993"},{"key":"S1351324900000139_ref015","volume-title":"Information Retrieval","author":"van Rijsbergen","year":"1979"},{"key":"S1351324900000139_ref003","doi-asserted-by":"publisher","DOI":"10.1002\/asi.4630250505"},{"key":"S1351324900000139_ref016","volume-title":"Word-Sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora","author":"Yarowsky","year":"1992"},{"key":"S1351324900000139_ref008","volume-title":"Discrete Distributions","author":"Johnson","year":"1969"},{"key":"S1351324900000139_ref013","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1948.tb01338.x"},{"key":"S1351324900000139_ref002","unstructured":"Bookstein A. , (1982) Explanation and generalization of vector models in information. In Conference on Research and Development in Information Retrieval (SIGIR). Pp. 118\u2013132."},{"key":"S1351324900000139_ref010","volume-title":"Inference and Disputed Authorship: The Federalist","author":"Mosteller","year":"1964"},{"key":"S1351324900000139_ref001","volume-title":"Text Compression","author":"Bell","year":"1990"},{"key":"S1351324900000139_ref004","doi-asserted-by":"publisher","DOI":"10.1002\/0471200611"},{"key":"S1351324900000139_ref005","volume-title":"Frequency Analysis of English Usage","author":"Francis","year":"1982"},{"key":"S1351324900000139_ref006","first-page":"415","article-title":"A method for disambiguating word senses in a large corpus","author":"Gale","year":"1993","journal-title":"Computers and Humanities"},{"key":"S1351324900000139_ref007","doi-asserted-by":"publisher","DOI":"10.1002\/asi.4630260402"},{"key":"S1351324900000139_ref011","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4471-2099-5_24"},{"key":"S1351324900000139_ref014","doi-asserted-by":"publisher","DOI":"10.1108\/eb026526"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324900000139","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,5,13]],"date-time":"2019-05-13T21:42:00Z","timestamp":1557783720000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324900000139\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1995,6]]},"references-count":16,"journal-issue":{"issue":"2","published-print":{"date-parts":[[1995,6]]}},"alternative-id":["S1351324900000139"],"URL":"https:\/\/doi.org\/10.1017\/s1351324900000139","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"value":"1351-3249","type":"print"},{"value":"1469-8110","type":"electronic"}],"subject":[],"published":{"date-parts":[[1995,6]]}}}