{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,2]],"date-time":"2025-06-02T13:47:54Z","timestamp":1748872074603},"reference-count":5,"publisher":"MIT Press - Journals","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computational Linguistics"],"published-print":{"date-parts":[[2016,6]]},"abstract":"<jats:p> Distributional semantic models, deriving vector-based word representations from patterns of word usage in corpora, have many useful applications (Turney and Pantel 2010 ). Recently, there has been interest in compositional distributional models, which derive vectors for phrases from representations of their constituent words (Mitchell and Lapata 2010 ). Often, the values of distributional vectors are pointwise mutual information (PMI) scores obtained from raw co-occurrence counts. In this article we study the relation between the PMI dimensions of a phrase vector and its components in order to gain insights into which operations an adequate composition model should perform. We show mathematically that the difference between the PMI dimension of a phrase vector and the sum of PMIs in the corresponding dimensions of the phrase's parts is an independently interpretable value, namely, a quantification of the impact of the context associated with the relevant dimension on the phrase's internal cohesion, as also measured by PMI. We then explore this quantity empirically, through an analysis of adjective\u2013noun composition. <\/jats:p>","DOI":"10.1162\/coli_a_00250","type":"journal-article","created":{"date-parts":[[2016,4,27]],"date-time":"2016-04-27T18:43:55Z","timestamp":1461782635000},"page":"345-350","source":"Crossref","is-referenced-by-count":8,"title":["When the Whole Is Less Than the Sum of Its Parts: How Composition Affects PMI Values in Distributional Semantic Vectors"],"prefix":"10.1162","volume":"42","author":[{"given":"Denis","family":"Paperno","sequence":"first","affiliation":[{"name":"University of Trento"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marco","family":"Baroni","sequence":"additional","affiliation":[{"name":"University of Trento"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","reference":[{"key":"R1","doi-asserted-by":"publisher","DOI":"10.3758\/s13428-011-0183-8"},{"key":"R2","unstructured":"Church, Kenneth and Peter Hanks. 1990. Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1):22\u201329."},{"key":"R3","unstructured":"Dinu, Georgiana, Nghia The Pham, and Marco Baroni. 2013. General estimation and evaluation of compositional distributional semantic models. In Proceedings of ACL Workshop on Continuous Vector Space Models and their Compositionality, pages 50\u201358, Sofia."},{"key":"R4","doi-asserted-by":"publisher","DOI":"10.1111\/j.1551-6709.2010.01106.x"},{"key":"R5","doi-asserted-by":"publisher","DOI":"10.1613\/jair.2934"}],"container-title":["Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/COLI_a_00250","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T21:27:53Z","timestamp":1615584473000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/coli\/article\/42\/2\/345-350\/1529"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,6]]},"references-count":5,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2016,6]]}},"alternative-id":["10.1162\/COLI_a_00250"],"URL":"https:\/\/doi.org\/10.1162\/coli_a_00250","relation":{},"ISSN":["0891-2017","1530-9312"],"issn-type":[{"value":"0891-2017","type":"print"},{"value":"1530-9312","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,6]]}}}