{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T00:24:48Z","timestamp":1775175888812,"version":"3.50.1"},"reference-count":7,"publisher":"MIT Press - Journals","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["TACL"],"published-print":{"date-parts":[[2016,12]]},"abstract":"<jats:p> Semantic word embeddings represent the meaning of a word via a vector, and are created by diverse methods. Many use nonlinear operations on co-occurrence statistics, and have hand-tuned hyperparameters and reweighting methods. <\/jats:p><jats:p> This paper proposes a new generative model, a dynamic version of the log-linear topic model of Mnih and Hinton (2007). The methodological novelty is to use the prior to compute closed form expressions for word statistics. This provides a theoretical justification for nonlinear models like PMI, word2vec, and GloVe, as well as some hyperparameter choices. It also helps explain why low-dimensional semantic embeddings contain linear algebraic structure that allows solution of word analogies, as shown by Mikolov et al. (2013a) and many subsequent papers. <\/jats:p><jats:p> Experimental support is provided for the generative model assumptions, the most important of which is that latent word vectors are fairly uniformly dispersed in space. <\/jats:p>","DOI":"10.1162\/tacl_a_00106","type":"journal-article","created":{"date-parts":[[2018,12,28]],"date-time":"2018-12-28T15:44:07Z","timestamp":1546011847000},"page":"385-399","source":"Crossref","is-referenced-by-count":108,"title":["A Latent Variable Model Approach to PMI-based Word                     Embeddings"],"prefix":"10.1162","volume":"4","author":[{"given":"Sanjeev","family":"Arora","sequence":"first","affiliation":[{"name":"Computer Science Department, Princeton University, 35 Olden St, Princeton,                         NJ 08540,"}]},{"given":"Yuanzhi","family":"Li","sequence":"additional","affiliation":[{"name":"Computer Science Department, Princeton University, 35 Olden St, Princeton,                         NJ 08540,"}]},{"given":"Yingyu","family":"Liang","sequence":"additional","affiliation":[{"name":"Computer Science Department, Princeton University, 35 Olden St, Princeton,                         NJ 08540,"}]},{"given":"Tengyu","family":"Ma","sequence":"additional","affiliation":[{"name":"Computer Science Department, Princeton University, 35 Olden St, Princeton,                         NJ 08540,"}]},{"given":"Andrej","family":"Risteski","sequence":"additional","affiliation":[{"name":"Computer Science Department, Princeton University, 35 Olden St, Princeton,                         NJ 08540,"}]}],"member":"281","reference":[{"key":"p_6","author":"Black Fischer","year":"1973","journal-title":"Journal of Political Economy."},{"key":"p_13","author":"Deerwester Scott C.","year":"1990","journal-title":"Journal of the American Society for Information Science."},{"key":"p_14","author":"Duchi John","year":"2011","journal-title":"The Journal of Machine Learning Research."},{"key":"p_16","author":"Globerson Amir","year":"2007","journal-title":"Journal of Machine Learning Research."},{"key":"p_17","author":"Hashimoto Tatsunori B.","year":"2016","journal-title":"Transactions of the Association for Computational Linguistics."},{"key":"p_19","author":"Hsu Daniel","year":"2012","journal-title":"Journal of Computer and System Sciences."},{"key":"p_33","author":"Turney Peter D.","year":"2010","journal-title":"Journal of Artificial Intelligence Research."}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/tacl_a_00106","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T21:38:28Z","timestamp":1615585108000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/43373"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,12]]},"references-count":7,"alternative-id":["10.1162\/tacl_a_00106"],"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00106","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,12]]}}}