{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T08:30:40Z","timestamp":1776155440729,"version":"3.50.1"},"reference-count":87,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2022,3,2]],"date-time":"2022-03-02T00:00:00Z","timestamp":1646179200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,3,2]],"date-time":"2022-03-02T00:00:00Z","timestamp":1646179200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Lang Resources &amp; Evaluation"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Distributional semantics has deeply changed in the last decades. First, predict models stole the thunder from traditional count ones, and more recently both of them were replaced in many NLP applications by contextualized vectors produced by neural language models. Although an extensive body of research has been devoted to Distributional Semantic Model (DSM) evaluation, we still lack a thorough comparison with respect to tested models, semantic tasks, and benchmark datasets. Moreover, previous work has mostly focused on task-driven evaluation, instead of exploring the differences between the way models represent the lexical semantic space. In this paper, we perform a large-scale evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT. First of all, we investigate the performance of embeddings in several semantic tasks, carrying out an in-depth statistical analysis to identify the major factors influencing the behavior of DSMs. The results show that (i) the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous and (ii) static DSMs surpass BERT representations in most out-of-context semantic tasks and datasets. Furthermore, we borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models. RSA reveals important differences related to the frequency and part-of-speech of lexical items.<\/jats:p>","DOI":"10.1007\/s10579-021-09575-z","type":"journal-article","created":{"date-parts":[[2022,3,2]],"date-time":"2022-03-02T04:45:01Z","timestamp":1646196301000},"page":"1269-1313","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":56,"title":["A comparative evaluation and analysis of three generations of Distributional Semantic Models"],"prefix":"10.1007","volume":"56","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5790-4308","authenticated-orcid":false,"given":"Alessandro","family":"Lenci","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5100-0535","authenticated-orcid":false,"given":"Magnus","family":"Sahlgren","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4105-2021","authenticated-orcid":false,"given":"Patrick","family":"Jeuniaux","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2236-4978","authenticated-orcid":false,"given":"Amaru","family":"Cuba Gyllensten","sequence":"additional","affiliation":[]},{"given":"Martina","family":"Miliani","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,3,2]]},"reference":[{"key":"9575_CR1","doi-asserted-by":"crossref","unstructured":"Abdou, M., Kulmizev, A., Hill, F., Low, D. M., & S\u00f8gaard, A. (2019). Higher-order comparisons of sentence encoder representations. In Proceedings of EMNLP-IJCNLP 2019 (pp. 5838\u20135845).","DOI":"10.18653\/v1\/D19-1593"},{"key":"9575_CR2","doi-asserted-by":"crossref","unstructured":"Abnar, S., Beinborn, L., Choenni, R., & Zuidema, W. (2019). Blackbox Meets Blackbox: Representational similarity & stability analysis of neural language models and brains. In Proceedings of the Second BlackboxNLP Workshop (pp. 191\u2013203).","DOI":"10.18653\/v1\/W19-4820"},{"key":"9575_CR3","first-page":"107","volume":"6","author":"M Antoniak","year":"2018","unstructured":"Antoniak, M., & Mimno, D. (2018). Evaluating the stability of embedding-based word similarities. Transactions of the ACL, 6, 107\u2013119.","journal-title":"Transactions of the ACL"},{"issue":"4","key":"9575_CR4","doi-asserted-by":"publisher","first-page":"673","DOI":"10.1162\/coli_a_00016","volume":"36","author":"M Baroni","year":"2010","unstructured":"Baroni, M., & Lenci, A. (2010). Distributional memory: A general framework for corpus-based semantics. Computational Linguistics, 36(4), 673\u2013721.","journal-title":"Computational Linguistics"},{"key":"9575_CR5","unstructured":"Baroni, M., & Lenci, A. (2011). How we BLESSed distributional semantic evaluation. In Proceedings of the GEMS 2011 Workshop (pp. 1\u201310)."},{"key":"9575_CR6","doi-asserted-by":"crossref","unstructured":"Baroni, M., Dinu, G., & Kruszewski, G. (2014). Don\u2019t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of ACL 2014 (pp. 238\u2013247).","DOI":"10.3115\/v1\/P14-1023"},{"key":"9575_CR7","first-page":"993","volume":"3","author":"DM Blei","year":"2003","unstructured":"Blei, D. M., Ng, A., & Jordan, M. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993\u20131022.","journal-title":"Journal of Machine Learning Research"},{"key":"9575_CR8","first-page":"135","volume":"5","author":"P Bojanowski","year":"2017","unstructured":"Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the ACL, 5, 135\u2013146.","journal-title":"Transactions of the ACL"},{"key":"9575_CR9","doi-asserted-by":"publisher","first-page":"213","DOI":"10.1146\/annurev-linguistics-011619-030303","volume":"6","author":"G Boleda","year":"2020","unstructured":"Boleda, G. (2020). Distributional semantics and linguistic theory. Annual Review of Linguistics, 6, 213\u2013234.","journal-title":"Annual Review of Linguistics"},{"key":"9575_CR10","doi-asserted-by":"crossref","unstructured":"Bommasani, R., Davis, K., & Cardie, C. (2020). Interpreting Pretrained Contextualized Representations via Reductions to Static Embeddings. In Proceedings of ACL 2020 (pp. 4758\u20134781).","DOI":"10.18653\/v1\/2020.acl-main.431"},{"issue":"2","key":"9575_CR11","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1162\/artl.2006.12.2.229","volume":"12","author":"H Brighton","year":"2006","unstructured":"Brighton, H., & Kirby, S. (2006). Understanding linguistic evolution by visualizing the emergence of topographic mappings. Artificial Life, 12(2), 229\u2013242.","journal-title":"Artificial Life"},{"key":"9575_CR12","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1162\/coli.2006.32.1.13","volume":"32","author":"A Budanitsky","year":"2006","unstructured":"Budanitsky, A., & Hirst, G. (2006). Evaluating WordNet-based measures of lexical semantic relatedness. Computational Linguistics, 32, 13\u201347.","journal-title":"Computational Linguistics"},{"key":"9575_CR13","doi-asserted-by":"publisher","first-page":"510","DOI":"10.3758\/BF03193020","volume":"39","author":"J Bullinaria","year":"2007","unstructured":"Bullinaria, J., & Levy, J. P. (2007). Extracting semantic representations from word co-occurrence statistics: A computational study. Behavior Research Methods, 39, 510\u2013526.","journal-title":"Behavior Research Methods"},{"key":"9575_CR14","doi-asserted-by":"publisher","first-page":"890","DOI":"10.3758\/s13428-011-0183-8","volume":"44","author":"J Bullinaria","year":"2012","unstructured":"Bullinaria, J., & Levy, J. P. (2012). Extracting semantic representations from word co-occurrence statistics: Stop-lists, stemming, and SVD. Behavior Research Methods, 44, 890\u2013907.","journal-title":"Behavior Research Methods"},{"key":"9575_CR15","doi-asserted-by":"publisher","first-page":"743","DOI":"10.1613\/jair.1.11259","volume":"63","author":"J Camacho-Collados","year":"2018","unstructured":"Camacho-Collados, J., & Pilehvar, M. T. (2018). From word to sense embeddings: A survey on vector representations of meaning. Journal of Artificial Intelligence Research, 63, 743\u2013788.","journal-title":"Journal of Artificial Intelligence Research"},{"key":"9575_CR16","unstructured":"Carlsson, F., Gyllensten, A. C., Gogoulou, E., Hellqvist, E. Y., & Sahlgren, M. (2021). Semantic re-tuning with contrastive tension. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=Ov_sMNau-PF."},{"key":"9575_CR17","unstructured":"Chersoni, E., Santus, E., Blache, P., & Lenci, A. (2017). Is structure necessary for modeling argument expectations in distributional semantics? In Proceedings of 12th International Conference on Computational Semantics (IWCS 2017)."},{"key":"9575_CR18","doi-asserted-by":"crossref","unstructured":"Chiu, B., Korhonen, A., & Pyysalo, S. (2016). Intrinsic evaluation of word vectors fails to predict extrinsic performance. In Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NLP (pp. 1\u20136).","DOI":"10.18653\/v1\/W16-2501"},{"key":"9575_CR19","doi-asserted-by":"crossref","unstructured":"Chronis, G., & Erk, K. (2020). When is a bishop not like a rook? When it\u2019s like a rabbi! Multi-prototype BERT embeddings for estimating semantic relationships. In Proceedings of CoNLL 2020 (pp 227\u2013244).","DOI":"10.18653\/v1\/2020.conll-1.17"},{"key":"9575_CR20","first-page":"2952","volume":"2019","author":"G Chrupa\u0142a","year":"2019","unstructured":"Chrupa\u0142a, G., & Alishahi, A. (2019). Correlating neural and symbolic representations of language. Proceedings of ACL 2019 (pp. 2952\u20132962).","journal-title":"Proceedings of ACL"},{"issue":"1","key":"9575_CR21","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1017\/S1351324916000334","volume":"23","author":"KW Church","year":"2017","unstructured":"Church, K. W. (2017). Emerging trends: Word2Vec. Natural Language Engineering, 23(1), 155\u2013162.","journal-title":"Natural Language Engineering"},{"key":"9575_CR22","unstructured":"Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT 2019 (pp. 4171\u20134186)."},{"key":"9575_CR23","doi-asserted-by":"publisher","first-page":"449","DOI":"10.1017\/S0140525X98001253","volume":"21","author":"S Edelman","year":"1998","unstructured":"Edelman, S. (1998). Representation is representation of similarities. Behavioral and Brain Sciences, 21, 449\u2013498.","journal-title":"Behavioral and Brain Sciences"},{"key":"9575_CR24","first-page":"92","volume":"2010","author":"K Erk","year":"2010","unstructured":"Erk, K., & Pad\u00f3, S. (2010). Exemplar-based models for word meaning in context. Proceedings of ACL 2010 (pp. 92\u201397).","journal-title":"Proceedings of ACL"},{"key":"9575_CR25","doi-asserted-by":"crossref","unstructured":"Ethayarajh, K. (2019). How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings. In Proceedings of EMNLP-IJCNLP 2019 (pp. 55\u201365).","DOI":"10.18653\/v1\/D19-1006"},{"key":"9575_CR26","unstructured":"Ghannay, S., Favre, B., Est\u00e8ve, Y., & Camelin, N. (2016). Word embedding evaluation and combination. In Proceedings of LREC 2016, Portoro\u017e (pp. 300\u2013305)."},{"key":"9575_CR27","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-02165-7","volume-title":"Neural network methods for natural language processing","author":"Y Goldberg","year":"2017","unstructured":"Goldberg, Y. (2017). Neural network methods for natural language processing. Morgan & Claypool."},{"key":"9575_CR28","unstructured":"Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. The MIT Press."},{"issue":"2","key":"9575_CR29","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1037\/0033-295X.114.2.211","volume":"114","author":"TL Griffiths","year":"2007","unstructured":"Griffiths, T. L., Tenenbaum, J., & Steyvers, M. (2007). Topics in semantic representation. Psychological Review, 114(2), 211\u2013244.","journal-title":"Psychological Review"},{"key":"9575_CR30","doi-asserted-by":"crossref","unstructured":"Han, X., Zhang, Z., Ding, N., Gu, Y., Liu, X., Huo, Y., Qiu, J., Zhang, L., Han, W., Huang, M., Jin, Q., Lan, Y., Liu, Y., Liu, Z., Lu, Z., Qiu, X., Song, R., Tang, J., Wen, J. R., Yuan, J., Zhao, W. X., & Zhu, J. (2021). Pre-trained models: Past, present and future. arXiv arXiv:2106.07139.","DOI":"10.1016\/j.aiopen.2021.08.002"},{"issue":"2\u20133","key":"9575_CR31","doi-asserted-by":"publisher","first-page":"146","DOI":"10.1080\/00437956.1954.11659520","volume":"10","author":"ZS Harris","year":"1954","unstructured":"Harris, Z. S. (1954). Distributional structure. Word, 10(2\u20133), 146\u2013162.","journal-title":"Word"},{"key":"9575_CR32","unstructured":"Jastrz\u0229bski, S., Le\u015bniak, D., & Czarnecki, W. M. (2017). How to evaluate word embeddings? on importance of data efficiency and simple supervised tasks. arXiv arXiv:1702.02170."},{"key":"9575_CR33","doi-asserted-by":"crossref","unstructured":"Jawahar, G., Sagot, B., & Seddah, D. (2019). What Does BERT Learn about the Structure of Language? In Proceedings of ACL 2019 (pp. 3651\u20133657).","DOI":"10.18653\/v1\/P19-1356"},{"key":"9575_CR34","unstructured":"Kanerva, P., Kristofersson, J., & Holst, A. (2000). Random indexing of text samples for latent semantic analysis. In Proceedings of CogSci 2000 (pp. 1036)."},{"key":"9575_CR35","doi-asserted-by":"crossref","unstructured":"Kiela, D., & Clark, S. (2014). A systematic study of semantic vector space model parameters. In Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (pp. 21\u201330).","DOI":"10.3115\/v1\/W14-1503"},{"key":"9575_CR36","doi-asserted-by":"crossref","unstructured":"Kim, Y. (2014). Convolutional neural networks for sentence classification. In Proceedings of EMNLP 2014 (pp. 1746\u20131751).","DOI":"10.3115\/v1\/D14-1181"},{"issue":"8","key":"9575_CR37","doi-asserted-by":"publisher","first-page":"401","DOI":"10.1016\/j.tics.2013.06.007","volume":"17","author":"N Kriegeskorte","year":"2013","unstructured":"Kriegeskorte, N., & Kievit, R. A. (2013). Representational geometry: Integrating cognition, computation, and the brain. Trends in Cognitive Sciences, 17(8), 401\u2013412.","journal-title":"Trends in Cognitive Sciences"},{"key":"9575_CR38","doi-asserted-by":"crossref","first-page":"4","DOI":"10.3389\/neuro.01.016.2008","volume":"2","author":"N Kriegeskorte","year":"2008","unstructured":"Kriegeskorte, N., Mur, M., & Bandettini, P. (2008). Representational similarity analysis: Connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience, 2, 4.","journal-title":"Frontiers in Systems Neuroscience"},{"issue":"2","key":"9575_CR39","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1037\/0033-295X.104.2.211","volume":"104","author":"TK Landauer","year":"1997","unstructured":"Landauer, T. K., & Dumais, S. (1997). A solution to Plato\u2019s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211\u2013240.","journal-title":"Psychological Review"},{"key":"9575_CR40","first-page":"531","volume":"2","author":"G Lapesa","year":"2014","unstructured":"Lapesa, G., & Evert, S. (2014). A large scale evaluation of distributional semantic models: Parameters, interactions and model selection. Transactions of the ACL, 2, 531\u2013545.","journal-title":"Transactions of the ACL"},{"key":"9575_CR41","first-page":"394","volume":"2017","author":"G Lapesa","year":"2017","unstructured":"Lapesa, G., & Evert, S. (2017). Large-scale evaluation of dependency-based dsms: Are they worth the effort? Proceedings of EACL, 2017 (pp. 394\u2013400).","journal-title":"Proceedings of EACL"},{"issue":"1","key":"9575_CR42","first-page":"1","volume":"20","author":"A Lenci","year":"2008","unstructured":"Lenci, A. (2008). Distributional approaches in linguistic and cognitive research. Italian Journal of Linguistics, 20(1), 1\u201331.","journal-title":"Italian Journal of Linguistics"},{"key":"9575_CR43","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1146\/annurev-linguistics-030514-125254","volume":"4","author":"A Lenci","year":"2018","unstructured":"Lenci, A. (2018). Distributional models of word meaning. Annual Review of Linguistics, 4, 151\u2013171.","journal-title":"Annual Review of Linguistics"},{"key":"9575_CR44","first-page":"302","volume":"2014","author":"O Levy","year":"2014","unstructured":"Levy, O., & Goldberg, Y. (2014a). Dependency-based word embeddings. Proceedings of ACL 2014 (pp. 302\u2013308).","journal-title":"Proceedings of ACL"},{"key":"9575_CR45","first-page":"171","volume":"2014","author":"O Levy","year":"2014","unstructured":"Levy, O., & Goldberg, Y. (2014b). Linguistic regularities in sparse and explicit word representations. Proceedings of CoNLL 2014 (pp. 171\u2013180).","journal-title":"Proceedings of CoNLL"},{"key":"9575_CR46","unstructured":"Levy, O., & Goldberg, Y. (2014c). Neural word embedding as implicit matrix factorization. In Proceedings of Advances in Neural Information Processing Systems (NIPS) (pp. 1\u20139)."},{"key":"9575_CR47","first-page":"211","volume":"3","author":"O Levy","year":"2015","unstructured":"Levy, O., Goldberg, Y., & Dagan, I. (2015). Improving distributional similarity with lessons learned from word embeddings. Transactions of the ACL, 3, 211\u2013225.","journal-title":"Transactions of the ACL"},{"key":"9575_CR48","doi-asserted-by":"crossref","unstructured":"Li, B., Tao, L., Zhao, Z., Tang, B., Drozd, A., Rogers, A., & Du, X. (2017). Investigating different context types and representations for learning word embeddings. In Proceedings of EMNLP 2017 (pp. 2411\u20132421).","DOI":"10.18653\/v1\/D17-1257"},{"key":"9575_CR49","doi-asserted-by":"crossref","unstructured":"Linzen, T. (2016). Issues in evaluating semantic spaces using word analogies. In 1st Workshop on Evaluating Vector Space Representations for NLP (pp. 13\u201318).","DOI":"10.18653\/v1\/W16-2503"},{"key":"9575_CR50","unstructured":"Liu, Q., Kusner, M. J., & Blunsom, P. (2020). A survey on contextual embeddings. arXiv arXiv:2003.07278."},{"key":"9575_CR51","doi-asserted-by":"publisher","first-page":"203","DOI":"10.3758\/BF03204766","volume":"28","author":"K Lund","year":"1996","unstructured":"Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28, 203\u2013208.","journal-title":"Behavior Research Methods, Instruments, & Computers"},{"key":"9575_CR52","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1016\/j.jml.2016.04.001","volume":"92","author":"P Mandera","year":"2017","unstructured":"Mandera, P., Keuleers, E., & Brysbaert, M. (2017). Explaining human performance in psycholinguistic tasks with models of semantic similarity based on prediction and counting: A review and empirical validation. Journal of Memory and Language, 92, 57\u201378.","journal-title":"Journal of Memory and Language"},{"key":"9575_CR53","first-page":"55","volume":"2014","author":"CD Manning","year":"2014","unstructured":"Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. Proceedings of ACL 2014 (pp. 55\u201360).","journal-title":"Proceedings of ACL"},{"issue":"2","key":"9575_CR54","doi-asserted-by":"publisher","first-page":"254","DOI":"10.1037\/0033-295X.100.2.254","volume":"100","author":"DL Medin","year":"1993","unstructured":"Medin, D. L., Goldstone, R. L., & Gentner, D. (1993). Respects for similarity. Psychological Review, 100(2), 254\u2013278.","journal-title":"Psychological Review"},{"key":"9575_CR55","doi-asserted-by":"crossref","unstructured":"Melamud, O., McClosky, D., Patwardhan, S., & Bansal, M. (2016). The role of context types and dimensionality in learning word embeddings. In Proceedings of NAACL-HLT 2016 (pp. 1030\u20131040).","DOI":"10.18653\/v1\/N16-1118"},{"key":"9575_CR56","unstructured":"Mickus, T., Paperno, D., Constant, M., & van Deemter, K. (2020). What do you mean, BERT? Assessing BERT as a distributional semantics model. In Proceedings of the Society for Computation in Linguistics 2020 (pp. 235\u2013245)."},{"key":"9575_CR57","unstructured":"Mikolov, T., Chen, K., Corrado, G.S., & Dean, J. (2013a). Efficient estimation of word representations in vector space. In Proceedings of ICLR 2013."},{"key":"9575_CR58","unstructured":"Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., & Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26 (NIPS 2013) (pp. 3111\u20133119)."},{"key":"9575_CR59","unstructured":"Mikolov, T., & tau Yih W, Zweig G,. (2013). Linguistic Regularities in Continuous Space Word Representations. Proceedings NAACL-HLT 2013 (pp. 746\u2013751)."},{"key":"9575_CR60","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/1602.001.0001","volume-title":"The Big Book of Concepts","author":"G Murphy","year":"2002","unstructured":"Murphy, G. (2002). The Big Book of Concepts. The MIT Press."},{"issue":"2","key":"9575_CR61","doi-asserted-by":"publisher","first-page":"161","DOI":"10.1162\/coli.2007.33.2.161","volume":"33","author":"S Pad\u00f3","year":"2007","unstructured":"Pad\u00f3, S., & Lapata, M. (2007). Dependency-based construction of semantic space models. Computational Linguistics, 33(2), 161\u2013199.","journal-title":"Computational Linguistics"},{"key":"9575_CR62","doi-asserted-by":"crossref","unstructured":"Pennington, J., Socher, R., & Manning, C. D. (2014). GloVe: Global Vectors for Word Representation. Proceedings EMNLP 2014 (pp. 1532\u20131543).","DOI":"10.3115\/v1\/D14-1162"},{"key":"9575_CR63","doi-asserted-by":"crossref","unstructured":"Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of NAACL-HLT (pp. 2227\u20132237).","DOI":"10.18653\/v1\/N18-1202"},{"key":"9575_CR64","doi-asserted-by":"publisher","first-page":"104440","DOI":"10.1016\/j.cognition.2020.104440","volume":"205","author":"JC Peterson","year":"2020","unstructured":"Peterson, J. C., Chen, D., & Griffiths, T. L. (2020). Parallelograms revisited: Exploring the limitations of vector space models for simple analogies. Cognition, 205, 104440.","journal-title":"Cognition"},{"key":"9575_CR65","unstructured":"Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. OpenAI: Tech. rep."},{"key":"9575_CR66","unstructured":"Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI: Tech. rep."},{"key":"9575_CR67","unstructured":"\u0158eh\u016d\u0159ek, R., & Sojka, P. (2010). Software framework for topic modelling with large corpora. In Proceedings of LREC 2010 Workshop New Challenges for NLP Frameworks (pp. 45\u201350)."},{"key":"9575_CR68","unstructured":"Ren, Y., Guo, S., Labeau, M., Cohen, S.B., & Kirby, S. (2020). Compositional languages emerge in a neural iterated learning model. In Proceedings of ICLR 2020 (pp. 1\u201322)."},{"key":"9575_CR69","doi-asserted-by":"crossref","unstructured":"Rogers, A., Drozd, A., & Li, B. (2017). The (too Many) Problems of Analogical Reasoning with Word Vectors. Proceedings *SEM 2017 (pp. 135\u2013148).","DOI":"10.18653\/v1\/S17-1017"},{"key":"9575_CR70","unstructured":"Rogers, A., Ananthakrishna, S. H., & Rumshisky, A. (2018). What\u2019s in your embedding, and how it predicts task performance. In Proceedings of COLING 2018 (pp. 2690\u20132703)."},{"key":"9575_CR71","unstructured":"Sahlgren, M. (2006). The word-space model. using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Phd thesis, Stockholm University."},{"issue":"1","key":"9575_CR72","first-page":"31","volume":"20","author":"M Sahlgren","year":"2008","unstructured":"Sahlgren, M. (2008). The distributional hypothesis. Italian Journal of Linguistics, 20(1), 31\u201351.","journal-title":"Italian Journal of Linguistics"},{"key":"9575_CR73","doi-asserted-by":"crossref","unstructured":"Sahlgren, M., & Lenci, A. (2016). The effects of data size and frequency range on distributional semantic models. In Proceedings of EMNLP 2016 (pp. 975\u2013980).","DOI":"10.18653\/v1\/D16-1099"},{"key":"9575_CR74","first-page":"1300","volume":"2008","author":"M Sahlgren","year":"2008","unstructured":"Sahlgren, M., Holst, A., & Kanerva, P. (2008). Permutations as a means to encode order in word space. Proceedings of CogSci, 2008 (pp. 1300\u20131305).","journal-title":"Proceedings of CogSci"},{"key":"9575_CR75","unstructured":"Sahlgren, M., Gyllensten, A. C., Espinoza, F., Hamfors, O., Karlgren, J., Olsson, F., Persson, P., Viswanathan, A., & Holst, A. (2016). The Gavagai Living Lexicon. In Proceedings of LREC 2016 (pp. 344\u2013350)."},{"issue":"11","key":"9575_CR76","doi-asserted-by":"publisher","first-page":"613","DOI":"10.1145\/361219.361220","volume":"18","author":"G Salton","year":"1975","unstructured":"Salton, G., Wong, A., & Yang, C. S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613\u2013620.","journal-title":"Communications of the ACM"},{"key":"9575_CR77","doi-asserted-by":"crossref","unstructured":"Schluter, N. (2018). The word analogy testing caveat. In Proceedings of NAACL-HLT (pp. 242\u2013246).","DOI":"10.18653\/v1\/N18-2039"},{"key":"9575_CR78","doi-asserted-by":"crossref","unstructured":"Schnabel, T., Labutov, I., Mimno, D., & Joachims, T. (2015). Evaluation methods for unsupervised word embeddings. In Proceedings of EMNLP 2015 (pp. 298\u2013307).","DOI":"10.18653\/v1\/D15-1036"},{"key":"9575_CR79","doi-asserted-by":"crossref","unstructured":"Tenney, I., Das, D., & Pavlick, E. (2019). BERT Rediscovers the Classical NLP Pipeline. In Proceedings of ACL 2019 (pp. 4593\u20134601).","DOI":"10.18653\/v1\/P19-1452"},{"key":"9575_CR80","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1613\/jair.2934","volume":"37","author":"PD Turney","year":"2010","unstructured":"Turney, P. D., & Pantel, P. (2010). From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37, 141\u2013188.","journal-title":"Journal of Artificial Intelligence Research"},{"key":"9575_CR81","doi-asserted-by":"crossref","unstructured":"Ushio, A., Espinosa-Anke, L., Schockaert, S., & Camacho-Collados, J. (2021). BERT is to NLP what AlexNet is to CV: Can pre-trained language models identify analogies? arXiv arXiv:2105.04949.","DOI":"10.18653\/v1\/2021.acl-long.280"},{"key":"9575_CR82","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, \u0141., & Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems 30 (NIPS 2017)."},{"key":"9575_CR83","doi-asserted-by":"crossref","unstructured":"Vuli\u0107, I., Ponti, E.M., Litschko, R., Glava\u0161, G., & Korhonen, A. (2020). Probing pretrained language models for lexical semantics. In Proceedings of EMNLP  (pp. 7222\u20137240).","DOI":"10.18653\/v1\/2020.emnlp-main.586"},{"key":"9575_CR84","doi-asserted-by":"crossref","unstructured":"Westera, M., & Boleda, G. (2019). Don\u2019t Blame Distributional Semantics if it can\u2019t do Entailment. In Proceedings of the 13th International Conference on Computational Semantics (pp. 120\u2013133).","DOI":"10.18653\/v1\/W19-0410"},{"key":"9575_CR85","unstructured":"Wiedemann, G., Remus, S., Chawla, A., & Biemann, C. (2019). Does BERT make any sense? Interpretable word sense disambiguation with contextualized embeddings. In Proceedings of the Conference on Natural Language Processing (KONVENS)."},{"key":"9575_CR86","series-title":"Practical machine learning tools and techniques","volume-title":"Data mining","author":"IH Witten","year":"2005","unstructured":"Witten, I. H., & Frank, E. (2005). Data mining. Practical machine learning tools and techniques (2nd ed.). Elsevier.","edition":"2"},{"key":"9575_CR87","doi-asserted-by":"crossref","unstructured":"Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Funtowicz, M., Davison, J., Shleifer, S., Von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., & Rush, A. M. (2020). Transformers: State-of-the-Art Natural Language Processing. In Proceedings of EMNLP 2020 (pp. 38\u201345).","DOI":"10.18653\/v1\/2020.emnlp-demos.6"}],"container-title":["Language Resources and Evaluation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10579-021-09575-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10579-021-09575-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10579-021-09575-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,31]],"date-time":"2022-10-31T13:22:54Z","timestamp":1667222574000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10579-021-09575-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,2]]},"references-count":87,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["9575"],"URL":"https:\/\/doi.org\/10.1007\/s10579-021-09575-z","relation":{},"ISSN":["1574-020X","1574-0218"],"issn-type":[{"value":"1574-020X","type":"print"},{"value":"1574-0218","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,2]]},"assertion":[{"value":"22 December 2021","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 March 2022","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 July 2022","order":3,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Update","order":4,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Missing Open Access funding information has been added in the Funding Note","order":5,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}}]}}