{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,20]],"date-time":"2026-05-20T17:01:13Z","timestamp":1779296473160,"version":"3.51.4"},"reference-count":145,"publisher":"MIT Press - Journals","issue":"1","license":[{"start":{"date-parts":[[2021,3,6]],"date-time":"2021-03-06T00:00:00Z","timestamp":1614988800000},"content-version":"vor","delay-in-days":5,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,4,21]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Research into representation learning models of lexical semantics usually utilizes some form of intrinsic evaluation to ensure that the learned representations reflect human semantic judgments. Lexical semantic similarity estimation is a widely used evaluation method, but efforts have typically focused on pairwise judgments of words in isolation, or are limited to specific contexts and lexical stimuli. There are limitations with these approaches that either do not provide any context for judgments, and thereby ignore ambiguity, or provide very specific sentential contexts that cannot then be used to generate a larger lexical resource. Furthermore, similarity between more than two items is not considered. We provide a full description and analysis of our recently proposed methodology for large-scale data set construction that produces a semantic classification of a large sample of verbs in the first phase, as well as multi-way similarity judgments made within the resultant semantic classes in the second phase. The methodology uses a spatial multi-arrangement approach proposed in the field of cognitive neuroscience for capturing multi-way similarity judgments of visual stimuli. We have adapted this method to handle polysemous linguistic stimuli and much larger samples than previous work. We specifically target verbs, but the method can equally be applied to other parts of speech. We perform cluster analysis on the data from the first phase and demonstrate how this might be useful in the construction of a comprehensive verb resource. We also analyze the semantic information captured by the second phase and discuss the potential of the spatially induced similarity judgments to better reflect human notions of word similarity. We demonstrate how the resultant data set can be used for fine-grained analyses and evaluation of representation learning models on the intrinsic tasks of semantic clustering and semantic similarity. In particular, we find that stronger static word embedding methods still outperform lexical representations emerging from more recent pre-training methods, both on word-level similarity and clustering. Moreover, thanks to the data set\u2019s vast coverage, we are able to compare the benefits of specializing vector representations for a particular type of external knowledge by evaluating FrameNet- and VerbNet-retrofitted models on specific semantic domains such as \u201cHeat\u201d or \u201cMotion.\u201d<\/jats:p>","DOI":"10.1162\/coli_a_00396","type":"journal-article","created":{"date-parts":[[2021,3,5]],"date-time":"2021-03-05T18:59:47Z","timestamp":1614970787000},"page":"69-116","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":7,"title":["Semantic Data Set Construction from Human Clustering and Spatial Arrangement"],"prefix":"10.1162","volume":"47","author":[{"given":"Olga","family":"Majewska","sequence":"first","affiliation":[{"name":"Language Technology Lab, University of Cambridge. om304@cam.ac.uk"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Diana","family":"McCarthy","sequence":"additional","affiliation":[{"name":"Language Technology Lab, University of Cambridge. diana@dianamccarthy.co.uk"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jasper J. F.","family":"van den Bosch","sequence":"additional","affiliation":[{"name":"School of Psychology, University of Birmingham. vandejjf@bham.ac.uk"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nikolaus","family":"Kriegeskorte","sequence":"additional","affiliation":[{"name":"Zuckerman Institute, University of Columbia. nk2765@columbia.edu"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ivan","family":"Vuli\u0107","sequence":"additional","affiliation":[{"name":"Language Technology Lab, University of Cambridge. iv250@cam.ac.uk"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anna","family":"Korhonen","sequence":"additional","affiliation":[{"name":"Language Technology Lab, University of Cambridge. alk23@cam.ac.uk"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","published-online":{"date-parts":[[2021,4,21]]},"reference":[{"key":"2021042218045184000_bib1","first-page":"19","article-title":"A study on similarity and relatedness using distributional and WordNet-based approaches","volume-title":"Proceedings of NAACL-HLT","author":"Agirre","year":"2009"},{"issue":"3","key":"2021042218045184000_bib2","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1016\/j.tjem.2018.08.001","article-title":"User\u2019s guide to correlation coefficients","volume":"18","author":"Akoglu","year":"2018","journal-title":"Turkish Journal of Emergency Medicine"},{"key":"2021042218045184000_bib3","first-page":"183","article-title":"Polyglot: Distributed word representations for multilingual NLP","volume-title":"Proceedings of CoNLL","author":"Al-Rfou","year":"2013"},{"issue":"3","key":"2021042218045184000_bib4","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1016\/S0010-0277(99)00059-1","article-title":"Incremental interpretation at verbs: Restricting the domain of subsequent reference","volume":"73","author":"Altmann","year":"1999","journal-title":"Cognition"},{"issue":"4","key":"2021042218045184000_bib5","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1007\/s10791-008-9066-8","article-title":"A comparison of extrinsic clustering evaluation metrics based on formal constraints","volume":"12","author":"Amig\u00f3","year":"2009","journal-title":"Information Retrieval"},{"key":"2021042218045184000_bib6","first-page":"5878","article-title":"CoSimLex: A resource for evaluating graded word similarity in context","volume-title":"Proceedings of the 12th Language Resources and Evaluation Conference","author":"Armendariz","year":"2020"},{"key":"2021042218045184000_bib7","first-page":"505","article-title":"Big BiRD: A large, fine-grained, bigram relatedness dataset for examining semantic composition","volume-title":"Proceedings of NAACL-HLT","author":"Asaadi","year":"2019"},{"key":"2021042218045184000_bib8","first-page":"106","article-title":"Improving reliability of word similarity evaluation by redesigning annotation task and performance measure","volume-title":"Proceedings of REPEVAL","author":"Avraham","year":"2016"},{"key":"2021042218045184000_bib9","first-page":"563","article-title":"Algorithms for scoring coreference chains","volume-title":"Proceedings of LREC","author":"Bagga","year":"1998"},{"key":"2021042218045184000_bib10","first-page":"86","article-title":"The Berkeley FrameNet project","volume-title":"Proceedings of COLING","author":"Baker","year":"1998"},{"key":"2021042218045184000_bib11","first-page":"278","article-title":"An unsupervised model for instance level subcategorization acquisition","volume-title":"Proceedings of EMNLP","author":"Baker","year":"2014"},{"key":"2021042218045184000_bib12","first-page":"238","article-title":"Don\u2019t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors","volume-title":"Proceedings of ACL","author":"Baroni","year":"2014"},{"key":"2021042218045184000_bib13","first-page":"1","article-title":"How we BLESSed distributional semantic evaluation","volume-title":"Proceedings of the GEMS 2011 Workshop on Geometrical Models of Natural Language Semantics, EMNLP 2011","author":"Baroni","year":"2011"},{"key":"2021042218045184000_bib14","first-page":"7","article-title":"A critique of word similarity as a method for evaluating distributional semantic models","volume-title":"Proceedings of REPEVAL","author":"Batchkarov","year":"2016"},{"key":"2021042218045184000_bib15","article-title":"Automated generation of multilingual clusters for the evaluation of distributed representations","volume-title":"Proceedings of ICLR Workshop Papers","author":"Blair","year":"2017"},{"key":"2021042218045184000_bib16","first-page":"135","article-title":"Enriching word vectors with subword information","volume":"5","author":"Bojanowski","year":"2017","journal-title":"Transactions of the ACL"},{"issue":"2","key":"2021042218045184000_bib17","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1080\/0022250X.2001.9990249","article-title":"A faster algorithm for betweenness centrality","volume":"25","author":"Brandes","year":"2001","journal-title":"Journal of Mathematical Sociology"},{"key":"2021042218045184000_bib18","first-page":"117","article-title":"Spectral clustering for German verbs","volume-title":"Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)","author":"Brew","year":"2002"},{"key":"2021042218045184000_bib19","first-page":"136","article-title":"Distributional semantics in technicolor","volume-title":"Proceedings of ACL","author":"Bruni","year":"2012"},{"key":"2021042218045184000_bib20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1613\/jair.4135","article-title":"Multimodal distributional semantics","volume":"49","author":"Bruni","year":"2014","journal-title":"Journal of Artificial Intelligence Research"},{"key":"2021042218045184000_bib21","first-page":"29","article-title":"Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures","volume-title":"Proceedings of the Workshop on WordNet and Other Lexical Resources","author":"Budanitsky","year":"2001"},{"issue":"1","key":"2021042218045184000_bib22","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1162\/coli.2006.32.1.13","article-title":"Evaluating WordNet-based measures of lexical semantic relatedness","volume":"32","author":"Budanitsky","year":"2006","journal-title":"Computational Linguistics"},{"key":"2021042218045184000_bib23","article-title":"Multilingual alignment of contextual word representations","volume-title":"International Conference on Learning Representations","author":"Cao","year":"2020"},{"issue":"6","key":"2021042218045184000_bib24","doi-asserted-by":"crossref","first-page":"1047","DOI":"10.3758\/MC.36.6.1047","article-title":"Similarity and proximity: When does close in space mean close in mind?","volume":"36","author":"Casasanto","year":"2008","journal-title":"Memory & Cognition"},{"issue":"12","key":"2021042218045184000_bib25","first-page":"614","article-title":"Biostatistics 104: correlational analysis","volume":"44","author":"Chan","year":"2003","journal-title":"Singapore Medical Journal"},{"issue":"40","key":"2021042218045184000_bib26","doi-asserted-by":"crossref","first-page":"14565","DOI":"10.1073\/pnas.1402594111","article-title":"Unique semantic space in the brain of each beholder predicts perceived similarity","volume":"111","author":"Charest","year":"2014","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"1","key":"2021042218045184000_bib27","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/0093-934X(90)90103-N","article-title":"Semantic and associative priming in the cerebral hemispheres: Some words do, some words don\u2019t\u2026sometimes, some places","volume":"38","author":"Chiarello","year":"1990","journal-title":"Brain and Language"},{"key":"2021042218045184000_bib28","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1016\/j.neuroimage.2019.03.031","article-title":"The spatiotemporal neural dynamics underlying perceived similarity for real-world objects","volume":"194","author":"Cichy","year":"2019","journal-title":"NeuroImage"},{"key":"2021042218045184000_bib29","doi-asserted-by":"crossref","first-page":"6022","DOI":"10.18653\/v1\/2020.acl-main.536","article-title":"Emerging cross-lingual structure in pretrained language models","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Conneau","year":"2020"},{"key":"2021042218045184000_bib30","volume-title":"Lexical Semantics","author":"Cruse","year":"1986"},{"key":"2021042218045184000_bib31","first-page":"924","article-title":"Sentiment lexica from paired comparisons","volume-title":"Proceedings of ICDM","author":"Dalitz","year":"2016"},{"issue":"1","key":"2021042218045184000_bib32","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1007\/BF01890115","article-title":"Efficient algorithms for agglomerative hierarchical clustering methods","volume":"1","author":"Day","year":"1984","journal-title":"Journal of Classification"},{"key":"2021042218045184000_bib33","first-page":"4171","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Devlin","year":"2019"},{"key":"2021042218045184000_bib34","first-page":"226","article-title":"A density-based algorithm for discovering clusters in large spatial databases with noise","volume-title":"Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD)","author":"Ester","year":"1996"},{"key":"2021042218045184000_bib35","first-page":"249","article-title":"Thematic thinking: The apprehension and consequences of thematic relations","volume-title":"Psychology of Learning and Motivation","author":"Estes","year":"2011"},{"key":"2021042218045184000_bib36","first-page":"854","article-title":"Classifying French verbs using French and English lexical resources","volume-title":"Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1","author":"Falk","year":"2012"},{"key":"2021042218045184000_bib37","first-page":"464","article-title":"Non-distributional word vector representations","volume-title":"Proceedings of NAACL-HLT","author":"Faruqui","year":"2015"},{"key":"2021042218045184000_bib38","first-page":"30","article-title":"Problems with evaluation of word embeddings using word similarity tasks","volume-title":"Proceedings of REPEVAL","author":"Faruqui","year":"2016"},{"key":"2021042218045184000_bib39","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/7287.001.0001","volume-title":"WordNet: An Electronic Lexical Database","author":"Fellbaum","year":"1998"},{"key":"2021042218045184000_bib40","first-page":"20","article-title":"Frame semantics and the nature of language","volume-title":"Annals of the New York Academy of Sciences: Conference on the Origin and Development of Language and Speech","author":"Fillmore","year":"1976"},{"key":"2021042218045184000_bib41","first-page":"5","article-title":"The need for a frame semantics in linguistics","volume-title":"Statistical Methods in Linguistics","author":"Fillmore","year":"1977"},{"key":"2021042218045184000_bib42","first-page":"111","article-title":"Frame semantics","volume-title":"Linguistics in the Morning Calm","author":"Fillmore","year":"1982"},{"issue":"1","key":"2021042218045184000_bib43","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1145\/503104.503110","article-title":"Placing search in context: The concept revisited","volume":"20","author":"Finkelstein","year":"2002","journal-title":"ACM Transactions on Information Systems"},{"key":"2021042218045184000_bib44","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1007\/1-4020-3232-3_5","article-title":"Prepositions and results in Italian and English: An analysis from event decomposition","volume-title":"Perspectives on Aspect","author":"Folli","year":"2005"},{"issue":"1\u20132","key":"2021042218045184000_bib45","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1006\/brln.1999.2079","article-title":"Semantic features and semantic categories: Differences in rapid activation of the lexicon","volume":"68","author":"Frenck-Mestre","year":"1999","journal-title":"Brain and Language"},{"key":"2021042218045184000_bib46","volume-title":"Conceptual Spaces: The Geometry of Thought","author":"G\u00e4rdenfors","year":"2004"},{"issue":"2","key":"2021042218045184000_bib47","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1207\/s15516709cog0702_3","article-title":"Structure-mapping: A theoretical framework for analogy","volume":"7","author":"Gentner","year":"1983","journal-title":"Cognitive Science"},{"key":"2021042218045184000_bib48","first-page":"2173","article-title":"SimVerb-3500: A large-scale evaluation set of verb similarity","volume-title":"Proceedings of EMNLP","author":"Gerz","year":"2016"},{"key":"2021042218045184000_bib49","first-page":"36","article-title":"Intrinsic evaluations of word embeddings: What can we do better?","volume-title":"Proceedings of REPEVAL","author":"Gladkova","year":"2016"},{"key":"2021042218045184000_bib50","first-page":"8","article-title":"Analogy-based detection of morphological and semantic relations with word embeddings: What works and what doesn\u2019t","volume-title":"Proceedings of the NAACL Student Research Workshop","author":"Gladkova","year":"2016"},{"issue":"4","key":"2021042218045184000_bib51","doi-asserted-by":"crossref","first-page":"381","DOI":"10.3758\/BF03204653","article-title":"An efficient method for obtaining similarity data","volume":"26","author":"Goldstone","year":"1994","journal-title":"Behavior Research Methods, Instruments, & Computers"},{"issue":"3\u20134","key":"2021042218045184000_bib52","doi-asserted-by":"crossref","first-page":"325","DOI":"10.1093\/biomet\/53.3-4.325","article-title":"Some distance properties of latent root and vector methods used in multivariate analysis","volume":"53","author":"Gower","year":"1966","journal-title":"Biometrika"},{"key":"2021042218045184000_bib53","first-page":"3483","article-title":"Learning word vectors for 157 languages","volume-title":"Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018)","author":"Grave","year":"2018"},{"key":"2021042218045184000_bib54","first-page":"4129","article-title":"A structural probe for finding syntax in word representations","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Hewitt","year":"2019"},{"issue":"4","key":"2021042218045184000_bib55","doi-asserted-by":"crossref","first-page":"665","DOI":"10.1162\/COLI_a_00237","article-title":"SimLex-999: Evaluating semantic models with (genuine) similarity estimation","volume":"41","author":"Hill","year":"2015","journal-title":"Computational Linguistics"},{"issue":"6","key":"2021042218045184000_bib56","doi-asserted-by":"crossref","first-page":"1744","DOI":"10.3758\/s13423-016-1053-2","article-title":"The principals of meaning: Extracting semantic dimensions from co-occurrence models of semantics","volume":"23","author":"Hollis","year":"2016","journal-title":"Psychonomic Bulletin & Review"},{"issue":"1","key":"2021042218045184000_bib57","doi-asserted-by":"crossref","first-page":"256","DOI":"10.1037\/a0028860","article-title":"The versatility of SpAM: A fast, efficient, spatial method of data collection for multidimensional scaling","volume":"142","author":"Hout","year":"2013","journal-title":"Journal of Experimental Psychology: General"},{"key":"2021042218045184000_bib58","first-page":"873","article-title":"Improving word representations via global context and multiple word prototypes","volume-title":"Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1","author":"Huang","year":"2012"},{"key":"2021042218045184000_bib59","volume-title":"Semantic Interpretation in Generative Grammar","author":"Jackendoff","year":"1972"},{"key":"2021042218045184000_bib60","first-page":"111","article-title":"Roget\u2019s thesaurus and semantic similarity","volume-title":"Recent Advances in Natural Language Processing III, Selected Papers from RANLP 2003","author":"Jarmasz","year":"2003"},{"key":"2021042218045184000_bib61","doi-asserted-by":"crossref","first-page":"3651","DOI":"10.18653\/v1\/P19-1356","article-title":"What does BERT learn about the structure of language?","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Jawahar","year":"2019"},{"key":"2021042218045184000_bib62","first-page":"290","article-title":"SemEval-2013 Task 13: Word sense induction for graded and non-graded senses","volume-title":"Proceedings of SEMEVAL","author":"Jurgens","year":"2013"},{"key":"2021042218045184000_bib63","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1007\/s10579-019-09452-w","article-title":"Capturing and measuring thematic relatedness","volume":"54","author":"Kacmajor","year":"2020","journal-title":"Language Resources and Evaluation"},{"key":"2021042218045184000_bib64","volume-title":"Roget\u2019s 21st Century Thesaurus (3rd Edition)","author":"Kipfer","year":"2009"},{"key":"2021042218045184000_bib65","first-page":"1027","article-title":"Extending VerbNet with novel verb classes","volume-title":"Proceedings of LREC","author":"Kipper","year":"2006"},{"key":"2021042218045184000_bib66","unstructured":"Kipper Schuler, Karin\n          . 2005. VerbNet: A Broad-coverage, Comprehensive Verb Lexicon. Ph.D. thesis, University of Pennsylvania."},{"key":"2021042218045184000_bib67","first-page":"465","article-title":"Best-worst scaling more reliable than rating scales: A case study on sentiment intensity annotation","volume-title":"Proceedings of ACL","author":"Kiritchenko","year":"2017"},{"key":"2021042218045184000_bib68","first-page":"811","article-title":"Capturing reliable fine-grained sentiment associations by crowdsourcing and best\u2013worst scaling","volume-title":"Proceedings of NAACL-HLT","author":"Kiritchenko","year":"2016"},{"key":"2021042218045184000_bib69","first-page":"79","article-title":"Europarl: A parallel corpus for statistical machine translation","volume-title":"MT Summit","author":"Koehn","year":"2005"},{"key":"2021042218045184000_bib70","doi-asserted-by":"crossref","first-page":"245","DOI":"10.3389\/fpsyg.2012.00245","article-title":"Inverse MDS: Inferring dissimilarity structure from multiple item arrangements","volume":"3","author":"Kriegeskorte","year":"2012","journal-title":"Frontiers in Psychology"},{"issue":"4","key":"2021042218045184000_bib71","first-page":"1","article-title":"Representational similarity analysis\u2014Connecting the branches of systems neuroscience","volume":"2","author":"Kriegeskorte","year":"2008","journal-title":"Frontiers in Systems Neuroscience"},{"issue":"7","key":"2021042218045184000_bib72","doi-asserted-by":"crossref","first-page":"1956","DOI":"10.1007\/s11263-020-01316-z","article-title":"The Open Images Data set V4: Unified image classification, object detection, and visual relationship detection at scale","volume":"128","author":"Kuznetsova","year":"2020","journal-title":"International Journal of Computer Vision"},{"key":"2021042218045184000_bib73","volume-title":"Philosophy in the Flesh: The Embodied Mind and Its Challenge to Western Thought","author":"Lakoff","year":"1999"},{"issue":"2","key":"2021042218045184000_bib74","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1037\/0033-295X.104.2.211","article-title":"A solution to Plato\u2019s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge","volume":"104","author":"Landauer","year":"1997","journal-title":"Psychological Review"},{"issue":"18","key":"2021042218045184000_bib75","article-title":"Effects of high-order co-occurrences on word semantic similarity","volume":"1","author":"Lemaire","year":"2006","journal-title":"Current Psychology Letters. Behaviour, Brain & Cognition"},{"key":"2021042218045184000_bib76","volume-title":"English Verb Classes and Alternations: Preliminary Investigation","author":"Levin","year":"1993"},{"issue":"2","key":"2021042218045184000_bib77","doi-asserted-by":"crossref","first-page":"230","DOI":"10.1037\/0022-3514.70.2.230","article-title":"Reasoning and the weighting of attributes in attitude judgments","volume":"70","author":"Levine","year":"1996","journal-title":"Journal of Personality and Social Psychology"},{"key":"2021042218045184000_bib78","first-page":"302","article-title":"Dependency-based word embeddings","volume-title":"Proceedings of ACL","author":"Levy","year":"2014"},{"issue":"1","key":"2021042218045184000_bib79","doi-asserted-by":"crossref","first-page":"398","DOI":"10.1186\/1471-2105-9-398","article-title":"Modifying the DPClus algorithm for identifying protein complexes based on new topological structures","volume":"9","author":"Li","year":"2008","journal-title":"BMC Bioinformatics"},{"issue":"9","key":"2021042218045184000_bib80","doi-asserted-by":"crossref","first-page":"1880","DOI":"10.3390\/ijms18091880","article-title":"CytoCluster: A cytoscape plugin for cluster analysis and visualization of biological networks","volume":"18","author":"Li","year":"2017","journal-title":"International Journal of Molecular Sciences"},{"issue":"1","key":"2021042218045184000_bib81","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1037\/0096-3445.130.1.3","article-title":"Thematic relations in adults\u2019 concepts.","volume":"130","author":"Lin","year":"2001","journal-title":"Journal of Experimental Psychology: General"},{"key":"2021042218045184000_bib82","first-page":"1073","article-title":"Linguistic knowledge and transferability of contextual representations","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Liu","year":"2019"},{"key":"2021042218045184000_bib83","first-page":"33","article-title":"Investigating cross-lingual alignment methods for contextualized embeddings with token-level evaluation","volume-title":"Proceedings of CoNLL","author":"Liu","year":"2019"},{"key":"2021042218045184000_bib84","article-title":"RoBERTa: A robustly optimized BERT pretraining approach","volume":"abs\/1907.11692","author":"Liu","year":"2019","journal-title":"CoRR"},{"key":"2021042218045184000_bib85","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781107337855","volume-title":"Best-Worst Scaling: Theory, Methods and Applications","author":"Louviere","year":"2015"},{"key":"2021042218045184000_bib86","unstructured":"Louviere, Jordan J. and George G.Woodworth. 1991. Best-worst scaling: A model for the largest difference judgments. Technical report, University of Alberta."},{"issue":"6","key":"2021042218045184000_bib87","doi-asserted-by":"crossref","first-page":"709","DOI":"10.1016\/S0022-5371(84)90434-1","article-title":"Semantic priming without association: A second look","volume":"23","author":"Lupker","year":"1984","journal-title":"Journal of Verbal Learning and Verbal Behavior"},{"key":"2021042218045184000_bib88","first-page":"5749","article-title":"Spatial multi-arrangement for clustering and multi-way similarity data set construction","volume-title":"Proceedings of the 12th Language Resources and Evaluation Conference","author":"Majewska","year":"2020"},{"key":"2021042218045184000_bib89","first-page":"952","article-title":"Acquiring verb classes through bottom-up semantic verb clustering","volume-title":"Proceedings of LREC","author":"Majewska","year":"2018"},{"issue":"3","key":"2021042218045184000_bib90","doi-asserted-by":"crossref","first-page":"771","DOI":"10.1007\/s10579-017-9403-x","article-title":"Investigating the cross-lingual translatability of VerbNet-style classification","volume":"52","author":"Majewska","year":"2018","journal-title":"Language Resources and Evaluation"},{"key":"2021042218045184000_bib91","doi-asserted-by":"crossref","first-page":"3428","DOI":"10.18653\/v1\/P19-1334","article-title":"Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"McCoy","year":"2019"},{"issue":"2\u20133","key":"2021042218045184000_bib92","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1080\/016909697386835","article-title":"Thematic roles as verb-specific concepts","volume":"12","author":"McRae","year":"1997","journal-title":"Language and Cognitive Processes"},{"key":"2021042218045184000_bib93","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1037\/13493-002","article-title":"Semantic and associative relations in adolescents and young adults: Examining a tenuous dichotomy","volume-title":"The Adolescent Brain: Learning, Reasoning, and Decision Making","author":"McRae","year":"2012"},{"key":"2021042218045184000_bib94","first-page":"177","article-title":"A random walks view of spectral segmentation","volume-title":"Proceedings of AI and STATISTICS (AISTATS) 2001","author":"Meila","year":"2001"},{"key":"2021042218045184000_bib95","article-title":"Efficient estimation of word representations in vector space","volume":"abs\/1301.3781","author":"Mikolov","year":"2013","journal-title":"CoRR"},{"key":"2021042218045184000_bib96","first-page":"52","article-title":"Advances in pre-training distributed word representations","volume-title":"Proceedings of LREC","author":"Mikolov","year":"2018"},{"key":"2021042218045184000_bib97","first-page":"3111","article-title":"Distributed representations of words and phrases and their compositionality","volume-title":"Advances in Neural Information Processing Systems","author":"Mikolov","year":"2013"},{"key":"2021042218045184000_bib98","first-page":"127","article-title":"A proposal for linguistic similarity data sets based on commonality lists","volume-title":"Proceedings of REPEVAL","author":"Milajevs","year":"2016"},{"issue":"11","key":"2021042218045184000_bib99","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1145\/219717.219748","article-title":"WordNet: A lexical database for English","volume":"38","author":"Miller","year":"1995","journal-title":"Communications of the ACM"},{"key":"2021042218045184000_bib100","first-page":"2265","article-title":"Learning word embeddings efficiently with noise-contrastive estimation","volume-title":"Proceedings of NIPS","author":"Mnih","year":"2013"},{"key":"2021042218045184000_bib101","first-page":"142","article-title":"Counter-fitting word vectors to linguistic constraints","volume-title":"Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Mrk\u0161i\u0107","year":"2016"},{"key":"2021042218045184000_bib102","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1162\/tacl_a_00063","article-title":"Semantic specialization of distributional word vector spaces using monolingual and cross-lingual constraints","volume":"5","author":"Mrk\u0161i\u0107","year":"2017","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2021042218045184000_bib103","doi-asserted-by":"crossref","first-page":"128","DOI":"10.3389\/fpsyg.2013.00128","article-title":"Human object-similarity judgments reflect and transcend the primate-IT object representation","volume":"4","author":"Mur","year":"2013","journal-title":"Frontiers in Psychology"},{"issue":"5","key":"2021042218045184000_bib104","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1038\/nmeth.1938","article-title":"Detecting overlapping protein complexes in protein-protein interaction networks","volume":"9","author":"Nepusz","year":"2012","journal-title":"Nature Methods"},{"issue":"1","key":"2021042218045184000_bib105","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1016\/j.socnet.2004.11.009","article-title":"A measure of betweenness centrality based on random walks","volume":"27","author":"Newman","year":"2005","journal-title":"Social Networks"},{"issue":"4","key":"2021042218045184000_bib106","doi-asserted-by":"crossref","first-page":"e1003553","DOI":"10.1371\/journal.pcbi.1003553","article-title":"A toolbox for representational similarity analysis","volume":"10","author":"Nili","year":"2014","journal-title":"PLoS Computational Biology"},{"key":"2021042218045184000_bib107","first-page":"649","article-title":"Semantic classification with distributional kernels","volume-title":"Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1","author":"\u00d3 S\u00e9aghdha","year":"2008"},{"key":"2021042218045184000_bib108","doi-asserted-by":"crossref","first-page":"1532","DOI":"10.3115\/v1\/D14-1162","article-title":"GloVe: Global vectors for word representation","volume-title":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Pennington","year":"2014"},{"key":"2021042218045184000_bib109","first-page":"2227","article-title":"Deep contextualized word representations","volume-title":"Proceedings of NAACL-HLT","author":"Peters","year":"2018"},{"key":"2021042218045184000_bib110","first-page":"1267","article-title":"WiC: The word-in-context data set for evaluating context-sensitive meaning representations","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Pilehvar","year":"2019"},{"key":"2021042218045184000_bib111","first-page":"1391","article-title":"Card-660: Cambridge Rare Word Data set\u2014a reliable benchmark for infrequent word representation models","volume-title":"Proceedings of EMNLP","author":"Pilehvar","year":"2018"},{"issue":"8","key":"2021042218045184000_bib112","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI Blog"},{"key":"2021042218045184000_bib113","first-page":"448","article-title":"Using information content to evaluate semantic similarity in a taxonomy","volume-title":"Proceedings of IJCAI","author":"Resnik","year":"1995"},{"key":"2021042218045184000_bib114","first-page":"399","article-title":"Measuring verb similarity","volume-title":"Proceedings of the 22nd Annual Meeting of the Cognitive Science Society (CogSci 2000)","author":"Resnik","year":"2000"},{"key":"2021042218045184000_bib115","first-page":"8722","article-title":"Getting closer to AI complete question answering: A set of prerequisite real tasks","volume-title":"The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020","author":"Rogers","year":"2020"},{"key":"2021042218045184000_bib116","article-title":"A primer in BERTology: What we know about how BERT works","author":"Rogers","year":"2020","journal-title":"arXiv preprint arXiv:2002.12327"},{"key":"2021042218045184000_bib117","doi-asserted-by":"crossref","first-page":"95","DOI":"10.3389\/fpsyg.2016.00095","article-title":"Verbal semantics drives early anticipatory eye movements during the comprehension of verb-initial sentences","volume":"7","author":"Sauppe","year":"2016","journal-title":"Frontiers in Psychology"},{"key":"2021042218045184000_bib118","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1007\/978-3-642-54906-9_3","article-title":"Verb clustering for Brazilian Portuguese","volume-title":"International Conference on Intelligent Text Processing and Computational Linguistics","author":"Scarton","year":"2014"},{"issue":"5","key":"2021042218045184000_bib119","doi-asserted-by":"crossref","first-page":"1763","DOI":"10.1213\/ANE.0000000000002864","article-title":"Correlation coefficients: Appropriate use and interpretation","volume":"126","author":"Schober","year":"2018","journal-title":"Anesthesia & Analgesia"},{"key":"2021042218045184000_bib120","first-page":"258","article-title":"Symmetric pattern based word embeddings for improved word similarity prediction","volume-title":"Proceedings of CoNLL","author":"Schwartz","year":"2015"},{"issue":"11","key":"2021042218045184000_bib121","doi-asserted-by":"crossref","first-page":"2498","DOI":"10.1101\/gr.1239303","article-title":"Cytoscape: A software environment for integrated models of biomolecular interaction networks","volume":"13","author":"Shannon","year":"2003","journal-title":"Genome Research"},{"key":"2021042218045184000_bib122","article-title":"What does BERT learn from multiple-choice reading comprehension data sets?","author":"Si","year":"2019","journal-title":"arXiv preprint arXiv:1910.12391"},{"key":"2021042218045184000_bib123","first-page":"8918","article-title":"Assessing the benchmarking capacity of machine reading comprehension data sets","volume-title":"The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020","author":"Sugawara","year":"2020"},{"key":"2021042218045184000_bib124","first-page":"638","article-title":"Improving verb clustering with automatically acquired selectional preferences","volume-title":"Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2","author":"Sun","year":"2009"},{"key":"2021042218045184000_bib125","first-page":"1056","article-title":"Investigating the cross-linguistic potential of VerbNet-style classification","volume-title":"Proceedings of the 23rd International Conference on Computational Linguistics","author":"Sun","year":"2010"},{"issue":"99","key":"2021042218045184000_bib126","first-page":"36","article-title":"Lexicalization patterns: Semantic structure in lexical forms","volume":"3","author":"Talmy","year":"1985","journal-title":"Language Typology and Syntactic Description"},{"key":"2021042218045184000_bib127","doi-asserted-by":"crossref","first-page":"4593","DOI":"10.18653\/v1\/P19-1452","article-title":"BERT rediscovers the classical NLP pipeline","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Tenney","year":"2019"},{"key":"2021042218045184000_bib128","volume-title":"Rediscovering the Social Group: A Self-Categorization Theory.","author":"Turner","year":"1987"},{"key":"2021042218045184000_bib129","first-page":"491","article-title":"Mining the web for synonyms: PMI-IR versus LSA on TOEFL","volume-title":"Proceedings of ECML","author":"Turney","year":"2001"},{"issue":"3","key":"2021042218045184000_bib130","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1162\/coli.2006.32.3.379","article-title":"Similarity of semantic relations","volume":"32","author":"Turney","year":"2006","journal-title":"Computational Linguistics"},{"key":"2021042218045184000_bib131","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1613\/jair.2934","article-title":"From frequency to meaning: Vector space models of semantics","volume":"37","author":"Turney","year":"2010","journal-title":"Journal of Artificial Intelligence Research"},{"issue":"4","key":"2021042218045184000_bib132","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1037\/0033-295X.84.4.327","article-title":"Features of similarity.","volume":"84","author":"Tversky","year":"1977","journal-title":"Psychological Review"},{"key":"2021042218045184000_bib133","first-page":"163","article-title":"Evaluation by association: A systematic study of quantitative word association evaluation","volume-title":"Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers","author":"Vuli\u0107","year":"2017"},{"key":"2021042218045184000_bib134","doi-asserted-by":"crossref","first-page":"137","DOI":"10.18653\/v1\/W18-3018","article-title":"Injecting lexical contrast into word vectors by guiding vector space specialisation","volume-title":"Proceedings of The Third Workshop on Representation Learning for NLP","author":"Vuli\u0107","year":"2018"},{"key":"2021042218045184000_bib135","first-page":"2546","article-title":"Cross-lingual induction and transfer of verb classes based on word vector space specialisation","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Vuli\u0107","year":"2017"},{"key":"2021042218045184000_bib136","article-title":"Multi-SimLex: A large-scale evaluation of multilingual and cross-lingual lexical semantic similarity","volume":"abs\/2003.04866","author":"Vuli\u0107","year":"2020","journal-title":"CoRR"},{"issue":"3","key":"2021042218045184000_bib137","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1109\/TCBB.2010.75","article-title":"A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks","volume":"8","author":"Wang","year":"2011","journal-title":"IEEE\/ACM Transactions on Computational Biology and Bioinformatics"},{"issue":"4","key":"2021042218045184000_bib138","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1109\/TNB.2012.2210907","article-title":"Identification of hierarchical and overlapping functional modules in PPI networks","volume":"11","author":"Wang","year":"2012","journal-title":"IEEE Transactions on NanoBioscience"},{"key":"2021042218045184000_bib139","doi-asserted-by":"crossref","first-page":"5721","DOI":"10.18653\/v1\/D19-1575","article-title":"Cross-lingual BERT transformation for zero-shot dependency parsing","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Wang","year":"2019"},{"key":"2021042218045184000_bib140","first-page":"345","article-title":"From paraphrase database to compositional paraphrase model and back","volume":"3","author":"Wieting","year":"2015","journal-title":"Transactions of the ACL"},{"key":"2021042218045184000_bib141","first-page":"1504","article-title":"CHARAGRAM: Embedding words and sentences via character n-grams","volume-title":"Proceedings of EMNLP","author":"Wieting","year":"2016"},{"key":"2021042218045184000_bib142","article-title":"HuggingFace\u2019s Transformers: State-of-the-art Natural Language Processing","volume":"abs\/1910.03771","author":"Wolf","year":"2019","journal-title":"ArXiv"},{"key":"2021042218045184000_bib143","first-page":"121","article-title":"Verb similarity on the taxonomy of WordNet","volume-title":"Proceedings of the 3rd International WordNet Conference (GWC-06)","author":"Yang","year":"2006"},{"key":"2021042218045184000_bib144","first-page":"5753","article-title":"XLNet: Generalized autoregressive pretraining for language understanding","volume-title":"Advances in Neural Information Processing Systems 32","author":"Yang","year":"2019"},{"key":"2021042218045184000_bib145","doi-asserted-by":"crossref","first-page":"4791","DOI":"10.18653\/v1\/P19-1472","article-title":"Hellaswag: Can a machine really finish your sentence?","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Zellers","year":"2019"}],"container-title":["Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/direct.mit.edu\/coli\/article-pdf\/47\/1\/69\/1911493\/coli_a_00396.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/direct.mit.edu\/coli\/article-pdf\/47\/1\/69\/1911493\/coli_a_00396.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,4,22]],"date-time":"2021-04-22T23:11:29Z","timestamp":1619133089000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/coli\/article\/47\/1\/69\/97331\/Semantic-Data-Set-Construction-from-Human"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3]]},"references-count":145,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2021,4,21]]},"published-print":{"date-parts":[[2021,4,21]]}},"URL":"https:\/\/doi.org\/10.1162\/coli_a_00396","relation":{},"ISSN":["0891-2017","1530-9312"],"issn-type":[{"value":"0891-2017","type":"print"},{"value":"1530-9312","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,3]]},"published":{"date-parts":[[2021,3]]}}}