{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,4]],"date-time":"2024-07-04T10:37:22Z","timestamp":1720089442584},"reference-count":75,"publisher":"MIT Press","license":[{"start":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T00:00:00Z","timestamp":1712880000000},"content-version":"vor","delay-in-days":102,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,4,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>When deriving contextualized word representations from language models, a decision needs to be made on how to obtain one for out-of-vocabulary (OOV) words that are segmented into subwords. What is the best way to represent these words with a single vector, and are these representations of worse quality than those of in-vocabulary words? We carry out an intrinsic evaluation of embeddings from different models on semantic similarity tasks involving OOV words. Our analysis reveals, among other interesting findings, that the quality of representations of words that are split is often, but not always, worse than that of the embeddings of known words. Their similarity values, however, must be interpreted with caution.<\/jats:p>","DOI":"10.1162\/tacl_a_00647","type":"journal-article","created":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T19:02:56Z","timestamp":1712948576000},"page":"299-320","update-policy":"http:\/\/dx.doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":1,"title":["The Impact of Word Splitting on the Semantic Content of Contextualized Word Representations"],"prefix":"10.1162","volume":"12","author":[{"given":"Aina Gar\u00ed","family":"Soler","sequence":"first","affiliation":[{"name":"LTCI, T\u00e9l\u00e9com-Paris, Institut Polytechnique de Paris, France. aina.garisoler@telecom-paris.fr"}]},{"given":"Matthieu","family":"Labeau","sequence":"additional","affiliation":[{"name":"LTCI, T\u00e9l\u00e9com-Paris, Institut Polytechnique de Paris, France. matthieu.labeau@telecom-paris.fr"}]},{"given":"Chlo\u00e9","family":"Clavel","sequence":"additional","affiliation":[{"name":"INRIA, Paris, France. chloe.clavel@inria.fr"}]}],"member":"281","published-online":{"date-parts":[[2024,4,5]]},"reference":[{"key":"2024041219024377300_bib1","doi-asserted-by":"publisher","first-page":"19","DOI":"10.3115\/1620754.1620758","article-title":"A study on similarity and relatedness using distributional and WordNet-based approaches","volume-title":"Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics","author":"Agirre","year":"2009"},{"key":"2024041219024377300_bib2","doi-asserted-by":"publisher","first-page":"54","DOI":"10.18653\/v1\/N19-4010","article-title":"FLAIR: An easy-to-use framework for state-of-the-art NLP","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)","author":"Akbik","year":"2019"},{"key":"2024041219024377300_bib3","doi-asserted-by":"publisher","first-page":"5878","DOI":"10.18653\/v1\/2020.semeval-1.3","article-title":"CoSimLex: A resource for evaluating graded word similarity in context","volume-title":"Proceedings of The 12th Language Resources and Evaluation Conference","author":"Armendariz","year":"2020"},{"key":"2024041219024377300_bib4","first-page":"4193","article-title":"Evaluating tokenizers impact on OOVs representation with transformers models","volume-title":"Proceedings of the Language Resources and Evaluation Conference","author":"Benamar","year":"2022"},{"key":"2024041219024377300_bib5","volume-title":"Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit","author":"Bird","year":"2009"},{"key":"2024041219024377300_bib6","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1162\/tacl_a_00051","article-title":"Enriching word vectors with subword information","volume":"5","author":"Bojanowski","year":"2017","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2024041219024377300_bib7","doi-asserted-by":"publisher","first-page":"4758","DOI":"10.18653\/v1\/2020.acl-main.431","article-title":"Interpreting pretrained contextualized representations via reductions to static embeddings","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Bommasani","year":"2020"},{"key":"2024041219024377300_bib8","doi-asserted-by":"publisher","first-page":"4617","DOI":"10.18653\/v1\/2020.findings-emnlp.414","article-title":"Byte pair encoding is suboptimal for language model pretraining","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Bostrom","year":"2020"},{"key":"2024041219024377300_bib9","first-page":"1877","article-title":"Language models are few-shot learners","volume-title":"Advances in Neural Information Processing Systems","author":"Brown","year":"2020"},{"issue":"3","key":"2024041219024377300_bib10","doi-asserted-by":"publisher","first-page":"375","DOI":"10.1017\/S1351324920000145","article-title":"Emerging trends: Subwords, seriously?","volume":"26","author":"Church","year":"2020","journal-title":"Natural Language Engineering"},{"key":"2024041219024377300_bib11","article-title":"ELECTRA: Pre-training text encoders as discriminators rather than generators","volume-title":"ICLR","author":"Clark","year":"2020"},{"key":"2024041219024377300_bib12","article-title":"Cross-lingual language model pretraining","volume-title":"Advances in Neural Information Processing Systems","author":"Conneau","year":"2019"},{"key":"2024041219024377300_bib13","doi-asserted-by":"publisher","first-page":"4171","DOI":"10.18653\/v1\/N19-1423","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Devlin","year":"2019"},{"key":"2024041219024377300_bib14","doi-asserted-by":"publisher","first-page":"1504","DOI":"10.18653\/v1\/N19-1154","article-title":"One size does not fit all: Comparing NMT representations of different granularities","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Durrani","year":"2019"},{"key":"2024041219024377300_bib15","doi-asserted-by":"publisher","first-page":"6903","DOI":"10.18653\/v1\/2020.coling-main.609","article-title":"CharacterBERT: Reconciling ELMo and BERT for word-level open-vocabulary representations from characters","volume-title":"Proceedings of the 28th International Conference on Computational Linguistics","author":"El Boukkouri","year":"2020"},{"key":"2024041219024377300_bib16","doi-asserted-by":"publisher","first-page":"10","DOI":"10.3115\/1687878.1687882","article-title":"Investigations on word senses and word usages","volume-title":"Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP","author":"Erk","year":"2009"},{"issue":"3","key":"2024041219024377300_bib17","doi-asserted-by":"publisher","first-page":"511","DOI":"10.1162\/COLI_a_00142","article-title":"Measuring word meaning in context","volume":"39","author":"Erk","year":"2013","journal-title":"Computational Linguistics"},{"key":"2024041219024377300_bib18","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/7287.001.0001","volume-title":"WordNet: An Electronic Lexical Database","author":"Fellbaum","year":"1998"},{"issue":"2","key":"2024041219024377300_bib19","first-page":"23","article-title":"A new algorithm for data compression","volume":"12","author":"Gage","year":"1994","journal-title":"C Users Journal"},{"issue":"5","key":"2024041219024377300_bib20","doi-asserted-by":"publisher","first-page":"2152","DOI":"10.3758\/s13428-019-01282-6","article-title":"LADEC: The large database of English compounds","volume":"51","author":"Gagn\u00e9","year":"2019","journal-title":"Behavior Research Methods"},{"key":"2024041219024377300_bib21","doi-asserted-by":"publisher","first-page":"1375","DOI":"10.18653\/v1\/D19-1141","article-title":"Investigating the effectiveness of BPE: The power of shorter sequences","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Gall\u00e9","year":"2019"},{"key":"2024041219024377300_bib22","doi-asserted-by":"publisher","first-page":"825","DOI":"10.1162\/tacl_a_00400","article-title":"Let\u2019s play mono-poly: BERT can reveal words\u2019 polysemy level and partitionability into senses","volume":"9","author":"Soler","year":"2021","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2024041219024377300_bib23","doi-asserted-by":"publisher","first-page":"9","DOI":"10.18653\/v1\/S19-1002","article-title":"Word usage similarity estimation with sentence representations and automatic substitutes","volume-title":"Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)","author":"Soler","year":"2019"},{"key":"2024041219024377300_bib24","first-page":"3950","article-title":"One word, two sides: Traces of stance in contextualized word representations","volume-title":"Proceedings of the 29th International Conference on Computational Linguistics","author":"Soler","year":"2022"},{"key":"2024041219024377300_bib25","doi-asserted-by":"publisher","first-page":"3960","DOI":"10.18653\/v1\/2020.acl-main.365","article-title":"Analysing lexical semantic change with contextualised word representations","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Giulianelli","year":"2020"},{"key":"2024041219024377300_bib26","article-title":"OpenWebText corpus","author":"Gokaslan","year":"2019"},{"key":"2024041219024377300_bib27","doi-asserted-by":"publisher","first-page":"304","DOI":"10.18653\/v1\/D17-1030","article-title":"High-risk learning: Acquiring new word vectors from tiny data","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Herbelot","year":"2017"},{"issue":"4","key":"2024041219024377300_bib28","doi-asserted-by":"publisher","first-page":"665","DOI":"10.1162\/COLI_a_00237","article-title":"SimLex-999: Evaluating semantic models with (genuine) similarity estimation","volume":"41","author":"Hill","year":"2015","journal-title":"Computational Linguistics"},{"key":"2024041219024377300_bib29","doi-asserted-by":"publisher","first-page":"3594","DOI":"10.18653\/v1\/2021.acl-long.279","article-title":"Superbizarre is not superb: Derivational morphology improves BERT\u2019s interpretation of complex words","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Hofmann","year":"2021"},{"key":"2024041219024377300_bib30","doi-asserted-by":"publisher","first-page":"385","DOI":"10.18653\/v1\/2022.acl-short.43","article-title":"An embarrassingly simple method to mitigate undesirable properties of pretrained language model tokenizers","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)","author":"Hofmann","year":"2022"},{"key":"2024041219024377300_bib31","doi-asserted-by":"publisher","first-page":"4692","DOI":"10.18653\/v1\/2021.emnlp-main.385","article-title":"AVocaDo: Strategy for adapting vocabulary to downstream domain","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Hong","year":"2021"},{"key":"2024041219024377300_bib32","first-page":"873","article-title":"Improving word representations via global context and multiple word prototypes","volume-title":"Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Huang","year":"2012"},{"key":"2024041219024377300_bib33","doi-asserted-by":"publisher","first-page":"56","DOI":"10.18653\/v1\/W17-4706","article-title":"Target-side word segmentation strategies for neural machine translation","volume-title":"Proceedings of the Second Conference on Machine Translation","author":"Huck","year":"2017"},{"key":"2024041219024377300_bib34","article-title":"Breaking character: Are subwords good enough for MRLs after all?","author":"Keren","year":"2022","journal-title":"ArXiv"},{"key":"2024041219024377300_bib35","doi-asserted-by":"publisher","first-page":"66","DOI":"10.18653\/v1\/P18-1007","article-title":"Subword regularization: Improving neural network translation models with multiple subword candidates","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Kudo","year":"2018"},{"key":"2024041219024377300_bib36","doi-asserted-by":"publisher","first-page":"66","DOI":"10.18653\/v1\/D18-2012","article-title":"SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations","author":"Kudo","year":"2018"},{"key":"2024041219024377300_bib37","doi-asserted-by":"publisher","first-page":"192","DOI":"10.18653\/v1\/2021.eacl-srw.25","article-title":"Explaining and improving BERT performance on lexical semantic change detection","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop","author":"Laicher","year":"2021"},{"issue":"2","key":"2024041219024377300_bib38","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1037\/\/0033-295X.104.2.211","article-title":"A solution to Plato\u2019s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge","volume":"104","author":"Landauer","year":"1997","journal-title":"Psychological Review"},{"issue":"1","key":"2024041219024377300_bib39","first-page":"147","article-title":"Using corpus statistics and WordNet relations for sense identification","volume":"24","author":"Leacock","year":"1998","journal-title":"Computational Linguistics"},{"key":"2024041219024377300_bib40","doi-asserted-by":"publisher","first-page":"543","DOI":"10.18653\/v1\/2021.acl-short.69","article-title":"When is char better than subword: A systematic study of segmentation algorithms for neural machine translation","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)","author":"Li","year":"2021"},{"key":"2024041219024377300_bib41","doi-asserted-by":"publisher","first-page":"278","DOI":"10.18653\/v1\/2021.starsem-1.26","article-title":"Learning embeddings for rare words leveraging Internet search engine and spatial location relationships","volume-title":"Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics","author":"Li","year":"2021"},{"key":"2024041219024377300_bib42","doi-asserted-by":"publisher","first-page":"4066","DOI":"10.18653\/v1\/2020.emnlp-main.333","article-title":"Towards better context-aware lexical semantics: Adjusting contextualized representations through static anchors","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Liu","year":"2020"},{"key":"2024041219024377300_bib43","article-title":"RoBERTa: A robustly optimized BERT pretraining approach","author":"Liu","year":"2019","journal-title":"arXiv preprint arXiv:1907.11692"},{"key":"2024041219024377300_bib44","first-page":"104","article-title":"Better word representations with recursive neural networks for morphology","volume-title":"Proceedings of the Seventeenth Conference on Computational Natural Language Learning","author":"Luong","year":"2013"},{"key":"2024041219024377300_bib45","doi-asserted-by":"publisher","first-page":"961","DOI":"10.18653\/v1\/2022.findings-acl.78","article-title":"BPE vs. morphological segmentation: A case study on machine translation of four polysynthetic languages","volume-title":"Findings of the Association for Computational Linguistics: ACL 2022","author":"Mager","year":"2022"},{"key":"2024041219024377300_bib46","doi-asserted-by":"publisher","first-page":"1247","DOI":"10.18653\/v1\/2021.acl-long.100","article-title":"Measure and evaluation of semantic divergence across two languages","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Montariol","year":"2021"},{"key":"2024041219024377300_bib47","first-page":"345","article-title":"Fine-tuning de mod\u00e8les de langues pour la veille \u00e9pid\u00e9miologique multilingue avec peu de ressources (Fine-tuning Language Models for Low-resource Multilingual Epidemic Surveillance)","volume-title":"Actes de la 29e Conf\u00e9rence sur le Traitement Automatique des Langues Naturelles. Volume 1 : Conf\u00e9rence principale","author":"Mutuvi","year":"2022"},{"key":"2024041219024377300_bib48","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18653\/v1\/2020.insights-1.1","article-title":"Domain adaptation challenges of BERT in tokenization and sub-word representations of out-of-vocabulary words","volume-title":"Proceedings of the First Workshop on Insights from Negative Results in NLP","author":"Nayak","year":"2020"},{"key":"2024041219024377300_bib49","doi-asserted-by":"publisher","first-page":"17","DOI":"10.3115\/v1\/E14-2005","article-title":"RDRPOSTagger: A ripple down rules-based part-of-speech tagger","volume-title":"Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics","author":"Nguyen","year":"2014"},{"key":"2024041219024377300_bib50","doi-asserted-by":"publisher","first-page":"1267","DOI":"10.18653\/v1\/N19-1128","article-title":"WiC: The word-in-context dataset for evaluating context-sensitive meaning representations","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Pilehvar","year":"2019"},{"key":"2024041219024377300_bib51","doi-asserted-by":"publisher","first-page":"1391","DOI":"10.18653\/v1\/D18-1169","article-title":"Card-660: Cambridge rare word dataset - A reliable benchmark for infrequent word representation models","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Pilehvar","year":"2018"},{"key":"2024041219024377300_bib52","doi-asserted-by":"publisher","first-page":"31","DOI":"10.18653\/v1\/2021.eacl-main.3","article-title":"Disambiguatory signals are stronger in word-initial positions","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Pimentel","year":"2021"},{"issue":"01","key":"2024041219024377300_bib53","doi-asserted-by":"publisher","first-page":"6900","DOI":"10.1609\/aaai.v33i01.33016900","article-title":"Unseen word representation by aligning heterogeneous lexical semantic spaces","volume":"33","author":"Prokhorov","year":"2019","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"2024041219024377300_bib54","article-title":"Stanza: A Python natural language processing toolkit for many human languages","author":"Qi","year":"2020","journal-title":"arXiv preprint arXiv:2003.07082"},{"issue":"8","key":"2024041219024377300_bib55","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI Blog"},{"issue":"140","key":"2024041219024377300_bib56","first-page":"1","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"21","author":"Raffel","year":"2020","journal-title":"Journal of Machine Learning Research"},{"key":"2024041219024377300_bib57","doi-asserted-by":"publisher","first-page":"3118","DOI":"10.18653\/v1\/2021.acl-long.243","article-title":"How good is your tokenizer? On the monolingual performance of multilingual language models","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Rust","year":"2021"},{"key":"2024041219024377300_bib58","doi-asserted-by":"publisher","first-page":"1568","DOI":"10.3758\/s13428-017-0981-8","article-title":"MorphoLex: A derivational morphological database for 70,000 English words","volume":"50","author":"S\u00e1nchez-Guti\u00e9rrez","year":"2018","journal-title":"Behavior Research Methods"},{"key":"2024041219024377300_bib59","doi-asserted-by":"publisher","first-page":"3996","DOI":"10.18653\/v1\/2020.acl-main.368","article-title":"BERTRAM: Improved word embeddings have big impact on contextualized model performance","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Schick","year":"2020"},{"issue":"05","key":"2024041219024377300_bib60","doi-asserted-by":"publisher","first-page":"8766","DOI":"10.1609\/aaai.v34i05.6403","article-title":"Rare words: A major problem for contextualized embeddings and how to fix it by attentive mimicking","volume":"34","author":"Schick","year":"2020","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"2024041219024377300_bib61","doi-asserted-by":"publisher","first-page":"7079","DOI":"10.18653\/v1\/2021.emnlp-main.567","article-title":"DWUG: A large resource of diachronic word usage graphs in four languages","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Schlechtweg","year":"2021"},{"key":"2024041219024377300_bib62","doi-asserted-by":"publisher","first-page":"5149","DOI":"10.1109\/ICASSP.2012.6289079","article-title":"Japanese and Korean voice search","volume-title":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Schuster","year":"2012"},{"key":"2024041219024377300_bib63","doi-asserted-by":"publisher","first-page":"1715","DOI":"10.18653\/v1\/P16-1162","article-title":"Neural machine translation of rare words with subword units","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Sennrich","year":"2016"},{"key":"2024041219024377300_bib64","doi-asserted-by":"publisher","DOI":"10.1201\/9781420036268","volume-title":"Handbook of parametric and nonparametric statistical procedures","author":"Sheskin","year":"2003"},{"key":"2024041219024377300_bib65","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.7199437","article-title":"rspeer\/wordfreq: v3.0 (v3.0.2)","author":"Speer","year":"2022"},{"key":"2024041219024377300_bib66","doi-asserted-by":"publisher","first-page":"7222","DOI":"10.18653\/v1\/2020.emnlp-main.586","article-title":"Probing pretrained language models for lexical semantics","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Vuli\u0107","year":"2020"},{"key":"2024041219024377300_bib67","doi-asserted-by":"publisher","first-page":"353","DOI":"10.18653\/v1\/W18-5446","article-title":"GLUE: A multi-task benchmark and analysis platform for natural language understanding","volume-title":"Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP","author":"Wang","year":"2018"},{"key":"2024041219024377300_bib68","first-page":"161","article-title":"Does BERT make any sense? Interpretable word sense disambiguation with contextualized embeddings","volume-title":"Proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019): Long Papers","author":"Wiedemann","year":"2019"},{"key":"2024041219024377300_bib69","doi-asserted-by":"publisher","first-page":"38","DOI":"10.18653\/v1\/2020.emnlp-demos.6","article-title":"Transformers: State-of-the-art natural language processing","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations","author":"Wolf","year":"2020"},{"key":"2024041219024377300_bib70","article-title":"Google\u2019s neural machine translation system: Bridging the gap between human and machine translation","author":"Yonghui","year":"2016","journal-title":"arXiv preprint:1609.08144"},{"key":"2024041219024377300_bib71","doi-asserted-by":"publisher","first-page":"133","DOI":"10.3115\/981732.981751","article-title":"Verb semantics and lexical selection","volume-title":"32nd Annual Meeting of the Association for Computational Linguistics","author":"Zhibiao","year":"1994"},{"key":"2024041219024377300_bib72","doi-asserted-by":"publisher","first-page":"249","DOI":"10.1109\/ICDEW.2019.000-5","article-title":"Semantic similarity computation in knowledge graphs: Comparisons and improvements","volume-title":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","author":"Yang","year":"2019"},{"key":"2024041219024377300_bib73","article-title":"XLNet: Generalized autoregressive pretraining for language understanding","volume-title":"Advances in Neural Information Processing Systems","author":"Yang","year":"2019"},{"key":"2024041219024377300_bib74","article-title":"BERTScore: Evaluating text generation with BERT","volume-title":"8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26\u201330, 2020","author":"Zhang","year":"2020"},{"key":"2024041219024377300_bib75","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1109\/ICCV.2015.11","article-title":"Aligning books and movies: Towards story-like visual explanations by watching movies and reading books","volume-title":"Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV\u201915)","author":"Zhu","year":"2015"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00647\/2362190\/tacl_a_00647.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00647\/2362190\/tacl_a_00647.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T19:03:13Z","timestamp":1712948593000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00647\/120475\/The-Impact-of-Word-Splitting-on-the-Semantic"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024]]},"references-count":75,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00647","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024]]},"published":{"date-parts":[[2024]]}}}