{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,3]],"date-time":"2025-12-03T17:59:09Z","timestamp":1764784749201,"version":"3.40.5"},"reference-count":60,"publisher":"Cambridge University Press (CUP)","issue":"3","license":[{"start":{"date-parts":[[2021,10,13]],"date-time":"2021-10-13T00:00:00Z","timestamp":1634083200000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["cambridge.org"],"crossmark-restriction":true},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2023,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Word embeddings have become a standard resource in the toolset of any Natural Language Processing practitioner. While monolingual word embeddings encode information about words in the context of a particular language, cross-lingual embeddings define a multilingual space where word embeddings from two or more languages are integrated together. Current state-of-the-art approaches learn these embeddings by aligning two disjoint monolingual vector spaces through an orthogonal transformation which preserves the structure of the monolingual counterparts. In this work, we propose to apply an additional transformation after this initial alignment step, which aims to bring the vector representations of a given word and its translations closer to their average. Since this additional transformation is non-orthogonal, it also affects the structure of the monolingual spaces. We show that our approach both improves the integration of the monolingual spaces and the quality of the monolingual spaces themselves. Furthermore, because our transformation can be applied to an arbitrary number of languages, we are able to effectively obtain a truly multilingual space. The resulting (monolingual and multilingual) spaces show consistent gains over the current state-of-the-art in standard intrinsic tasks, namely dictionary induction and word similarity, as well as in extrinsic tasks such as cross-lingual hypernym discovery and cross-lingual natural language inference.<\/jats:p>","DOI":"10.1017\/s1351324921000280","type":"journal-article","created":{"date-parts":[[2021,10,13]],"date-time":"2021-10-13T13:52:45Z","timestamp":1634133165000},"page":"746-768","update-policy":"https:\/\/doi.org\/10.1017\/policypage","source":"Crossref","is-referenced-by-count":2,"title":["Meemi: A simple method for post-processing and integrating cross-lingual word embeddings"],"prefix":"10.1017","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5018-8222","authenticated-orcid":false,"given":"Yerai","family":"Doval","sequence":"first","affiliation":[]},{"given":"Jose","family":"Camacho-Collados","sequence":"additional","affiliation":[]},{"given":"Luis","family":"Espinosa-Anke","sequence":"additional","affiliation":[]},{"given":"Steven","family":"Schockaert","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2021,10,13]]},"reference":[{"key":"S1351324921000280_ref22","doi-asserted-by":"crossref","unstructured":"Doval, Y. , Camacho-Collados, J. , Espinosa-Anke, L. and Schockaert, S. 2018. Improving cross-lingual word embeddings by meeting in the middle. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. Association for Computational Linguistics, pp. 294\u2013304.","DOI":"10.18653\/v1\/D18-1027"},{"volume-title":"WALS Online","year":"2013","author":"Dryer","key":"S1351324921000280_ref23"},{"key":"S1351324921000280_ref20","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1269"},{"key":"S1351324921000280_ref27","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1070"},{"key":"S1351324921000280_ref13","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00051"},{"key":"S1351324921000280_ref29","unstructured":"Han, L. , Kashyap, A. , Finin, T. , Mayfield, J. and Weese, J. (2013). UMBC EBIQUITY-CORE: semantic textual similarity systems. In Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics, Volume 1, Atlanta, Georgia. Association for Computational Linguistics, pp. 44\u201352."},{"key":"S1351324921000280_ref19","unstructured":"Conneau, A. , Lample, G. , Ranzato, M. , Denoyer, L. and J\u00e9gou, H. (2018a). Word translation without parallel data. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada. OpenReview.net."},{"key":"S1351324921000280_ref15","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S18-1115"},{"key":"S1351324921000280_ref32","doi-asserted-by":"publisher","DOI":"10.1145\/2600428.2611177"},{"key":"S1351324921000280_ref4","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1250"},{"key":"S1351324921000280_ref33","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1043"},{"key":"S1351324921000280_ref34","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1330"},{"key":"S1351324921000280_ref38","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-5010"},{"key":"S1351324921000280_ref41","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1018"},{"key":"S1351324921000280_ref60","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1207"},{"key":"S1351324921000280_ref10","doi-asserted-by":"crossref","unstructured":"Barone, A.V.M. (2016). Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders. In Proceedings of the 1st Workshop on Representation Learning for NLP, Berlin, Germany. Association for Computational Linguistics, pp. 121\u2013126.","DOI":"10.18653\/v1\/W16-1614"},{"key":"S1351324921000280_ref43","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4020-4746-6_10"},{"key":"S1351324921000280_ref5","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1042"},{"key":"S1351324921000280_ref40","unstructured":"Mikolov, T. , Le, Q.V. and Sutskever, I. (2013b). Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168."},{"key":"S1351324921000280_ref46","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1226"},{"key":"S1351324921000280_ref24","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1041"},{"key":"S1351324921000280_ref58","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324918000293"},{"key":"S1351324921000280_ref35","doi-asserted-by":"crossref","unstructured":"Kementchedjhieva, Y. , Ruder, S. , Cotterell, R. and S\u00f8gaard, A. (2018). Generalizing procrustes analysis for better bilingual dictionary induction. In Proceedings of the 22nd Conference on Computational Natural Language Learning, Brussels, Belgium. Association for Computational Linguistics, pp. 211\u2013220.","DOI":"10.18653\/v1\/K18-1021"},{"key":"S1351324921000280_ref1","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2009.05.002"},{"key":"S1351324921000280_ref2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1214"},{"key":"S1351324921000280_ref54","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/N15-1104"},{"key":"S1351324921000280_ref47","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/E17-1007"},{"key":"S1351324921000280_ref7","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1073"},{"key":"S1351324921000280_ref49","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1056"},{"key":"S1351324921000280_ref53","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1101"},{"key":"S1351324921000280_ref25","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/E14-1049"},{"key":"S1351324921000280_ref30","doi-asserted-by":"publisher","DOI":"10.3115\/992133.992154"},{"key":"S1351324921000280_ref56","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505677"},{"key":"S1351324921000280_ref52","doi-asserted-by":"publisher","DOI":"10.1613\/jair.4986"},{"key":"S1351324921000280_ref39","unstructured":"Mikolov, T. , Chen, K. , Corrado, G. and Dean, J. (2013a). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781."},{"key":"S1351324921000280_ref57","unstructured":"Yu, Z. , Wang, H. , Lin, X. and Wang, M. (2015). Learning term embeddings for hypernymy identification. In Proceedings of the 24th International Conference on Artificial Intelligence, Buenos Aires, Argentina. AAAI Press, pp. 1390\u20131397."},{"key":"S1351324921000280_ref37","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/N15-1028"},{"key":"S1351324921000280_ref42","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"S1351324921000280_ref14","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S16-1168"},{"key":"S1351324921000280_ref16","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S17-2002"},{"key":"S1351324921000280_ref28","unstructured":"Goodfellow, I. , Pouget-Abadie, J. , Mirza, M. , Xu, B. , Warde-Farley, D. , Ozair, S. , Courville, A. and Bengio, Y. (2014). Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, Montreal, Canada. MIT Press, pp. 2672\u20132680."},{"key":"S1351324921000280_ref44","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1234"},{"key":"S1351324921000280_ref8","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.421"},{"key":"S1351324921000280_ref3","unstructured":"Ammar, W. , Mulcaire, G. , Tsvetkov, Y. , Lample, G. , Dyer, C. and Smith, N.A. (2016). Massively multilingual word embeddings. arXiv preprint arXiv:1602.01925."},{"key":"S1351324921000280_ref59","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1179"},{"key":"S1351324921000280_ref50","doi-asserted-by":"crossref","unstructured":"Vulic, I. , Baker, S. , Ponti, E.M. , Petti, U. , Leviant, I. , Wing, K. , Majewska, O. , Bar, E. , Malone, M. , Poibeau, T. , Reichart, R. and Korhonen, A. (2020). Multi-simlex: a large-scale evaluation of multilingual and cross-lingual lexical semantic similarity. arXiv preprint arXiv:2003.04866.","DOI":"10.1162\/coli_a_00391"},{"key":"S1351324921000280_ref55","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1268"},{"key":"S1351324921000280_ref12","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S18-1116"},{"key":"S1351324921000280_ref26","doi-asserted-by":"publisher","DOI":"10.3115\/1219840.1219854"},{"key":"S1351324921000280_ref51","doi-asserted-by":"crossref","unstructured":"Vuli\u0107, I. , Glava\u0161, G. , Reichart, R. and Korhonen, A. (2019). Do we really need fully unsupervised cross-lingual embeddings? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China. Association for Computational Linguistics, pp. 4398\u20134409.","DOI":"10.18653\/v1\/D19-1449"},{"key":"S1351324921000280_ref18","unstructured":"Conneau, A. and Kiela, D. (2018). SentEval: an evaluation toolkit for universal sentence representations. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA), pp. 1699\u20131704."},{"key":"S1351324921000280_ref48","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1072"},{"key":"S1351324921000280_ref17","unstructured":"Cardellino, C. (2016). Spanish Billion Words Corpus and Embeddings. http:\/\/crscardellino.me\/SBWCE\/."},{"key":"S1351324921000280_ref6","doi-asserted-by":"crossref","unstructured":"Artetxe, M. , Labaka, G. and Agirre, E. (2018a). Generalizing and improving bilingual word embedding mappings with a multi-step framework of linear transformations. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18), New Orleans, Louisiana. Association for the Advancement of Artificial Intelligence, pp. 5012\u20135019.","DOI":"10.1609\/aaai.v32i1.11992"},{"key":"S1351324921000280_ref45","doi-asserted-by":"publisher","DOI":"10.1613\/jair.1.11640"},{"key":"S1351324921000280_ref36","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1027"},{"key":"S1351324921000280_ref9","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S18-2010"},{"key":"S1351324921000280_ref31","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1327"},{"key":"S1351324921000280_ref21","unstructured":"Devlin, J. , Chang, M.-W. , Lee, K. and Toutanova, K. (2019). BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota. Association for Computational Linguistics, pp. 4171\u20134186."},{"key":"S1351324921000280_ref11","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-009-9081-4"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324921000280","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,11]],"date-time":"2023-11-11T02:36:31Z","timestamp":1699670191000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324921000280\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,13]]},"references-count":60,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,5]]}},"alternative-id":["S1351324921000280"],"URL":"https:\/\/doi.org\/10.1017\/s1351324921000280","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"type":"print","value":"1351-3249"},{"type":"electronic","value":"1469-8110"}],"subject":[],"published":{"date-parts":[[2021,10,13]]},"assertion":[{"value":"\u00a9 The Author(s), 2021. Published by Cambridge University Press","name":"copyright","label":"Copyright","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https:\/\/creativecommons.org\/licenses\/by\/4.0\/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.","name":"license","label":"License","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This content has been made available to all.","name":"free","label":"Free to read"}]}}