{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T15:55:43Z","timestamp":1774540543859,"version":"3.50.1"},"reference-count":36,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2025,1,18]],"date-time":"2025-01-18T00:00:00Z","timestamp":1737158400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100006476","name":"Romanian Academy","doi-asserted-by":"publisher","award":["Annual Research Plan"],"award-info":[{"award-number":["Annual Research Plan"]}],"id":[{"id":"10.13039\/501100006476","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>Transformer models produce advanced text representations that have been used to break through the hard challenge of natural language understanding. Using the Transformer\u2019s attention mechanism, which acts as a language learning memory, trained on tens of billions of words, a word sense disambiguation (WSD) algorithm can now construct a more faithful vectorial representation of the context of a word to be disambiguated. Working with a set of 34 lemmas of nouns, verbs, adjectives and adverbs selected from the National Reference Corpus of Romanian (CoRoLa), we show that using BERT\u2019s attention heads at all hidden layers, we can devise contextual vectors of the target lemma that produce better clusters of lemma\u2019s senses than the ones obtained with standard BERT embeddings. If we automatically translate the Romanian example sentences of the target lemma into English, we show that we can reliably infer the number of senses with which the target lemma appears in the CoRoLa. We also describe an unsupervised WSD algorithm that, using a Romanian BERT model and a few example sentences of the target lemma\u2019s senses, can label the Romanian induced sense clusters with the appropriate sense labels, with an average accuracy of 64%.<\/jats:p>","DOI":"10.3390\/make7010010","type":"journal-article","created":{"date-parts":[[2025,1,20]],"date-time":"2025-01-20T04:04:12Z","timestamp":1737345852000},"page":"10","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Unsupervised Word Sense Disambiguation Using Transformer\u2019s Attention Mechanism"],"prefix":"10.3390","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3421-3173","authenticated-orcid":false,"given":"Radu","family":"Ion","sequence":"first","affiliation":[{"name":"Research Institute for Artificial Intelligence \u201cMihai Dr\u0103g\u0103nescu\u201d, \u201cCalea 13 Septembrie\u201d, 050711 Bucharest, Romania"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0019-7574","authenticated-orcid":false,"given":"Vasile","family":"P\u0103i\u0219","sequence":"additional","affiliation":[{"name":"Research Institute for Artificial Intelligence \u201cMihai Dr\u0103g\u0103nescu\u201d, \u201cCalea 13 Septembrie\u201d, 050711 Bucharest, Romania"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1945-2587","authenticated-orcid":false,"given":"Verginica Barbu","family":"Mititelu","sequence":"additional","affiliation":[{"name":"Research Institute for Artificial Intelligence \u201cMihai Dr\u0103g\u0103nescu\u201d, \u201cCalea 13 Septembrie\u201d, 050711 Bucharest, Romania"}]},{"given":"Elena","family":"Irimia","sequence":"additional","affiliation":[{"name":"Research Institute for Artificial Intelligence \u201cMihai Dr\u0103g\u0103nescu\u201d, \u201cCalea 13 Septembrie\u201d, 050711 Bucharest, Romania"}]},{"given":"Maria","family":"Mitrofan","sequence":"additional","affiliation":[{"name":"Research Institute for Artificial Intelligence \u201cMihai Dr\u0103g\u0103nescu\u201d, \u201cCalea 13 Septembrie\u201d, 050711 Bucharest, Romania"}]},{"given":"Valentin","family":"Badea","sequence":"additional","affiliation":[{"name":"Research Institute for Artificial Intelligence \u201cMihai Dr\u0103g\u0103nescu\u201d, \u201cCalea 13 Septembrie\u201d, 050711 Bucharest, Romania"}]},{"given":"Dan","family":"Tufi\u0219","sequence":"additional","affiliation":[{"name":"Research Institute for Artificial Intelligence \u201cMihai Dr\u0103g\u0103nescu\u201d, \u201cCalea 13 Septembrie\u201d, 050711 Bucharest, Romania"}]}],"member":"1968","published-online":{"date-parts":[[2025,1,18]]},"reference":[{"key":"ref_1","unstructured":"Vaswani, A., Jones, L., Shazeer, N., Parmar, N., Gomez, A.N., Uszkoreit, J., Kaiser, \u0141., and Polosukhin, I. (2017, January 4\u20139). Attention Is All You Need. Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA."},{"key":"ref_2","unstructured":"(2024, October 10). Hello GPT-4o. Available online: https:\/\/openai.com\/index\/hello-gpt-4o\/."},{"key":"ref_3","unstructured":"(2024, October 10). Gemini Models. Available online: https:\/\/deepmind.google\/technologies\/gemini\/."},{"key":"ref_4","first-page":"97","article-title":"Automatic word sense discrimination","volume":"24","year":"1998","journal-title":"Comput. Linguist."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Song, X., Salcianu, A., Song, Y., Dopson, D., and Zhou, D. (2020). Fast WordPiece Tokenization. arXiv.","DOI":"10.18653\/v1\/2021.emnlp-main.160"},{"key":"ref_6","unstructured":"Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019, January 2\u20137). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the NAACL-HLT 2019, Minneapolis, MN, USA."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Tufi\u0219, D., Ion, R., and Ide, N. (2004, January 23\u201327). Fine-Grained Word Sense Disambiguation Based on Parallel Corpora, Word Alignment, Word Clustering and Aligned Wordnets. Proceedings of the 20th International Conference on Computational Linguistics COLING 2004, Geneva, Switzerland.","DOI":"10.3115\/1220355.1220547"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Gala, N., Rapp, R., and Bel-Enguix, N. (2014). The Lexical Ontology for Romanian. Language Production, Cognition, and the Lexicon, Series Text, Speech and Language Technology, Springer.","DOI":"10.1007\/978-3-319-08043-7"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Fellbaum, C. (1998). WordNet: An Electronic Lexical Database, MIT Press.","DOI":"10.7551\/mitpress\/7287.001.0001"},{"key":"ref_10","first-page":"1","article-title":"Generalizing from a Few Examples: A Survey on Few-shot Learning","volume":"53","author":"Wang","year":"2020","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Raganato, A., Camacho-Collados, J., and Navigli, R. (2017, January 3\u20137). Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Valencia, Spain.","DOI":"10.18653\/v1\/E17-1010"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Manandhar, S., Klapaftis, I.P., Dligach, D., and Pradhan, S.S. (2010, January 15\u201316). SemEval-2010 Task 14: Word Sense Induction & Disambiguation. Proceedings of the 5th International Workshop on Semantic Evaluation, ACL 2010, Uppsala, Sweden.","DOI":"10.3115\/1621969.1621990"},{"key":"ref_13","unstructured":"(2024, October 17). Word Sense Induction. Available online: https:\/\/paperswithcode.com\/task\/word-sense-induction."},{"key":"ref_14","first-page":"130","article-title":"Breaking Sticks and Ambiguities with Adaptive Skip-gram","volume":"51","author":"Bartunov","year":"2016","journal-title":"PMLR"},{"key":"ref_15","unstructured":"Sun, Y., Rao, N., and Ding, W. (2017). A Simple Approach to Learn Polysemous Word Embeddings. arXiv."},{"key":"ref_16","unstructured":"Huang, E.H., Socher, R., Manning, C.D., and Ng, A.Y. (2012, January 8\u201314). Improving word representations via global context and multiple word prototypes. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics Volume 1: Long Papers, Jeju, Republic of Korea."},{"key":"ref_17","unstructured":"Amplayo, R.K., Hwang, S., and Song, M. (February, January 27). AutoSense Model for Word Sense Induction. Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), Honolulu, HI, USA."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Eyal, M., Sadde, S., Taub-Tabib, H., and Goldberg, Y. (2022, January 22\u201327). Large Scale Substitution-based Word Sense Induction. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics Volume 1: Long Papers, Dublin, Ireland.","DOI":"10.18653\/v1\/2022.acl-long.325"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Ansell, A., Bravo-Marquez, F., and Pfahringer, B. (2021, January 19\u201323). PolyLM: Learning about Polysemy through Language Modeling. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, Online.","DOI":"10.18653\/v1\/2021.eacl-main.45"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., and Hidayanto, A.N. (2021). A Comparative Study of Transformers on Word Sense Disambiguation. Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, Springer.","DOI":"10.1007\/978-3-030-92307-5"},{"key":"ref_21","unstructured":"Vandenbussche, P.-Y., Scerri, T., and Daniel, R. (2021, January 8). Word Sense Disambiguation with Transformer Models. Proceedings of the 6th Workshop on Semantic Deep Learning (SemDeep-6), Online."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Tripodi, R., and Navigli, R. (2019, January 3\u20137). Game Theory Meets Embeddings: A Unified Framework for Word Sense Disambiguation. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China.","DOI":"10.18653\/v1\/D19-1009"},{"key":"ref_23","unstructured":"Academia Rom\u00e2n\u0103, Institutul de Lingvistic\u0103 \u201cIorgu Iordan\u2014Al. Rosetti\u201d (2016). DEX\u2014Dic\u021bionarul Explicativ al Limbii Rom\u00e2ne, Univers Enciclopedic."},{"key":"ref_24","first-page":"227","article-title":"Little Strokes Fell Great Oaks. Creating CoRoLA, The Reference Corpus of Contemporary Romanian","volume":"64","author":"Irimia","year":"2019","journal-title":"RRL"},{"key":"ref_25","unstructured":"Ion, R. (2022). Evaluating and User-Testing Rodna, A New Romanian Text Processing Pipeline, Romanian Academy. Research report."},{"key":"ref_26","unstructured":"(2024, October 24). Mistral. Available online: https:\/\/ollama.com\/library\/mistral."},{"key":"ref_27","unstructured":"(2024, October 22). Scikit-Learn. Available online: https:\/\/scikit-learn.org\/1.5\/index.html."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Masala, M., Ruseti, S., and Dascalu, M. (2020, January 8\u201313). RoBERT\u2014A Romanian BERT Model. Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain.","DOI":"10.18653\/v1\/2020.coling-main.581"},{"key":"ref_29","unstructured":"(2024, October 22). Readerbench\/RoBERT-Small. Available online: https:\/\/huggingface.co\/readerbench\/RoBERT-small."},{"key":"ref_30","unstructured":"(2024, October 22). FacebookAI\/Roberta-Base. Available online: https:\/\/huggingface.co\/FacebookAI\/roberta-base."},{"key":"ref_31","unstructured":"(2024, October 23). Collins Dictionary. Available online: https:\/\/www.collinsdictionary.com\/dictionary\/english."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Elmakias, I., and Vilenchik, D. (2021). An Oblivious Approach to Machine Translation Quality Estimation. Mathematics, 9.","DOI":"10.3390\/math9172090"},{"key":"ref_33","unstructured":"Moosa, I.M., Zhang, R., and Yin, W. (2024, January 7\u201311). MT-Ranker: Reference-free machine translation evaluation by inter-system ranking. Proceedings of the 12th International Conference on Learning Representations, ICLR 2024, Vienna, Austria."},{"key":"ref_34","unstructured":"(2025, January 06). Readerbench\/RoBERT-Base. Available online: https:\/\/huggingface.co\/readerbench\/RoBERT-base."},{"key":"ref_35","unstructured":"(2025, January 06). Readerbench\/RoBERT-Large. Available online: https:\/\/huggingface.co\/readerbench\/RoBERT-large."},{"key":"ref_36","unstructured":"(2025, January 06). FacebookAI\/xlm-Roberta-Large. Available online: https:\/\/huggingface.co\/FacebookAI\/xlm-roberta-large."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/1\/10\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,8]],"date-time":"2025-10-08T10:31:29Z","timestamp":1759919489000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/1\/10"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,18]]},"references-count":36,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,3]]}},"alternative-id":["make7010010"],"URL":"https:\/\/doi.org\/10.3390\/make7010010","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,18]]}}}