{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:19:35Z","timestamp":1750220375993,"version":"3.41.0"},"reference-count":79,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2020,10,13]],"date-time":"2020-10-13T00:00:00Z","timestamp":1602547200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100008982","name":"Qatar National Research Fund","doi-asserted-by":"publisher","award":["NPRP 6-716-1-138"],"award-info":[{"award-number":["NPRP 6-716-1-138"]}],"id":[{"id":"10.13039\/100008982","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2020,11,30]]},"abstract":"<jats:p>Success of Natural Language Processing (NLP) models, just like all advanced machine learning models, rely heavily on large -scale lexical resources. For English, English WordNet (EWN) is a leading example of a large-scale resource that has enabled advances in Natural Language Understanding (NLU) tasks such as word sense disambiguation, question answering, sentiment analysis, and emotion recognition. EWN includes sets of cognitive synonyms called synsets, which are interlinked by means of conceptual-semantic and lexical relations and where each synset expresses a distinct concept. However, other languages are still lagging behind in having large-scale and rich lexical resources similar to EWN. In this article, we focus on enabling the development of such resources for Arabic. While there have been efforts in developing an Arabic WordNet (AWN), the current version of AWN has its limitations in size and in lacking transliteration standards, which are important for compatibility with Arabic NLP tools. Previous efforts for extending AWN resulted in a lexicon, called ArSenL, that overcame the size and the transliteration standard limitation but was limited in accuracy due to the heuristic approach that only considered surface matching between the English definitions from the Standard Arabic Morphological Analyzer (SAMA) and EWN synset terms, and that resulted in inaccurate mapping of Arabic lemmas to EWN\u2019s synsets. Furthermore, there has been limited exploration of other expansion methods due to expensive manual validation needed. To address these limitations of simultaneously having large-scale size with high accuracy and standard representations, the mapping problem is formulated as a link prediction problem between a large-scale Arabic lexicon and EWN, where a word in one lexicon is linked to a word in another lexicon if the two words are semantically related. We use a semi-supervised approach to create a training dataset by finding common terms in the large-scale Arabic resource and AWN. This set of data becomes implicitly linked to EWN and can be used for training and evaluating prediction models. We propose the use of a two-step Boosting method, where the first step aims at linking English translations of SAMA\u2019s terms to EWN\u2019s synsets. The second step uses surface similarity between SAMA\u2019s glosses and EWN\u2019s synsets. The method results in a new large-scale Arabic lexicon that we call ArSenL 2.0 as a sequel to the previously developed sentiment lexicon ArSenL. A comprehensive study covering both intrinsic and extrinsic evaluations shows the superiority of the method compared to several baseline and state-of-the-art link prediction methods. Compared to previously developed ArSenL, ArSenL 2.0 included a larger set of sentimentally charged adjectives and verbs. It also showed higher linking accuracy on the ground truth data compared to previous ArSenL. For extrinsic evaluation, ArSenL 2.0 was used for sentiment analysis and showed, here, too, higher accuracy compared to previous ArSenL.<\/jats:p>","DOI":"10.1145\/3404854","type":"journal-article","created":{"date-parts":[[2020,10,13]],"date-time":"2020-10-13T11:47:34Z","timestamp":1602589654000},"page":"1-38","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["A Link Prediction Approach for Accurately Mapping a Large-scale Arabic Lexical Resource to English WordNet"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7277-617X","authenticated-orcid":false,"given":"Gilbert","family":"Badaro","sequence":"first","affiliation":[{"name":"American University of Beirut, Lebanon"}]},{"given":"Hazem","family":"Hajj","sequence":"additional","affiliation":[{"name":"American University of Beirut, Lebanon"}]},{"given":"Nizar","family":"Habash","sequence":"additional","affiliation":[{"name":"New York University Abu Dhabi, Abu Dhabi, UAE"}]}],"member":"320","published-online":{"date-parts":[[2020,10,13]]},"reference":[{"volume-title":"Proceedings of the 6th International Global WordNet Conference. 18--22","year":"2012","author":"Abdul-Mageed Muhammad","key":"e_1_2_1_1_1"},{"volume-title":"Proceedings of the International Conference on Language Resources and Evaluation. 1162--1169","author":"Abdul-Mageed Muhammad","key":"e_1_2_1_2_1"},{"volume-title":"Proceedings of the 49th Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 587--591","year":"2011","author":"Abdul-Mageed Muhammad","key":"e_1_2_1_3_1"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/AEECT.2013.6716448"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-013-9237-0"},{"volume-title":"Proceedings of the 50th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 399--409","year":"2012","author":"Abu-Jbara Amjad","key":"e_1_2_1_6_1"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1504\/IJSNM.2015.072280"},{"volume-title":"Proceedings of the 4th International Conference on Information and Communication Systems (ICICS\u201913)","year":"2013","author":"Al-Kabi Mohammed","key":"e_1_2_1_8_1"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1177\/0165551516683908"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.diin.2015.07.006"},{"key":"e_1_2_1_11_1","first-page":"25","article-title":"Aroma: A recursive deep learning model for opinion mining in Arabic as a low resource language","volume":"16","author":"Al-Sallab Ahmad","year":"2017","journal-title":"ACM Trans. Asian Low-resour. Lang. Inf. Proc."},{"volume":"9","volume-title":"Proceedings of the Arabic Natural Language Processing Workshop","author":"Al Sallab Ahmad A.","key":"e_1_2_1_12_1"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1066"},{"key":"e_1_2_1_14_1","article-title":"Arabic SentiWordNet in relation to SentiWordNet 3.0","volume":"4","author":"Alhazmi Samah","year":"2013","journal-title":"Int. J. Comput. Ling."},{"volume-title":"Proceedings of the 3rd International Conference on Arabic Language Processing (CITALA\u201909)","year":"2009","author":"Alkhalifa Musa","key":"e_1_2_1_15_1"},{"key":"e_1_2_1_16_1","first-page":"20","article-title":"Automatically extending named entities coverage of Arabic WordNet using","volume":"3","author":"Alkhalifa Musa","year":"2010","journal-title":"Wikipedia. Int. J. Inf. Commun. Technol."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3278605"},{"volume-title":"Proceedings of the 3rd Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial\u201916)","year":"2016","author":"Aminian Maryam","key":"e_1_2_1_18_1"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the International Conference on Language Resources and Evaluation","volume":"10","author":"Baccianella Stefano","year":"2010"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W15-3203"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3295662"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/W14-3623"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.5339\/qfarc.2014.ITPP0631"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S18-1036"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/IWCMC.2013.6583584"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2659480.2659501"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDMW.2014.28"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S18-2009"},{"volume-title":"Proceedings of the11th International Conference on Language Resources and Evaluation (LREC\u201918)","year":"2018","author":"Badaro Gilbert","key":"e_1_2_1_29_1"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-1314"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S17-2099"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45715-1_11"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3360016"},{"volume-title":"Proceedings of the 3rd International WordNet Conference. Citeseer, 295--300","year":"2006","author":"Black William","key":"e_1_2_1_34_1"},{"volume-title":"Proceedings of the 51st Meeting of the Association for Computational Linguistics. 1352--1362","year":"2013","author":"Bond Francis","key":"e_1_2_1_35_1"},{"volume-title":"Arabic Morphological Analyzer (AraMorph). Version 1.0","year":"2002","author":"Buckwalter Tim","key":"e_1_2_1_36_1"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-2063"},{"volume-title":"Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD\u201916), Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM\u201916)","year":"2016","author":"Constantine Layale","key":"e_1_2_1_39_1"},{"volume-title":"Proceedings of the 50th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 65--69","year":"2012","author":"Dasigi Pradeep","key":"e_1_2_1_40_1"},{"volume-title":"Proceedings of the International Conference on Language Resources and Evaluation. 3782--3789","year":"2014","author":"Diab Mona T.","key":"e_1_2_1_41_1"},{"volume-title":"Proceedings of the International Conference on Language Resources and Evaluation.","year":"2016","author":"El-Beltagy Samhaa R.","key":"e_1_2_1_42_1"},{"volume-title":"Computational Linguistics, Speech 8 Image Processing for Arabic Language (Language Processing, Pattern Recognition and Intelligent Systems)","author":"El-Beltagy Samhaa R.","key":"e_1_2_1_43_1"},{"volume-title":"Proceedings of the 9th International Conference on Innovations in Information Technology (IIT\u201913)","author":"Samhaa","key":"e_1_2_1_44_1"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10590-011-9110-0"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2789210"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/AICCSA.2016.7945800"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1304"},{"key":"e_1_2_1_49_1","first-page":"1","article-title":"SentiWordNet: A high-coverage lexical resource for opinion mining","volume":"17","author":"Esuli Andrea","year":"2007","journal-title":"Evaluation"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/1644879.1644881"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2436256.2436274"},{"volume-title":"Proceedings of the Qatar Foundation Research Conference. ITPP1075","year":"2014","author":"Baly Ramy Georges","key":"e_1_2_1_52_1"},{"volume-title":"Proceedings of the 50th Meeting of the Association for Computational Linguistics. 140--144","year":"2012","author":"Guo Weiwei","key":"e_1_2_1_53_1"},{"volume-title":"Proceedings of the 50th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 864--872","year":"2012","author":"Guo Weiwei","key":"e_1_2_1_54_1"},{"volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 739--745","year":"2013","author":"Guo Weiwei","key":"e_1_2_1_55_1"},{"volume-title":"Proceedings of the Session Traitement Automatique de l\u2019Arabe (JEP-TALN\u201904)","year":"2004","author":"Habash Nizar","key":"e_1_2_1_56_1"},{"volume-title":"Arabic Computational Morphology: Knowledge-based and Empirical Methods, Antal van den Bosch and Abdelhadi Soudi (Eds.)","author":"Habash Nizar","key":"e_1_2_1_57_1"},{"volume-title":"Arabic Computational Morphology","author":"Habash Nizar","key":"e_1_2_1_58_1"},{"key":"e_1_2_1_59_1","first-page":"32","article-title":"Clasenti: A class-specific sentiment analysis framework","volume":"17","author":"Hamdi Ali","year":"2018","journal-title":"ACM Trans. Asian Low-resour. Lang. Inf. Proc."},{"volume-title":"Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC\u201912)","year":"2012","author":"Hanoka Val\u00e9rie","key":"e_1_2_1_60_1"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/2184436.2184437"},{"volume-title":"Idioms-proverbs lexicon for modern standard Arabic and colloquial sentiment analysis. Arxiv Preprint Arxiv:1506.01906","year":"2015","author":"Ibrahim Hossam S.","key":"e_1_2_1_62_1"},{"volume-title":"Proceedings of COLING 2012: Demonstration Papers","year":"2012","author":"Joshi Salil","key":"e_1_2_1_63_1"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-019-02755-3"},{"volume-title":"LDC Catalog No. LDC2010L01","year":"2010","author":"Maamouri Mohamed","key":"e_1_2_1_65_1"},{"volume-title":"Proceedings of the Conference on Advances in Neural Information Processing Systems. 3111--3119","year":"2013","author":"Mikolov Tomas","key":"e_1_2_1_66_1"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1093\/ijl\/3.4.235"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1109\/4236.815847"},{"volume-title":"Proceedings of the 48th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 216--225","year":"2010","author":"Navigli Roberto","key":"e_1_2_1_69_1"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2012.07.001"},{"volume-title":"Proceedings of the 9th Global WordNet Conference (GWC'18)","year":"2018","author":"Patel Kevin","key":"e_1_2_1_71_1"},{"volume-title":"Proceedings of the Workshop on Making Sense of Sense: Bringing Psycholinguistics and Computational Linguistics Together.","year":"2006","author":"Patwardhan Siddharth","key":"e_1_2_1_72_1"},{"key":"e_1_2_1_73_1","unstructured":"Jennifer Chu-Carroll and John Prager. 2001. Use of WordNet hypernyms for answering what-is questions. In TREC-2001.  Jennifer Chu-Carroll and John Prager. 2001. Use of WordNet hypernyms for answering what-is questions. In TREC-2001."},{"volume-title":"Proceedings of the 4th Global WordNet Conference.","year":"2008","author":"Rodr\u00edguez Horacio","key":"e_1_2_1_74_1"},{"volume-title":"Proceedings of the International Conference on Language Resources and Evaluation.","year":"2008","author":"Rodr\u00edguez Horacio","key":"e_1_2_1_75_1"},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1109\/SENSET.2017.8125054"},{"volume-title":"Proceedings of the 5th Language and Technology Conference (LTC\u201911)","year":"2011","author":"Sagot Beno\u00eet","key":"e_1_2_1_77_1"},{"volume-title":"Habash","year":"2012","author":"Salloum Wael Sameer","key":"e_1_2_1_78_1"},{"volume-title":"Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL\u201907)","year":"2007","author":"Shen Dan","key":"e_1_2_1_79_1"},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1145\/1644879.1644884"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3404854","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3404854","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:17:44Z","timestamp":1750191464000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3404854"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,13]]},"references-count":79,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2020,11,30]]}},"alternative-id":["10.1145\/3404854"],"URL":"https:\/\/doi.org\/10.1145\/3404854","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2020,10,13]]},"assertion":[{"value":"2020-04-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-10-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}