{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T15:04:47Z","timestamp":1753887887975,"version":"3.41.2"},"reference-count":39,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2021,2,17]],"date-time":"2021-02-17T00:00:00Z","timestamp":1613520000000},"content-version":"vor","delay-in-days":47,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002383","name":"King Saud University","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002383","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Complexity"],"published-print":{"date-parts":[[2021,1]]},"abstract":"<jats:p>Automatic synonym extraction plays an important role in many natural language processing systems, such as those involving information retrieval and question answering. Recently, research has focused on extracting semantic relations from word embeddings since they capture relatedness and similarity between words. However, using word embeddings alone poses problems for synonym extraction because it cannot determine whether the relation between words is synonymy or some other semantic relation. In this paper, we present a novel solution for this problem by proposing the SynoExtractor pipeline, which can be used to filter similar word embeddings to retain synonyms based on specified linguistic rules. Our experiments were conducted using KSUCCA and Gigaword embeddings and trained with CBOW and SG models. We evaluated automatically extracted synonyms by comparing them with Alma\u2019any Arabic synonym thesauri. We also arranged for a manual evaluation by two Arabic linguists. The results of experiments we conducted show that using the SynoExtractor pipeline enhances the precision of synonym extraction compared to using the cosine similarity measure alone. SynoExtractor obtained a 0.605 mean average precision (MAP) for the King Saud University Corpus of Classical Arabic with 21% improvement over the baseline and a 0.748 MAP for the Gigaword corpus with 25% improvement. SynoExtractor outperformed the Sketch Engine thesaurus for synonym extraction by 32% in terms of MAP. Our work shows promising results for synonym extraction suggesting that our method can also be used with other languages.<\/jats:p>","DOI":"10.1155\/2021\/6627434","type":"journal-article","created":{"date-parts":[[2021,2,17]],"date-time":"2021-02-17T20:46:45Z","timestamp":1613594805000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["SynoExtractor: A Novel Pipeline for Arabic Synonym Extraction Using Word2Vec Word Embeddings"],"prefix":"10.1155","volume":"2021","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2184-3210","authenticated-orcid":false,"given":"Rawan N.","family":"Al-Matham","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7328-4935","authenticated-orcid":false,"given":"Hend S.","family":"Al-Khalifa","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2021,2,17]]},"reference":[{"key":"e_1_2_13_1_2","doi-asserted-by":"publisher","DOI":"10.1145\/219717.219748"},{"key":"e_1_2_13_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2019.12.002"},{"key":"e_1_2_13_3_2","doi-asserted-by":"crossref","unstructured":"LinN. KudinovV. A. ZawH. M. andNaingS. Query expansion for Myanmar information retrieval used by wordnet Proceedings of the 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus) January 2020 Saint Petersburg Russia 395\u2013399 https:\/\/doi.org\/10.1109\/EIConRus49466.2020.9039137.","DOI":"10.1109\/EIConRus49466.2020.9039137"},{"key":"e_1_2_13_4_2","doi-asserted-by":"crossref","unstructured":"PisarevI. A. Method for automated thesaurus development in learning process support systems Proceedings of the 2015 XVIII International Conference on Soft Computing and Measurements (SCM) May 2015 21\u201323.","DOI":"10.1109\/SCM.2015.7190399"},{"key":"e_1_2_13_5_2","doi-asserted-by":"crossref","unstructured":"YadavJ.andMeenaY. K. \u201cUse of fuzzy logic and wordnet for improving performance of extractive automatic text summarization Proceedings of the 2016 International Conference on Advances in Computing Communications and Informatics (ICACCI) September 2016 Jaipur 2071\u20132077 https:\/\/doi.org\/10.1109\/ICACCI.2016.7732356 2-s2.0-85007314824.","DOI":"10.1109\/ICACCI.2016.7732356"},{"key":"e_1_2_13_6_2","doi-asserted-by":"crossref","unstructured":"MirkinS. DaganI. andGeffetM. Integrating pattern-based and distributional similarity methods for lexical entailment acquisition Proceedings of the COLING\/ACL on Main Conference Poster Sessions July 2006 USA 579\u2013586 Accessed: Oct. 29 2020. [Online].","DOI":"10.3115\/1273073.1273148"},{"key":"e_1_2_13_7_2","unstructured":"ManishinaE. JabaianB. HuetS. andLef\u00e8vreF. Automatic corpus extension for data-driven natural language generation Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC) May 2016 Portoro\u017e Slovenia 3624\u20133631."},{"key":"e_1_2_13_8_2","unstructured":"TeamA. \u0645\u0639\u062c\u0645 \u0627\u0644\u0645\u0639\u0627\u0646\u064a \u0627\u0644\u0645\u0631\u0627\u062f\u0641\u0629 \u0648 \u0627\u0644\u0645\u062a\u0636\u0627\u062f\u0629 - \u0645\u0631\u0627\u062f\u0641 \u062e\u0627\u0631\u060c \u0639\u0643\u0633 \u062e\u0627\u0631 - \u0645\u0631\u0627\u062f\u0641\u0627\u062a \u0648 \u0623\u0636\u062f\u0627\u062f \u0627\u0644\u0644\u063a\u0629 \u0627\u0644\u0639\u0631\u0628\u064a\u0629 \u0648 \u0627\u0644\u0627\u0646\u062c\u0644\u064a\u0632\u064a\u0629 \u0641\u064a \u0642\u0627\u0645\u0648\u0633 \u0648 \u0645\u0639\u062c\u0645 \u0627\u0644\u0645\u0639\u0627\u0646\u064a \u0627\u0644\u0641\u0648\u0631\u064a accessed Oct. 29 2020 https:\/\/www.almaany.com\/ar\/thes\/ar-ar\/."},{"key":"e_1_2_13_9_2","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-9-159"},{"key":"e_1_2_13_10_2","doi-asserted-by":"publisher","DOI":"10.1017\/s1351324911000210"},{"key":"e_1_2_13_11_2","first-page":"116","volume-title":"Advances in Natural Language Processing","author":"Y\u0131ld\u0131z T.","year":"2014"},{"key":"e_1_2_13_12_2","doi-asserted-by":"publisher","DOI":"10.1177\/0165551518799640"},{"key":"e_1_2_13_13_2","first-page":"1133","article-title":"Self-supervised synonym extraction from the web","volume":"31","author":"Hu F.","year":"2015","journal-title":"Journal of Information Science and Engineering"},{"key":"e_1_2_13_14_2","unstructured":"AsrF. T. ZinkovR. andJonesM. \u201cQuerying word embeddings for similarity and relatedness 1 Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies June 2018 New Orleans LA USA 675\u2013684."},{"key":"e_1_2_13_15_2","doi-asserted-by":"crossref","unstructured":"OnoM. MiwaM. andSasakiY. \u201cWord embedding-based antonym detection using thesauri and distributional information Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2015 984\u2013989.","DOI":"10.3115\/v1\/N15-1100"},{"key":"e_1_2_13_16_2","doi-asserted-by":"crossref","unstructured":"DouZ. WeiW. andWanX. Improving word embeddings for antonym detection using thesauri and sentiwordnet Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing October 2018 Zhengzhou China 67\u201379.","DOI":"10.1007\/978-3-319-99501-4_6"},{"key":"e_1_2_13_17_2","doi-asserted-by":"crossref","unstructured":"NguyenK. A. im WaldeS. S. andVuN. T. \u201cIntegrating distributional lexical contrast into word embeddings for antonym\u2013synonym distinction Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics August 2016 Berlin Germany.","DOI":"10.18653\/v1\/P16-2074"},{"key":"e_1_2_13_18_2","doi-asserted-by":"crossref","unstructured":"ZhangL. LiJ. andWangC. Automatic synonym extraction using Word2Vec and spectral clustering Proceedings of the Control Conference (CCC) 2017 36th Chinese July 2017 Dalian China 5629\u20135632.","DOI":"10.23919\/ChiCC.2017.8028251"},{"key":"e_1_2_13_19_2","doi-asserted-by":"publisher","DOI":"10.1515\/pralin-2016-0006"},{"volume-title":"Cosine Similarity - an Overview | ScienceDirect Topics","key":"e_1_2_13_20_2"},{"key":"e_1_2_13_21_2","doi-asserted-by":"publisher","DOI":"10.34028\/iajit\/17\/1\/6"},{"key":"e_1_2_13_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2016.09.019"},{"key":"e_1_2_13_23_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10772-015-9301-9"},{"key":"e_1_2_13_24_2","article-title":"Arabic relation extraction: a survey","volume":"5","author":"Sarhan I.","year":"2016","journal-title":"International Journal of Computer and Information Technology"},{"key":"e_1_2_13_25_2","unstructured":"AlrabiahM. Al-SalmanA. andAtwellE. S. The design and construction of the 50 million words KSUCCA Proceedings of WACL\u20192 Second Workshop on Arabic Corpus Linguistics July 2013 Lancster University UK 5\u20138."},{"volume-title":"Arabic Gigaword Third Edition - Linguistic Data Consortium","key":"e_1_2_13_26_2"},{"volume-title":"Sketch Engine | Language Corpus Management And Query System","key":"e_1_2_13_27_2"},{"volume-title":"ROGET\u2019s Hyperlinked Thesaurus","key":"e_1_2_13_28_2"},{"key":"e_1_2_13_29_2","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511486494"},{"key":"e_1_2_13_30_2","unstructured":"Kami\u0144skiM. Corpus-based extraction of collocations for near-synonym discrimination Proceedings of the Xvii Euralex International Congress September 2016 Tbilisi Georgia 367\u2013374."},{"volume-title":"Embedding Viewer","key":"e_1_2_13_31_2"},{"key":"e_1_2_13_32_2","unstructured":"MikolovT. ChenK. CorradoG. andDeanJ. Efficient estimation of word representations in vector space 2013 http:\/\/arxiv.org\/abs\/1301.3781."},{"key":"e_1_2_13_33_2","first-page":"31","article-title":"Normalized (pointwise) mutual information in collocation extraction","author":"Bouma G.","year":"2009","journal-title":"Proceedings of GSCL"},{"key":"e_1_2_13_34_2","unstructured":"LeQ.andMikolovT. Distributed representations of sentences and documents Proceedings of the International Conference on Machine Learning June 2014 Washington WA USA 1188\u20131196."},{"key":"e_1_2_13_35_2","first-page":"33","article-title":"The distributional hypothesis","volume":"20","author":"Sahlgren M.","year":"2008","journal-title":"Italian Journal of Disability Studies"},{"key":"e_1_2_13_36_2","doi-asserted-by":"crossref","unstructured":"AbdelaliA. DarwishK. DurraniN. andMubarakH. Farasa: a fast and furious segmenter for Arabic Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations 2016 11\u201316.","DOI":"10.18653\/v1\/N16-3003"},{"key":"e_1_2_13_37_2","doi-asserted-by":"publisher","DOI":"10.22452\/mjcs.vol29no1.5"},{"key":"e_1_2_13_38_2","doi-asserted-by":"crossref","unstructured":"BatitaM. A.andZriguiM. The enrichment of Arabic wordnet antonym relations Proceedings of the International Conference on Computational Linguistics and Intelligent Text Processing April 2017 Budapest Hungary 342\u2013353.","DOI":"10.1007\/978-3-319-77113-7_27"},{"key":"e_1_2_13_39_2","unstructured":"AldhubayiL. B. M. Machine Learning of Antonyms in English and Arabic Corpora 2019 University of Leeds Leeds UK Phd Thesis."}],"container-title":["Complexity"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2021\/6627434.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2021\/6627434.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2021\/6627434","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,9]],"date-time":"2024-08-09T22:20:30Z","timestamp":1723242030000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2021\/6627434"}},"subtitle":[],"editor":[{"given":"M. Irfan","family":"Uddin","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,1]]},"references-count":39,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,1]]}},"alternative-id":["10.1155\/2021\/6627434"],"URL":"https:\/\/doi.org\/10.1155\/2021\/6627434","archive":["Portico"],"relation":{},"ISSN":["1076-2787","1099-0526"],"issn-type":[{"type":"print","value":"1076-2787"},{"type":"electronic","value":"1099-0526"}],"subject":[],"published":{"date-parts":[[2021,1]]},"assertion":[{"value":"2020-12-15","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-01-27","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-02-17","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"6627434"}}