{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T14:07:01Z","timestamp":1760710021874,"version":"3.41.2"},"reference-count":59,"publisher":"Emerald","issue":"5","license":[{"start":{"date-parts":[[2019,12,2]],"date-time":"2019-12-02T00:00:00Z","timestamp":1575244800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJWIS"],"published-print":{"date-parts":[[2019,12,2]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>This paper aims to propose an approach to automatically annotate a large corpus in Arabic dialect. This corpus is used in order to analyse sentiments of Arabic users on social medias. It focuses on the Algerian dialect, which is a sub-dialect of Maghrebi Arabic. Although Algerian is spoken by roughly 40 million speakers, few studies address the automated processing in general and the sentiment analysis in specific for Algerian.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>The approach is based on the construction and use of a sentiment lexicon to automatically annotate a large corpus of Algerian text that is extracted from Facebook. Using this approach allow to significantly increase the size of the training corpus without calling the manual annotation. The annotated corpus is then vectorized using document embedding (doc2vec), which is an extension of word embeddings (word2vec). For sentiments classification, the authors used different classifiers such as support vector machines (SVM), Naive Bayes (NB) and logistic regression (LR).<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>The results suggest that NB and SVM classifiers generally led to the best results and MLP generally had the worst results. Further, the threshold that the authors use in selecting messages for the training set had a noticeable impact on recall and precision, with a threshold of 0.6 producing the best results. Using PV-DBOW led to slightly higher results than using PV-DM. Combining PV-DBOW and PV-DM representations led to slightly lower results than using PV-DBOW alone. The best results were obtained by the NB classifier with F1 up to 86.9 per cent.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>The principal originality of this paper is to determine the right parameters for automatically annotating an Algerian dialect corpus. This annotation is based on a sentiment lexicon that was also constructed automatically.<\/jats:p><\/jats:sec>","DOI":"10.1108\/ijwis-03-2019-0008","type":"journal-article","created":{"date-parts":[[2019,10,14]],"date-time":"2019-10-14T06:31:54Z","timestamp":1571034714000},"page":"594-615","source":"Crossref","is-referenced-by-count":7,"title":["A set of parameters for automatically annotating a Sentiment Arabic Corpus"],"prefix":"10.1108","volume":"15","author":[{"given":"Guellil","family":"Imane","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Darwish","family":"Kareem","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Azouaou","family":"Faical","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"140","reference":[{"key":"key2020012715181488000_ref001","first-page":"3907","article-title":"Awatif: a multi-genre corpus for modern standard Arabic subjectivity and sentiment analysis","volume-title":"LREC","year":"2012"},{"issue":"1","key":"key2020012715181488000_ref002","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.csl.2013.03.001","article-title":"Samar: subjectivity and sentiment analysis for Arabic social media","volume":"28","year":"2014","journal-title":"Computer Speech and Language"},{"key":"key2020012715181488000_ref003","first-page":"547","article-title":"Automatic lexicon construction for Arabic sentiment analysis","volume-title":"International Conference on Future internet of Things and Cloud (FiCloud)","year":"2014"},{"key":"key2020012715181488000_ref004","first-page":"1","article-title":"Arabic sentiment analysis: lexicon-based and corpus-based","volume-title":"IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT)","year":"2013"},{"issue":"3","key":"key2020012715181488000_ref005","doi-asserted-by":"crossref","first-page":"55","DOI":"10.4018\/ijitwe.2014070104","article-title":"Towards improving the lexicon-based approach for Arabic sentiment analysis","volume":"9","year":"2014","journal-title":"International Journal of Information Technology and Web Engineering"},{"issue":"2","key":"key2020012715181488000_ref006","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1504\/IJSNM.2015.072280","article-title":"Lexicon-based sentiment analysis of Arabic tweets","volume":"2","year":"2015","journal-title":"International Journal of Social Network Mining"},{"key":"key2020012715181488000_ref007","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1016\/j.procs.2017.05.365","article-title":"Using word embedding and ensemble learning for highly imbalanced data sentiment analysis in short Arabic text","volume":"109","year":"2017","journal-title":"Procedia Computer Science"},{"key":"key2020012715181488000_ref008","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1016\/j.procs.2017.10.094","article-title":"Arasenti-tweet: a corpus for Arabic sentiment analysis of Saudi tweets","volume":"117","year":"2017","journal-title":"Procedia Computer Science"},{"key":"key2020012715181488000_ref009","first-page":"114","article-title":"Arabic language sentiment analysis on health services","volume-title":"1st International Workshop on Arabic Script Analysis and Recognition (ASAR)","year":"2017"},{"issue":"1","key":"key2020012715181488000_ref010","first-page":"364","article-title":"Survey on Arabic sentiment analysis in twitter","volume":"9","year":"2015","journal-title":"International Science Index"},{"issue":"2","key":"key2020012715181488000_ref011","first-page":"256","article-title":"Semantic sentiment analysis of Arabic texts","volume":"8","year":"2017","journal-title":"International Journal of Advanced Computer Science and Applications"},{"key":"key2020012715181488000_ref012","first-page":"3820","article-title":"Word embeddings for Arabic sentiment analysis","volume-title":"IEEE International Conference on Big Data (Big Data)","year":"2016"},{"article-title":"Combining sentiment lexicons of Arabic terms","volume-title":"AMCIS","year":"2017","key":"key2020012715181488000_ref013"},{"key":"key2020012715181488000_ref014","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1016\/j.procs.2017.10.097","article-title":"Challenges in sentiment analysis for Arabic social networks","volume":"117","year":"2017","journal-title":"Procedia Computer Science"},{"first-page":"494","article-title":"Labr: a large scale Arabic book reviews dataset","year":"2013","key":"key2020012715181488000_ref015"},{"issue":"12","key":"key2020012715181488000_ref016","first-page":"75","article-title":"Arabic sentiment analysis: a survey","volume":"6","year":"2015","journal-title":"International Journal of Advanced Computer Science and Applications"},{"key":"key2020012715181488000_ref017","first-page":"31","article-title":"Alg\/fr: a step by step construction of a lexicon between Algerian dialect and French","volume-title":"The 31st Pacific Asia Conference on Language, Information and Computation PACLIC","year":"2017"},{"article-title":"Document embeddings for Arabic sentiment analysis","volume-title":"International Workshop on Language Processing and Knowledge Management LPKM","year":"2017","key":"key2020012715181488000_ref018"},{"key":"key2020012715181488000_ref019","first-page":"339","article-title":"Sentiment classification techniques for Arabic language: a survey","volume-title":"7th International Conference on Information and Communication Systems (ICICS)","year":"2016"},{"key":"key2020012715181488000_ref020","first-page":"1","article-title":"A new modeling approach for Arabic opinion mining recognition","volume-title":"Intelligent Systems and Computer Vision (ISCV)","year":"2015"},{"volume-title":"Arabic Opinion Mining Using Combined Classification Approach","year":"2011","key":"key2020012715181488000_ref021"},{"key":"key2020012715181488000_ref022","first-page":"32","article-title":"Arabic text classification based on word and document embeddings","volume-title":"International Conference on Advanced Intelligent Systems and Informatics","year":"2016"},{"issue":"2","key":"key2020012715181488000_ref023","doi-asserted-by":"crossref","first-page":"45","DOI":"10.5121\/ijaia.2012.3205","article-title":"A machine learning approach for opinion holder extraction in Arabic language","volume":"3","year":"2012","journal-title":"International Journal of Artificial Intelligence and Applications"},{"year":"2017","key":"key2020012715181488000_ref024","article-title":"Arabic multi-dialect segmentation: bi-lstm-crf vs. svm"},{"key":"key2020012715181488000_ref025","first-page":"97","article-title":"Sentiment analysis of French movie reviews","volume-title":"Advances in Distributed Agent-Based Retrieval Tools","year":"2011"},{"key":"key2020012715181488000_ref026","first-page":"724","article-title":"Arabic dialect identification with an unsupervised learning (based on a lexicon) application case: Algerian dialect","volume-title":"IEEE International Conference on Computational Science and Engineering (CSE) and IEEE Intl Conference on Embedded and Ubiquitous Computing (EUC) and 15th Intl Symposium on Distributed Computing and Applications for Business Engineering (DCABES)","year":"2016"},{"year":"2017","key":"key2020012715181488000_ref027","article-title":"Asda: Analyseur syntaxique du dialecte alg {\\\u2019e} rien dans un but d\u2019analyse s {\\\u2019e} mantique"},{"key":"key2020012715181488000_ref028","first-page":"1","article-title":"Social big data mining: a survey focused on opinion mining and sentiments analysis","volume-title":"12th International Symposium on Programming and Systems (ISPS)","year":"2015"},{"key":"key2020012715181488000_ref029","unstructured":"Guellil, I. and Faical, A. (2017), \u201cBilingual lexicon for Algerian Arabic dialect treatment in social media. In: WiNLP: women and underrepresented minorities in natural language processing (co-located with ACL 2017)\u201d, available at: www.winlp.org\/wp-content\/uploads\/2017\/final_papers_2017\/92_Paper.pdf"},{"journal-title":"Journal of King Saud University-Computer and Information Sciences","article-title":"Arabic natural language processing: an overview","year":"2019","key":"key2020012715181488000_ref030"},{"issue":"1","key":"key2020012715181488000_ref031","doi-asserted-by":"crossref","first-page":"1","DOI":"10.2200\/S00277ED1V01Y201008HLT010","article-title":"Introduction to Arabic natural language processing","volume":"3","year":"2010","journal-title":"Synthesis Lectures on Human Language Technologies"},{"article-title":"Estimating the sentiment of Arabic social media contents: a survey","volume-title":"5th International Conference on Arabic Language Processing","year":"2014","key":"key2020012715181488000_ref032"},{"journal-title":"Information Processing and Management","article-title":"Machine translation for Arabic dialects (survey)","year":"2017","key":"key2020012715181488000_ref033"},{"year":"2014","key":"key2020012715181488000_ref034","article-title":"Building resources for Algerian Arabic dialects"},{"first-page":"703","article-title":"Exploiting emoticons in sentiment analysis","year":"2013","key":"key2020012715181488000_ref035"},{"key":"key2020012715181488000_ref036","first-page":"192","article-title":"Classifying sentiment in Arabic social networks: naive search versus naive Bayes","volume-title":"2nd International Conference on Advances in Computational Tools for Engineering Applications (ACTEA)","year":"2012"},{"issue":"10","key":"key2020012715181488000_ref037","article-title":"Arabic sentiment analysis approaches: an analytical survey","volume":"7","year":"2016","journal-title":"International Journal of Scientific and Engineering Research"},{"issue":"10","key":"key2020012715181488000_ref038","doi-asserted-by":"crossref","first-page":"1961","DOI":"10.3844\/jcssp.2014.1961.1968","article-title":"A hybrid method using lexicon-based approach and naive Bayes classifier for Arabic opinion question answering","volume":"10","year":"2014","journal-title":"Journal of Computer Science"},{"volume-title":"Stemming Arabic Text","year":"1999","key":"key2020012715181488000_ref039"},{"key":"key2020012715181488000_ref040","first-page":"128","article-title":"Subjectivity and sentiment analysis of Arabic: a survey","volume-title":"International Conference on Advanced Machine Learning Technologies and Applications","year":"2012"},{"first-page":"1188","article-title":"Distributed representations of sentences and documents","year":"2014","key":"key2020012715181488000_ref041"},{"key":"key2020012715181488000_ref042","first-page":"466","article-title":"The Penn Arabic treebank: building a large-scale annotated Arabic corpus","volume-title":"NEMLAR Conference on Arabic Language Resources and Tools","year":"2004"},{"key":"key2020012715181488000_ref043","doi-asserted-by":"crossref","first-page":"55","DOI":"10.13053\/rcs-110-1-5","article-title":"A proposed lexicon-based sentiment analysis approach for the vernacular Algerian Arabic","volume":"110","year":"2016","journal-title":"Research in Computing Science"},{"first-page":"55","article-title":"Sentiment analysis of Tunisian dialects: linguistic resources and experiments","year":"2017","key":"key2020012715181488000_ref044"},{"issue":"4","key":"key2020012715181488000_ref045","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1016\/j.asej.2014.04.011","article-title":"Sentiment analysis algorithms and applications: a survey","volume":"5","year":"2014","journal-title":"Ain Shams Engineering Journal"},{"article-title":"A study of a non-resourced language: the case of one of the Algerian dialects","volume-title":"The Third International Workshop on Spoken Languages Technologies for Under-resourced Languages-SLTU\u201912","year":"2012","key":"key2020012715181488000_ref046"},{"article-title":"Machine translation experiments on padic: a parallel Arabic dialect corpus","volume-title":"The 29th Pacific Asia Conference on Language, Information and Computation","year":"2015","key":"key2020012715181488000_ref047"},{"key":"key2020012715181488000_ref048","first-page":"3111","article-title":"Distributed representations of words and phrases and their compositionality","volume-title":"Advances in Neural Information Processing Systems","year":"2013"},{"first-page":"55","article-title":"Subjectivity and sentiment analysis of modern standard Arabic and Arabic microblogs","year":"2013","key":"key2020012715181488000_ref049"},{"first-page":"2515","article-title":"ASTD: Arabic sentiment tweets dataset","year":"2015","key":"key2020012715181488000_ref050"},{"issue":"4","key":"key2020012715181488000_ref051","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1007\/s12559-017-9470-8","article-title":"A review of sentiment analysis research in Chinese language","volume":"9","year":"2017","journal-title":"Cognitive Computation"},{"first-page":"502","article-title":"Semeval-2017 task 4: sentiment analysis in twitter","year":"2017","key":"key2020012715181488000_ref052"},{"issue":"10","key":"key2020012715181488000_ref053","first-page":"2045","article-title":"OCA: opinion corpus for Arabic","volume":"62","year":"2011","journal-title":"Journal of the Association for Information Science and Technology"},{"first-page":"69","article-title":"A conventional orthography for Algerian Arabic","year":"2015","key":"key2020012715181488000_ref054"},{"first-page":"432","article-title":"Learning from relatives: unified dialectal Arabic segmentation","year":"2017","key":"key2020012715181488000_ref055"},{"key":"key2020012715181488000_ref056","first-page":"78","article-title":"A hybrid approach for sentiment classification of Egyptian dialect tweets","volume-title":"First International Conference on Arabic Computational Linguistics (ACLing)","year":"2015"},{"key":"key2020012715181488000_ref057","first-page":"409","article-title":"Sentiment analysis in Arabic","volume-title":"International Conference on Applications of Natural Language to Information Systems","year":"2016"},{"issue":"2","key":"key2020012715181488000_ref058","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1162\/COLI_a_00049","article-title":"Lexicon-based methods for sentiment analysis","volume":"37","year":"2011","journal-title":"Computational Linguistics"},{"key":"key2020012715181488000_ref059","first-page":"467","article-title":"Sentireview: sentiment analysis based on text and emoticons","volume-title":"International Conference on Innovative Mechanisms for Industry Applications (ICIMIA)","year":"2017"}],"container-title":["International Journal of Web Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IJWIS-03-2019-0008\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IJWIS-03-2019-0008\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T22:23:49Z","timestamp":1753395829000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/ijwis\/article\/15\/5\/594-615\/165246"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,12,2]]},"references-count":59,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2019,12,2]]}},"alternative-id":["10.1108\/IJWIS-03-2019-0008"],"URL":"https:\/\/doi.org\/10.1108\/ijwis-03-2019-0008","relation":{},"ISSN":["1744-0084","1744-0084"],"issn-type":[{"type":"print","value":"1744-0084"},{"type":"print","value":"1744-0084"}],"subject":[],"published":{"date-parts":[[2019,12,2]]}}}