{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,19]],"date-time":"2025-08-19T10:18:39Z","timestamp":1755598719637,"version":"3.41.2"},"reference-count":28,"publisher":"Emerald","issue":"5","license":[{"start":{"date-parts":[[2018,3,5]],"date-time":"2018-03-05T00:00:00Z","timestamp":1520208000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["K"],"published-print":{"date-parts":[[2018,5,2]]},"abstract":"<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title>\n<jats:p>This paper aims to propose a statistical and context-aware feature reduction algorithm that improves sentiment classification accuracy. Classification of reviews with different granularities in two classes of reviews with negative and positive polarities is among the objectives of sentiment analysis. One of the major issues in sentiment analysis is feature engineering while it severely affects time complexity and accuracy of sentiment classification.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title>\n<jats:p>In this paper, a feature reduction method is proposed that uses context-based knowledge as well as synset statistical knowledge. To do so, one-dimensional presentation proposed for SentiWordNet calculates statistical knowledge that involves polarity concentration and variation tendency for each synset. Feature reduction involves two phases. In the first phase, features that combine semantic and statistical similarity conditions are put in the same cluster. In the second phase, features are ranked and then the features which are given lower ranks are eliminated. The experiments are conducted by support vector machine (SVM), naive Bayes (NB), decision tree (DT) and k-nearest neighbors (KNN) algorithms to classify the vectors of the unigram and bigram features in two classes of positive or negative sentiments.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Findings<\/jats:title>\n<jats:p>The results showed that the applied clustering algorithm reduces SentiWordNet synset to less than half which reduced the size of the feature vector by less than half. In addition, the accuracy of sentiment classification is improved by at least 1.5 per cent.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title>\n<jats:p>The presented feature reduction method is the first use of the synset clustering for feature reduction. In this paper features reduction algorithm, first aggregates the similar features into clusters then eliminates unsatisfactory cluster.<\/jats:p>\n<\/jats:sec>","DOI":"10.1108\/k-06-2017-0229","type":"journal-article","created":{"date-parts":[[2018,3,5]],"date-time":"2018-03-05T09:40:44Z","timestamp":1520242844000},"page":"957-984","source":"Crossref","is-referenced-by-count":9,"title":["A proposed scheme for sentiment analysis"],"prefix":"10.1108","volume":"47","author":[{"given":"Sajjad","family":"Tofighy","sequence":"first","affiliation":[]},{"given":"Seyed Mostafa","family":"Fakhrahmad","sequence":"additional","affiliation":[]}],"member":"140","published-online":{"date-parts":[[2018,3,5]]},"reference":[{"key":"key2021041509160823200_ref001","unstructured":"Andrew, L.M. Daly, R.E. Pham, P.T. Huang, D. and Ng, A.Y. (2017), available at: http:\/\/ai.stanford.edu\/amaas\/data\/sentiment\/, http:\/\/ai.stanford.edu\/amaas\/data\/sentiment\/ (accessed 11 August 2017)."},{"key":"key2021041509160823200_ref002","unstructured":"Bo, P. Lee, L. and Vaithyanathan, S. (2017), Movie Review Data, available at: www.cs.cornell.edu\/people\/pabo\/movie-review-data\/ (accessed 11 August 2017)."},{"article-title":"A large-scale multilingual disambiguation of glosses","volume-title":"Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)","year":"2016","key":"key2021041509160823200_ref004"},{"key":"key2021041509160823200_ref003","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1016\/j.ins.2014.05.009","article-title":"An empirical study of sentence features for subjectivity and polarity classification","volume":"280","year":"2014","journal-title":"Information Sciences"},{"key":"key2021041509160823200_ref005","unstructured":"Esuli, A. and Sebastiani, F. (2006), \u201cSENTIWORDNET: A high-coverage lexical resource for opinion mining\u201d, (ISTI-PP-002\/2007), Technical report, Institute of Information Science and Technologies (ISTI) of the Italian National Research Council (CNR), pp. 1-26."},{"issue":"16","key":"key2021041509160823200_ref006","doi-asserted-by":"crossref","first-page":"6266","DOI":"10.1016\/j.eswa.2013.05.057","article-title":"Twitter Brand sentiment analysis: a hybrid system using n-gram analysis and dynamic artificial neural network","volume":"40","year":"2013","journal-title":"Expert Systems with Applications"},{"key":"key2021041509160823200_ref007","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1016\/j.dss.2015.04.002","article-title":"Polarity classification using structure-based vector representations of text","volume":"74","year":"2015","journal-title":"Decision Support Systems"},{"key":"key2021041509160823200_ref008","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1016\/j.knosys.2016.02.011","article-title":"SWIMS: Semi-supervised subjective feature weighting and intelligent model selection for sentiment analysis","volume":"100","year":"2016","journal-title":"Knowledge-Based Systems"},{"volume-title":"Multi-Class Sentiment Classification: The Experimental Comparisons of Feature Selection and Machine Learning Algorithms","year":"2017","key":"key2021041509160823200_ref009"},{"key":"key2021041509160823200_ref028","first-page":"142","article-title":"Learning word vectors for sentiment analysis","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies","year":"2011"},{"issue":"11","key":"key2021041509160823200_ref010","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1145\/219717.219748","article-title":"WordNet: a lexical database for English","volume":"38","year":"1995","journal-title":"Communications of the Acm"},{"key":"key2021041509160823200_ref011","first-page":"412","article-title":"Sentiment analysis using support vector machines with diverse information sources","volume-title":"EMNLP","year":"2004"},{"first-page":"67","article-title":"Feature selection and weighting methods in sentiment analysis","year":"2009","key":"key2021041509160823200_ref012"},{"first-page":"191","article-title":"Learning to classify subjective sentences from multiple domains using extended subjectivity lexicon and subjective predicates","year":"2013","key":"key2021041509160823200_ref013"},{"first-page":"79","article-title":"Thumbs up?: Sentiment classification using machine learning techniques","year":"2002","key":"key2021041509160823200_ref014"},{"volume-title":"Aspect Based Sentiment Analysis","year":"2014","key":"key2021041509160823200_ref015"},{"key":"key2021041509160823200_ref016","doi-asserted-by":"crossref","first-page":"102","DOI":"10.1109\/ICACCI.2013.6637154","article-title":"Identifying the best feature combination for sentiment analysis of customer reviews","volume-title":"2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI)","year":"2013"},{"key":"key2021041509160823200_ref017","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1016\/j.future.2017.09.049","article-title":"A learning automata-based ensemble resource usage prediction algorithm for cloud computing environment","volume":"79","year":"2018","journal-title":"Future Generation Computer Systems"},{"issue":"6","key":"key2021041509160823200_ref018","first-page":"1235","article-title":"Use of coefficient of variation in assessing variability of quantitative assays","volume":"9","year":"2002","journal-title":"Clinical and Diagnostic Laboratory Immunology"},{"issue":"3","key":"key2021041509160823200_ref019","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1007\/s10791-010-9161-5","article-title":"Sentiment classification: a lexical similarity based approach for extracting subjectivity in documents","volume":"14","year":"2011","journal-title":"Information Retrieval"},{"key":"key2021041509160823200_ref020","doi-asserted-by":"crossref","first-page":"136","DOI":"10.1016\/j.eswa.2017.02.038","article-title":"Enriched LDA (ELDA): combination of latent dirichlet allocation with word co-occurrence analysis for aspect extraction","volume":"80","year":"2017","journal-title":"Expert Systems with Applications"},{"key":"key2021041509160823200_ref021","first-page":"15","article-title":"Performance investigation of feature selection methods and sentiment lexicons for sentiment analysis","volume":"3","year":"2012","journal-title":"IJCA Special Issue on Advanced Computing and Communication Technologies for HPC Applications"},{"key":"key2021041509160823200_ref022","first-page":"712","article-title":"Sentiment analysis of movie reviews: a new feature-based heuristic for aspect-level sentiment classification","volume-title":"2013 International Multi-Conference on Automation, Computing, Communication, Control and Compressed Sensing (iMac4s)","year":"2013"},{"issue":"4","key":"key2021041509160823200_ref023","doi-asserted-by":"crossref","first-page":"2622","DOI":"10.1016\/j.eswa.2007.05.028","article-title":"An empirical study of sentiment analysis for Chinese documents","volume":"34","year":"2008","journal-title":"Expert Systems with Applications"},{"key":"key2021041509160823200_ref024","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1016\/j.eswa.2016.03.028","article-title":"Classification of sentiment reviews using n-gram machine learning approach","volume":"57","year":"2016","journal-title":"Expert Systems with Applications"},{"key":"key2021041509160823200_ref025","unstructured":"Waikato, U. (2017), Weka 3. University of Waikato, available at: www.cs.waikato.ac.nz\/ml\/weka\/index.html (accessed 11 August 2017)."},{"first-page":"246","article-title":"Development and use of a gold-standard data set for subjectivity classifications","year":"1999","key":"key2021041509160823200_ref026"},{"volume-title":"Review Popularity and Review Helpfulness: A Model for User Review Effectiveness","year":"2017","key":"key2021041509160823200_ref027"}],"container-title":["Kybernetes"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/K-06-2017-0229\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/K-06-2017-0229\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T21:48:45Z","timestamp":1753393725000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/k\/article\/47\/5\/957-984\/267031"}},"subtitle":["Effective feature reduction based on statistical information of SentiWordNet"],"short-title":[],"issued":{"date-parts":[[2018,3,5]]},"references-count":28,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2018,3,5]]},"published-print":{"date-parts":[[2018,5,2]]}},"alternative-id":["10.1108\/K-06-2017-0229"],"URL":"https:\/\/doi.org\/10.1108\/k-06-2017-0229","relation":{},"ISSN":["0368-492X"],"issn-type":[{"type":"print","value":"0368-492X"}],"subject":[],"published":{"date-parts":[[2018,3,5]]}}}