{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,21]],"date-time":"2025-10-21T00:28:21Z","timestamp":1761006501111,"version":"build-2065373602"},"reference-count":34,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2014,5,8]],"date-time":"2014-05-08T00:00:00Z","timestamp":1399507200000},"content-version":"vor","delay-in-days":492,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc of Assoc for Info"],"published-print":{"date-parts":[[2013,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Document level sentiment analysis, the task of determining whether the sentiment expressed in a document is positive or negative, is commonly performed by supervised methods. As with all supervised tasks, obtaining training data for these methods can be expensive and time\u2010consuming. Some semi\u2010supervised approaches have been proposed that rely on sentiment lexicons. We propose a novel supervised and a novel semi\u2010supervised sentiment analysis method that are both based on a probabilistic graphical model, without requiring any lexicon. Our semi\u2010supervised method takes advantage of the numerical ratings that are often included in online reviews (e.g., 4 out of 5 stars). While these numerical ratings are related to sentiment, they are noisy and hence, by themselves, they are an imperfect indicator of reviews' sentiments. We incorporate unlabeled user reviews as training data by treating the reviews' numerical ratings as sentiment labels while modeling the ratings' noisy nature. Our empirical results, utilizing a corpus of labeled sentences from hotel reviews and unlabeled hotel reviews with numerical ratings, show that treating reviews' ratings as noisy and utilizing them to augment a small amount of labeled sentences outperforms strong existing supervised and semi\u2010supervised classification\u2010based and lexicon\u2010based approaches.<\/jats:p>","DOI":"10.1002\/meet.14505001031","type":"journal-article","created":{"date-parts":[[2014,5,8]],"date-time":"2014-05-08T16:35:28Z","timestamp":1399566928000},"page":"1-10","source":"Crossref","is-referenced-by-count":4,"title":["Semi\u2010supervised probabilistic sentiment analysis: Merging labeled sentences with unlabeled reviews to identify sentiment"],"prefix":"10.1002","volume":"50","author":[{"given":"Andrew","family":"Yates","sequence":"first","affiliation":[]},{"given":"Nazli","family":"Goharian","sequence":"additional","affiliation":[]},{"given":"Wai Gen","family":"Yee","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2014,5,8]]},"reference":[{"key":"e_1_2_8_2_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1020281327116"},{"key":"e_1_2_8_3_1","doi-asserted-by":"publisher","DOI":"10.1162\/jmlr.2003.3.4-5.993"},{"key":"e_1_2_8_4_1","unstructured":"Brody S. &Elhadad N.(2010).An unsupervised aspect\u2010sentiment model for online reviews.Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT\u2010NAACL '10)(pp.804\u2013812)."},{"key":"e_1_2_8_5_1","doi-asserted-by":"publisher","DOI":"10.1080\/00031305.1992.10475878"},{"key":"e_1_2_8_6_1","doi-asserted-by":"crossref","unstructured":"Ding X. Liu B. &Yu P.(2008).A holistic lexicon\u2010based approach to opinion mining.Proceedings of the 2008 International Conference on Web Search and Data Mining (WSDM '08).","DOI":"10.1145\/1341531.1341561"},{"key":"e_1_2_8_7_1","doi-asserted-by":"crossref","unstructured":"Hu M. &Liu B.(2004).Mining and summarizing customer reviews.Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '04)(pp.168\u2013177).","DOI":"10.1145\/1014052.1014073"},{"key":"e_1_2_8_8_1","doi-asserted-by":"crossref","unstructured":"Jo Y. &Oh A. H.(2011).Aspect and sentiment unification model for online review analysis.Proceedings of the fourth ACM international conference on Web search and data mining (WSDM '11)(p.815).","DOI":"10.1145\/1935826.1935932"},{"key":"e_1_2_8_9_1","first-page":"169","volume-title":"Advances in Kernel Methods \u2010 Support Vector Learning","author":"Joachims T.","year":"1999"},{"key":"e_1_2_8_10_1","doi-asserted-by":"crossref","unstructured":"Kim J. Li J. &Lee J.(2009).Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis.Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP (ACL\u2010IJCNLP''09)(pp.253\u2013261).","DOI":"10.3115\/1687878.1687915"},{"key":"e_1_2_8_11_1","doi-asserted-by":"publisher","DOI":"10.1162\/coli.2006.32.4.485"},{"key":"e_1_2_8_12_1","doi-asserted-by":"crossref","unstructured":"Lin C. &He Y.(2009).Joint sentiment\/topic model for sentiment analysis.Proceedings of the 18th ACM conference on Information and knowledge management (CIKM '09).","DOI":"10.1145\/1645953.1646003"},{"key":"e_1_2_8_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4614-3223-4_13"},{"key":"e_1_2_8_14_1","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1145\/1367497.1367514","volume-title":"Proceedings of the 17th international conference on World Wide Web (WWW '08)","author":"Lu Y.","year":"2008"},{"key":"e_1_2_8_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526728"},{"key":"e_1_2_8_16_1","doi-asserted-by":"crossref","unstructured":"Mei Q. Ling X. Wondra M. Su H. &Zhai C.(2007).Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs.Proceedings of the 16th international conference on World Wide Web (WWW '07).","DOI":"10.1145\/1242572.1242596"},{"key":"e_1_2_8_17_1","doi-asserted-by":"crossref","unstructured":"Melville P. Ox O. &Lawrence R. D.(2009).Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification.Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '09).","DOI":"10.1145\/1557019.1557156"},{"key":"e_1_2_8_18_1","doi-asserted-by":"crossref","unstructured":"Moghaddam S. &Ester M.(2011).ILDA: Interdependent LDA Model for Learning Latent Aspects and their Ratings from Online Product Reviews Categories and Subject Descriptors.Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval (SIGIR '11)(pp.665\u2013674).","DOI":"10.1145\/2009916.2010006"},{"key":"e_1_2_8_19_1","unstructured":"Mukherjee A. &Liu B.(2012a).Aspect extraction through semi\u2010supervised modeling.Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL '12)."},{"key":"e_1_2_8_20_1","unstructured":"Mukherjee A. &Liu B.(2012b).Modeling review comments.Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL '10)."},{"key":"e_1_2_8_21_1","doi-asserted-by":"crossref","unstructured":"Ng V. Dasgupta S. &Arifin S.(2006).Examining the role of linguistic knowledge sources in the automatic identification and classification of reviews.Proceedings of the COLING\/ACL on Main conference poster sessions (COLING\u2010ACL '06)(pp.611\u2013618).","DOI":"10.3115\/1273073.1273152"},{"key":"e_1_2_8_22_1","first-page":"1","article-title":"Evaluating rater quality and rating difficulty in online annotation activities","volume":"49","author":"Organisciak P.","year":"2012","journal-title":"ASIST '12"},{"key":"e_1_2_8_23_1","unstructured":"Paltoglou G. &Thelwall M.(2010).A study of information retrieval weighting schemes for sentiment analysis.Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL '10)(pp.1386\u20131395)."},{"key":"e_1_2_8_24_1","first-page":"1","volume-title":"Foundations and Trends\u00ae in Information Retrieval","author":"Pang B.","year":"2008"},{"key":"e_1_2_8_25_1","doi-asserted-by":"crossref","unstructured":"Pang B. Lee L. &Vaithyanathan S.(2002).Thumbs up?: sentiment classification using machine learning techniques.Proceedings of the ACL\u201002 conference on Empirical methods in natural language processing (EMNLP '02).","DOI":"10.3115\/1118693.1118704"},{"key":"e_1_2_8_26_1","unstructured":"Plummer M.(2003).JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling.Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003)."},{"key":"e_1_2_8_27_1","unstructured":"Qiu G. Liu B. Bu J. &Chen C.(2009).Expanding Domain Sentiment Lexicon through Double Propagation.Proceedings of the 21st international jont conference on Artifical intelligence (IJCAI '09)."},{"key":"e_1_2_8_28_1","doi-asserted-by":"crossref","unstructured":"Qiu L. Zhang W. Hu C. &Zhao K.(2009).SELC: A Self\u2010Supervised Model for Sentiment Classification.Proceedings of the 18th ACM conference on Information and knowledge management (CIKM '09)(pp.929\u2013936).","DOI":"10.1145\/1645953.1646072"},{"key":"e_1_2_8_29_1","doi-asserted-by":"crossref","unstructured":"Ramage D. Hall D. Nallapati R. &Manning C. D.(2009).Labeled LDA: A supervised topic model for credit attribution in multi\u2010labeled corpora.EMNLP '09(pp.248\u2013256).","DOI":"10.3115\/1699510.1699543"},{"key":"e_1_2_8_30_1","unstructured":"Sauper C. Haghighi A. &Barzilay R.(2011).Content models with attitude.Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL '11)."},{"key":"e_1_2_8_31_1","doi-asserted-by":"publisher","DOI":"10.1162\/COLI_a_00049"},{"key":"e_1_2_8_32_1","first-page":"743","volume-title":"Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '08)","author":"Tan S.","year":"2008"},{"key":"e_1_2_8_33_1","unstructured":"Titov I. &McDonald R.(2008).A joint model of text and aspect ratings for sentiment summarization.Proceedings of ACL\u201008: HLT."},{"key":"e_1_2_8_34_1","doi-asserted-by":"crossref","first-page":"783","DOI":"10.1145\/1835804.1835903","volume-title":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '10)","author":"Wang H.","year":"2010"},{"key":"e_1_2_8_35_1","first-page":"618","volume-title":"Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '11)","author":"Wang H.","year":"2011"}],"container-title":["Proceedings of the American Society for Information Science and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fmeet.14505001031","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fmeet.14505001031","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/asistdl.onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/meet.14505001031","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,20]],"date-time":"2025-10-20T09:20:24Z","timestamp":1760952024000},"score":1,"resource":{"primary":{"URL":"https:\/\/asistdl.onlinelibrary.wiley.com\/doi\/10.1002\/meet.14505001031"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,1]]},"references-count":34,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2013,1]]}},"alternative-id":["10.1002\/meet.14505001031"],"URL":"https:\/\/doi.org\/10.1002\/meet.14505001031","archive":["Portico"],"relation":{},"ISSN":["0044-7870","1550-8390"],"issn-type":[{"type":"print","value":"0044-7870"},{"type":"electronic","value":"1550-8390"}],"subject":[],"published":{"date-parts":[[2013,1]]}}}