{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T00:28:41Z","timestamp":1777854521087,"version":"3.51.4"},"reference-count":74,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2016,7,10]],"date-time":"2016-07-10T00:00:00Z","timestamp":1468108800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2017,2]]},"abstract":"<jats:p>Clustering is a powerful unsupervised tool for sentiment analysis from text. However, the clustering results may be affected by any step of the clustering process, such as data pre-processing strategy, term weighting method in Vector Space Model and clustering algorithm. This paper presents the results of an experimental study of some common clustering techniques with respect to the task of sentiment analysis. Different from previous studies, in particular, we investigate the combination effects of these factors with a series of comprehensive experimental studies. The experimental results indicate that, first, the K-means-type clustering algorithms show clear advantages on balanced review datasets, while performing rather poorly on unbalanced datasets by considering clustering accuracy. Second, the comparatively newly designed weighting models are better than the traditional weighting models for sentiment clustering on both balanced and unbalanced datasets. Furthermore, adjective and adverb words extraction strategy can offer obvious improvements on clustering performance, while strategies of adopting stemming and stopword removal will bring negative influences on sentiment clustering. The experimental results would be valuable for both the study and usage of clustering methods in online review sentiment analysis.<\/jats:p>","DOI":"10.1177\/0165551515617374","type":"journal-article","created":{"date-parts":[[2015,12,3]],"date-time":"2015-12-03T21:29:45Z","timestamp":1449178185000},"page":"54-74","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":47,"title":["Exploring performance of clustering methods on document sentiment analysis"],"prefix":"10.1177","volume":"43","author":[{"given":"Baojun","family":"Ma","sequence":"first","affiliation":[{"name":"School of Economics and Management, Beijing University of Posts and Telecommunications, China"}]},{"given":"Hua","family":"Yuan","sequence":"additional","affiliation":[{"name":"School of Management and Economics, University of Electronic Science and Technology of China, China"}]},{"given":"Ye","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Science, Beijing University of Posts and Telecommunications, China"}]}],"member":"179","published-online":{"date-parts":[[2016,7,10]]},"reference":[{"key":"bibr1-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1561\/1500000011"},{"key":"bibr2-0165551515617374","doi-asserted-by":"publisher","DOI":"10.3115\/1118693.1118704"},{"key":"bibr3-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1016\/j.joi.2009.01.003"},{"key":"bibr4-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1613\/jair.2934"},{"key":"bibr5-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/860435.860485"},{"key":"bibr6-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/331499.331504"},{"key":"bibr7-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1646075"},{"key":"bibr8-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-004-0194-1"},{"key":"bibr9-0165551515617374","volume-title":"Data mining: Concepts and techniques","author":"Han J","year":"2006","edition":"2"},{"key":"bibr10-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-02145-9"},{"key":"bibr11-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2013.30"},{"key":"bibr12-0165551515617374","doi-asserted-by":"publisher","DOI":"10.3115\/1599081.1599185"},{"key":"bibr13-0165551515617374","doi-asserted-by":"publisher","DOI":"10.3115\/1621829.1621835"},{"key":"bibr14-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/1935826.1935884"},{"key":"bibr15-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2593682"},{"key":"bibr16-0165551515617374","first-page":"525","volume-title":"KDD-2000 workshop on text mining","author":"Steinbach M","year":"2000"},{"key":"bibr17-0165551515617374","unstructured":"Karypis G. CLUTO\u2014software for clustering high-dimensional datasets, 2007. Available from: http:\/\/www.cs.umn.edu\/~cluto"},{"key":"bibr18-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-85287-2_4"},{"key":"bibr19-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1002\/asi.20553"},{"key":"bibr20-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-00958-7_41"},{"key":"bibr21-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/1774088.1774461"},{"key":"bibr22-0165551515617374","first-page":"129","volume-title":"Proceedings of the 22nd annual conference on neural information processing systems","author":"Blitzer J","year":"2008"},{"key":"bibr23-0165551515617374","first-page":"440","volume-title":"Proceedings of the 45th annual meeting of the Association for Computational Linguistics (ACL\u201907)","author":"Blitzer J","year":"2007"},{"key":"bibr24-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-008-9070-z"},{"key":"bibr25-0165551515617374","doi-asserted-by":"publisher","DOI":"10.3115\/1613715.1613801"},{"key":"bibr26-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390190"},{"key":"bibr27-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/2063576.2063728"},{"key":"bibr28-0165551515617374","doi-asserted-by":"publisher","DOI":"10.3115\/1654758.1654769"},{"key":"bibr29-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242759"},{"key":"bibr30-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/1341531.1341560"},{"key":"bibr31-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/1871437.1871669"},{"key":"bibr32-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1177\/0165551511432670"},{"key":"bibr33-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/1571941.1572093"},{"key":"bibr34-0165551515617374","first-page":"1041","volume-title":"Proceedings of the 23rd annual conference on neural information processing systems","author":"Mansour Y","year":"2009"},{"key":"bibr35-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/2187836.2187863"},{"key":"bibr36-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772767"},{"key":"bibr37-0165551515617374","doi-asserted-by":"publisher","DOI":"10.3115\/1218955.1218990"},{"key":"bibr38-0165551515617374","doi-asserted-by":"publisher","DOI":"10.3115\/1219840.1219855"},{"key":"bibr39-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2008.113"},{"key":"bibr40-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/1963192.1963262"},{"key":"bibr41-0165551515617374","doi-asserted-by":"publisher","DOI":"10.3115\/1117794.1117802"},{"key":"bibr42-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1109\/52.50773"},{"key":"bibr43-0165551515617374","volume-title":"The SMART retrieval system \u2013 Experiments in automatic document processing","author":"Salton G","year":"1971"},{"key":"bibr44-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1002\/asi.4630340406"},{"key":"bibr45-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/182.358466"},{"key":"bibr46-0165551515617374","volume-title":"Modern information retrieval","author":"Baeza-Yates R","year":"1999"},{"key":"bibr47-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1002\/asi.4630320304"},{"key":"bibr48-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1108\/eb026526"},{"key":"bibr49-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1002\/asi.4630270302"},{"key":"bibr50-0165551515617374","volume-title":"The third text retrieval conference (TREC \u201894)","author":"Robertson SE","year":"1994"},{"key":"bibr51-0165551515617374","volume-title":"Proceedings of the Sixteenth Text REtrieval Conference (TREC 2007)","author":"Amati G","year":"2007"},{"key":"bibr52-0165551515617374","volume-title":"Using language models for information retrieval","author":"Hiemstra D","year":"2001"},{"key":"bibr53-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772780"},{"key":"bibr54-0165551515617374","volume-title":"Criterion functions for document clustering: Experiments and analysis","author":"Zhao Y","year":"2001"},{"key":"bibr55-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-005-0361-3"},{"key":"bibr56-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1023\/B:MACH.0000027785.44527.d6"},{"key":"bibr57-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1982.1056489"},{"key":"bibr58-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1002\/9780470316801"},{"key":"bibr59-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2002.1033770"},{"key":"bibr60-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1007\/s11222-007-9033-z"},{"key":"bibr61-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1109\/34.868688"},{"key":"bibr62-0165551515617374","first-page":"849","volume-title":"Advances in Neural Information Processing Systems","author":"Ng AY","year":"2001"},{"key":"bibr63-0165551515617374","first-page":"2559","author":"Pearson K","year":"1901","journal-title":"Philosophical Magazine"},{"key":"bibr64-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775110"},{"key":"bibr65-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972733.6"},{"key":"bibr66-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1145\/345508.345578"},{"key":"bibr67-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2009.04.002"},{"key":"bibr68-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007617005950"},{"key":"bibr69-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1016\/j.datak.2009.06.007"},{"key":"bibr70-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-006-6540-7"},{"key":"bibr71-0165551515617374","volume-title":"Learning to select for information retrieval","author":"Peng J","year":"2010"},{"key":"bibr72-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1109\/TSMCB.2008.2004559"},{"key":"bibr73-0165551515617374","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2009.06.012"},{"key":"bibr74-0165551515617374","first-page":"2837","volume":"11","author":"Vinh NX","year":"2010","journal-title":"The Journal of Machine Learning Research"}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551515617374","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/0165551515617374","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551515617374","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:09:22Z","timestamp":1777504162000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0165551515617374"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,7,10]]},"references-count":74,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2017,2]]}},"alternative-id":["10.1177\/0165551515617374"],"URL":"https:\/\/doi.org\/10.1177\/0165551515617374","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,7,10]]}}}