{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T04:27:09Z","timestamp":1777696029550,"version":"3.51.4"},"reference-count":33,"publisher":"SAGE Publications","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IDA"],"published-print":{"date-parts":[[2021,4,20]]},"abstract":"<jats:p>Sentiment classification aims to solve the problem of automatic judgment of sentiment polarity. In the sentiment classification task of text data, such as online reviews, traditional deep learning models are dedicated to algorithm optimization but ignore the characteristics of imbalanced distribution of the number of classified samples and the inclusion of weak tagging information such as ratings and tags. Based on the traditional deep learning model, the method of random oversampling and cost sensitivity is used to increase the contribution of a minority of samples to the model loss function and avoid the model biasing to the majority of samples. The model training is divided into two stages. In the first stage, a large amount of weak tagging data is used to train the model, therefore a model that captures the sentiment semantics of the data is obtained. After that, the model parameters trained in the first stage are used as the initial parameters of the second stage model training, and only a small amount of tagging data is used to continue training the model to reduce the impact of noise, thus reducing the use of manual tagging samples. The experimental results show that the method is considerably better than traditional deep learning models in the sentiment classification task of hotel review data.<\/jats:p>","DOI":"10.3233\/ida-205408","type":"journal-article","created":{"date-parts":[[2021,4,23]],"date-time":"2021-04-23T14:50:31Z","timestamp":1619189431000},"page":"555-570","source":"Crossref","is-referenced-by-count":4,"title":["Sentiment classification based on weak tagging information and imbalanced data"],"prefix":"10.1177","volume":"25","author":[{"given":"Chuantao","family":"Wang","sequence":"first","affiliation":[{"name":"School of Mechanical-Electronic and Vehicle Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China"},{"name":"Beijing Engineering Research Center of Monitoring for Construction Safety, Beijing, China"}]},{"given":"Xuexin","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Mechanical-Electronic and Vehicle Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China"},{"name":"Beijing Engineering Research Center of Monitoring for Construction Safety, Beijing, China"}]},{"given":"Linkai","family":"Ding","sequence":"additional","affiliation":[{"name":"School of Mechanical-Electronic and Vehicle Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China"},{"name":"Beijing Engineering Research Center of Monitoring for Construction Safety, Beijing, China"}]}],"member":"179","reference":[{"key":"10.3233\/IDA-205408_ref1","doi-asserted-by":"crossref","unstructured":"Cambria and Erik, Affective computing and sentiment analysis, IEEE Intelligent Systems 31.2 (2016), 102\u2013107.","DOI":"10.1109\/MIS.2016.31"},{"key":"10.3233\/IDA-205408_ref2","unstructured":"L. Jiang, M. Yu, M. Zhou, X. Liu and T. Zhao, Target-dependent twitter sentiment classification, Proceedings of Annual Meeting of the Association for Computational Linguistics Human Language Technologies 1 (2011), 151\u2013160."},{"key":"10.3233\/IDA-205408_ref3","doi-asserted-by":"crossref","unstructured":"S. Kiritchenko, X. Zhu, Cherry and C. Mohammad, S. NRC-Canada-2014: etecting aspects and sentiment in customer reviews, in: Proceedings of the 8th International Workshop on Semantic Evaluation, 2014, pp. 437\u2013442.","DOI":"10.3115\/v1\/S14-2076"},{"key":"10.3233\/IDA-205408_ref4","unstructured":"D.T. Vo and Y. Zhang, Target-dependent twitter sentiment classification with rich automatic features, in: Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015."},{"key":"10.3233\/IDA-205408_ref6","unstructured":"T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado and J. Dean, Distributed representations of words and phrases and their compositionality, in: Advances in Neural Information Processing Systems, 2013, pp. 3111\u20133119."},{"key":"10.3233\/IDA-205408_ref7","doi-asserted-by":"crossref","unstructured":"J. Pennington, R. Socher and C.D. Manning, Glove: Global vectors for word representation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1532\u20131543.","DOI":"10.3115\/v1\/D14-1162"},{"key":"10.3233\/IDA-205408_ref8","doi-asserted-by":"crossref","first-page":"639","DOI":"10.1007\/s12559-018-9549-x","article-title":"Sentic LSTM: a hybrid network for targeted aspect-based sentiment analysis","volume":"104","author":"Ma","year":"2018","journal-title":"Cognitive Computation"},{"key":"10.3233\/IDA-205408_ref9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1007730.1007733","article-title":"Special issue on learning from imbalanced data sets","volume":"6.1","author":"Chawla","year":"2004","journal-title":"ACM SIGKDD Explorations Newsletter"},{"key":"10.3233\/IDA-205408_ref10","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/j.elerap.2015.10.003","article-title":"A topic sentence-based instance transfer method for imbalanced sentiment classification of Chinese product reviews","volume":"16","author":"Tian","year":"2016","journal-title":"Electronic Commerce Research and Applications"},{"key":"10.3233\/IDA-205408_ref11","doi-asserted-by":"crossref","first-page":"28281","DOI":"10.1109\/ACCESS.2019.2892094","article-title":"Improving the performance of sentiment classification on imbalanced datasets with transfer learning","volume":"7","author":"Xiao","year":"2019","journal-title":"IEEE Access"},{"key":"10.3233\/IDA-205408_ref12","first-page":"1263","article-title":"Learning from imbalanced data","volume":"21.9","author":"He","year":"2009","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"10.3233\/IDA-205408_ref13","doi-asserted-by":"crossref","first-page":"307","DOI":"10.3390\/info9120307","article-title":"Improving the accuracy in sentiment classification in the light of modelling the latent semantic relations","volume":"9.12","author":"Rizun","year":"2018","journal-title":"Information"},{"key":"10.3233\/IDA-205408_ref14","doi-asserted-by":"crossref","first-page":"491","DOI":"10.1177\/0165551517703514","article-title":"Lexicon-based sentiment analysis: comparative evaluation of six sentiment lexicons","volume":"44.4","author":"Khoo","year":"2018","journal-title":"Journal of Information Science"},{"key":"10.3233\/IDA-205408_ref16","doi-asserted-by":"crossref","first-page":"43749","DOI":"10.1109\/ACCESS.2019.2907772","article-title":"Chinese text sentiment analysis based on extended sentiment dictionary","volume":"7","author":"Xu","year":"2019","journal-title":"IEEE Access"},{"key":"10.3233\/IDA-205408_ref17","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1016\/j.eswa.2018.04.006","article-title":"A domain transferable lexicon set for Twitter sentiment analysis using a supervised machine learning approach","volume":"106","author":"Ghiassi","year":"2018","journal-title":"Expert Systems with Applications"},{"key":"10.3233\/IDA-205408_ref20","doi-asserted-by":"crossref","first-page":"151","DOI":"10.2478\/fcds-2019-0009","article-title":"Tackling the problem of class imbalance in multi-class sentiment classification: an experimental study","volume":"44.2","author":"Lango","year":"2019","journal-title":"Foundations of Computing and Decision Sciences"},{"key":"10.3233\/IDA-205408_ref21","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1017\/S1351324917000298","article-title":"To use or not to use: feature selection for sentiment analysis of highly imbalanced data","volume":"24.1","author":"K\u00fcbler","year":"2018","journal-title":"Natural Language Engineering"},{"key":"10.3233\/IDA-205408_ref22","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/j.elerap.2015.10.003","article-title":"A topic sentence-based instance transfer method for imbalanced sentiment classification of Chinese product reviews","volume":"16","author":"Tian","year":"2016","journal-title":"Electronic Commerce Research and Applications"},{"key":"10.3233\/IDA-205408_ref23","first-page":"1","article-title":"Multi-channel embedding convolutional neural network model for arabic sentiment classification","volume":"18.4","author":"Dahou","year":"2019","journal-title":"ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)"},{"key":"10.3233\/IDA-205408_ref24","doi-asserted-by":"crossref","first-page":"190","DOI":"10.3390\/fi11090190","article-title":"Deep learning-based sentimental analysis for large-scale imbalanced twitter data","volume":"11.9","author":"Jamal","year":"2019","journal-title":"Future Internet"},{"key":"10.3233\/IDA-205408_ref25","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1016\/j.neucom.2019.01.059","article-title":"Semi-supervised target-oriented sentiment classification","volume":"337","author":"Xu","year":"2019","journal-title":"Neurocomputing"},{"key":"10.3233\/IDA-205408_ref26","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/j.engappai.2014.07.020","article-title":"Cross-lingual sentiment classification using multiple source languages in multi-view semi-supervised learning","volume":"36","author":"Hajmohammadi","year":"2014","journal-title":"Engineering Applications of Artificial Intelligence"},{"key":"10.3233\/IDA-205408_ref27","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1016\/j.neucom.2013.10.011","article-title":"Fuzzy deep belief networks for semi-supervised sentiment classification","volume":"131","author":"Zhou","year":"2014","journal-title":"Neurocomputing"},{"key":"10.3233\/IDA-205408_ref28","doi-asserted-by":"crossref","first-page":"22945","DOI":"10.1109\/ACCESS.2020.2969205","article-title":"Multi-modal sentiment classification with independent and interactive knowledge via semi-supervised learning","volume":"8","author":"Zhang","year":"2020","journal-title":"IEEE Access"},{"key":"10.3233\/IDA-205408_ref29","doi-asserted-by":"crossref","unstructured":"P.K. Novak, J. Smailovi\u0107, B. Sluba and I. Mozeti\u010d, Sentiment of emojis, PloS One 10.12 (2015).","DOI":"10.1371\/journal.pone.0144296"},{"key":"10.3233\/IDA-205408_ref30","doi-asserted-by":"crossref","first-page":"101615","DOI":"10.1016\/j.scs.2019.101615","article-title":"Thai sentiment analysis with deep learning techniques: a comparative study based on word embedding, POS-tag, and sentic features","volume":"50","author":"Pasupa","year":"2019","journal-title":"Sustainable Cities and Society"},{"key":"10.3233\/IDA-205408_ref32","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1504\/IJWBC.2019.098693","article-title":"Lexicon-based twitter sentiment analysis for vote share prediction using emoji and N-gram features","volume":"15.1","author":"Bansal","year":"2019","journal-title":"International Journal of Web Based Communities"},{"key":"10.3233\/IDA-205408_ref33","unstructured":"Y. Wang and A. Pal, Detecting emotions in social media: a constrained optimization approach, in: Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015."},{"key":"10.3233\/IDA-205408_ref34","doi-asserted-by":"crossref","unstructured":"B. O\u2019Connor, R. Balasubramanyan, B.R. Routledge and N.A. Smith, From tweets to polls: Linking text sentiment to public opinion time series, in: Fourth International AAAI Conference on Weblogs and Social Media, 2010.","DOI":"10.1609\/icwsm.v4i1.14031"},{"key":"10.3233\/IDA-205408_ref35","doi-asserted-by":"crossref","unstructured":"B. Krawczyk, B.T. McInnes and A. Cano, Sentiment classification from multi-class imbalanced twitter data using binarization, in: International Conference on Hybrid Artificial Intelligence Systems, 2017, pp. 26\u201337.","DOI":"10.1007\/978-3-319-59650-1_3"},{"key":"10.3233\/IDA-205408_ref36","doi-asserted-by":"crossref","unstructured":"S. Li, G. Zhou, Z. Wang, S.Y.M. Lee and R. Wang, Imbalanced sentiment classification, in: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, 2011, pp. 2469\u20132472.","DOI":"10.1145\/2063576.2063994"},{"key":"10.3233\/IDA-205408_ref37","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1007\/s12559-015-9319-y","article-title":"Word embedding composition for data imbalances in sentiment and emotion classification","volume":"7.2","author":"Xu","year":"2015","journal-title":"Cognitive Computation"},{"key":"10.3233\/IDA-205408_ref38","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.knosys.2018.06.019","article-title":"Imbalanced text sentiment classification using universal and domain-specific knowledge","volume":"160","author":"Li","year":"2018","journal-title":"Knowledge-Based Systems"}],"container-title":["Intelligent Data Analysis"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/IDA-205408","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:19:05Z","timestamp":1777454345000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/IDA-205408"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,20]]},"references-count":33,"journal-issue":{"issue":"3"},"URL":"https:\/\/doi.org\/10.3233\/ida-205408","relation":{},"ISSN":["1088-467X","1571-4128"],"issn-type":[{"value":"1088-467X","type":"print"},{"value":"1571-4128","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,4,20]]}}}