{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T23:17:29Z","timestamp":1775171849488,"version":"3.50.1"},"reference-count":38,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T00:00:00Z","timestamp":1684281600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Natural Science Foundation of Autonomous Region","award":["2021D01C118"],"award-info":[{"award-number":["2021D01C118"]}]},{"name":"Natural Science Foundation of Autonomous Region","award":["042419006"],"award-info":[{"award-number":["042419006"]}]},{"name":"Autonomous Region High-Level Innovative Talent Project","award":["2021D01C118"],"award-info":[{"award-number":["2021D01C118"]}]},{"name":"Autonomous Region High-Level Innovative Talent Project","award":["042419006"],"award-info":[{"award-number":["042419006"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>In sentiment analysis, biased user reviews can have a detrimental impact on a company\u2019s evaluation. Therefore, identifying such users can be highly beneficial as their reviews are not based on reality but on their characteristics rooted in their psychology. Furthermore, biased users may be seen as instigators of other prejudiced information on social media. Thus, proposing a method to help detect polarized opinions in product reviews would offer significant advantages. This paper proposes a new method for sentiment classification of multimodal data, which is called UsbVisdaNet (User Behavior Visual Distillation and Attention Network). The method aims to identify biased user reviews by analyzing their psychological behaviors. It can identify both positive and negative users and improves sentiment classification results that may be skewed due to subjective biases in user opinions by leveraging user behavior information. Through ablation and comparison experiments, the effectiveness of UsbVisdaNet is demonstrated, achieving superior sentiment classification performance on the Yelp multimodal dataset. Our research pioneers the integration of user behavior features, text features, and image features at multiple hierarchical levels within this domain.<\/jats:p>","DOI":"10.3390\/s23104829","type":"journal-article","created":{"date-parts":[[2023,5,18]],"date-time":"2023-05-18T07:03:58Z","timestamp":1684393438000},"page":"4829","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["UsbVisdaNet: User Behavior Visual Distillation and Attention Network for Multimodal Sentiment Classification"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0258-4893","authenticated-orcid":false,"given":"Shangwu","family":"Hou","sequence":"first","affiliation":[{"name":"Xinjiang Multilingual Information Technology Laboratory, Xinjiang Multilingual Information Technology Research Center, College of Information Science and Engineering, Xinjiang University, Urumqi 830017, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2441-2483","authenticated-orcid":false,"given":"Gulanbaier","family":"Tuerhong","sequence":"additional","affiliation":[{"name":"Xinjiang Multilingual Information Technology Laboratory, Xinjiang Multilingual Information Technology Research Center, College of Information Science and Engineering, Xinjiang University, Urumqi 830017, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9480-6915","authenticated-orcid":false,"given":"Mairidan","family":"Wushouer","sequence":"additional","affiliation":[{"name":"Xinjiang Multilingual Information Technology Laboratory, Xinjiang Multilingual Information Technology Research Center, College of Information Science and Engineering, Xinjiang University, Urumqi 830017, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,5,17]]},"reference":[{"key":"ref_1","unstructured":"Calabrese, B., and Cannataro, M. (2015, January 6\u201310). Sentiment analysis and affective computing: Methods and applications. Proceedings of the Brain-Inspired Computing: Second International Workshop, BrainComp 2015, Cetraro, Italy. Revised Selected Papers 2."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1007\/BF01238028","article-title":"Affective computing","volume":"1","author":"Lisetti","year":"1998","journal-title":"Pattern Anal. Appl."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"106803","DOI":"10.1016\/j.knosys.2021.106803","article-title":"Cross-modal image sentiment analysis via deep correlation of textual semantic","volume":"216","author":"Zhang","year":"2021","journal-title":"Knowl.-Based Syst."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"e5954","DOI":"10.1002\/cpe.5954","article-title":"Various syncretic co-attention network for multimodal sentiment analysis","volume":"32","author":"Cao","year":"2020","journal-title":"Concurr. Comput. Pract. Exp."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"2974","DOI":"10.1109\/TII.2020.3005405","article-title":"Social image sentiment analysis by exploiting multimodal content and heterogeneous relations","volume":"17","author":"Xu","year":"2020","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1016\/j.knosys.2019.04.018","article-title":"Visual-textual sentiment classification with bi-directional multi-level attention networks","volume":"178","author":"Xu","year":"2019","journal-title":"Knowl.-Based Syst."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Yang, X., Feng, S., Zhang, Y., and Wang, D. (2021, January 1\u20136). Multimodal sentiment detection based on multi-channel graph neural networks. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online.","DOI":"10.18653\/v1\/2021.acl-long.28"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Zhang, S., Li, B., and Yin, C. (2021). Cross-modal sentiment sensing with visual-augmented representation and diverse decision fusion. Sensors, 22.","DOI":"10.3390\/s22010074"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1145\/3388861","article-title":"Attention-based modality-gated networks for image-text sentiment analysis","volume":"16","author":"Huang","year":"2020","journal-title":"ACM Trans. Multimed. Comput. Commun. Appl."},{"key":"ref_10","unstructured":"Arevalo, J., Solorio, T., Montes-y G\u00f3mez, M., and Gonz\u00e1lez, F.A. (2017). Gated multimodal units for information fusion. arXiv."},{"key":"ref_11","unstructured":"Jin, S., and Zafarani, R. (2018). Proceedings of the 2018 IEEE International Conference on Data Mining Workshops (ICDMW), Beijing, China, 17\u201320 November 2018, IEEE."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Tan, C., Lee, L., Tang, J., Jiang, L., Zhou, M., and Li, P. (2011, January 21\u201324). User-level sentiment analysis incorporating social networks. Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA.","DOI":"10.1145\/2020408.2020614"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1162\/tacl_a_00062","article-title":"Overcoming language variation in sentiment analysis with social attention","volume":"5","author":"Yang","year":"2017","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Yang, Y., Chang, M.W., and Eisenstein, J. (2016). Toward socially-infused information extraction: Embedding authors, mentions, and entities. arXiv.","DOI":"10.18653\/v1\/D16-1152"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Tang, D., Qin, B., and Liu, T. (2015, January 26\u201331). Learning semantic representations of users and products for document level sentiment classification. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China.","DOI":"10.3115\/v1\/P15-1098"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1016\/j.knosys.2017.02.030","article-title":"Learning representations from heterogeneous network for sentiment classification of product reviews","volume":"124","author":"Gui","year":"2017","journal-title":"Knowl.-Based Syst."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Gong, L., and Wang, H. (2018, January 19\u201323). When sentiment analysis meets social network: A holistic user behavior modeling in opinionated data. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK.","DOI":"10.1145\/3219819.3220120"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zou, X., Yang, J., and Zhang, J. (2018). Microblog sentiment analysis using social and topic context. PLoS ONE, 13.","DOI":"10.1371\/journal.pone.0191163"},{"key":"ref_19","unstructured":"Fornacciari, P., Mordonini, M., and Tomaiuolo, M. (2015). Social Network and Sentiment Analysis on Twitter: Towards a Combined Approach, KDWeb."},{"key":"ref_20","unstructured":"Rubin, K.H., and Bowker, J. (2017). The SAGE Encyclopedia of Lifespan Human Development, Sage."},{"key":"ref_21","unstructured":"Allport, G., and Murchison, C. (1935). Handbook of Social Psychology, Clark University Press."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/S0065-2601(08)60372-X","article-title":"Direct experience and attitude-behavior consistency","volume":"Volume 14","author":"Fazio","year":"1981","journal-title":"Advances in Experimental Social Psychology"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"2168","DOI":"10.1109\/TVCG.2019.2903943","article-title":"Deepvid: Deep visual interpretation and diagnosis for image classifiers via knowledge distillation","volume":"25","author":"Wang","year":"2019","journal-title":"IEEE Trans. Vis. Comput. Graph."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"6684","DOI":"10.1109\/TCYB.2020.3041212","article-title":"An adaptive localized decision variable analysis approach to large-scale multiobjective and many-objective optimization","volume":"52","author":"Ma","year":"2021","journal-title":"IEEE Trans. Cybern."},{"key":"ref_25","unstructured":"Truong, Q.T., and Lauw, H.W. (February, January 27). Vistanet: Visual aspect attention network for multimodal sentiment analysis. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1016\/j.inffus.2020.01.011","article-title":"Emotion recognition using multi-modal data and machine learning techniques: A tutorial and review","volume":"59","author":"Zhang","year":"2020","journal-title":"Inf. Fusion"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1109\/MSP.2021.3106895","article-title":"Emotion recognition from multiple modalities: Fundamentals and methodologies","volume":"38","author":"Zhao","year":"2021","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1763","DOI":"10.1213\/ANE.0000000000002864","article-title":"Correlation coefficients: Appropriate use and interpretation","volume":"126","author":"Schober","year":"2018","journal-title":"Anesth. Analg."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Hou, S., Tuerhong, G., and Wushouer, M. (2023). VisdaNet: Visual Distillation and Attention Network for Multimodal Sentiment Classification. Sensors, 23.","DOI":"10.3390\/s23020661"},{"key":"ref_30","unstructured":"Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., and Clark, J. (2021, January 18\u201324). Learning transferable visual models from natural language supervision. Proceedings of the International Conference on Machine Learning, PMLR, Virtual."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Luo, Y., Ji, J., Sun, X., Cao, L., Wu, Y., Huang, F., Lin, C.W., and Ji, R. (2021, January 2\u20139). Dual-level collaborative transformer for image captioning. Proceedings of the AAAI Conference on Artificial Intelligence, Virtual.","DOI":"10.1609\/aaai.v35i3.16328"},{"key":"ref_32","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., and Hovy, E. (2016, January 12\u201317). Hierarchical attention networks for document classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA.","DOI":"10.18653\/v1\/N16-1174"},{"key":"ref_34","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. arXiv.","DOI":"10.3115\/v1\/D14-1181"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1162\/tacl_a_00051","article-title":"Enriching word vectors with subword information","volume":"5","author":"Bojanowski","year":"2017","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Tang, D., Qin, B., and Liu, T. (2015, January 17\u201321). Document modeling with gated recurrent neural network for sentiment classification. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal.","DOI":"10.18653\/v1\/D15-1167"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"108107","DOI":"10.1016\/j.knosys.2021.108107","article-title":"Gated attention fusion network for multimodal sentiment classification","volume":"240","author":"Du","year":"2022","journal-title":"Knowl.-Based Syst."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/10\/4829\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:36:47Z","timestamp":1760125007000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/10\/4829"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,17]]},"references-count":38,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2023,5]]}},"alternative-id":["s23104829"],"URL":"https:\/\/doi.org\/10.3390\/s23104829","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,17]]}}}