{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T21:55:42Z","timestamp":1777758942007,"version":"3.51.4"},"reference-count":56,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T00:00:00Z","timestamp":1775001600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T00:00:00Z","timestamp":1776124800000},"content-version":"vor","delay-in-days":13,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Malatya Turgut \u00d6zal University"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Vis Comput"],"published-print":{"date-parts":[[2026,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Emotion analysis is a critical research domain focused on detecting the emotional states of individuals or communities across multiple data modalities, including text, images, and audio. While substantial progress has been made in unimodal (text-based) sentiment analysis, real-world scenarios often involve multimodal data, making integrated approaches essential for capturing contextual richness and improving predictive accuracy. This study introduces a hybrid deep learning model that combines text and visual features through an intermediate fusion mechanism and multi-task learning framework. Textual inputs are processed using RoBERTa and BiGRU layers, while visual inputs are analyzed through ViT and ResNet50 architectures enhanced by the Convolutional Block Attention Module (CBAM). The fused multimodal representations enable simultaneous and more robust emotion classification. Experimental results on the MVSA dataset demonstrate the superior performance of the proposed model, achieving 96.02% accuracy, 95.51% precision, 94.07% recall, and 94.73% F1-score, outperforming several state-of-the-art multimodal benchmarks. These findings underscore the model\u2019s methodological contributions and its strong potential for advancing the field of multimodal emotion analysis in both academic research and real-world applications.<\/jats:p>","DOI":"10.1007\/s00371-026-04475-1","type":"journal-article","created":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T07:11:58Z","timestamp":1776150718000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Advancing multimodal emotion analysis: a hybrid deep learning approach with intermediate fusion and multi-task learning"],"prefix":"10.1007","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-6621-9240","authenticated-orcid":false,"given":"Fatih","family":"Kaya","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9398-084X","authenticated-orcid":false,"given":"Yunus Emre","family":"Karaca","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8009-063X","authenticated-orcid":false,"given":"Serpil","family":"Aslan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Muhammed","family":"Y\u0131ld\u0131r\u0131m","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2026,4,14]]},"reference":[{"key":"4475_CR1","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2023.110404","volume":"143","author":"S Aslan","year":"2023","unstructured":"Aslan, S.: A deep learning-based sentiment analysis approach (MF-CNN-BILSTM) and topic modeling of tweets related to the Ukraine-Russia conflict. Appl. Soft Comput. 143, 110404 (2023). https:\/\/doi.org\/10.1016\/j.asoc.2023.110404","journal-title":"Appl. Soft Comput."},{"key":"4475_CR2","doi-asserted-by":"publisher","first-page":"63373","DOI":"10.1109\/ACCESS.2019.2916887","volume":"7","author":"W Guo","year":"2019","unstructured":"Guo, W., Wang, J., Wang, S.: Deep multimodal representation learning: a survey. IEEE Access 7, 63373\u201363394 (2019). https:\/\/doi.org\/10.1109\/ACCESS.2019.2916887","journal-title":"IEEE Access"},{"key":"4475_CR3","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1007\/978-3-319-23654-6_2","volume-title":"Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis","author":"E Cambria","year":"2015","unstructured":"Cambria, E., Hussain, A.: SenticNet. In: Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis, pp. 23\u201371. Springer, Cham (2015). https:\/\/doi.org\/10.1007\/978-3-319-23654-6_2"},{"key":"4475_CR4","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1016\/j.inffus.2017.02.003","volume":"37","author":"S Poria","year":"2017","unstructured":"Poria, S., Cambria, E., Bajpai, R., Hussain, A.: A review of affective computing: from unimodal analysis to multimodal fusion. Inf. Fusion 37, 98\u2013125 (2017). https:\/\/doi.org\/10.1016\/j.inffus.2017.02.003","journal-title":"Inf. Fusion"},{"issue":"1","key":"4475_CR5","doi-asserted-by":"publisher","first-page":"54","DOI":"10.3390\/ai4010004","volume":"4","author":"A Rahali","year":"2023","unstructured":"Rahali, A., Akhloufi, M.A.: End-to-end transformer-based models in textual-based NLP. AI 4(1), 54\u2013110 (2023). https:\/\/doi.org\/10.3390\/ai4010004","journal-title":"AI"},{"key":"4475_CR6","doi-asserted-by":"publisher","unstructured":"You, Q., Luo, J., Jin, H., Yang, J.: Cross-modality consistent regression for joint visual\u2013textual sentiment analysis of social multimedia. In: Proceedings of the 9th ACM International Conference on Web Search and Data Mining (WSDM), pp. 13\u201322. ACM, New York (2016). https:\/\/doi.org\/10.1145\/2835776.2835820","DOI":"10.1145\/2835776.2835820"},{"key":"4475_CR7","doi-asserted-by":"publisher","first-page":"356","DOI":"10.1016\/j.jvcir.2017.02.017","volume":"48","author":"B Jiang","year":"2017","unstructured":"Jiang, B., Yang, J., Lv, Z., Tian, K., Meng, Q., Yan, Y.: Internet cross-media retrieval based on deep learning. J. Vis. Commun. Image Represent. 48, 356\u2013366 (2017). https:\/\/doi.org\/10.1016\/j.jvcir.2017.02.017","journal-title":"J. Vis. Commun. Image Represent."},{"key":"4475_CR8","doi-asserted-by":"publisher","first-page":"375","DOI":"10.1016\/j.jvcir.2017.02.020","volume":"48","author":"WZ Nie","year":"2017","unstructured":"Nie, W.Z., Peng, W.J., Wang, X.Y., Zhao, Y.L., Su, Y.T.: Multimedia venue semantic modeling based on multimodal data. J. Vis. Commun. Image Represent. 48, 375\u2013385 (2017). https:\/\/doi.org\/10.1016\/j.jvcir.2017.02.020","journal-title":"J. Vis. Commun. Image Represent."},{"key":"4475_CR9","doi-asserted-by":"publisher","unstructured":"Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 83\u201392. ACM, New York (2010). https:\/\/doi.org\/10.1145\/1873951.1873965","DOI":"10.1145\/1873951.1873965"},{"key":"4475_CR10","doi-asserted-by":"publisher","unstructured":"Murthy, J.S., Shekar, A.C., Bhattacharya, D., Namratha, R., Sripriya, D.: A novel framework for multimodal Twitter sentiment analysis using feature learning. In: Advances in Computing and Data Sciences. ICACDS 2021, Nashik, India, April 23\u201324, 2021. Revised Selected Papers, Part II, Lecture Notes in Computer Science, vol. 12779, pp. 252\u2013261. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-78399-3_24","DOI":"10.1007\/978-3-030-78399-3_24"},{"key":"4475_CR11","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2023.111206","volume":"152","author":"A Pandey","year":"2024","unstructured":"Pandey, A., Vishwakarma, D.K.: Progress, achievements, and challenges in multimodal sentiment analysis using deep learning: a survey. Appl. Soft Comput. 152, 111206 (2024). https:\/\/doi.org\/10.1016\/j.asoc.2023.111206","journal-title":"Appl. Soft Comput."},{"key":"4475_CR12","doi-asserted-by":"publisher","first-page":"424","DOI":"10.1016\/j.inffus.2022.09.025","volume":"91","author":"A Gandhi","year":"2023","unstructured":"Gandhi, A., Adhvaryu, K., Poria, S., Cambria, E., Hussain, A.: Multimodal sentiment analysis: a systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Inf. Fusion 91, 424\u2013444 (2023). https:\/\/doi.org\/10.1016\/j.inffus.2022.09.025","journal-title":"Inf. Fusion"},{"issue":"28","key":"4475_CR13","doi-asserted-by":"publisher","first-page":"e7387","DOI":"10.1002\/cpe.7387","volume":"34","author":"S Aslan","year":"2022","unstructured":"Aslan, S.: A novel TCNN\u2013Bi-LSTM deep learning model for predicting sentiments of tweets about COVID-19 vaccines. Concurr. Comput. Pract. Exp. 34(28), e7387 (2022). https:\/\/doi.org\/10.1002\/cpe.7387","journal-title":"Concurr. Comput. Pract. Exp."},{"issue":"1","key":"4475_CR14","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-025-85859-6","volume":"15","author":"Y Cai","year":"2025","unstructured":"Cai, Y., Li, X., Zhang, Y., Li, J., Zhu, F., Rao, L.: Multimodal sentiment analysis based on multi-layer feature fusion and multi-task learning. Sci. Rep. 15(1), 2126 (2025). https:\/\/doi.org\/10.1038\/s41598-025-85859-6","journal-title":"Sci. Rep."},{"issue":"11","key":"4475_CR15","doi-asserted-by":"publisher","first-page":"8403","DOI":"10.1007\/s11760-024-03482-w","volume":"18","author":"Y Li","year":"2024","unstructured":"Li, Y., Zheng, X., Zhu, M., Mei, J., Chen, Z., Tao, Y.: Compact bilinear pooling and multi-loss network for social media multimodal classification. Signal Image Video Process. 18(11), 8403\u20138412 (2024). https:\/\/doi.org\/10.1007\/s11760-024-03482-w","journal-title":"Signal Image Video Process."},{"issue":"1","key":"4475_CR16","doi-asserted-by":"publisher","DOI":"10.1002\/isaf.1549","volume":"31","author":"A Todd","year":"2024","unstructured":"Todd, A., Bowden, J., Moshfeghi, Y.: Text-based sentiment analysis in finance: synthesising the existing literature and exploring future directions. Intell. Syst. Account. Finance Manag. 31(1), e1549 (2024). https:\/\/doi.org\/10.1002\/isaf.1549","journal-title":"Intell. Syst. Account. Finance Manag."},{"key":"4475_CR17","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2025.111369","volume":"139","author":"W Zou","year":"2025","unstructured":"Zou, W., Sun, X., Lu, Q., Wang, X., Feng, J.: A vision and language hierarchical alignment for multimodal aspect-based sentiment analysis. Pattern Recogn. 139, 111369 (2025). https:\/\/doi.org\/10.1016\/j.patcog.2025.111369","journal-title":"Pattern Recogn."},{"issue":"1","key":"4475_CR18","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.ijresmar.2021.09.005","volume":"39","author":"HJ Alantari","year":"2022","unstructured":"Alantari, H.J., Currim, I.S., Deng, Y., Singh, S.: An empirical comparison of machine learning methods for text-based sentiment analysis of online consumer reviews. Int. J. Res. Mark. 39(1), 1\u201319 (2022). https:\/\/doi.org\/10.1016\/j.ijresmar.2021.09.005","journal-title":"Int. J. Res. Mark."},{"issue":"17","key":"4475_CR19","doi-asserted-by":"publisher","first-page":"50061","DOI":"10.1007\/s11042-023-17569-y","volume":"83","author":"D Chakraborty","year":"2024","unstructured":"Chakraborty, D., Rudrapal, D., Bhattacharya, B.: A multimodal sentiment analysis approach for tweets by comprehending co-relations between information modalities. Multimed. Tools Appl. 83(17), 50061\u201350085 (2024). https:\/\/doi.org\/10.1007\/s11042-023-17569-y","journal-title":"Multimed. Tools Appl."},{"issue":"4","key":"4475_CR20","doi-asserted-by":"publisher","first-page":"1323","DOI":"10.1007\/s00371-022-02594-5","volume":"39","author":"ZB Fan","year":"2023","unstructured":"Fan, Z.B., Zhu, Y.X., Markovi\u0107, S., Zhang, K.: A comparative study of oil paintings and Chinese ink paintings on composition. Vis. Comput. 39(4), 1323\u20131334 (2023). https:\/\/doi.org\/10.1007\/s00371-022-02594-5","journal-title":"Vis. Comput."},{"key":"4475_CR21","doi-asserted-by":"publisher","unstructured":"Cai, Z., Gao, H., Li, J., Wang, X.: Deep learning approaches on multimodal sentiment analysis. In: 2022 IEEE International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA), pp. 1127\u20131131. IEEE (2022). https:\/\/doi.org\/10.1109\/EEBDA53927.2022.9745018","DOI":"10.1109\/EEBDA53927.2022.9745018"},{"key":"4475_CR22","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.120857","volume":"232","author":"J Liu","year":"2023","unstructured":"Liu, J., Ang, M.C., Chaw, J.K., Kor, A.L., Ng, K.W.: Emotion assessment and application in human\u2013computer interaction interface based on backpropagation neural network and artificial bee colony algorithm. Expert Syst. Appl. 232, 120857 (2023). https:\/\/doi.org\/10.1016\/j.eswa.2023.120857","journal-title":"Expert Syst. Appl."},{"key":"4475_CR23","doi-asserted-by":"publisher","unstructured":"Marchesotti, L., Perronnin, F., Larlus, D., Csurka, G.: Assessing the aesthetic quality of photographs using generic image descriptors. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1784\u20131791. IEEE (2011). https:\/\/doi.org\/10.1109\/ICCV.2011.6126444","DOI":"10.1109\/ICCV.2011.6126444"},{"issue":"15","key":"4475_CR24","doi-asserted-by":"publisher","first-page":"22323","DOI":"10.1007\/s11042-019-08312-7","volume":"80","author":"A Ortis","year":"2021","unstructured":"Ortis, A., Farinella, G.M., Torrisi, G., Battiato, S.: Exploiting objective text description of images for visual sentiment analysis. Multimed. Tools Appl. 80(15), 22323\u201322346 (2021). https:\/\/doi.org\/10.1007\/s11042-019-08312-7","journal-title":"Multimed. Tools Appl."},{"key":"4475_CR25","doi-asserted-by":"crossref","unstructured":"Jagadeesh, M., Ravikumar, A., Deivasigamani, A., Dharshini, S.S.: Customer emotion analysis on food review images using deep learning: a review. In: Disruptive Technologies for Sustainable Development, pp. 208\u2013213 (2024)","DOI":"10.1201\/9781003428473-44"},{"key":"4475_CR26","doi-asserted-by":"publisher","first-page":"3375","DOI":"10.1109\/TMM.2022.3160060","volume":"25","author":"T Zhu","year":"2022","unstructured":"Zhu, T., Li, L., Yang, J., Zhao, S., Liu, H., Qian, J.: Multimodal sentiment analysis with image\u2013text interaction network. IEEE Trans. Multimed. 25, 3375\u20133385 (2022). https:\/\/doi.org\/10.1109\/TMM.2022.3160060","journal-title":"IEEE Trans. Multimed."},{"key":"4475_CR27","doi-asserted-by":"publisher","unstructured":"Borth, D., Ji, R., Chen, T., Breuel, T., Chang, S.-F.: Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proc. ACM Int. Conf. Multimedia, pp. 223\u2013232 (2013). https:\/\/doi.org\/10.1145\/2502081.2502282","DOI":"10.1145\/2502081.2502282"},{"issue":"4","key":"4475_CR28","doi-asserted-by":"publisher","first-page":"1793","DOI":"10.1007\/s10044-023-01178-2","volume":"26","author":"L Sun","year":"2023","unstructured":"Sun, L., Li, Q., Liu, L., Su, Y.: Unsupervised multimodal learning for image\u2013text relation classification in tweets. Pattern Anal. Appl. 26(4), 1793\u20131804 (2023). https:\/\/doi.org\/10.1007\/s10044-023-01178-2","journal-title":"Pattern Anal. Appl."},{"issue":"6","key":"4475_CR29","doi-asserted-by":"publisher","first-page":"15711","DOI":"10.1007\/s11042-023-16174-3","volume":"83","author":"G Meena","year":"2024","unstructured":"Meena, G., Mohbey, K.K., Indian, A., Khan, M.Z., Kumar, S.: Identifying emotions from facial expressions using a deep convolutional neural network-based approach. Multimed. Tools Appl. 83(6), 15711\u201315732 (2024). https:\/\/doi.org\/10.1007\/s11042-023-16174-3","journal-title":"Multimed. Tools Appl."},{"issue":"15","key":"4475_CR30","doi-asserted-by":"publisher","first-page":"8955","DOI":"10.1007\/s11042-014-2337-z","volume":"75","author":"D Cao","year":"2016","unstructured":"Cao, D., Ji, R., Lin, D., Li, S.: Visual sentiment topic model based microblog image sentiment analysis. Multimed. Tools Appl. 75(15), 8955\u20138968 (2016). https:\/\/doi.org\/10.1007\/s11042-014-2337-z","journal-title":"Multimed. Tools Appl."},{"key":"4475_CR31","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2023.101958","volume":"100","author":"C Zhu","year":"2023","unstructured":"Zhu, C., Chen, M., Zhang, S., Sun, C., Liang, H., Liu, Y., Chen, J.: SKEAFN: sentiment knowledge enhanced attention fusion network for multimodal sentiment analysis. Inf. Fusion 100, 101958 (2023). https:\/\/doi.org\/10.1016\/j.inffus.2023.101958","journal-title":"Inf. Fusion"},{"key":"4475_CR32","doi-asserted-by":"publisher","unstructured":"Tan, Q., Shen, X., Bai, Z., Sun, Y.: Cross-modality fused graph convolutional network for image\u2013text sentiment analysis. In: International Conference on Image and Graphics (ICIG 2023), pp. 397\u2013411. Springer, Cham (2023). https:\/\/doi.org\/10.1007\/978-3-031-46314-3_32","DOI":"10.1007\/978-3-031-46314-3_32"},{"issue":"16","key":"4475_CR33","doi-asserted-by":"publisher","DOI":"10.3390\/rs17162750","volume":"17","author":"Y Huang","year":"2025","unstructured":"Huang, Y., Zhu, X., Wang, R., Xie, Y., Fong, S.: A dynamic global\u2013local spatiotemporal graph framework for multi-city PM2.5 long-term forecasting. Remote Sens 17(16), 2750 (2025). https:\/\/doi.org\/10.3390\/rs17162750","journal-title":"Remote Sens"},{"key":"4475_CR34","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2025.3577794","author":"J Wang","year":"2025","unstructured":"Wang, J., Gao, M., Zhai, W., Rida, I., Zhu, X., Li, Q.: Knowledge generation and distillation for road segmentation in intelligent transportation systems. IEEE Trans. Intell. Transp. Syst. (2025). https:\/\/doi.org\/10.1109\/TITS.2025.3577794","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"4475_CR35","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2025.111993","volume":"157","author":"Y Ye","year":"2025","unstructured":"Ye, Y., Liu, N., Zhao, Y., Zhu, X., Wang, J., Liu, Y.: Advancing federated domain generalization in ophthalmology: Vision enhancement and consistency assurance for multicenter fundus image segmentation. Pattern Recognit. 157, 111993 (2025). https:\/\/doi.org\/10.1016\/j.patcog.2025.111993","journal-title":"Pattern Recognit."},{"key":"4475_CR36","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2025.105519","volume":"158","author":"M Gao","year":"2025","unstructured":"Gao, M., Sun, J., Li, Q., Khan, M.A., Shang, J., Zhu, X., Jeon, G.: Towards trustworthy image super-resolution via symmetrical and recursive artificial neural network. Image Vis. Comput. 158, 105519 (2025). https:\/\/doi.org\/10.1016\/j.imavis.2025.105519","journal-title":"Image Vis. Comput."},{"issue":"3","key":"4475_CR37","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1007\/s12559-025-10463-9","volume":"17","author":"R Wang","year":"2025","unstructured":"Wang, R., Wang, Y., Cambria, E., Fan, X., Yu, X., Huang, Y., et al.: Contrastive-based removal of negative information in multimodal emotion analysis. Cogn. Comput. 17(3), 107 (2025). https:\/\/doi.org\/10.1007\/s12559-025-10463-9","journal-title":"Cogn. Comput."},{"key":"4475_CR38","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2025.105582","volume":"159","author":"S Guo","year":"2025","unstructured":"Guo, S., Li, Q., Gao, M., Zhu, X., Rida, I.: Generalizable deepfake detection via spatial kernel selection and halo attention network. Image Vis. Comput. 159, 105582 (2025). https:\/\/doi.org\/10.1016\/j.imavis.2025.105582","journal-title":"Image Vis. Comput."},{"key":"4475_CR39","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2025.105663","volume":"159","author":"W Song","year":"2025","unstructured":"Song, W., Guo, S., Gao, M., Li, Q., Zhu, X., Rida, I.: Deepfake detection via feature refinement and enhancement network. Image Vis. Comput. 159, 105663 (2025). https:\/\/doi.org\/10.1016\/j.imavis.2025.105663","journal-title":"Image Vis. Comput."},{"key":"4475_CR40","doi-asserted-by":"publisher","unstructured":"Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019). https:\/\/doi.org\/10.48550\/arXiv.1907.11692","DOI":"10.48550\/arXiv.1907.11692"},{"key":"4475_CR41","doi-asserted-by":"publisher","first-page":"33027","DOI":"10.1016\/j.ijhydene.2022.07.188","volume":"47","author":"R Zhang","year":"2022","unstructured":"Zhang, R., Chen, T., Xiao, F., Luo, J.: Bi-directional gated recurrent unit recurrent neural networks for failure prognosis of proton exchange membrane fuel cells. Int. J. Hydrogen Energy 47, 33027\u201333038 (2022). https:\/\/doi.org\/10.1016\/j.ijhydene.2022.07.188","journal-title":"Int. J. Hydrogen Energy"},{"key":"4475_CR42","doi-asserted-by":"publisher","DOI":"10.3390\/s23073447","volume":"23","author":"N Ebert","year":"2023","unstructured":"Ebert, N., Stricker, D., Wasenm\u00fcller, O.: PLG-ViT: Vision transformer with parallel local and global self-attention. Sensors 23, 3447 (2023). https:\/\/doi.org\/10.3390\/s23073447","journal-title":"Sensors"},{"key":"4475_CR43","doi-asserted-by":"publisher","first-page":"63","DOI":"10.1007\/978-1-4842-6168-2_6","volume-title":"Convolutional Neural Networks with Swift for TensorFlow: Image Recognition and Dataset Categorization","author":"B Koonce","year":"2021","unstructured":"Koonce, B.: ResNet-50. In: Convolutional Neural Networks with Swift for TensorFlow: Image Recognition and Dataset Categorization, pp. 63\u201372. Apress, Berkeley (2021). https:\/\/doi.org\/10.1007\/978-1-4842-6168-2_6"},{"key":"4475_CR44","doi-asserted-by":"publisher","unstructured":"Ma, B., Wang, X., Zhang, H., Li, F., Dan, J.: CBAM-GAN: Generative adversarial networks based on convolutional block attention module. In: International Conference on Artificial Intelligence and Security, pp. 227\u2013236. Springer, Cham (2019). https:\/\/doi.org\/10.1007\/978-3-030-24274-9_20","DOI":"10.1007\/978-3-030-24274-9_20"},{"key":"4475_CR45","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1023\/A:1007379606734","volume":"28","author":"R Caruana","year":"1997","unstructured":"Caruana, R.: Multitask learning. Mach. Learn. 28, 41\u201375 (1997). https:\/\/doi.org\/10.1023\/A:1007379606734","journal-title":"Mach. Learn."},{"key":"4475_CR46","doi-asserted-by":"publisher","unstructured":"Niu, T., Zhu, S., Pang, L., El Saddik, A.: Sentiment analysis on multi-view social data. In: MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science, vol. 9517, pp. 15\u201327. Springer, Cham (2016). https:\/\/doi.org\/10.1007\/978-3-319-27674-8_2","DOI":"10.1007\/978-3-319-27674-8_2"},{"key":"4475_CR47","doi-asserted-by":"publisher","DOI":"10.14569\/IJACSA.2025.0160167","author":"LLX Wei","year":"2025","unstructured":"Wei, L.L.X., Sani, N.S.: Enhanced facial expression recognition based on ResNet50 with a convolutional block attention module. Int. J. Adv. Comput. Sci. Appl. (2025). https:\/\/doi.org\/10.14569\/IJACSA.2025.0160167","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"issue":"11","key":"4475_CR48","doi-asserted-by":"publisher","DOI":"10.3390\/s24113683","volume":"24","author":"TH Noor","year":"2024","unstructured":"Noor, T.H., Noor, A., Alharbi, A.F., Faisal, A., Alrashidi, R., Alsaedi, A.S., Alsaeedi, A.: Real-time Arabic sign language recognition using a hybrid deep learning model. Sensors 24(11), 3683 (2024). https:\/\/doi.org\/10.3390\/s24113683","journal-title":"Sensors"},{"issue":"1","key":"4475_CR49","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-024-54927-8","volume":"14","author":"UK Lilhore","year":"2024","unstructured":"Lilhore, U.K., Dalal, S., Varshney, N., Sharma, Y.K., Rao, K.B., Rao, V.M., Chakrabarti, P.: Prevalence and risk factors analysis of postpartum depression at early stage using hybrid deep learning model. Sci. Rep. 14(1), 4533 (2024). https:\/\/doi.org\/10.1038\/s41598-024-54927-8","journal-title":"Sci. Rep."},{"issue":"4","key":"4475_CR50","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpcardiol.2024.102454","volume":"49","author":"H utanto","year":"2024","unstructured":"utanto, H.: Transforming clinical cardiology through neural networks and deep learning: a guide for clinicians. Curr. Probl. Cardiol. 49(4), 102454 (2024). https:\/\/doi.org\/10.1016\/j.cpcardiol.2024.102454","journal-title":"Curr. Probl. Cardiol."},{"key":"4475_CR51","doi-asserted-by":"publisher","DOI":"10.5435\/JAAOS-D-23-00831","volume":"30","author":"A Bozzo","year":"2022","unstructured":"Bozzo, A., Tsui, J.M., Bhatnagar, S., Forsberg, J.: Deep learning and multimodal artificial intelligence in orthopaedic surgery. JAAOS J. Am. Acad. Orthop. Surg. 30, 10\u20135435 (2022). https:\/\/doi.org\/10.5435\/JAAOS-D-23-00831","journal-title":"JAAOS J. Am. Acad. Orthop. Surg."},{"issue":"3","key":"4475_CR52","doi-asserted-by":"publisher","first-page":"662","DOI":"10.3390\/electronics13030662","volume":"13","author":"Q Pan","year":"2024","unstructured":"Pan, Q., Meng, Z.: Hybrid uncertainty calibration for multimodal sentiment analysis. Electronics 13(3), 662 (2024). https:\/\/doi.org\/10.3390\/electronics13030662","journal-title":"Electronics"},{"key":"4475_CR53","doi-asserted-by":"publisher","unstructured":"Tomita, V.A.K., Marcacini, R.M.: TF-MVSA: Multimodal video sentiment analysis tool using transfer learning. In: Anais Estendidos (Extended Abstracts), 2023. https:\/\/doi.org\/10.5753\/webmedia_estendido.2023.235544","DOI":"10.5753\/webmedia_estendido.2023.235544"},{"issue":"5","key":"4475_CR54","doi-asserted-by":"publisher","DOI":"10.3390\/s23052679","volume":"23","author":"H Wang","year":"2023","unstructured":"Wang, H., Li, X., Ren, Z., Wang, M., Ma, C.: Multimodal sentiment analysis representations learning via contrastive learning with condense attention fusion. Sensors 23(5), 2679 (2023). https:\/\/doi.org\/10.3390\/s23052679","journal-title":"Sensors"},{"key":"4475_CR55","doi-asserted-by":"publisher","unstructured":"Chen, F., Huang, P., Ge, X., Huang, J., Bao, Z.: Multimodal sentiment analysis based on causal reasoning. arXiv preprint arXiv:2412.07292 (2024). https:\/\/doi.org\/10.48550\/arXiv.2412.07292","DOI":"10.48550\/arXiv.2412.07292"},{"issue":"3","key":"4475_CR56","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3517139","volume":"18","author":"RK Yadav","year":"2022","unstructured":"Yadav, R.K., Vishwakarma, D.K.: DMLANet: a dual-modality-level attention network for image\u2013text sentiment analysis. ACM Trans Multim. Comput. Commun. Appl. (TOMM) 18(3), 1\u201324 (2022). https:\/\/doi.org\/10.1145\/3517139","journal-title":"ACM Trans Multim. Comput. Commun. Appl. (TOMM)"}],"container-title":["The Visual Computer"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00371-026-04475-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00371-026-04475-1","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00371-026-04475-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T13:16:54Z","timestamp":1777468614000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00371-026-04475-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4]]},"references-count":56,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2026,4]]}},"alternative-id":["4475"],"URL":"https:\/\/doi.org\/10.1007\/s00371-026-04475-1","relation":{},"ISSN":["0178-2789","1432-2315"],"issn-type":[{"value":"0178-2789","type":"print"},{"value":"1432-2315","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,4]]},"assertion":[{"value":"25 July 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 March 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 April 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"253"}}