{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T15:33:30Z","timestamp":1777736010406,"version":"3.51.4"},"reference-count":80,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,8,26]],"date-time":"2025-08-26T00:00:00Z","timestamp":1756166400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,8,26]],"date-time":"2025-08-26T00:00:00Z","timestamp":1756166400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100003789","name":"Helwan University","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100003789","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Emotion recognition in dynamic and real-world environments presents significant challenges due to the complexity and variability of multimodal data. This paper introduces an innovative Multimodal Emotion Recognition (MER) framework that seamlessly integrates text, audio, video, and motion data using advanced machine learning techniques. To address challenges such as class imbalance, the framework employs Generative Adversarial Networks (GANs) for synthetic sample generation and Dynamic Prompt Engineering (DPE) for enhanced feature extraction across modalities. Text features are processed with Mistral-7B, audio with HuBERT, video with TimeSformer and LLaVA, and motion with MediaPipe Pose. The system efficiently fuses these inputs using Hierarchical Attention-based Graph Neural Networks (HAN-GNN) and Cross-Modality Transformer Fusion (XMTF), further improved by contrastive learning with Prototypical Networks to enhance class separation. The framework demonstrates exceptional performance, achieving training accuracies of 99.92% on IEMOCAP and 99.95% on MELD, with testing accuracies of 99.82% and 99.81%, respectively. High precision, recall, and specificity further highlight the robustness of the model. While trained on batch-processed datasets, the framework has been optimized for real-time applications, demonstrating computational efficiency with training completed in just 5\u00a0min and inference times under 0.4\u00a0ms per sample. This makes the system well-suited for real-time emotion recognition tasks despite being trained on batch data. It also generalizes effectively to noisy and multilingual settings, achieving strong results on SAVEE and CMU-MOSEAS, thereby confirming its resilience in diverse real-world scenarios. This research advances the field of MER, offering a scalable and efficient solution for affective computing. The findings emphasize the importance of refining these systems for real-world applications, particularly in complex, multimodal big data environments.<\/jats:p>","DOI":"10.1186\/s40537-025-01264-w","type":"journal-article","created":{"date-parts":[[2025,8,26]],"date-time":"2025-08-26T10:41:57Z","timestamp":1756204917000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Advancing multimodal emotion recognition in big data through prompt engineering and deep adaptive learning"],"prefix":"10.1186","volume":"12","author":[{"given":"Abeer A.","family":"Wafa","sequence":"first","affiliation":[]},{"given":"Mai M.","family":"Eldefrawi","sequence":"additional","affiliation":[]},{"given":"Marwa S.","family":"Farhan","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,8,26]]},"reference":[{"issue":"6","key":"1264_CR1","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s10489-025-06245-3","volume":"55","author":"M Zhao","year":"2025","unstructured":"Zhao M, Gong L, Din AS. A review of the emotion recognition model of robots. Appl Intell. 2025;55(6):1\u201333.","journal-title":"Appl Intell"},{"issue":"1","key":"1264_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40537-025-01062-4","volume":"12","author":"S Kusal","year":"2025","unstructured":"Kusal S, Patil S, Kotecha K. Multimodal text-emoji fusion using deep neural networks for text-based emotion detection in online communication. J Big Data. 2025;12(1):1\u201325.","journal-title":"J Big Data"},{"key":"1264_CR3","first-page":"1","volume":"2025","author":"R Sujatha","year":"2025","unstructured":"Sujatha R, Chatterjee JM, Pathy B, Hu Y-C. Automatic emotion recognition using deep neural network. Multimed Tools Appl. 2025;2025:1\u201330.","journal-title":"Multimed Tools Appl"},{"issue":"1","key":"1264_CR4","doi-asserted-by":"publisher","first-page":"5473","DOI":"10.1038\/s41598-025-89202-x","volume":"15","author":"M Khan","year":"2025","unstructured":"Khan M, Tran P-N, Pham NT, El Saddik A, Othmani A. Memocmt: multimodal emotion recognition using cross-modal transformer-based feature fusion. Sci Rep. 2025;15(1):5473.","journal-title":"Sci Rep"},{"issue":"5","key":"1264_CR5","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1007\/s10462-025-11126-9","volume":"58","author":"R Pillalamarri","year":"2025","unstructured":"Pillalamarri R, Shanmugam U. A review on eeg-based multimodal learning for emotion recognition. Artif Intell Rev. 2025;58(5):131.","journal-title":"Artif Intell Rev"},{"issue":"3","key":"1264_CR6","doi-asserted-by":"publisher","first-page":"174","DOI":"10.1177\/10870547241297005","volume":"29","author":"EC Shepard","year":"2025","unstructured":"Shepard EC, Ruben M, Weyandt LL. Emotion recognition accuracy among individuals with adhd: a systematic review. J Atten Disord. 2025;29(3):174\u201394.","journal-title":"J Atten Disord"},{"issue":"1","key":"1264_CR7","first-page":"1","volume":"58","author":"P Pereira","year":"2025","unstructured":"Pereira P, Moniz H, Carvalho JP. Deep emotion recognition in textual conversations: a survey. Artif Intell Rev. 2025;58(1):1\u201337.","journal-title":"Artif Intell Rev"},{"key":"1264_CR8","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2025.110004","volume":"143","author":"Z Ma","year":"2025","unstructured":"Ma Z, Li A, Tang J, Zhang J, Yin Z. Multimodal emotion recognition by fusing complementary patterns from central to peripheral neurophysiological signals across feature domains. Eng Appl Artif Intell. 2025;143: 110004.","journal-title":"Eng Appl Artif Intell"},{"issue":"11","key":"1264_CR9","doi-asserted-by":"publisher","first-page":"5184","DOI":"10.3390\/s23115184","volume":"23","author":"A Aguilera","year":"2023","unstructured":"Aguilera A, Mellado D, Rojas F. An assessment of in-the-wild datasets for multimodal emotion recognition. Sensors. 2023;23(11):5184.","journal-title":"Sensors"},{"issue":"21","key":"1264_CR10","doi-asserted-by":"publisher","first-page":"59699","DOI":"10.1007\/s11042-023-17803-7","volume":"83","author":"M Tellai","year":"2024","unstructured":"Tellai M, Gao L, Mao Q, Abdelaziz M. A novel conversational hierarchical attention network for speech emotion recognition in dyadic conversation. Multimed Tools Appl. 2024;83(21):59699\u2013723.","journal-title":"Multimed Tools Appl"},{"issue":"10","key":"1264_CR11","doi-asserted-by":"publisher","first-page":"4199","DOI":"10.3390\/app14104199","volume":"14","author":"F Makhmudov","year":"2024","unstructured":"Makhmudov F, Kultimuratov A, Cho Y-I. Enhancing multimodal emotion recognition through attention mechanisms in bert and cnn architectures. Appl Sci. 2024;14(10):4199.","journal-title":"Appl Sci"},{"key":"1264_CR12","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2023.107708","volume":"130","author":"U Bilotti","year":"2024","unstructured":"Bilotti U, Bisogni C, De Marsico M, Tramonte S. Multimodal emotion recognition via convolutional neural networks: comparison of different strategies on two multimodal datasets. Eng Appl Artif Intell. 2024;130: 107708.","journal-title":"Eng Appl Artif Intell"},{"key":"1264_CR13","doi-asserted-by":"publisher","DOI":"10.1016\/j.bspc.2023.105052","volume":"85","author":"S Zhang","year":"2023","unstructured":"Zhang S, Yang Y, Chen C, Liu R, Tao X, Guo W, Xu Y, Zhao X. Multimodal emotion recognition based on audio and text by using hybrid attention networks. Biomed Signal Process Control. 2023;85: 105052.","journal-title":"Biomed Signal Process Control"},{"key":"1264_CR14","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2024.102590","volume":"112","author":"Y Shou","year":"2024","unstructured":"Shou Y, Meng T, Ai W, Zhang F, Yin N, Li K. Adversarial alignment and graph fusion via information bottleneck for multimodal emotion recognition in conversations. Inf Fusion. 2024;112: 102590.","journal-title":"Inf Fusion"},{"key":"1264_CR15","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2025.103268","volume":"123","author":"X Zhu","year":"2025","unstructured":"Zhu X, Wang Y, Cambria E, Rida I, L\u00f3pez JS, Cui L, Wang R. Rmer-dt: robust multimodal emotion recognition in conversational contexts based on diffusion and transformers. Inf Fusion. 2025;123: 103268.","journal-title":"Inf Fusion"},{"key":"1264_CR16","doi-asserted-by":"publisher","DOI":"10.1016\/j.array.2025.100445","volume":"27","author":"R Wang","year":"2025","unstructured":"Wang R, Xu D, Cascone L, Wang Y, Chen H, Zheng J, Zhu X. Raft: robust adversarial fusion transformer for multimodal sentiment analysis. Array. 2025;27: 100445.","journal-title":"Array"},{"issue":"1","key":"1264_CR17","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1186\/s40537-025-01090-0","volume":"12","author":"S Jena","year":"2025","unstructured":"Jena S, Basak S, Agrawal H, Saini B, Gite S, Kotecha K, Alfarhood S. Developing a negative speech emotion recognition model for safety systems using deep learning. J Big Data. 2025;12(1):54.","journal-title":"J Big Data"},{"key":"1264_CR18","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2024.106764","volume":"181","author":"C Fu","year":"2025","unstructured":"Fu C, Qian F, Su K, Su Y, Wang Z, Shi J, Liu Z, Liu C, Ishi CT. Himul-lgg: a hierarchical decision fusion-based local-global graph neural network for multimodal emotion recognition in conversation. Neural Netw. 2025;181: 106764.","journal-title":"Neural Netw"},{"key":"1264_CR19","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2024.126236","volume":"270","author":"E Boitel","year":"2025","unstructured":"Boitel E, Mohasseb A, Haig E. Mist: multimodal emotion recognition using deberta for text, semi-cnn for speech, resnet-50 for facial, and 3d-cnn for motion analysis. Expert Syst Appl. 2025;270: 126236.","journal-title":"Expert Syst Appl"},{"issue":"1","key":"1264_CR20","doi-asserted-by":"publisher","first-page":"40","DOI":"10.3390\/info16010040","volume":"16","author":"H Filali","year":"2025","unstructured":"Filali H, Boulealam C, El Fazazy K, Mahraz AM, Tairi H, Riffi J. Meaningful multimodal emotion recognition based on capsule graph transformer architecture. Information. 2025;16(1):40.","journal-title":"Information"},{"key":"1264_CR21","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2024.129306","volume":"622","author":"H Li","year":"2025","unstructured":"Li H, Zhong H, Xu C, Liu X, Wen G, Liu L. Global distilling framework with cognitive gravitation for multimodal emotion recognition. Neurocomputing. 2025;622: 129306.","journal-title":"Neurocomputing"},{"key":"1264_CR22","doi-asserted-by":"crossref","unstructured":"Wang Z, He J, Liang Y, Hu X, Peng T, Wang K, Wang J, Zhang C, Zhang W, Niu S et al. Milmer: a framework for multiple instance learning based multimodal emotion recognition. 2025. arXiv preprint arXiv:2502.00547.","DOI":"10.2139\/ssrn.5143190"},{"key":"1264_CR23","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2024.125822","volume":"264","author":"C Li","year":"2025","unstructured":"Li C, Xie L, Wang X, Pan H, Wang Z. A twin disentanglement transformer network with hierarchical-level feature reconstruction for robust multimodal emotion recognition. Expert Syst Appl. 2025;264: 125822.","journal-title":"Expert Syst Appl"},{"key":"1264_CR24","doi-asserted-by":"publisher","first-page":"335","DOI":"10.1007\/s10579-008-9076-6","volume":"42","author":"C Busso","year":"2008","unstructured":"Busso C, Bulut M, Lee C-C, Kazemzadeh A, Mower E, Kim S, Chang JN, Lee S, Narayanan SS. Iemocap: interactive emotional dyadic motion capture database. Lang Resour Eval. 2008;42:335\u201359.","journal-title":"Lang Resour Eval"},{"key":"1264_CR25","doi-asserted-by":"crossref","unstructured":"Poria S, Hazarika D, Majumder N, Naik G, Cambria E, Mihalcea R. Meld: A multimodal multi-party dataset for emotion recognition in conversations. 2018. arXiv preprint arXiv:1810.02508.","DOI":"10.18653\/v1\/P19-1050"},{"key":"1264_CR26","volume-title":"Surrey audio-visual expressed emotion (savee) database","author":"P Jackson","year":"2014","unstructured":"Jackson P, Haq S. Surrey audio-visual expressed emotion (savee) database. Guildford: University of Surrey; 2014."},{"key":"1264_CR27","unstructured":"Zadeh A, Cao YS, Hessner S, Liang PP, Poria S, Morency L-P. Cmu-moseas: A multimodal language dataset for spanish, portuguese, german and french. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing, vol. 2020, p. 1801 (2020)."},{"key":"1264_CR28","doi-asserted-by":"crossref","unstructured":"Gupta R, Tiwari S, Chaudhary P. Prompt engineering. In: Generative AI: Techniques, Models and Applications. Springer 2025. pp 163\u2013186.","DOI":"10.1007\/978-3-031-82062-5_7"},{"key":"1264_CR29","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2023.102038","volume":"102","author":"K Zhang","year":"2024","unstructured":"Zhang K, Zhou F, Wu L, Xie N, He Z. Semantic understanding and prompt engineering for large-scale traffic data imputation. Inf Fusion. 2024;102: 102038.","journal-title":"Inf Fusion"},{"key":"1264_CR30","doi-asserted-by":"crossref","unstructured":"Siino M. Mistral at semeval-2024 task 5: Mistral 7b for argument reasoning in civil procedure. In: Proceedings Of The 18th International Workshop On Semantic Evaluation (SemEval-2024), 2024. pp 155\u2013162.","DOI":"10.18653\/v1\/2024.semeval-1.24"},{"key":"1264_CR31","unstructured":"Yoon JW, Woo BJ, Kim NS. Hubert-ee: early exiting hubert for efficient speech recognition. 2022. arXiv preprint arXiv:2204.06328."},{"key":"1264_CR32","unstructured":"Zhang S, Fang Q, Yang Z, Feng Y. Llava-mini: efficient image and video large multimodal models with one vision token. 2025. arXiv preprint arXiv:2501.03895."},{"key":"1264_CR33","doi-asserted-by":"crossref","unstructured":"Ding X, Wang L. Do language models understand time? 2024. arXiv preprint arXiv:2412.13845.","DOI":"10.1145\/3701716.3717744"},{"issue":"4","key":"1264_CR34","doi-asserted-by":"publisher","first-page":"2700","DOI":"10.3390\/app13042700","volume":"13","author":"J-W Kim","year":"2023","unstructured":"Kim J-W, Choi J-Y, Ha E-J, Choi J-H. Human pose estimation using mediapipe pose and optimization method based on a humanoid model. Appl Sci. 2023;13(4):2700.","journal-title":"Appl Sci"},{"key":"1264_CR35","first-page":"6531","volume":"35","author":"S Shi","year":"2022","unstructured":"Shi S, Jiang L, Dai D, Schiele B. Motion transformer with global intention localization and local movement refinement. Adv Neural Inf Process Syst. 2022;35:6531\u201343.","journal-title":"Adv Neural Inf Process Syst"},{"key":"1264_CR36","unstructured":"Koroteev MV. Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021)."},{"key":"1264_CR37","unstructured":"Kulkarni S, Watanabe H, Homma F. Uncertainty quantification and calibration for audio-driven disease diagnosis. In: NeurIPS 2024 Workshop on Bayesian Decision-making and Uncertainty."},{"key":"1264_CR38","doi-asserted-by":"publisher","first-page":"510","DOI":"10.1016\/j.procs.2018.08.203","volume":"135","author":"K Santoso","year":"2018","unstructured":"Santoso K, Kusuma GP. Face recognition using modified openface. Proc Comput Sci. 2018;135:510\u20137.","journal-title":"Proc Comput Sci"},{"key":"1264_CR39","first-page":"28541","volume":"36","author":"C Li","year":"2023","unstructured":"Li C, Wong C, Zhang S, Usuyama N, Liu H, Yang J, Naumann T, Poon H, Gao J. Llava-med: training a large language-and-vision assistant for biomedicine in one day. Adv Neural Inf Process Syst. 2023;36:28541\u201364.","journal-title":"Adv Neural Inf Process Syst"},{"key":"1264_CR40","doi-asserted-by":"publisher","first-page":"41403","DOI":"10.1109\/ACCESS.2022.3164711","volume":"10","author":"Q Wang","year":"2022","unstructured":"Wang Q, Zhang K, Asghar MA. Skeleton-based st-gcn for human action recognition with extended skeleton graph and partitioning strategy. IEEE Access. 2022;10:41403\u201310.","journal-title":"IEEE Access"},{"key":"1264_CR41","doi-asserted-by":"crossref","unstructured":"Newell A, Deng J. How useful is self-supervised pretraining for visual tasks? In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 2020. p. 7345\u20137354.","DOI":"10.1109\/CVPR42600.2020.00737"},{"key":"1264_CR42","unstructured":"Chithrananda S, Grand G, Ramsundar B. Chemberta: large-scale self-supervised pretraining for molecular property prediction. 2020. arXiv preprint arXiv:2010.09885."},{"key":"1264_CR43","first-page":"23321","volume":"34","author":"J Zhao","year":"2021","unstructured":"Zhao J, Dong Y, Ding M, Kharlamov E, Tang J. Adaptive diffusion in graph neural networks. Adv Neural Inf Process Syst. 2021;34:23321\u201333.","journal-title":"Adv Neural Inf Process Syst"},{"key":"1264_CR44","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2024.124430","volume":"255","author":"F Chen","year":"2024","unstructured":"Chen F, Sun X, Wang Y, Xu Z, Ma W. Adaptive graph neural network for traffic flow prediction considering time variation. Expert Syst Appl. 2024;255: 124430.","journal-title":"Expert Syst Appl"},{"key":"1264_CR45","first-page":"9720","volume":"34","author":"X Liu","year":"2021","unstructured":"Liu X, Ding J, Jin W, Xu H, Ma Y, Liu Z, Tang J. Graph neural networks with adaptive residual. Adv Neural Inf Process Syst. 2021;34:9720\u201333.","journal-title":"Adv Neural Inf Process Syst"},{"issue":"4","key":"1264_CR46","doi-asserted-by":"publisher","first-page":"3357","DOI":"10.1007\/s00521-022-07862-6","volume":"35","author":"X Jia","year":"2023","unstructured":"Jia X, Jiang M, Dong Y, Zhu F, Lin H, Xin Y, Chen H. Multimodal heterogeneous graph attention network. Neural Comput Appl. 2023;35(4):3357\u201372.","journal-title":"Neural Comput Appl"},{"key":"1264_CR47","doi-asserted-by":"crossref","unstructured":"Hong H, Guo H, Lin Y, Yang X, Li Z, Ye J. An attention-based graph neural network for heterogeneous structural learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, 2020. p. 4132\u20134139.","DOI":"10.1609\/aaai.v34i04.5833"},{"key":"1264_CR48","unstructured":"Li Q, Ji C, Guo S, Zhao Y, Mao Q, Wang S, Wei Y, Li J. Variational multi-modal hypergraph attention network for multi-modal relation extraction. 2024. arXiv preprint arXiv:2404.12006."},{"key":"1264_CR49","doi-asserted-by":"crossref","unstructured":"Ung HQ, Niu H, Dao M-S, Wada S, Minamikawa A. xmtrans: Temporal attentive cross-modality fusion transformer for long-term traffic prediction. In: 2024 25th IEEE International Conference on Mobile Data Management (MDM), 2024. p. 195\u2013202. IEEE.","DOI":"10.1109\/MDM61037.2024.00043"},{"issue":"5","key":"1264_CR50","doi-asserted-by":"publisher","first-page":"114","DOI":"10.1007\/s00138-024-01598-0","volume":"35","author":"J Wang","year":"2024","unstructured":"Wang J, Xia L, Wen X. Cmf-transformer: cross-modal fusion transformer for human action recognition. Mach Vis Appl. 2024;35(5):114.","journal-title":"Mach Vis Appl"},{"key":"1264_CR51","doi-asserted-by":"crossref","unstructured":"Ahn D, Kim S, Hong H, Ko BC. Star-transformer: a spatio-temporal cross attention transformer for human action recognition. In: Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, 2023. p. 3330\u20133339.","DOI":"10.1109\/WACV56688.2023.00333"},{"key":"1264_CR52","doi-asserted-by":"crossref","unstructured":"Liu H, Zhang F, Zhang X, Zhao S, Sun J, Yu H, Zhang X. Label-enhanced prototypical network with contrastive learning for multi-label few-shot aspect category detection. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022. p. 1079\u20131087.","DOI":"10.1145\/3534678.3539340"},{"key":"1264_CR53","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2024.121372","volume":"686","author":"M Jiang","year":"2025","unstructured":"Jiang M, Fan J, He J, Du W, Wang Y, Li F. Contrastive prototype network with prototype augmentation for few-shot classification. Inf Sci. 2025;686: 121372.","journal-title":"Inf Sci"},{"key":"1264_CR54","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2024.111730","volume":"295","author":"E Zha","year":"2024","unstructured":"Zha E, Zeng D, Lin M, Shen Y. Ceptner: contrastive learning enhanced prototypical network for two-stage few-shot named entity recognition. Knowl Based Syst. 2024;295: 111730.","journal-title":"Knowl Based Syst"},{"key":"1264_CR55","first-page":"1","volume":"2025","author":"Z Wang","year":"2025","unstructured":"Wang Z, Li H, Wang Q. An approach to truck driving risk identification: a machine learning method based on optuna optimization. IEEE Access. 2025;2025:1.","journal-title":"IEEE Access"},{"key":"1264_CR56","doi-asserted-by":"crossref","unstructured":"Hamed K, Ozgunalp U. A comparative analysis of pretrained models for brain tumaor classification and their optimization using optuna. In: 2024 Innovations in Intelligent Systems and Applications Conference (ASYU). 2024. p. 1\u20137. IEEE.","DOI":"10.1109\/ASYU62119.2024.10757117"},{"key":"1264_CR57","doi-asserted-by":"crossref","unstructured":"Agrawal T, Agrawal T. Optuna and automl. Hyperparameter Optimization in Machine Learning: Make Your Machine Learning and Deep Learning Models More Efficient, 2021. p. 109\u2013129.","DOI":"10.1007\/978-1-4842-6579-6_5"},{"key":"1264_CR58","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijcard.2024.132757","volume":"420","author":"S Dhanka","year":"2025","unstructured":"Dhanka S, Maini S. A hybridization of xgboost machine learning model by optuna hyperparameter tuning suite for cardiovascular disease classification with significant effect of outliers and heterogeneous training datasets. Int J Cardiol. 2025;420: 132757.","journal-title":"Int J Cardiol"},{"key":"1264_CR59","doi-asserted-by":"crossref","unstructured":"Jajal P, Jiang W, Tewari A, Kocinare E, Woo J, Sarraf A, Lu Y-H, Thiruvathukal GK, Davis JC. Interoperability in deep learning: A user survey and failure analysis of onnx model converters. In: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 2024. p. 1466\u20131478.","DOI":"10.1145\/3650212.3680374"},{"key":"1264_CR60","doi-asserted-by":"publisher","first-page":"5806","DOI":"10.1109\/TPAMI.2025.3554560","volume":"47","author":"D Ren","year":"2025","unstructured":"Ren D, Li W, Ding T, Wang L, Fan Q, Huo J, Pan H, Gao Y. Onnxpruner: onnx-based general model pruning adapter. IEEE Trans Pattern Anal Mach Intell. 2025;47:5806\u201317.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"2","key":"1264_CR61","first-page":"299","volume":"28","author":"V \u00c7etin","year":"2022","unstructured":"\u00c7etin V, Y\u0131ld\u0131z O. A comprehensive review on data preprocessing techniques in data analysis. Pamukkale \u00dcniversitesi M\u00fchendislik Bilimleri Dergisi. 2022;28(2):299\u2013312.","journal-title":"Pamukkale \u00dcniversitesi M\u00fchendislik Bilimleri Dergisi"},{"issue":"3","key":"1264_CR62","doi-asserted-by":"publisher","first-page":"509","DOI":"10.1017\/S1351324922000213","volume":"29","author":"CP Chai","year":"2023","unstructured":"Chai CP. Comparison of text preprocessing methods. Nat Lang Eng. 2023;29(3):509\u201353.","journal-title":"Nat Lang Eng"},{"key":"1264_CR63","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.120520","volume":"229","author":"VK Singh","year":"2023","unstructured":"Singh VK, Sharma K, Sur SN. A survey on preprocessing and classification techniques for acoustic scene. Expert Syst Appl. 2023;229: 120520.","journal-title":"Expert Syst Appl"},{"key":"1264_CR64","unstructured":"Ma C, Wu Z, Cai C, Zhang P, Wang Y, Zheng L, Chen C, Zhou Q. Rate-perception optimized preprocessing for video coding. 2023. arXiv preprint arXiv:2301.10455."},{"key":"1264_CR65","first-page":"1","volume":"2024","author":"J Ren","year":"2025","unstructured":"Ren J, An N, Lin C, Zhang Y, Sun Z, Zhang W, Li S, Guo N, Cui W, Hu Q, et al. Deepprep: an accelerated, scalable and robust pipeline for neuroimaging preprocessing empowered by deep learning. Nat Methods. 2025;2024:1\u20134.","journal-title":"Nat Methods"},{"issue":"1","key":"1264_CR66","first-page":"1","volume":"58","author":"A Schwarz","year":"2025","unstructured":"Schwarz A, Rahal JR, Sahelices B, Barroso-Garc\u00eda V, Weis R, Duque Ant\u00f3n S. Data augmentation in predictive maintenance applicable to hydrogen combustion engines: a review. Artif Intell Rev. 2025;58(1):1\u201324.","journal-title":"Artif Intell Rev"},{"key":"1264_CR67","volume":"74","author":"Y Zhao","year":"2025","unstructured":"Zhao Y, Sheng T, Li D. Data augmentation fault diagnosis of rolling machinery using condition denoising diffusion probabilistic model and improved cnn. IEEE Trans Instrum Meas. 2025;74: 3517712.","journal-title":"IEEE Trans Instrum Meas"},{"key":"1264_CR68","unstructured":"Sapkota R, Raza S, Shoman M, Paudel A, Karkee M. Image, text, and speech data augmentation using multimodal llms for deep learning: a survey. 2025. arXiv preprint arXiv:2501.18648."},{"issue":"4","key":"1264_CR69","doi-asserted-by":"publisher","first-page":"1163","DOI":"10.3390\/s25041163","volume":"25","author":"E Mathe","year":"2025","unstructured":"Mathe E, Vernikos I, Spyrou E, Mylonas P. Leveraging artificial occluded samples for data augmentation in human activity recognition. Sensors. 2025;25(4):1163.","journal-title":"Sensors"},{"issue":"1","key":"1264_CR70","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1007\/s11053-024-10442-1","volume":"34","author":"Q Sun","year":"2025","unstructured":"Sun Q, Yang X, Zhong M. Forecasting copper price with multi-view graph transformer and fractional brownian motion-based data augmentation. Nat Resour Res. 2025;34(1):253\u201369.","journal-title":"Nat Resour Res"},{"issue":"12","key":"1264_CR71","first-page":"2935","volume":"10","author":"A Gunawardana","year":"2009","unstructured":"Gunawardana A, Shani G. A survey of accuracy evaluation metrics of recommendation tasks. J Mach Learn Res. 2009;10(12):2935\u201362.","journal-title":"J Mach Learn Res"},{"key":"1264_CR72","doi-asserted-by":"crossref","unstructured":"Wise MN. The values of precision 2020.","DOI":"10.2307\/j.ctv14163t2"},{"key":"1264_CR73","unstructured":"Opitz J, Burst S. Macro f1 and macro f1. 2019. arXiv preprint arXiv:1911.03347."},{"key":"1264_CR74","first-page":"1024","volume":"2","author":"K Von Heusinger","year":"2011","unstructured":"Von Heusinger K. Specificity. Semanti An Int Handb Nat Lang Mean. 2011;2:1024\u201357.","journal-title":"Semanti An Int Handb Nat Lang Mean"},{"key":"1264_CR75","unstructured":"Pugnana A, Ruggieri S. Auc-based selective classification. In: International Conference on Artificial Intelligence and Statistics. 2023. p. 2494\u20132514. PMLR."},{"issue":"1","key":"1264_CR76","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1186\/s12864-019-6413-7","volume":"21","author":"D Chicco","year":"2020","unstructured":"Chicco D, Jurman G. The advantages of the matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21(1):6.","journal-title":"BMC Genomics"},{"issue":"4","key":"1264_CR77","doi-asserted-by":"publisher","first-page":"863","DOI":"10.1109\/TNN.2006.875974","volume":"17","author":"G-B Huang","year":"2024","unstructured":"Huang G-B, Zhu Q-Y, Siew C-K. Real-time learning capability of neural networks. IEEE Trans Neural Netw. 2024;17(4):863\u201378.","journal-title":"IEEE Trans Neural Netw"},{"key":"1264_CR78","doi-asserted-by":"publisher","first-page":"24587","DOI":"10.1109\/ACCESS.2025.3538642","volume":"13","author":"SE G\u00fcler","year":"2025","unstructured":"G\u00fcler SE, Akbulut FP. Multimodal emotion recognition: emotion classification through the integration of eeg and facial expressions. IEEE Access. 2025;13:24587\u2013603.","journal-title":"IEEE Access"},{"key":"1264_CR79","doi-asserted-by":"publisher","first-page":"586","DOI":"10.1016\/j.aej.2024.12.060","volume":"116","author":"X Hao","year":"2025","unstructured":"Hao X, Li H, Wen Y. Real-time music emotion recognition based on multimodal fusion. Alex Eng J. 2025;116:586\u2013600.","journal-title":"Alex Eng J"},{"issue":"4","key":"1264_CR80","doi-asserted-by":"publisher","first-page":"1182","DOI":"10.3390\/s25041182","volume":"25","author":"Y Wang","year":"2025","unstructured":"Wang Y, Hao R, Li Z, Kuang X, Dong J, Zhang Q, Qian F, Fu C. Hgf-milag: hierarchical graph fusion for emotion recognition in conversation with mid-late gender-aware strategy. Sensors. 2025;25(4):1182.","journal-title":"Sensors"}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-025-01264-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s40537-025-01264-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-025-01264-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,9]],"date-time":"2025-09-09T18:31:41Z","timestamp":1757442701000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-025-01264-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,26]]},"references-count":80,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["1264"],"URL":"https:\/\/doi.org\/10.1186\/s40537-025-01264-w","relation":{},"ISSN":["2196-1115"],"issn-type":[{"value":"2196-1115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,26]]},"assertion":[{"value":"20 April 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 August 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 August 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"210"}}