{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,19]],"date-time":"2026-06-19T06:41:58Z","timestamp":1781851318973,"version":"3.54.5"},"reference-count":15,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2025,1,2]],"date-time":"2025-01-02T00:00:00Z","timestamp":1735776000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computation"],"abstract":"<jats:p>Depression is one of the most common mental health disorders in the world, affecting millions of people. Early detection of depression is crucial for effective medical intervention. Multimodal networks can greatly assist in the detection of depression, especially in situations where in patients are not always aware of or able to express their symptoms. By analyzing text and audio data, such networks are able to automatically identify patterns in speech and behavior that indicate a depressive state. In this study, we propose two multimodal information fusion networks: early and late fusion. These networks were developed using convolutional neural network (CNN) layers to learn local patterns, a bidirectional LSTM (Bi-LSTM) to process sequences, and a self-attention mechanism to improve focus on key parts of the data. The DAIC-WOZ and EDAIC-WOZ datasets were used for the experiments. The experiments compared the precision, recall, f1-score, and accuracy metrics for the cases of using early and late multimodal data fusion and found that the early information fusion multimodal network achieved higher classification accuracy results. On the test dataset, this network achieved an f1-score of 0.79 and an overall classification accuracy of 0.86, indicating its effectiveness in detecting depression.<\/jats:p>","DOI":"10.3390\/computation13010009","type":"journal-article","created":{"date-parts":[[2025,1,2]],"date-time":"2025-01-02T10:32:26Z","timestamp":1735813946000},"page":"9","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":37,"title":["Multimodal Data Fusion for Depression Detection Approach"],"prefix":"10.3390","volume":"13","author":[{"given":"Mariia","family":"Nykoniuk","sequence":"first","affiliation":[{"name":"Department of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, Ukraine"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0064-6584","authenticated-orcid":false,"given":"Oleh","family":"Basystiuk","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, Ukraine"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6875-8534","authenticated-orcid":false,"given":"Nataliya","family":"Shakhovska","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, Ukraine"},{"name":"Department of Civil and Environmental Engineering, Brunel University of London, Uxbridge UB8 3PH, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2114-3436","authenticated-orcid":false,"given":"Nataliia","family":"Melnykova","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence, Lviv Polytechnic National University, Stepan Bandera 12, 79013 Lviv, Ukraine"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2025,1,2]]},"reference":[{"key":"ref_1","unstructured":"World Health Organization (2024, December 22). Depressive Disorder (Depression). Available online: https:\/\/www.who.int\/news-room\/fact-sheets\/detail\/depression."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"V\u00e1zquez-Romero, A., and Gallardo-Antol\u00edn, A. (2020). Automatic detection of depression in speech using ensemble convolutional neural networks. Entropy, 22.","DOI":"10.3390\/e22060688"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Yin, F., Du, J., Xu, X., and Zhao, L. (2023). Depression detection in speech using transformer and parallel convolutional neural networks. Electronics, 12.","DOI":"10.3390\/electronics12020328"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1016\/j.specom.2022.07.006","article-title":"Fusing features of speech for depression classification based on higher-order spectral analysis","volume":"143","author":"Miao","year":"2022","journal-title":"Speech Commun."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Zhao, Y., Liang, Z., Du, J., Zhang, L., Liu, C., and Zhao, L. (2021). Multi-head attention-based long short-term memory for depression detection from speech. Front. Neurorobotics, 15.","DOI":"10.3389\/fnbot.2021.684037"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1186\/s40708-023-00185-9","article-title":"Towards automatic text-based estimation of depression through symptom prediction","volume":"10","author":"Milintsevich","year":"2023","journal-title":"Brain Inform."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Park, J., and Moon, N. (2022). Design and implementation of attention depression detection model based on multimodal analysis. Sustainability, 14.","DOI":"10.3390\/su14063569"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2251","DOI":"10.1109\/TAFFC.2022.3154332","article-title":"Prediction of depression severity based on the prosodic and semantic features with bidirectional LSTM and time distributed CNN","volume":"14","author":"Mao","year":"2022","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"20479","DOI":"10.1109\/ACCESS.2024.3362233","article-title":"Additive cross-modal attention network (ACMA) for depression detection based on audio and textual features","volume":"12","author":"Iyortsuun","year":"2024","journal-title":"IEEE Access"},{"key":"ref_10","unstructured":"(2024, December 22). DAIC-WOZ Database. Available online: https:\/\/dcapswoz.ict.usc.edu\/."},{"key":"ref_11","unstructured":"Roberts, L. (2024, December 22). Understanding the Mel Spectrogram. Available online: https:\/\/medium.com\/analytics-vidhya\/understanding-the-mel-spectrogram-fca2afa2ce53."},{"key":"ref_12","unstructured":"Wikipedia Contributors (2024, December 22). Mel-Frequency Cepstrum. Available online: https:\/\/en.wikipedia.org\/w\/index.php?title=Mel-frequency_cepstrum&oldid=1233509682."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1016\/S0167-6393(02)00057-2","article-title":"Spectral contrast enhancement: Algorithms and comparisons","volume":"39","author":"Yang","year":"2003","journal-title":"Speech Commun."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Basystiuk, O., Melnykova, N., and Rybchak, Z. (2023, January 19\u201321). Multimodal Learning Analytics: An Overview of the Data Collection Methodology. Proceedings of the 2023 IEEE 18th International Conference on Computer Science and Information Technologies (CSIT), Lviv, Ukraine.","DOI":"10.1109\/CSIT61576.2023.10324177"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"118523","DOI":"10.1016\/j.eswa.2022.118523","article-title":"Multimodal fusion methods with deep neural networks and meta-information for aggression detection in surveillance","volume":"211","author":"Jaafar","year":"2022","journal-title":"Expert Syst. Appl."}],"container-title":["Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-3197\/13\/1\/9\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,7]],"date-time":"2025-10-07T15:23:52Z","timestamp":1759850632000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-3197\/13\/1\/9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,2]]},"references-count":15,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,1]]}},"alternative-id":["computation13010009"],"URL":"https:\/\/doi.org\/10.3390\/computation13010009","relation":{},"ISSN":["2079-3197"],"issn-type":[{"value":"2079-3197","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,2]]}}}