{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T14:32:10Z","timestamp":1772548330025,"version":"3.50.1"},"reference-count":90,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T00:00:00Z","timestamp":1772496000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>Computational analysis of therapeutic communication presents challenges in multi-label classification, severe class imbalance, and heterogeneous multimodal data integration. We introduce a bidirectional analytical framework addressing patient emotion recognition and provider behavior analysis. For patient-side analysis, we employ ClinicalBERT on human-annotated CounselChat (1482 interactions, 25 categories, imbalance 60:1), achieving a macro-F1 of 0.74 through class weighting and threshold optimization, representing a six-fold improvement over naive baselines and 6\u201313 point improvement over modern imbalance methods. For provider-side analysis, we process 330 YouTube therapy sessions through automated pipelines (speaker diarization, automatic speech recognition, temporal segmentation), yielding 14,086 annotated segments. Our architecture combines DeBERTa-v3-base with WavLM-base-plus through cross-modal attention mechanisms adapted from multimodal Transformer frameworks. On controlled human-annotated HOPE data (178 sessions, 12,500 utterances), the model achieves a macro-F1 of 0.91 with Cohen\u2019s kappa of 0.87, comparable to inter-rater reliability reported in psychotherapy process research. On YouTube data, a macro-F1 of 0.71 demonstrates feasibility while highlighting annotation quality impacts. Cross-dataset transfer and systematic attention analyses validate domain-specific effectiveness and interpretability.<\/jats:p>","DOI":"10.3390\/computers15030161","type":"journal-article","created":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T09:45:12Z","timestamp":1772531112000},"page":"161","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["From Patient Emotion Recognition to Provider Understanding: A Multimodal Data Mining Framework for Emotion-Aware Clinical Counseling Systems"],"prefix":"10.3390","volume":"15","author":[{"given":"Saahithi","family":"Mallarapu","sequence":"first","affiliation":[{"name":"Khoury College of Computer Sciences, Northeastern University, California, CA 95112, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xinyan","family":"Liu","sequence":"additional","affiliation":[{"name":"Khoury College of Computer Sciences, Northeastern University, California, CA 95112, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-7046-7099","authenticated-orcid":false,"given":"Pegah","family":"Zargarian","sequence":"additional","affiliation":[{"name":"Khoury College of Computer Sciences, Northeastern University, California, CA 95112, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Seyyedeh Fatemeh","family":"Mottaghian","sequence":"additional","affiliation":[{"name":"College of Computing & Data Sciences, Boston University, Boston, MA 02215, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ramyashree","family":"Suresha","sequence":"additional","affiliation":[{"name":"Khoury College of Computer Sciences, Northeastern University, California, CA 95112, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vasudha","family":"Jain","sequence":"additional","affiliation":[{"name":"Khoury College of Computer Sciences, Northeastern University, California, CA 95112, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Akram","family":"Bayat","sequence":"additional","affiliation":[{"name":"Khoury College of Computer Sciences, Northeastern University, California, CA 95112, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2026,3,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.4018\/jdwm.2007070101","article-title":"Multi-label classification: An overview","volume":"3","author":"Tsoumakas","year":"2007","journal-title":"Int. J. Data Warehous. Min."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1819","DOI":"10.1109\/TKDE.2013.39","article-title":"A review on multi-label learning algorithms","volume":"26","author":"Zhang","year":"2014","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"429","DOI":"10.3233\/IDA-2002-6504","article-title":"The class imbalance problem: A systematic study","volume":"6","author":"Japkowicz","year":"2002","journal-title":"Intell. Data Anal."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Kumar, V., Lalotra, G.S., Sasikala, P., Rajput, D.S., Kaluri, R., Lakshmanna, K., Shorfuzzaman, M., Alsufyani, A., and Uddin, M. (2022). Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques. Healthcare, 10.","DOI":"10.3390\/healthcare10071293"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1263","DOI":"10.1109\/TKDE.2008.239","article-title":"Learning from Imbalanced Data","volume":"21","author":"He","year":"2009","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.neucom.2014.08.091","article-title":"Addressing imbalance in multilabel classification: Measures and random resampling algorithms","volume":"163","author":"Charte","year":"2015","journal-title":"Neurocomputing"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Saito, T., and Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE, 10.","DOI":"10.1371\/journal.pone.0118432"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2055","DOI":"10.1016\/j.patcog.2013.01.012","article-title":"Threshold optimisation for multi-label classifiers","volume":"46","author":"Pillai","year":"2013","journal-title":"Pattern Recognit."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Cui, Y., Jia, M., Lin, T.Y., Song, Y., and Belongie, S. (2019, January 15\u201320). Class-Balanced Loss Based on Effective Number of Samples. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00949"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1109\/TPAMI.2018.2798607","article-title":"Multimodal machine learning: A survey and taxonomy","volume":"41","author":"Baltrusaitis","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.imavis.2017.08.003","article-title":"A survey of multimodal sentiment analysis","volume":"65","author":"Soleymani","year":"2017","journal-title":"Image Vis. Comput."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1062","DOI":"10.1016\/j.specom.2011.01.011","article-title":"Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge","volume":"53","author":"Schuller","year":"2011","journal-title":"Speech Commun."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"614","DOI":"10.1037\/0022-3514.70.3.614","article-title":"Acoustic profiles in vocal emotion expression","volume":"70","author":"Banse","year":"1996","journal-title":"J. Pers. Soc. Psychol."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Scherer, K.R., Johnstone, T., and Klasmeyer, G. (2003). Vocal expression of emotion. Handbook of Affective Sciences, Oxford University Press.","DOI":"10.1093\/oso\/9780195126013.003.0023"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"572","DOI":"10.1016\/j.patcog.2010.09.020","article-title":"Survey on speech emotion recognition: Features, classification schemes, and databases","volume":"44","author":"Kamel","year":"2011","journal-title":"Pattern Recognit."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zadeh, A., Chen, M., Poria, S., Cambria, E., and Morency, L.P. (2017, January 7\u201311). Tensor fusion network for multimodal sentiment analysis. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark.","DOI":"10.18653\/v1\/D17-1115"},{"key":"ref_17","unstructured":"Poria, S., Cambria, E., Hazarika, D., Majumder, N., Zadeh, A., and Morency, L.P. (August, January 30). Context-dependent sentiment analysis in user-generated videos. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, BC, Canada."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"649","DOI":"10.1017\/S1351324916000383","article-title":"Natural language processing in mental health applications using non-clinical texts","volume":"23","author":"Calvo","year":"2017","journal-title":"Nat. Lang. Eng."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1016\/j.cobeha.2017.07.005","article-title":"Detecting depression and mental illness on social media: An integrative review","volume":"18","author":"Guntuku","year":"2017","journal-title":"Curr. Opin. Behav. Sci."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"De Choudhury, M., Counts, S., and Horvitz, E. (2013, January 2\u20134). Social media as a measurement tool of depression in populations. Proceedings of the 5th Annual ACM Web Science Conference, Paris, France.","DOI":"10.1145\/2464464.2464480"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Yates, A., Cohan, A., and Goharian, N. (2017, January 7\u201311). Depression and self-harm risk assessment in online forums. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark.","DOI":"10.18653\/v1\/D17-1322"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Benton, A., Mitchell, M., and Hovy, D. (2017, January 3\u20137). Multitask learning for mental health conditions with limited social media data. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain.","DOI":"10.18653\/v1\/E17-1015"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Coppersmith, G., Dredze, M., Harman, C., and Hollingshead, K. (2015, January 5). From ADHD to SAD: Analyzing the Language of Mental Health on Twitter through Self-Reported Diagnoses. Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, Denver, CO, USA.","DOI":"10.3115\/v1\/W15-1201"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Resnik, P., Armstrong, W., Claudino, L., Nguyen, T., Nguyen, V.A., and Boyd-Graber, J. (2015, January 5). Beyond LDA: Exploring supervised topic modeling for depression-related language in Twitter. Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, Denver, CO, USA.","DOI":"10.3115\/v1\/W15-1212"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Losada, D.E., Crestani, F., and Parapar, J. (2017). eRISK 2017: CLEF lab on early risk prediction on the Internet: Experimental foundations. International Conference of the Cross-Language Evaluation Forum for European Languages, Dublin, Ireland, 11\u201314 September 2017, Springer.","DOI":"10.1007\/978-3-319-65813-1_30"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Shen, J.H., and Rudzicz, F. (2017, January 3). Detecting Anxiety through Reddit. Proceedings of the Fourth Workshop on Computational Linguistics and Clinical Psychology\u2014From Linguistic Signal to Clinical Reality, Vancouver, BC, Canada.","DOI":"10.18653\/v1\/W17-3107"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1038\/s41746-022-00589-7","article-title":"Natural language processing applied to mental illness detection: A narrative review","volume":"5","author":"Zhang","year":"2022","journal-title":"NPJ Digit. Med."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1016\/j.specom.2015.03.004","article-title":"A review of depression and suicide risk assessment using speech analysis","volume":"71","author":"Cummins","year":"2015","journal-title":"Speech Commun."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"96","DOI":"10.1002\/lio2.354","article-title":"Automated assessment of psychiatric disorders using speech: A systematic review","volume":"5","author":"Low","year":"2020","journal-title":"Laryngoscope Investig. Otolaryngol."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"617","DOI":"10.1001\/archpsyc.62.6.617","article-title":"Prevalence, Severity, and Comorbidity of 12-Month DSM-IV Disorders in the National Comorbidity Survey Replication","volume":"62","author":"Kessler","year":"2005","journal-title":"Arch. Gen. Psychiatry"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1037\/0021-843X.110.4.585","article-title":"Current and lifetime comorbidity of the DSM-IV anxiety and mood disorders in a large clinical sample","volume":"110","author":"Brown","year":"2001","journal-title":"J. Abnorm. Psychol."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1146\/annurev.psych.49.1.377","article-title":"Comorbidity of anxiety and unipolar mood disorders","volume":"49","author":"Mineka","year":"1998","journal-title":"Annu. Rev. Psychol."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1037\/pst0000175","article-title":"Therapist empathy and client outcome: An updated meta-analysis","volume":"55","author":"Elliott","year":"2018","journal-title":"Psychotherapy"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1037\/pst0000193","article-title":"Psychotherapy relationships that work III","volume":"55","author":"Norcross","year":"2018","journal-title":"Psychotherapy"},{"key":"ref_35","first-page":"461","article-title":"Empathy","volume":"56","author":"Greenberg","year":"2019","journal-title":"Psychotherapy"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1037\/0033-3204.44.3.268","article-title":"Reassessing Rogers\u2019 necessary and sufficient conditions of change","volume":"44","author":"Watson","year":"2007","journal-title":"Psychotherapy"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1037\/a0022186","article-title":"Alliance in individual psychotherapy","volume":"48","author":"Horvath","year":"2011","journal-title":"Psychotherapy"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"438","DOI":"10.1037\/0022-006X.68.3.438","article-title":"Relation of the therapeutic alliance with outcome and other variables: A meta-analytic review","volume":"68","author":"Martin","year":"2000","journal-title":"J. Consult. Clin. Psychol."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1037\/pst0000172","article-title":"The alliance in adult psychotherapy: A meta-analytic synthesis","volume":"55","author":"Wampold","year":"2018","journal-title":"Psychotherapy"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Wampold, B.E., and Imel, Z.E. (2015). The Great Psychotherapy Debate: The Evidence for What Makes Psychotherapy Work, Routledge. [2nd ed.].","DOI":"10.4324\/9780203582015"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Flemotomos, N., Martinez, V.R., Chen, Z., Singla, K., Ardulov, V., Peri, R., Imel, Z.E., Atkins, D.C., and Narayanan, S. (2021). Automated quality assessment of cognitive behavioral therapy sessions through highly contextualized language representations. PLoS ONE, 16.","DOI":"10.1371\/journal.pone.0258639"},{"key":"ref_42","unstructured":"Flemotomos, N., Martinez, V.R., Gibson, J., Atkins, D.C., Creed, T.A., and Narayanan, S.S. (2021). Language features for automated evaluation of cognitive behavior psychotherapy sessions: A machine learning approach. Front. Psychol., 12."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Orlinsky, D.E., and R\u00f8nnestad, M.H. (2005). How Psychotherapists Develop: A study of Therapeutic Work and Professional Growth, American Psychological Association.","DOI":"10.1037\/11157-000"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Juslin, P.N., and Scherer, K.R. (2005). Vocal expression of affect. The New Handbook of Methods in Nonverbal Behavior Research, Oxford University Press.","DOI":"10.1093\/oso\/9780198529613.003.0003"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1016\/S0167-6393(02)00071-7","article-title":"Describing the emotional states that are expressed in speech","volume":"40","author":"Cowie","year":"2003","journal-title":"Speech Commun."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"256","DOI":"10.1037\/0033-2909.111.2.256","article-title":"Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis","volume":"111","author":"Ambady","year":"1992","journal-title":"Psychol. Bull."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: Synthetic minority over-sampling technique","volume":"16","author":"Chawla","year":"2002","journal-title":"J. Artif. Intell. Res."},{"key":"ref_48","unstructured":"Boyd, K., Eng, K.H., and Page, C.D. (2014, January 15\u201319). Area under the precision-recall curve: Point estimates and confidence intervals. Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Nancy, France."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"e2351075","DOI":"10.1001\/jamanetworkopen.2023.52590","article-title":"Mental Health Counseling From Conversational Content with Transformer-Based Machine Learning","volume":"7","author":"Imel","year":"2024","journal-title":"JAMA Netw. Open"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Bredin, H., and Laurent, A. (September, January 30). End-to-End Speaker Segmentation for Overlap-Aware Resegmentation. In Proceedings of Interspeech 2021, Brno, Czech Republic.","DOI":"10.21437\/Interspeech.2021-560"},{"key":"ref_51","unstructured":"Radford, A., Kim, J.W., Xu, T., Brockman, G., McLeavey, C., and Sutskever, I. (2023, January 23\u201329). Robust Speech Recognition via Large-Scale Weak Supervision. In Proceedings of ICML 2023, Honolulu, HI, USA."},{"key":"ref_52","unstructured":"Anthropic (2024). Claude 3 Model Family: Introducing the Next Generation of AI Assistants, Anthropic. Technical Report."},{"key":"ref_53","unstructured":"Tsai, Y.H., Bai, S., Liang, P.P., Kolter, J.Z., Morency, L.P., and Salakhutdinov, R. (August, January 28). Multimodal Transformer for Unaligned Multimodal Language Sequences. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy."},{"key":"ref_54","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention is All You Need. Advances in Neural Information Processing Systems, Proceedings of the Annual Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4\u20139 December 2017, Neural Information Processing Systems Foundation."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"1505","DOI":"10.1109\/JSTSP.2022.3188113","article-title":"WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing","volume":"16","author":"Chen","year":"2022","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"ref_56","unstructured":"He, P., Gao, J., and Chen, W. (2021). DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing. arXiv."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Jones, E.E. (2000). Therapeutic Action: A Guide to Psychoanalytic Therapy, Jason Aronson.","DOI":"10.5040\/9798216436812"},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1093\/ptr\/8.1.71","article-title":"How expert clinicians\u2019 prototypes of an ideal treatment correlate with outcome in psychodynamic and cognitive-behavioral therapy","volume":"8","author":"Ablon","year":"1998","journal-title":"Psychother. Res."},{"key":"ref_59","unstructured":"Gadzicki, K., Khamsehashari, R., and Zetzsche, C. (2020, January 6\u20139). Early vs Late Fusion in Multimodal Convolutional Neural Networks. Proceedings of the 2020 IEEE 23rd International Conference on Information Fusion (FUSION), Rustenburg, South Africa."},{"key":"ref_60","unstructured":"Gratch, J., Artstein, R., Lucas, G., Stratou, G., Scherer, S., Nazarian, A., Wood, R., Boberg, J., DeVault, D., and Marsella, S. (2014, January 26\u201331). The Distress Analysis Interview Corpus of human and computer interviews. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC), Reykjavik, Iceland."},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"306","DOI":"10.1037\/0022-006X.61.2.306","article-title":"Comparing the process in psychodynamic and cognitive-behavioral therapies","volume":"61","author":"Jones","year":"1993","journal-title":"J. Consult. Clin. Psychol."},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"341","DOI":"10.4088\/JCP.10m06176blu","article-title":"Comorbidity patterns of anxiety and depressive disorders in a large cohort study: The Netherlands Study of Depression and Anxiety (NESDA)","volume":"72","author":"Lamers","year":"2011","journal-title":"J. Clin. Psychiatry"},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"575","DOI":"10.1080\/026999399379195","article-title":"Emotions and psychopathology","volume":"13","author":"Kring","year":"1999","journal-title":"Cogn. Emot."},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1111\/j.1468-2850.1995.tb00036.x","article-title":"Emotion regulation and mental health","volume":"2","author":"Gross","year":"1995","journal-title":"Clin. Psychol. Sci. Pract."},{"key":"ref_65","unstructured":"Alsentzer, E., Murphy, J., Boag, W., Weng, W.H., Jin, D., Naumann, T., and McDermott, M. (2019). Publicly Available Clinical BERT Embeddings. arXiv."},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"McFee, B., Raffel, C., Liang, D., Ellis, D.P., McVicar, M., Battenberg, E., and Nieto, O. (2015, January 6\u201312). librosa: Audio and music signal analysis in python. Proceedings of the 14th Python in Science Conference, Austin, TX, USA.","DOI":"10.25080\/Majora-7b98e3ed-003"},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"438","DOI":"10.1037\/cou0000382","article-title":"Machine learning and natural language processing in psychotherapy research: Alliance as example use case","volume":"67","author":"Goldberg","year":"2020","journal-title":"J. Couns. Psychol."},{"key":"ref_68","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2019, January 2\u20137). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT, Minneapolis, MN, USA."},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Pepino, L., Riera, P., and Ferrer, L. (2021). Emotion Recognition from Speech Using wav2vec 2.0 Embeddings. arXiv.","DOI":"10.21437\/Interspeech.2021-703"},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"988","DOI":"10.1111\/j.1553-2712.2008.00227.x","article-title":"Deliberate practice and acquisition of expert performance: A general overview","volume":"15","author":"Ericsson","year":"2008","journal-title":"Acad. Emerg. Med."},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal Loss for Dense Object Detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_72","unstructured":"Menon, A.K., Jayasumana, S., Rawat, A.S., Jain, H., Veit, A., and Kumar, S. (2021, January 3\u20137). Long-tail learning via logit adjustment. Proceedings of the International Conference on Learning Representations, Virtual Event."},{"key":"ref_73","unstructured":"Hassan, A.A., Hanafy, R.J., and Fouda, M.E. (2024). Automated Multi-Label Annotation for Mental Health Illnesses Using Large Language Models. arXiv."},{"key":"ref_74","first-page":"104565","article-title":"Multimodal depression detection: A comparative study of machine learning models and feature fusion techniques","volume":"149","author":"Almeida","year":"2024","journal-title":"J. Biomed. Inform."},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Al Hanai, T., Ghassemi, M., and Glass, J. (2018, January 2\u20136). Detecting Depression with Audio\/Text Sequence Modeling of Interviews. In Proceedings of Interspeech 2018, Hyderabad, India.","DOI":"10.21437\/Interspeech.2018-2522"},{"key":"ref_76","unstructured":"Majumder, N., Poria, S., Hazarika, D., Mihalcea, R., Gelbukh, A., and Cambria, E. (February, January 27). DialogueRNN: An attentive RNN for emotion detection in conversations. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_77","unstructured":"Poria, S., Hazarika, D., Majumder, N., Naik, G., Cambria, E., and Mihalcea, R. (August, January 28). MELD: A multimodal multi-party dataset for emotion recognition in conversations. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy."},{"key":"ref_78","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1007\/s10579-008-9076-6","article-title":"IEMOCAP: Interactive emotional dyadic motion capture database","volume":"42","author":"Busso","year":"2008","journal-title":"Lang. Resour. Eval."},{"key":"ref_79","doi-asserted-by":"crossref","unstructured":"Valstar, M., Schuller, B., Smith, K., Almaev, T., Eyben, F., Krajewski, J., Cowie, R., and Pantic, M. (2014, January 7). AVEC 2014: 3D dimensional affect and depression recognition challenge. Proceedings of the 4th International Workshop on Audio\/Visual Emotion Challenge, Orlando, FL, USA.","DOI":"10.1145\/2661806.2661807"},{"key":"ref_80","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1037\/bul0000084","article-title":"Risk factors for suicidal thoughts and behaviors: A meta-analysis of 50 years of research","volume":"143","author":"Franklin","year":"2017","journal-title":"Psychol. Bull."},{"key":"ref_81","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1093\/epirev\/mxn002","article-title":"Suicide and suicidal behavior","volume":"30","author":"Nock","year":"2008","journal-title":"Epidemiol. Rev."},{"key":"ref_82","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1016\/j.cpr.2019.01.001","article-title":"Doing no harm in mindfulness-based programs: Conceptual issues and empirical findings","volume":"71","author":"Baer","year":"2019","journal-title":"Clin. Psychol. Rev."},{"key":"ref_83","doi-asserted-by":"crossref","first-page":"i37","DOI":"10.1093\/bioinformatics\/btx228","article-title":"Deep learning with word embeddings improves biomedical named entity recognition","volume":"33","author":"Habibi","year":"2017","journal-title":"Bioinformatics"},{"key":"ref_84","doi-asserted-by":"crossref","first-page":"160035","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Sci. Data"},{"key":"ref_85","doi-asserted-by":"crossref","unstructured":"Gaur, M., Alambo, A., Sain, J.P., Kursuncu, U., Thirunarayan, K., Kavuluru, R., Sheth, A., Welton, R., and Pathak, J. (2019, January 13\u201317). Knowledge-aware assessment of severity of suicide risk for early intervention. Proceedings of the World Wide Web Conference, San Francisco, CA, USA.","DOI":"10.1145\/3308558.3313698"},{"key":"ref_86","doi-asserted-by":"crossref","unstructured":"Sharma, A., Lin, I.W., Miner, A.S., Atkins, D.C., and Althoff, T. (2021, January 19\u201323). Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach. Proceedings of the Web Conference 2021, Ljubljana, Slovenia.","DOI":"10.1145\/3442381.3450097"},{"key":"ref_87","unstructured":"P\u00e9rez-Rosas, V., Mihalcea, R., Resnicow, K., Singh, S., and An, L. (August, January 30). Understanding and Predicting Empathic Behavior in Counseling Therapy. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, BC, Canada."},{"key":"ref_88","doi-asserted-by":"crossref","unstructured":"Wu, T., Huang, Q., Liu, Z., Wang, Y., and Lin, D. (2020, January 23\u201328). Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58548-8_10"},{"key":"ref_89","unstructured":"Kenny, P.G., Parsons, T.D., Gratch, J., Leuski, A., and Rizzo, A.A. (2007, January 20\u201322). Virtual patients for clinical therapist skills training. Proceedings of the International Conference on Intelligent Virtual Agents, Philadelphia, PA, USA."},{"key":"ref_90","first-page":"311","article-title":"Detection and computational analysis of psychological signals using a virtual human interviewing agent","volume":"9","author":"Rizzo","year":"2016","journal-title":"J. Pain Manag."}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/15\/3\/161\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T10:17:35Z","timestamp":1772533055000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/15\/3\/161"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,3]]},"references-count":90,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2026,3]]}},"alternative-id":["computers15030161"],"URL":"https:\/\/doi.org\/10.3390\/computers15030161","relation":{},"ISSN":["2073-431X"],"issn-type":[{"value":"2073-431X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,3]]}}}