{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T11:10:40Z","timestamp":1770030640680,"version":"3.49.0"},"reference-count":16,"publisher":"SAGE Publications","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IFS"],"published-print":{"date-parts":[[2021,8,11]]},"abstract":"<jats:p>Over the years the need for differentiating various emotions from oral communication plays an important role in emotion based studies. There have been different algorithms to classify the kinds of emotion. Although there is no measure of fidelity of the emotion under consideration, which is primarily due to the reason that most of the readily available datasets that are annotated are produced by actors and not generated in real-world scenarios. Therefore, the predicted emotion lacks an important aspect called authenticity, which is whether an emotion is actual or stimulated. In this research work, we have developed a transfer learning and style transfer based hybrid convolutional neural network algorithm to classify the emotion as well as the fidelity of the emotion. The model is trained on features extracted from a dataset that contains stimulated as well as actual utterances. We have compared the developed algorithm with conventional machine learning and deep learning techniques by few metrics like accuracy, Precision, Recall and F1 score. The developed model performs much better than the conventional machine learning and deep learning models. The research aims to dive deeper into human emotion and make a model that understands it like humans do with precision, recall, F1 score values of 0.994, 0.996, 0.995 for speech authenticity and 0.992, 0.989, 0.99 for speech emotion classification respectively.<\/jats:p>","DOI":"10.3233\/jifs-210711","type":"journal-article","created":{"date-parts":[[2021,6,30]],"date-time":"2021-06-30T05:12:20Z","timestamp":1625029940000},"page":"2013-2024","source":"Crossref","is-referenced-by-count":6,"title":["Transfer learning based convolution neural net for authentication and classification of emotions from natural and stimulated speech signals"],"prefix":"10.1177","volume":"41","author":[{"given":"Mukul","family":"Kumar","sequence":"first","affiliation":[{"name":"School of Electrical Engineering, Vellore Institute Technology Vellore, India"}]},{"given":"Nipun","family":"Katyal","sequence":"additional","affiliation":[{"name":"School of Electrical Engineering, Vellore Institute Technology Vellore, India"}]},{"given":"Nersisson","family":"Ruban","sequence":"additional","affiliation":[{"name":"School of Electrical Engineering, Vellore Institute Technology Vellore, India"}]},{"given":"Elena","family":"Lyakso","sequence":"additional","affiliation":[{"name":"Saint-Petersburg State University, Russia"}]},{"given":"A.","family":"Mary Mekala","sequence":"additional","affiliation":[{"name":"School of Information Technology and Engineering, Vellore Institute Technology Vellore, India"}]},{"given":"Alex Noel","family":"Joseph Raj","sequence":"additional","affiliation":[{"name":"Key Laboratory of Digital Signal and Image Processing of Guangdong Province, Department of Electronics Engineering, College of Engineering, Shantou University, China"}]},{"given":"G.","family":"Maarc Richard","sequence":"additional","affiliation":[{"name":"School of Electrical Engineering, Vellore Institute Technology Vellore, India"}]}],"member":"179","reference":[{"key":"10.3233\/JIFS-210711_ref1","doi-asserted-by":"crossref","unstructured":"Ang J. , Dhillon R. , Krupski A. , Shriberg E. and Stolcke A. , Prosody-based automatic detection of annoyance and frustration in human-computer dialog, In Seventh International Conference on Spoken Language Processing (2002).","DOI":"10.21437\/ICSLP.2002-559"},{"key":"10.3233\/JIFS-210711_ref2","doi-asserted-by":"crossref","unstructured":"Burkhardt F. , Van Ballegooy M. , Engelbrecht K.P. , Polzehl T. and Stegmann J. , Emotion detection in dialog systems: Applications, strategies and challenges, In 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops 2009 Sep 10 (pp. 1\u20136). IEEE.","DOI":"10.1109\/ACII.2009.5349498"},{"key":"10.3233\/JIFS-210711_ref3","doi-asserted-by":"crossref","unstructured":"Busso C. , Lee S. and Narayanan S.S. , Using neutral speech models for emotional speech analysis, In Eighth Annual Conference of the International Speech Communication Association (2007).","DOI":"10.21437\/Interspeech.2007-605"},{"issue":"4","key":"10.3233\/JIFS-210711_ref4","doi-asserted-by":"crossref","first-page":"582","DOI":"10.1109\/TASL.2008.2009578","article-title":"Analysis of emotionally salient aspects of fundamental frequency for emotion detection","volume":"17","author":"Busso","year":"2009","journal-title":"IEEE transactions on audio, speech, and language processing"},{"issue":"4","key":"10.3233\/JIFS-210711_ref6","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1016\/j.neunet.2005.03.007","article-title":"Challenges in real-life emotion annotation and machine learning based detection","volume":"18","author":"Devillers","year":"2005","journal-title":"Neural Networks"},{"key":"10.3233\/JIFS-210711_ref7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2011\/906789","article-title":"Robust emotional stressed speech detection using weighted frequency subbands","volume":"2011","author":"Hansen","year":"2011","journal-title":"EURASIP Journal on Advances in Signal Processing"},{"key":"10.3233\/JIFS-210711_ref13","first-page":"1097","article-title":"Imagenet classification with deep convolutional neural networks","volume":"25","author":"Krizhevsky","year":"2012","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"2","key":"10.3233\/JIFS-210711_ref14","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1109\/TSA.2004.838534","article-title":"Toward detecting emotions in spoken dialogs","volume":"13","author":"Lee","year":"2005","journal-title":"IEEE transactions on speech and audio processing"},{"issue":"2","key":"10.3233\/JIFS-210711_ref16","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1016\/j.csl.2007.06.001","article-title":"Applying an analysis of acted vocal emotions to improve the simulation of synthetic speech","volume":"22","author":"Murray","year":"2008","journal-title":"Computer Speech & Language"},{"issue":"13","key":"10.3233\/JIFS-210711_ref18","doi-asserted-by":"crossref","first-page":"5858","DOI":"10.1016\/j.eswa.2014.03.026","article-title":"A new approach of audio emotion recognition","volume":"41","author":"Ooi","year":"2014","journal-title":"Expert Systems with Applications"},{"key":"10.3233\/JIFS-210711_ref22","first-page":"422","article-title":"Emotion recognition through speech using neural network","volume":"5","author":"Rawat","year":"2015","journal-title":"Int J"},{"key":"10.3233\/JIFS-210711_ref27","doi-asserted-by":"publisher","DOI":"10.5683\/SP2\/E8H2MF"},{"issue":"1\/2","key":"10.3233\/JIFS-210711_ref30","doi-asserted-by":"crossref","first-page":"14","DOI":"10.17743\/jaes.2019.0043","article-title":"Continuous speech emotion recognition with convolutional neural networks","volume":"68","author":"Vryzas","year":"2020","journal-title":"Journal of the Audio Engineering Society"},{"issue":"1","key":"10.3233\/JIFS-210711_ref31","first-page":"183","article-title":"A CNN-assisted enhanced audio signal processing for speech emotion recognition","volume":"20","author":"Kwon","year":"2020","journal-title":"Sensors"},{"key":"10.3233\/JIFS-210711_ref32","doi-asserted-by":"crossref","unstructured":"Rachman F.H. , Sarno R. and Fatichah C. , Hybrid Approach of Structural Lyric and Audio Segments for Detecting Song Emotion, International Journal of Intelligent Engineering & Systems 13(1) (2020).","DOI":"10.22266\/ijies2020.0229.09"},{"key":"10.3233\/JIFS-210711_ref33","doi-asserted-by":"crossref","first-page":"101077","DOI":"10.1016\/j.csl.2020.101077","article-title":"Transfer learning from adult to children for speech recognition: Evaluation, analysis and recommendations","volume":"63","author":"Shivakumar","year":"2020","journal-title":"Computer Speech & Language"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/JIFS-210711","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T00:56:14Z","timestamp":1769993774000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/JIFS-210711"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,11]]},"references-count":16,"journal-issue":{"issue":"1"},"URL":"https:\/\/doi.org\/10.3233\/jifs-210711","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,8,11]]}}}