{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,5]],"date-time":"2026-01-05T15:26:39Z","timestamp":1767626799390,"version":"3.37.3"},"reference-count":64,"publisher":"Springer Science and Business Media LLC","issue":"9","license":[{"start":{"date-parts":[[2020,2,25]],"date-time":"2020-02-25T00:00:00Z","timestamp":1582588800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,2,25]],"date-time":"2020-02-25T00:00:00Z","timestamp":1582588800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002341","name":"The Academy of Finland","doi-asserted-by":"crossref","award":["312490"],"award-info":[{"award-number":["312490"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Circuits Syst Signal Process"],"published-print":{"date-parts":[[2020,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In generation of emotional speech, there are deviations in the speech production features when compared to neutral (non-emotional) speech. The objective of this study is to capture the deviations in features related to the excitation component of speech and to develop a system for automatic recognition of emotions based on these deviations. The emotions considered in this study are anger, happiness, sadness and neutral state. The study shows that there are useful features in the deviations of the excitation features, which can be exploited to develop an emotion recognition system. The excitation features used in this study are the instantaneous fundamental frequency (<jats:inline-formula><jats:alternatives><jats:tex-math>$$F_0$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:msub><mml:mi>F<\/mml:mi><mml:mn>0<\/mml:mn><\/mml:msub><\/mml:math><\/jats:alternatives><\/jats:inline-formula>), the strength of excitation, the energy of excitation and the ratio of the high-frequency to low-frequency band energy (<jats:inline-formula><jats:alternatives><jats:tex-math>$$\\beta $$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:mi>\u03b2<\/mml:mi><\/mml:math><\/jats:alternatives><\/jats:inline-formula>). A hierarchical binary decision tree approach is used to develop an emotion recognition system with neutral speech as reference. The recognition experiments showed that the excitation features are comparable or better than the existing prosody features and spectral features, such as mel-frequency cepstral coefficients, perceptual linear predictive coefficients and modulation spectral features.<\/jats:p>","DOI":"10.1007\/s00034-020-01377-y","type":"journal-article","created":{"date-parts":[[2020,2,25]],"date-time":"2020-02-25T18:02:51Z","timestamp":1582653771000},"page":"4459-4481","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":30,"title":["Excitation Features of Speech for Emotion Recognition Using Neutral Speech as Reference"],"prefix":"10.1007","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5806-3053","authenticated-orcid":false,"given":"Sudarsana Reddy","family":"Kadiri","sequence":"first","affiliation":[]},{"given":"P.","family":"Gangamohan","sequence":"additional","affiliation":[]},{"given":"Suryakanth V.","family":"Gangashetty","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8173-9418","authenticated-orcid":false,"given":"Paavo","family":"Alku","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7080-1239","authenticated-orcid":false,"given":"B.","family":"Yegnanarayana","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,2,25]]},"reference":[{"issue":"1","key":"1377_CR1","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1159\/000091405","volume":"63","author":"M Airas","year":"2006","unstructured":"M. Airas, P. Alku, Emotions in vowel segments of continuous speech: analysis of the glottal flow using the normalized amplitude quotient. Phonetica 63(1), 26\u201346 (2006)","journal-title":"Phonetica"},{"issue":"5","key":"1377_CR2","doi-asserted-by":"crossref","first-page":"623","DOI":"10.1007\/s12046-011-0041-5","volume":"36","author":"P Alku","year":"2011","unstructured":"P. Alku, Glottal inverse filtering analysis of human voice production-a review of estimation and parameterization methods of the glottal excitation and their applications. Sadhana 36(5), 623\u2013650 (2011)","journal-title":"Sadhana"},{"issue":"11","key":"1377_CR3","doi-asserted-by":"crossref","first-page":"1558","DOI":"10.1109\/PROC.1977.10770","volume":"65","author":"JB Allen","year":"1977","unstructured":"J.B. Allen, L. Rabiner, A unified approach to short-time Fourier analysis and synthesis. Proc. IEEE 65(11), 1558\u20131564 (1977)","journal-title":"Proc. IEEE"},{"issue":"1","key":"1377_CR4","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1016\/j.csl.2013.07.002","volume":"28","author":"JP Arias","year":"2014","unstructured":"J.P. Arias, C. Busso, N.B. Yoma, Shape-based modeling of the fundamental frequency contour for emotion detection in speech. Comput. Speech Lang. 28(1), 278\u2013294 (2014)","journal-title":"Comput. Speech Lang."},{"issue":"6","key":"1377_CR5","doi-asserted-by":"crossref","first-page":"4547","DOI":"10.1121\/1.2909562","volume":"123","author":"M Bulut","year":"2008","unstructured":"M. Bulut, S. Narayanan, On the robustness of overall F0-only modifications to the perception of emotions in speech. J. Acoust. Soc. Am. 123(6), 4547\u20134558 (2008)","journal-title":"J. Acoust. Soc. Am."},{"key":"1377_CR6","unstructured":"F. Burkhardt, A. Paeschke, M. Rolfes, W.F. Sendlmeier, B. Weiss, A database of German emotional speech, in INTERSPEECH (2005), pp. 1517\u20131520"},{"issue":"4","key":"1377_CR7","doi-asserted-by":"crossref","first-page":"582","DOI":"10.1109\/TASL.2008.2009578","volume":"17","author":"C Busso","year":"2009","unstructured":"C. Busso, S. Lee, S. Narayanan, Analysis of emotionally salient aspects of fundamental frequency for emotion detection. IEEE Trans. Audio Speech Lang. Process. 17(4), 582\u2013596 (2009)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"1377_CR8","doi-asserted-by":"crossref","unstructured":"C.-C. Chang, C.-J. Lin, LIBSVM: a library for support vector machines, 2 July 2007","DOI":"10.1145\/1961189.1961199"},{"key":"1377_CR9","doi-asserted-by":"crossref","DOI":"10.1002\/0471200611","volume-title":"Elements of Information Theory","author":"TM Cover","year":"1991","unstructured":"T.M. Cover, J.A. Thomas, Elements of Information Theory (Wiley, New York, 1991)"},{"key":"1377_CR10","unstructured":"L. Devillers, C. Vaudable, C. Chastagnol, Real-life emotion-related states detection in call centers: a cross-corpora study, in INTERSPEECH (2010), pp. 2350\u20132353"},{"issue":"3","key":"1377_CR11","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1109\/LSP.2009.2038507","volume":"17","author":"N Dhananjaya","year":"2010","unstructured":"N. Dhananjaya, B. Yegnanarayana, Voiced\/nonvoiced detection based on robustness of voiced epochs. IEEE Signal Process. Lett. 17(3), 273\u2013276 (2010)","journal-title":"IEEE Signal Process. Lett."},{"issue":"1\u20132","key":"1377_CR12","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1016\/S0167-6393(02)00070-5","volume":"40","author":"E Douglas-Cowie","year":"2003","unstructured":"E. Douglas-Cowie, N. Campbell, R. Cowie, P. Roach, Emotional speech: towards a new generation of databases. Speech Commun. 40(1\u20132), 33\u201360 (2003)","journal-title":"Speech Commun."},{"key":"1377_CR13","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.csl.2011.03.003","volume":"26","author":"T Drugman","year":"2012","unstructured":"T. Drugman, B. Bozkurt, T. Dutoit, A comparative study of glottal source estimation techniques. Comput. Speech Lang. 26, 20\u201334 (2012)","journal-title":"Comput. Speech Lang."},{"key":"1377_CR14","doi-asserted-by":"crossref","unstructured":"I.S. Engberg, A.\u00a0Varnich Hansen, O. Andersen, P. Dalsgaard, Design, recording and verification of a Danish emotional speech database, in EUROSPEECH (ISCA, 1997), pp. 1695\u20131698","DOI":"10.21437\/Eurospeech.1997-482"},{"key":"1377_CR15","unstructured":"P.\u00a0Gangamohan, S.R. Kadiri, S.V. Gangashetty, B.\u00a0Yegnanarayana, Excitation source features for discrimination of anger and happy emotions, in INTERSPEECH (2014), pp. 1253\u20131257"},{"key":"1377_CR16","unstructured":"P.\u00a0Gangamohan, S.R. Kadiri, B. Yegnanarayana, Analysis of emotional speech at subsegmental level, in INTERSPEECH, August (2013), pp. 1916\u20131920"},{"issue":"6","key":"1377_CR17","doi-asserted-by":"crossref","first-page":"1056","DOI":"10.1109\/TASLP.2014.2319157","volume":"22","author":"MJ Gangeh","year":"2014","unstructured":"M.J. Gangeh, P. Fewzee, A. Ghodsi, M.S. Kamel, F. Karray, Multiview supervised dictionary learning in speech emotion recognition. IEEE\/ACM Trans. Audio Speech Lang. Process. 22(6), 1056\u20131068 (2014)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"1377_CR18","unstructured":"D.\u00a0Govind, S.R.M. Prasanna, B.\u00a0Yegnanarayana, Neutral to target emotion conversion using source and suprasegmental information, in Interspeech (2011), pp. 2969\u20132972"},{"issue":"10\u201311","key":"1377_CR19","doi-asserted-by":"crossref","first-page":"787","DOI":"10.1016\/j.specom.2007.01.010","volume":"49","author":"M Grimm","year":"2007","unstructured":"M. Grimm, K. Kroschel, E. Mower, S. Narayanan, Primitives-based evaluation and estimation of emotions in speech. Speech Commun. 49(10\u201311), 787\u2013800 (2007)","journal-title":"Speech Commun."},{"key":"1377_CR20","unstructured":"M. Grimm, K. Kroschel, S.S. Narayanan, The Vera am Mittag German audio-visual emotional speech database, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Hannover, Germany, June (2008), pp. 865\u2013868"},{"key":"1377_CR21","unstructured":"S.\u00a0Guruprasad, Significance of processing regions of high signal-to-noise ratio in speech signals. PhD Thesis, Apr (2011)"},{"issue":"7","key":"1377_CR22","doi-asserted-by":"crossref","first-page":"1853","DOI":"10.1109\/TASL.2010.2101595","volume":"19","author":"S Guruprasad","year":"2011","unstructured":"S. Guruprasad, B. Yegnanarayana, Performance of an event-based instantaneous fundamental frequency estimator for distant speech signals. IEEE Trans. Audio Speech Lang. Process. 19(7), 1853\u20131864 (2011)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"issue":"7","key":"1377_CR23","doi-asserted-by":"crossref","first-page":"1458","DOI":"10.1109\/TASL.2013.2255278","volume":"21","author":"A Hassan","year":"2013","unstructured":"A. Hassan, R. Damper, M. Niranjan, On acoustic emotion recognition: compensating for covariate shift. IEEE Trans. Audio Speech Lang. Process. 21(7), 1458\u20131468 (2013)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"issue":"7","key":"1377_CR24","doi-asserted-by":"crossref","first-page":"903","DOI":"10.1016\/j.specom.2012.03.003","volume":"54","author":"A Hassan","year":"2012","unstructured":"A. Hassan, R.I. Damper, Classification of emotional speech using 3DEC hierarchical classifier. Speech Commun. 54(7), 903\u2013916 (2012)","journal-title":"Speech Commun."},{"issue":"4","key":"1377_CR25","doi-asserted-by":"crossref","first-page":"1738","DOI":"10.1121\/1.399423","volume":"87","author":"H Hermansky","year":"1990","unstructured":"H. Hermansky, Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87(4), 1738\u20131752 (1990)","journal-title":"J. Acoust. Soc. Am."},{"key":"1377_CR26","unstructured":"J.H. Jeon, R. Xia, Y. Liu, Sentence level emotion recognition based on decisions from subsentence segments, in ICASSP (2011), pp. 4940\u20134943"},{"key":"1377_CR27","unstructured":"S.R. Kadiri, P.\u00a0Gangamohan, S.V. Gangashetty, B. Yegnanarayana, Analysis of excitation source features of speech for emotion recognition, in INTERSPEECH (2015), pp. 1324\u20131328"},{"issue":"9\u201310","key":"1377_CR28","doi-asserted-by":"crossref","first-page":"1172","DOI":"10.1016\/j.specom.2011.01.007","volume":"53","author":"M Kockmann","year":"2011","unstructured":"M. Kockmann, L. Burget, J. Cernock\u00fd, Application of speaker- and language identification state-of-the-art techniques for emotion recognition. Speech Commun. 53(9\u201310), 1172\u20131185 (2011)","journal-title":"Speech Commun."},{"issue":"2","key":"1377_CR29","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1007\/s10772-011-9125-1","volume":"15","author":"SG Koolagudi","year":"2012","unstructured":"S.G. Koolagudi, K. Sreenivasa Rao, Emotion recognition from speech: a review. Int. J. Speech Technol. 15(2), 99\u2013117 (2012)","journal-title":"Int. J. Speech Technol."},{"issue":"9\u201310","key":"1377_CR30","doi-asserted-by":"crossref","first-page":"1162","DOI":"10.1016\/j.specom.2011.06.004","volume":"53","author":"C-C Lee","year":"2011","unstructured":"C.-C. Lee, E. Mower, C. Busso, S. Lee, S. Narayanan, Emotion recognition using a hierarchical binary decision tree approach. Speech Commun. 53(9\u201310), 1162\u20131171 (2011)","journal-title":"Speech Commun."},{"issue":"2","key":"1377_CR31","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1109\/TSA.2004.838534","volume":"13","author":"CM Lee","year":"2005","unstructured":"C.M. Lee, S.S. Narayanan, Toward detecting emotions in spoken dialogs. IEEE Trans. Audio Speech Lang. Process. 13(2), 293\u2013303 (2005)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"1377_CR32","unstructured":"L. Li, Y. Zhao, D. Jiang, Y. Zhang, F. Wang, I. Gonzalez, E. Valentin, H. Sahli, Hybrid deep neural network\u2013hidden Markov model (DNN\u2013HMM) based speech emotion recognition, in 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (2013), pp. 312\u2013317"},{"issue":"6","key":"1377_CR33","doi-asserted-by":"crossref","first-page":"490","DOI":"10.1109\/TMM.2010.2051872","volume":"12","author":"I Luengo","year":"2010","unstructured":"I. Luengo, E. Navas, I. Hern\u00e1ez, Feature analysis and evaluation for automatic emotion identification in speech. IEEE Trans. Multimed. 12(6), 490\u2013501 (2010)","journal-title":"IEEE Trans. Multimed."},{"key":"1377_CR34","doi-asserted-by":"crossref","first-page":"561","DOI":"10.1109\/PROC.1975.9792","volume":"63","author":"J Makhoul","year":"1975","unstructured":"J. Makhoul, Linear prediction: a tutorial review. Proc. IEEE 63, 561\u2013580 (1975)","journal-title":"Proc. IEEE"},{"issue":"3","key":"1377_CR35","doi-asserted-by":"crossref","first-page":"727","DOI":"10.1016\/j.csl.2013.08.004","volume":"28","author":"A Milton","year":"2014","unstructured":"A. Milton, S. Tamil Selvi, Class-specific multiple classifiers scheme to recognize emotions from speech signals. Comput. Speech Lang. 28(3), 727\u2013742 (2014)","journal-title":"Comput. Speech Lang."},{"issue":"5","key":"1377_CR36","doi-asserted-by":"crossref","first-page":"3050","DOI":"10.1121\/1.4796110","volume":"133","author":"VK Mittal","year":"2013","unstructured":"V.K. Mittal, B. Yegnanarayana, Effect of glottal dynamics in the production of shouted speech. J. Acoust. Soc. Am. 133(5), 3050\u20133061 (2013)","journal-title":"J. Acoust. Soc. Am."},{"issue":"2","key":"1377_CR37","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1016\/j.specom.2006.11.004","volume":"49","author":"D Morrison","year":"2007","unstructured":"D. Morrison, R. Wang, L.C. De Silva, Ensemble methods for spoken emotion recognition in call-centres. Speech Commun. 49(2), 98\u2013112 (2007)","journal-title":"Speech Commun."},{"issue":"8","key":"1377_CR38","doi-asserted-by":"crossref","first-page":"1602","DOI":"10.1109\/TASL.2008.2004526","volume":"16","author":"KSR Murty","year":"2008","unstructured":"K.S.R. Murty, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602\u20131613 (2008)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"issue":"6","key":"1377_CR39","doi-asserted-by":"crossref","first-page":"469","DOI":"10.1109\/LSP.2009.2016829","volume":"16","author":"KSR Murty","year":"2009","unstructured":"K.S.R. Murty, B. Yegnanarayana, M. Anand Joseph, Characterization of glottal activity from speech signals. IEEE Signal Process. Lett. 16(6), 469\u2013472 (2009)","journal-title":"IEEE Signal Process. Lett."},{"issue":"2","key":"1377_CR40","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1109\/T-AFFC.2011.8","volume":"2","author":"T Pfister","year":"2011","unstructured":"T. Pfister, P. Robinson, Real-time recognition of affective states from nonverbal features of speech and its application for public speaking skill analysis. IEEE Trans. Affect. Comput. 2(2), 66\u201378 (2011)","journal-title":"IEEE Trans. Affect. Comput."},{"key":"1377_CR41","unstructured":"S.R.M. Prasanna, D. Govind, Analysis of excitation source information in emotional speech, in INTERSPEECH. ISCA (2010), pp. 781\u2013784"},{"issue":"4","key":"1377_CR42","doi-asserted-by":"crossref","first-page":"787","DOI":"10.1007\/s10772-017-9445-x","volume":"20","author":"D Pravena","year":"2017","unstructured":"D. Pravena, D. Govind, Significance of incorporating excitation source parameters for improved emotion recognition from speech and electroglottographic signals. Int. J. Speech Technol. 20(4), 787\u2013797 (2017)","journal-title":"Int. J. Speech Technol."},{"issue":"3","key":"1377_CR43","doi-asserted-by":"crossref","first-page":"614","DOI":"10.1037\/0022-3514.70.3.614","volume":"70","author":"KR Scherer","year":"1996","unstructured":"K.R. Scherer, R. Banse, Acoustic profiles in vocal emotion expression. J. Personal. Soc. Psychol. 70(3), 614\u2013636 (1996)","journal-title":"J. Personal. Soc. Psychol."},{"key":"1377_CR44","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1007\/s10772-012-9175-z","volume":"16","author":"K Sreenivasa Rao","year":"2013","unstructured":"K. Sreenivasa Rao, S.G. Koolagudi, Characterization and recognition of emotions from speech using excitation source information. Int. J. Speech Technol. 16, 181\u2013201 (2013)","journal-title":"Int. J. Speech Technol."},{"key":"1377_CR45","unstructured":"M. Sarma, P. Ghahremani, D. Povey, N.K. Goel, K.K. Sarma, N. Dehak, Emotion identification from raw speech signals using DNNs, in INTERSPEECH (2018), pp. 3097\u20133101"},{"issue":"1\u20132","key":"1377_CR46","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1016\/S0167-6393(02)00084-5","volume":"40","author":"KR Scherer","year":"2003","unstructured":"K.R. Scherer, Vocal communication of emotion: a review of research paradigms. Speech Commun. 40(1\u20132), 227\u2013256 (2003)","journal-title":"Speech Commun."},{"issue":"9\u201310","key":"1377_CR47","doi-asserted-by":"crossref","first-page":"1062","DOI":"10.1016\/j.specom.2011.01.011","volume":"53","author":"B Schuller","year":"2011","unstructured":"B. Schuller, A. Batliner, S. Steidl, D. Seppi, Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Commun. 53(9\u201310), 1062\u20131087 (2011)","journal-title":"Speech Commun."},{"key":"1377_CR48","unstructured":"B. Schuller, S. Steidl, A. Batliner, The interspeech 2009 emotion challenge, in INTERSPEECH (2009), pp. 312\u2013315"},{"issue":"2","key":"1377_CR49","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1109\/T-AFFC.2010.8","volume":"1","author":"B Schuller","year":"2010","unstructured":"B. Schuller, B. Vlasenko, F. Eyben, M. W\u00f6llmer, A. Stuhlsatz, A. Wendemuth, G. Rigoll, Cross-corpus acoustic emotion recognition: variances and strategies. IEEE Trans. Affect. Comput. 1(2), 119\u2013131 (2010)","journal-title":"IEEE Trans. Affect. Comput."},{"issue":"3","key":"1377_CR50","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1016\/j.specom.2007.01.006","volume":"49","author":"M Shami","year":"2007","unstructured":"M. Shami, W. Verhelst, An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech. Speech Commun. 49(3), 201\u2013212 (2007)","journal-title":"Speech Commun."},{"key":"1377_CR51","unstructured":"A.\u00a0Stuhlsatz, C.\u00a0Meyer, F.\u00a0Eyben, T.\u00a0ZieIke, G.\u00a0Meier, B.\u00a0Schuller, Deep neural networks for acoustic emotion recognition: raising the benchmarks, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2011), pp. 5688\u20135691"},{"key":"1377_CR52","unstructured":"R. Sun, E. Moore, A preliminary study on cross-databases emotion recognition using the glottal features in speech, in INTERSPEECH (2012), pp. 1628\u20131631"},{"key":"1377_CR53","unstructured":"R. Sun, E. Moore, J.F. Torres, Investigating glottal parameters for differentiating emotional categories with similar prosodics, in ICASSP (2009), pp. 4509\u20134512"},{"issue":"3","key":"1377_CR54","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1109\/T-AFFC.2011.14","volume":"2","author":"J Sundberg","year":"2011","unstructured":"J. Sundberg, S. Patel, E. Bj\u00f6rkner, K.R. Scherer, Interdependencies among voice source parameters in emotional speech. IEEE Trans. Affect. Comput. 2(3), 162\u2013174 (2011)","journal-title":"IEEE Trans. Affect. Comput."},{"key":"1377_CR55","unstructured":"P. Tzirakis, J. Zhang, B.W. Schuller, End-to-end speech emotion recognition using deep neural networks, in ICASSP (2018), pp. 5089\u20135093"},{"issue":"1","key":"1377_CR56","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1016\/j.jvoice.2008.04.004","volume":"24","author":"T Waaramaa","year":"2010","unstructured":"T. Waaramaa, A.-M. Laukkanen, M. Airas, P. Alku, Perception of emotional valences and activity levels from vowel segments of continuous speech. J. Voice 24(1), 30\u201338 (2010)","journal-title":"J. Voice"},{"key":"1377_CR57","volume-title":"Emotions in Voice: Acoustic and Perceptual Analysis of Voice Quality in the Vocal Expression of Emotions","author":"T Waaramaa-M\u00e4ki-Kulmala","year":"2009","unstructured":"T. Waaramaa-M\u00e4ki-Kulmala, T. Yliopisto, Emotions in Voice: Acoustic and Perceptual Analysis of Voice Quality in the Vocal Expression of Emotions (Acta univesitatis Tamperensis. Tampere University Press, Tampere, 2009)"},{"key":"1377_CR58","first-page":"1","volume-title":"Progress in Nonlinear Speech Processing. Lecture Notes in Computer Science","author":"J Walker","year":"2007","unstructured":"J. Walker, P. Murphy, A review of glottal waveform analysis, in Progress in Nonlinear Speech Processing. Lecture Notes in Computer Science, vol. 4391, ed. by Y. Stylianou, M. Faundez-Zanuy, A. Esposito (Springer, Berlin, 2007), pp. 1\u201321"},{"issue":"4B","key":"1377_CR59","doi-asserted-by":"crossref","first-page":"1238","DOI":"10.1121\/1.1913238","volume":"52","author":"CE Williams","year":"1972","unstructured":"C.E. Williams, K.N. Stevens, Emotions and speech: some acoustical correlates. J. Acoust. Soc. Am. 52(4B), 1238\u20131250 (1972)","journal-title":"J. Acoust. Soc. Am."},{"issue":"5","key":"1377_CR60","doi-asserted-by":"crossref","first-page":"768","DOI":"10.1016\/j.specom.2010.08.013","volume":"53","author":"S Wu","year":"2011","unstructured":"S. Wu, T.H. Falk, W.-Y. Chan, Automatic speech emotion recognition using modulation spectral features. Speech Commun. 53(5), 768\u2013785 (2011)","journal-title":"Speech Commun."},{"issue":"5","key":"1377_CR61","doi-asserted-by":"crossref","first-page":"651","DOI":"10.1007\/s12046-011-0046-0","volume":"36","author":"B Yegnanarayana","year":"2011","unstructured":"B. Yegnanarayana, S.V. Gangashetty, Epoch-based analysis of speech signals. Sadhana 36(5), 651\u2013697 (2011)","journal-title":"Sadhana"},{"issue":"4","key":"1377_CR62","doi-asserted-by":"crossref","first-page":"614","DOI":"10.1109\/TASL.2008.2012194","volume":"17","author":"B Yegnanarayana","year":"2009","unstructured":"B. Yegnanarayana, K. Sri\u00a0Rama Murty, Event-based instantaneous fundamental frequency estimation from speech signals. IEEE Trans. Audio Speech Lang. Process. 17(4), 614\u2013624 (2009)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"issue":"1","key":"1377_CR63","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1109\/TPAMI.2008.52","volume":"31","author":"Z Zeng","year":"2009","unstructured":"Z. Zeng, M. Pantic, G.I. Roisman, T.S. Huang, A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39\u201358 (2009)","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"issue":"5","key":"1377_CR64","doi-asserted-by":"crossref","first-page":"569","DOI":"10.1109\/LSP.2014.2308954","volume":"21","author":"W Zheng","year":"2014","unstructured":"W. Zheng, M. Xin, X. Wang, B. Wang, A novel speech emotion recognition method via incomplete sparse least square regression. IEEE Signal Process. Lett. 21(5), 569\u2013572 (2014)","journal-title":"IEEE Signal Process. Lett."}],"container-title":["Circuits, Systems, and Signal Processing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00034-020-01377-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s00034-020-01377-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00034-020-01377-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,16]],"date-time":"2022-10-16T14:14:07Z","timestamp":1665929647000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s00034-020-01377-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,2,25]]},"references-count":64,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["1377"],"URL":"https:\/\/doi.org\/10.1007\/s00034-020-01377-y","relation":{},"ISSN":["0278-081X","1531-5878"],"issn-type":[{"type":"print","value":"0278-081X"},{"type":"electronic","value":"1531-5878"}],"subject":[],"published":{"date-parts":[[2020,2,25]]},"assertion":[{"value":"12 March 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 February 2020","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 February 2020","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 February 2020","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with ethical standards"}},{"value":"The authors declare no competing financial interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}