{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,3]],"date-time":"2026-02-03T23:38:10Z","timestamp":1770161890858,"version":"3.49.0"},"reference-count":49,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2020,3,6]],"date-time":"2020-03-06T00:00:00Z","timestamp":1583452800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"published-print":{"date-parts":[[2020,4,30]]},"abstract":"<jats:p>Deep learning is far and wide considered to be the most powerful method in computer vision fields, which has a lot of applications such as image recognition, robot navigation systems, and self-driving cars. Recent developments in neural networks have led to an efficient end-to-end architecture to human activity representation and classification. In light of these recent events in deep learning, there is now much considerable concern about developing less expensive computation and memory-wise methods. This paper presents an optimized end-to-end approach named stochastic deep conviction network (SDCN) formulated using the deep learning method. It comprises of deep learning method namely deep belief network (DBN), two supervised machine learning algorithm support vector machine (SVM) and decision tree (DT) with optimization capability for speech emotion identification. In the beginning, pre-processing is performed and the features are automatically extracted from the input speech signal by the DBN. Since speech signal features loses most of the information and the performance cannot be guaranteed because dynamic interactions can generate uncountable emotion-specific experiences that have the same core feeling state but different perceptual inclinations so DBN provides more robust features. The next step is to classify the emotions in the training phase; here the SVM classifier is chosen which performs dual classification. In order to enhance this classification process, defects must be reduced and the best discrimination of the extracted features should be obtained hence particle swarm optimization (PSO) technique is being added along with SVM classifier in the training phase. To reduce the over fitting problem and risks of a single classifier a DT is being used in the testing phase for the exact identification of emotions (anger, disgust, fear, happiness, neutral and sadness) and therefore it obtains better performance than a single classifier. The complication of the decision tool is that it can increase the computation time. Thus to eliminate this defect whale optimization (WO) technique is being added to the decision tree to reduce the complexity of the system, which in turn lessens the time taken for recognizing the emotion of the speech signal. This formulated proposed SDCN system improves the recognition rate accurately. In this work, theMATLAB environment is being preferred to perform speech emotion recognition. Using the proposed technique the achieved accuracy of emotion detection is above 95% and the identification of various emotions exceeds 98% recognition rate with a computation time of 23 seconds, which has not been achieved so far by any other existing techniques.<\/jats:p>","DOI":"10.3233\/jifs-191753","type":"journal-article","created":{"date-parts":[[2020,3,6]],"date-time":"2020-03-06T10:41:09Z","timestamp":1583491269000},"page":"5175-5190","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":7,"title":["A novel stochastic deep conviction network for emotion recognition in speech signal"],"prefix":"10.1177","volume":"38","author":[{"given":"Shilpi","family":"Shukla","sequence":"first","affiliation":[{"name":"Assistant Professor, Mahatma Gandhi Mission\u2019s College of Engineering and Technology, Noida, Uttar Pradesh, India"}]},{"given":"Madhu","family":"Jain","sequence":"additional","affiliation":[{"name":"Associate Professor, Department of Electronics and Communication Engineering, Jaypee Institute of Information Technology, Noida, (Uttar Pradesh), India"}]}],"member":"179","published-online":{"date-parts":[[2020,3,6]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/T-AFFC.2012.20"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1049\/iet-ipr.2018.5728"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1080\/08839514.2015.1051891"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-016-3487-y"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCE.2009.5278031"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/10.846676"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1631\/FITEE.1400323"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-014-1768-9"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2011.2116010"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2012.2205597"},{"key":"e_1_3_1_12_2","doi-asserted-by":"crossref","unstructured":"DengL. HintonG. and KingsburyB. New types of deep neural network learning for speech recognition and related applications: Anoverview In 2013 IEEE International Conference on Acoustics Speech and Signal Processing (2013) 8599\u20138603.","DOI":"10.1109\/ICASSP.2013.6639344"},{"key":"e_1_3_1_13_2","doi-asserted-by":"crossref","unstructured":"DengL. LiJ. HuangJ.T. YaoK. YuD. SeideF. SeltzerM. ZweigG. HeX. WilliamsJ. and GongY. Recent advances in deep learning for speech research at Microsoft In 2013 IEEE International Conference on Acoustics Speech and Signal Processing (2013) 8604\u20138608.","DOI":"10.1109\/ICASSP.2013.6639345"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2012.2237151"},{"key":"e_1_3_1_15_2","doi-asserted-by":"crossref","unstructured":"CuiX. GoelV. and KingsburyB. Data augmentation for deep convolutional neural network acoustic modelling In 2015 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (2015) 4545\u20134549.","DOI":"10.1109\/ICASSP.2015.7178831"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.1127647"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1113\/jphysiol.1962.sp006837"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1038\/505146a"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1561\/2200000006"},{"key":"e_1_3_1_20_2","doi-asserted-by":"crossref","unstructured":"BengioY. Deep learning of representations: Looking forward In International Conference on Statistical Language and Speech Processing (2009) 1\u201337.","DOI":"10.1007\/978-3-642-39593-2_1"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2012.2205597"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ress.2013.02.022"},{"key":"e_1_3_1_23_2","doi-asserted-by":"crossref","unstructured":"HintonG.E. A practical guide to training restricted Boltzmann machines In Neural networks: Tricks of the Trade (2012) 599\u2013619.","DOI":"10.1007\/978-3-642-35289-8_32"},{"issue":"8","key":"e_1_3_1_24_2","first-page":"2404","article-title":"Increasing the performance of speech recognition system by using different optimization techniques to redesign artificial neural network","volume":"97","author":"Shukla S.","year":"2019","unstructured":"ShuklaS., JainM. and DubeyR.K., Increasing the performance of speech recognition system by using different optimization techniques to redesign artificial neural network, Journal of Theoretical and Applied Information Technology 97(8) (2019), 2404\u20132415.","journal-title":"Journal of Theoretical and Applied Information Technology"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10772-019-09639-0"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1155\/2013\/493973"},{"issue":"2","key":"e_1_3_1_27_2","first-page":"712","article-title":"Linear phase second order recursive digital integrators and differentiators","volume":"21","author":"Jain M.","year":"2012","unstructured":"JainM., GuptaM. and JainN., Linear phase second order recursive digital integrators and differentiators, Radioengineering 21(2) (2012), 712\u2013717.","journal-title":"Radioengineering"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.4103\/0377-2063.96183"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.3745\/JIPS.02.0001"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1504\/IJSISE.2016.075006"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1049\/iet-spr.2009.0030"},{"issue":"3","key":"e_1_3_1_32_2","article-title":"A Automatic speech based emotion recognition using paralinguistics features","volume":"67","author":"Hook J.","year":"2019","unstructured":"HookJ., NooroziF., ToygarO. and AnbarjafariG., A Automatic speech based emotion recognition using paralinguistics features, Bulletin of the Polish Academy of Sciences Technical Sciences 67(3) (2019).","journal-title":"Bulletin of the Polish Academy of Sciences Technical Sciences"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.bspc.2018.08.035"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-017-5292-7"},{"key":"e_1_3_1_35_2","doi-asserted-by":"crossref","unstructured":"PuJ. PanagakisY. and PanticM. Learning Low Rank and Sparse Models via Robust Autoencoders. In ICASSP 2019 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (3192\u20133196).","DOI":"10.1109\/ICASSP.2019.8682925"},{"key":"e_1_3_1_36_2","doi-asserted-by":"crossref","unstructured":"GuptaS. KishalayaD. DineshA. and ThenkanidiyoorV. Recognition from Varying Length Patterns of Speech using CNN-based Segment-Level Pyramid Match Kernel-based SVMs In 2019 National Conference on Communications (NCC). (2019) 1\u20136","DOI":"10.1109\/NCC.2019.8732191"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSTARS.2019.2898348"},{"key":"e_1_3_1_38_2","unstructured":"WeiP. and ZhaoY. A novel speech emotion recognition algorithm based on wavelet kernel sparse classifier in stacked deep auto-encoder model Personal and Ubiquitous Computing (2019) 1\u20139."},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1007\/s12652-017-0644-8"},{"key":"e_1_3_1_40_2","doi-asserted-by":"crossref","unstructured":"WenG. LiH. HuangJ. LiD. and XunE. Random deep belief networks for recognizing emotions from speech signals Computational Intelligence and Neuroscience (2017).","DOI":"10.1155\/2017\/1945630"},{"key":"e_1_3_1_41_2","unstructured":"WangL. and SungE. AdaBoost with SVM-based component classifiers Engineering Applications of Artificial Intelligence (2008)."},{"key":"e_1_3_1_42_2","doi-asserted-by":"crossref","unstructured":"LiliG. LongbiaoW. DangJ. ZhangL. GuanH. and LiX. Speech Emotion Recognition by Combining Amplitude and Phase Information Using Convolutional Neural Network Proc Interspeech (2018) 1611\u20131615.","DOI":"10.21437\/Interspeech.2018-2156"},{"key":"e_1_3_1_43_2","doi-asserted-by":"crossref","unstructured":"BadshahA.M. AhmadJ. RahimN. and BaikS.W. Speech emotion recognition from spectrograms with deep convolutional neural network 2017. In International conference on platform technology and service (PlatCon) Busan South Korea (2017) 1\u20135.","DOI":"10.1109\/PlatCon.2017.7883728"},{"key":"e_1_3_1_44_2","doi-asserted-by":"crossref","unstructured":"LimW. JangD. and LeeT. Speech emotion recognition using convolutional and recurrent neural networks In 2016 Asia-Pacific signal and Information Processing Association Annual Summit and Conference (APSIPA) Jeju (2016) 1\u20134.","DOI":"10.1109\/APSIPA.2016.7820699"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10772-018-09572-8"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.sigpro.2009.09.009"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2012.03.003"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2014.2311435"},{"key":"e_1_3_1_49_2","first-page":"1","article-title":"Emotion classification using segmentation of vowel-like and non-vowel-like regions","volume":"99","author":"Deb S.","year":"2017","unstructured":"DebS. and DandapatS., Emotion classification using segmentation of vowel-like and non-vowel-like regions, IEEE Transactions on Affective Computing 99 (2017), 1.","journal-title":"IEEE Transactions on Affective Computing"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.14569\/IJACSA.2019.0101249"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-191753","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-191753","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-191753","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,3]],"date-time":"2026-02-03T10:25:27Z","timestamp":1770114327000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-191753"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,3,6]]},"references-count":49,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,4,30]]}},"alternative-id":["10.3233\/JIFS-191753"],"URL":"https:\/\/doi.org\/10.3233\/jifs-191753","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,3,6]]}}}