{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T06:57:06Z","timestamp":1760597826054,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":52,"publisher":"ACM","license":[{"start":{"date-parts":[[2018,10,2]],"date-time":"2018-10-02T00:00:00Z","timestamp":1538438400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Google Faculty Research Award"},{"name":"University at Albany Faculty Research Award"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2018,10,2]]},"DOI":"10.1145\/3242969.3242972","type":"proceedings-article","created":{"date-parts":[[2018,10,2]],"date-time":"2018-10-02T12:09:29Z","timestamp":1538482169000},"page":"366-375","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Joint Discrete and Continuous Emotion Prediction Using Ensemble and End-to-End Approaches"],"prefix":"10.1145","author":[{"given":"Ehab A.","family":"AlBadawy","sequence":"first","affiliation":[{"name":"University at Albany, SUNY, Albany, NY, USA"}]},{"given":"Yelin","family":"Kim","sequence":"additional","affiliation":[{"name":"University at Albany, SUNY, Albany, NY, USA"}]}],"member":"320","published-online":{"date-parts":[[2018,10,2]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"et almbox","author":"Abadi Mart'\u0131n","year":"2016","unstructured":"Mart'\u0131n Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , et almbox . . 2016 . TensorFlow: A System for Large-Scale Machine Learning. OSDI , Vol. Vol. 16 . 265--283. Mart'\u0131n Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et almbox. . 2016. TensorFlow: A System for Large-Scale Machine Learning. OSDI, Vol. Vol. 16. 265--283."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2988257.2988264"},{"key":"e_1_3_2_1_3_1","volume-title":"IEMOCAP: Interactive emotional dyadic motion capture database. Language resources and evaluation","author":"Busso Carlos","year":"2008","unstructured":"Carlos Busso , Murtaza Bulut , Chi-Chun Lee , Abe Kazemzadeh , Emily Mower , Samuel Kim , Jeannette N Chang , Sungbok Lee , and Shrikanth S Narayanan . 2008 . IEMOCAP: Interactive emotional dyadic motion capture database. Language resources and evaluation Vol. 42 , 4 (2008), 335. Carlos Busso, Murtaza Bulut, Chi-Chun Lee, Abe Kazemzadeh, Emily Mower, Samuel Kim, Jeannette N Chang, Sungbok Lee, and Shrikanth S Narayanan . 2008. IEMOCAP: Interactive emotional dyadic motion capture database. Language resources and evaluation Vol. 42, 4 (2008), 335."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2808196.2811634"},{"key":"e_1_3_2_1_5_1","volume-title":"et almbox","author":"Cohen Ira","year":"2000","unstructured":"Ira Cohen , Ashutosh Garg , Thomas S Huang , et almbox . . 2000 . Emotion recognition from facial expressions using multilevel HMM Neural information processing systems, Vol. Vol. 2 . Citeseer . Ira Cohen, Ashutosh Garg, Thomas S Huang, et almbox. . 2000. Emotion recognition from facial expressions using multilevel HMM Neural information processing systems, Vol. Vol. 2. Citeseer."},{"key":"e_1_3_2_1_6_1","volume-title":"Fifteenth Annual Conference of the International Speech Communication Association.","author":"Deng Li","year":"2014","unstructured":"Li Deng and John C Platt . 2014 . Ensemble deep learning for speech recognition . In Fifteenth Annual Conference of the International Speech Communication Association. Li Deng and John C Platt . 2014. Ensemble deep learning for speech recognition. In Fifteenth Annual Conference of the International Speech Communication Association."},{"key":"e_1_3_2_1_7_1","volume-title":"Classification in the presence of label noise: a survey","author":"Fr\u00e9nay \u0131t","year":"2014","unstructured":"Beno^ \u0131t Fr\u00e9nay and Michel Verleysen . 2014. Classification in the presence of label noise: a survey . IEEE transactions on neural networks and learning systems Vol. 25 , 5 ( 2014 ), 845--869. Beno^\u0131t Fr\u00e9nay and Michel Verleysen . 2014. Classification in the presence of label noise: a survey. IEEE transactions on neural networks and learning systems Vol. 25, 5 (2014), 845--869."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Sayan Ghosh Eugene Laksana Louis-Philippe Morency and Stefan Scherer . 2016. Representation Learning for Speech Emotion Recognition. INTERSPEECH. 3603--3607.  Sayan Ghosh Eugene Laksana Louis-Philippe Morency and Stefan Scherer . 2016. Representation Learning for Speech Emotion Recognition. INTERSPEECH. 3603--3607.","DOI":"10.21437\/Interspeech.2016-692"},{"key":"e_1_3_2_1_9_1","unstructured":"Xavier Glorot and Yoshua Bengio . 2010. Understanding the difficulty of training deep feedforward neural networks Proceedings of the thirteenth international conference on artificial intelligence and statistics. 249--256.  Xavier Glorot and Yoshua Bengio . 2010. Understanding the difficulty of training deep feedforward neural networks Proceedings of the thirteenth international conference on artificial intelligence and statistics. 249--256."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2012.06.016"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3123383"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2808196.2811641"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2654984"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2017.01.012"},{"key":"e_1_3_2_1_17_1","volume-title":"Capturing Long-term Temporal Dependencies with Convolutional Networks for Continuous Emotion Recognition. arXiv preprint arXiv:1708.07050","author":"Khorram Soheil","year":"2017","unstructured":"Soheil Khorram , Zakaria Aldeneh , Dimitrios Dimitriadis , Melvin McInnis , and Emily Mower Provost . 2017. Capturing Long-term Temporal Dependencies with Convolutional Networks for Continuous Emotion Recognition. arXiv preprint arXiv:1708.07050 ( 2017 ). Soheil Khorram, Zakaria Aldeneh, Dimitrios Dimitriadis, Melvin McInnis, and Emily Mower Provost . 2017. Capturing Long-term Temporal Dependencies with Convolutional Networks for Continuous Emotion Recognition. arXiv preprint arXiv:1708.07050 (2017)."},{"key":"e_1_3_2_1_18_1","volume-title":"A concordance correlation coefficient to evaluate reproducibility. Biometrics","author":"Lawrence I","year":"1989","unstructured":"I Lawrence and Kuei Lin . 1989. A concordance correlation coefficient to evaluate reproducibility. Biometrics ( 1989 ), 255--268. I Lawrence and Kuei Lin . 1989. A concordance correlation coefficient to evaluate reproducibility. Biometrics (1989), 255--268."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2017-94"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2818346.2830587"},{"key":"e_1_3_2_1_21_1","first-page":"50","article-title":"iVectors for continuous emotion recognition","volume":"45","author":"Lopez-Otero Paula","year":"2014","unstructured":"Paula Lopez-Otero , Laura Docio-Fernandez , and Carmen Garcia-Mateo . 2014 . iVectors for continuous emotion recognition . Training Vol. 45 (2014), 50 . Paula Lopez-Otero, Laura Docio-Fernandez, and Carmen Garcia-Mateo . 2014. iVectors for continuous emotion recognition. Training Vol. 45 (2014), 50.","journal-title":"Training"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2988257.2988267"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2014.2360798"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAFFC.2014.2334294"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/T-AFFC.2011.20"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2013.2253768"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/T-AFFC.2011.40"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2006.11.004"},{"key":"e_1_3_2_1_29_1","volume-title":"Say Wei Foo, and Liyanage C De Silva","author":"Nwe Tin Lay","year":"2003","unstructured":"Tin Lay Nwe , Say Wei Foo, and Liyanage C De Silva . 2003 . Speech emotion recognition using hidden Markov models. Speech communication Vol. 41 , 4 (2003), 603--623. Tin Lay Nwe, Say Wei Foo, and Liyanage C De Silva . 2003. Speech emotion recognition using hidden Markov models. Speech communication Vol. 41, 4 (2003), 603--623."},{"key":"e_1_3_2_1_30_1","volume-title":"The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and psychopathology","author":"Posner Jonathan","year":"2005","unstructured":"Jonathan Posner , James A Russell , and Bradley S Peterson . 2005. The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and psychopathology Vol. 17 , 3 ( 2005 ), 715--734. Jonathan Posner, James A Russell, and Bradley S Peterson . 2005. The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and psychopathology Vol. 17, 3 (2005), 715--734."},{"key":"e_1_3_2_1_31_1","volume-title":"IEEE 2011 workshop on automatic speech recognition and understanding. IEEE Signal Processing Society.","author":"Povey Daniel","year":"2011","unstructured":"Daniel Povey , Arnab Ghoshal , Gilles Boulianne , Lukas Burget , Ondrej Glembek , Nagendra Goel , Mirko Hannemann , Petr Motlicek , Yanmin Qian , Petr Schwarz , 2011 . The Kaldi speech recognition toolkit . In IEEE 2011 workshop on automatic speech recognition and understanding. IEEE Signal Processing Society. Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, et almbox. . 2011. The Kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding. IEEE Signal Processing Society."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2988257.2988268"},{"key":"e_1_3_2_1_33_1","unstructured":"Fabien Ringeval Bj\u00f6rn Schuller Michel Valstar Shashank Jaiswal Erik Marchi Denis Lalanne Roddy Cowie and Maja Pantic . 2015. Av$^  Fabien Ringeval Bj\u00f6rn Schuller Michel Valstar Shashank Jaiswal Erik Marchi Denis Lalanne Roddy Cowie and Maja Pantic . 2015. Av$^"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2808196.2811642"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/FG.2013.6553805"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/T-AFFC.2011.34"},{"key":"e_1_3_2_1_37_1","volume-title":"2011 IEEE International Conference on. IEEE, 646--646","author":"Schr\u00f6der Marc","year":"2011","unstructured":"Marc Schr\u00f6der , Sathish Pammi , Hatice Gunes , Maja Pantic , Michel F Valstar , Roddy Cowie , Gary McKeown , Dirk Heylen , Mark Ter Maat , Florian Eyben , 2011 . Come and have an emotional workout with sensitive artificial listeners! Automatic Face & Gesture Recognition and Workshops (FG 2011) , 2011 IEEE International Conference on. IEEE, 646--646 . Marc Schr\u00f6der, Sathish Pammi, Hatice Gunes, Maja Pantic, Michel F Valstar, Roddy Cowie, Gary McKeown, Dirk Heylen, Mark Ter Maat, Florian Eyben, et almbox. . 2011. Come and have an emotional workout with sensitive artificial listeners! Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on. IEEE, 646--646."},{"key":"e_1_3_2_1_38_1","volume-title":"ICME 2005. IEEE International Conference on. IEEE, 864--867","author":"Schuller Bj\u00f6rn","year":"2005","unstructured":"Bj\u00f6rn Schuller , Stephan Reiter , Ronald Muller , Marc Al-Hames , Manfred Lang , and Gerhard Rigoll . 2005 . Speaker independent speech emotion recognition by ensemble classification Multimedia and Expo, 2005 . ICME 2005. IEEE International Conference on. IEEE, 864--867 . Bj\u00f6rn Schuller, Stephan Reiter, Ronald Muller, Marc Al-Hames, Manfred Lang, and Gerhard Rigoll . 2005. Speaker independent speech emotion recognition by ensemble classification Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on. IEEE, 864--867."},{"key":"e_1_3_2_1_39_1","volume-title":"ICME'03","volume":"1","author":"Schuller Bj\u00f6rn","year":"2003","unstructured":"Bj\u00f6rn Schuller , Gerhard Rigoll , and Manfred Lang . 2003 . Hidden Markov model-based speech emotion recognition Multimedia and Expo, 2003 . ICME'03 . Proceedings. 2003 International Conference on, Vol. Vol. 1 . IEEE, I--401. Bj\u00f6rn Schuller, Gerhard Rigoll, and Manfred Lang . 2003. Hidden Markov model-based speech emotion recognition Multimedia and Expo, 2003. ICME'03. Proceedings. 2003 International Conference on, Vol. Vol. 1. IEEE, I--401."},{"key":"e_1_3_2_1_40_1","volume-title":"2011 IEEE International Conference on. IEEE, 803--808","author":"Soleymani Mohammad","year":"2011","unstructured":"Mohammad Soleymani , Sander Koelstra , Ioannis Patras , and Thierry Pun . 2011 . Continuous emotion detection in response to music videos Automatic Face & Gesture Recognition and Workshops (FG 2011) , 2011 IEEE International Conference on. IEEE, 803--808 . Mohammad Soleymani, Sander Koelstra, Ioannis Patras, and Thierry Pun . 2011. Continuous emotion detection in response to music videos Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on. IEEE, 803--808."},{"key":"e_1_3_2_1_41_1","volume-title":"Louis Ten Bosch, and Lou Boves","author":"Sun Yang","year":"2010","unstructured":"Yang Sun , Louis Ten Bosch, and Lou Boves . 2010 . Hybrid HMM\/BLSTM-RNN for robust speech recognition International Conference on Text, Speech and Dialogue. Springer , 400--407. Yang Sun, Louis Ten Bosch, and Lou Boves . 2010. Hybrid HMM\/BLSTM-RNN for robust speech recognition International Conference on Text, Speech and Dialogue. Springer, 400--407."},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472669"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2017.2764438"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2988257.2988258"},{"key":"e_1_3_2_1_45_1","volume-title":"Proc. 9th Interspeech 2008 incorp. 12th Australasian Int. Conf. on Speech Science and Technology SST 2008","author":"W\u00f6llmer Martin","year":"2008","unstructured":"Martin W\u00f6llmer , Florian Eyben , Stephan Reiter , Bj\u00f6rn Schuller , Cate Cox , Ellen Douglas-Cowie , and Roddy Cowie . 2008 . Abandoning emotion classes-towards continuous emotion recognition with modelling of long-range dependencies . In Proc. 9th Interspeech 2008 incorp. 12th Australasian Int. Conf. on Speech Science and Technology SST 2008 , Brisbane, Australia. 597--600. Martin W\u00f6llmer, Florian Eyben, Stephan Reiter, Bj\u00f6rn Schuller, Cate Cox, Ellen Douglas-Cowie, and Roddy Cowie . 2008. Abandoning emotion classes-towards continuous emotion recognition with modelling of long-range dependencies. In Proc. 9th Interspeech 2008 incorp. 12th Australasian Int. Conf. on Speech Science and Technology SST 2008, Brisbane, Australia. 597--600."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2012.6288834"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2010.2057200"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/2993148.2993184"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2011.2118752"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3136755.3136792"},{"key":"e_1_3_2_1_51_1","volume-title":"Robert Swedberg, and Georg Essl .","author":"Zhang Biqiao","year":"2015","unstructured":"Biqiao Zhang , Emily Mower Provost , Robert Swedberg, and Georg Essl . 2015 . Predicting Emotion Perception Across Domains: A Study of Singing and Speaking. AAAI. 1328--1335. Biqiao Zhang, Emily Mower Provost, Robert Swedberg, and Georg Essl . 2015. Predicting Emotion Perception Across Domains: A Study of Singing and Speaking. AAAI. 1328--1335."},{"key":"e_1_3_2_1_52_1","volume-title":"2008 IEEE International Conference on. IEEE, 1369--1372","author":"Zhang Shiliang","year":"2008","unstructured":"Shiliang Zhang , Qi Tian , Shuqiang Jiang , Qingming Huang , and Wen Gao . 2008 . Affective MTV analysis based on arousal and valence features Multimedia and Expo , 2008 IEEE International Conference on. IEEE, 1369--1372 . Shiliang Zhang, Qi Tian, Shuqiang Jiang, Qingming Huang, and Wen Gao . 2008. Affective MTV analysis based on arousal and valence features Multimedia and Expo, 2008 IEEE International Conference on. IEEE, 1369--1372."}],"event":{"name":"ICMI '18: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION","sponsor":["SIGCHI Specialist Interest Group in Computer-Human Interaction of the ACM"],"location":"Boulder CO USA","acronym":"ICMI '18"},"container-title":["Proceedings of the 20th ACM International Conference on Multimodal Interaction"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3242969.3242972","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3242969.3242972","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:39:24Z","timestamp":1750210764000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3242969.3242972"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,10,2]]},"references-count":52,"alternative-id":["10.1145\/3242969.3242972","10.1145\/3242969"],"URL":"https:\/\/doi.org\/10.1145\/3242969.3242972","relation":{},"subject":[],"published":{"date-parts":[[2018,10,2]]},"assertion":[{"value":"2018-10-02","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}