{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:57:21Z","timestamp":1760245041446,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":27,"publisher":"ACM","license":[{"start":{"date-parts":[[2016,10,16]],"date-time":"2016-10-16T00:00:00Z","timestamp":1476576000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Horizon 2020 - Mixed Bison","award":["645523"],"award-info":[{"award-number":["645523"]}]},{"name":"IARPA","award":["W911NF-12-C-0013"],"award-info":[{"award-number":["W911NF-12-C-0013"]}]},{"name":"Technology Agency of the Czech Republic - MINT'","award":["TA04011311"],"award-info":[{"award-number":["TA04011311"]}]},{"name":"Horizon 2020 - Mixed Emotions","award":["644632"],"award-info":[{"award-number":["644632"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2016,10,16]]},"DOI":"10.1145\/2988257.2988268","type":"proceedings-article","created":{"date-parts":[[2016,10,12]],"date-time":"2016-10-12T18:34:04Z","timestamp":1476297244000},"page":"75-82","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":33,"title":["Multimodal Emotion Recognition for AVEC 2016 Challenge"],"prefix":"10.1145","author":[{"given":"Filip","family":"Povolny","sequence":"first","affiliation":[{"name":"Phonexia, Brno, Czech Rep"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pavel","family":"Matejka","sequence":"additional","affiliation":[{"name":"Brno University of Technology, Brno, Czech Rep"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michal","family":"Hradis","sequence":"additional","affiliation":[{"name":"Brno University of Technology, Brno, Czech Rep"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anna","family":"Popkov\u00e1","sequence":"additional","affiliation":[{"name":"Brno University of Technology, Brno, Czech Rep"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lubomir","family":"Otrusina","sequence":"additional","affiliation":[{"name":"Brno University of Technology, Brno, Czech Rep"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pavel","family":"Smrz","sequence":"additional","affiliation":[{"name":"Brno University of Technology, Brno, Czech Rep"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ian","family":"Wood","sequence":"additional","affiliation":[{"name":"National University of Ireland Galway, Galway, Ireland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cecile","family":"Robin","sequence":"additional","affiliation":[{"name":"National University of Ireland Galway, Galway, Ireland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lori","family":"Lamel","sequence":"additional","affiliation":[{"name":"LIMSI, Paris, France"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2016,10,16]]},"reference":[{"key":"e_1_3_2_1_1_1","first-page":"01600","article-title":"AVEC 2016 - depression, mood, and emotion recognition workshop and challenge","volume":"1605","author":"Valstar Michel F.","year":"2016","unstructured":"Michel F. Valstar , Jonathan Gratch , Bj\u00f6rn W. Schuller , Fabien Ringeval , Denis Lalanne , Mercedes Torres , Stefan Scherer , Giota Stratou , Roddy Cowie , and Maja Pantic , \" AVEC 2016 - depression, mood, and emotion recognition workshop and challenge ,\" CoRR , vol. abs\/ 1605 . 01600 , 2016 . Michel F. Valstar, Jonathan Gratch, Bj\u00f6rn W. Schuller, Fabien Ringeval, Denis Lalanne, Mercedes Torres, Stefan Scherer, Giota Stratou, Roddy Cowie, and Maja Pantic, \"AVEC 2016 - depression, mood, and emotion recognition workshop and challenge,\" CoRR, vol. abs\/1605.01600, 2016.","journal-title":"CoRR"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_2_1","DOI":"10.1016\/j.imavis.2012.06.016"},{"key":"e_1_3_2_1_3_1","first-page":"48","volume-title":"Further investigation into multilingual training and adaptation of stacked bottle-neck neural network structure,\" in Proceedings of 2014 Spoken Language Technology Workshop","author":"Gr\u00e9zl F.","year":"2014","unstructured":"F. Gr\u00e9zl , E. Egorova , and M. Karafi\u00e1t , \" Further investigation into multilingual training and adaptation of stacked bottle-neck neural network structure,\" in Proceedings of 2014 Spoken Language Technology Workshop , 2014 , pp. 48 -- 53 . F. Gr\u00e9zl, E. Egorova, and M. Karafi\u00e1t, \"Further investigation into multilingual training and adaptation of stacked bottle-neck neural network structure,\" in Proceedings of 2014 Spoken Language Technology Workshop, 2014, pp. 48--53."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"crossref","first-page":"299","DOI":"10.21437\/Odyssey.2014-45","article-title":"Neural network bottleneck features for language identification","volume":"2014","author":"Matejka P.","year":"2014","unstructured":"P. Matejka , L. Zhang , T. Ng , H. S. Mallidi , O. Glembek , J. Ma , and B. Zhang , \" Neural network bottleneck features for language identification ,\" in Proceedings of Odyssey 2014 , 2014 , pp. 299 -- 304 . P. Matejka, L. Zhang, T. Ng, H. S. Mallidi, O. Glembek, J. Ma, and B. Zhang, \"Neural network bottleneck features for language identification,\" in Proceedings of Odyssey 2014, 2014, pp. 299--304.","journal-title":"Proceedings of Odyssey"},{"key":"e_1_3_2_1_5_1","first-page":"389","article-title":"Multilingual bottleneck features for language recognition","volume":"2015","author":"F\u00e9r R.","year":"2015","unstructured":"R. F\u00e9r , P. Matejka , F. Gr\u00e9zl , O. Plchot , and J. Cernock\u00fd , \" Multilingual bottleneck features for language recognition ,\" in Proceedings of Interspeech 2015 , 2015 , pp. 389 -- 393 . R. F\u00e9r, P. Matejka, F. Gr\u00e9zl, O. Plchot, and J. Cernock\u00fd, \"Multilingual bottleneck features for language recognition,\" in Proceedings of Interspeech 2015, 2015, pp. 389--393.","journal-title":"Proceedings of Interspeech"},{"key":"e_1_3_2_1_6_1","first-page":"2015","article-title":"Speaker recognition by means of acoustic and phonetically informed gmms","author":"Cumani S.","year":"2015","unstructured":"S. Cumani , P. Laface , and F. Kulsoom , \" Speaker recognition by means of acoustic and phonetically informed gmms ,\" in Proceedings of Interspeech 2015 , 2015 . S. Cumani, P. Laface, and F. Kulsoom, \"Speaker recognition by means of acoustic and phonetically informed gmms,\" in Proceedings of Interspeech 2015, 2015.","journal-title":"Proceedings of Interspeech"},{"key":"e_1_3_2_1_7_1","first-page":"2015","article-title":"Insights into deep neural networks for speaker recognition","author":"Garcia-Romero D.","year":"2015","unstructured":"D. Garcia-Romero and A. McCree , \" Insights into deep neural networks for speaker recognition ,\" in Proceedings of Interspeech 2015 , 2015 . D. Garcia-Romero and A. McCree, \"Insights into deep neural networks for speaker recognition,\" in Proceedings of Interspeech 2015, 2015.","journal-title":"Proceedings of Interspeech"},{"doi-asserted-by":"crossref","unstructured":"Anna Popkov\u00e1 Filip Povoln\u00fd Pavel Matejka Ondrej Glembek Frantisek Gr\u00e9zl and Jan \"Honza\" Cernock\u00fd \"Investigation of bottle-neck features for emotion recognition \" in Text Speech and Dialog (TSD) 2016.  Anna Popkov\u00e1 Filip Povoln\u00fd Pavel Matejka Ondrej Glembek Frantisek Gr\u00e9zl and Jan \"Honza\" Cernock\u00fd \"Investigation of bottle-neck features for emotion recognition \" in Text Speech and Dialog (TSD) 2016.","key":"e_1_3_2_1_8_1","DOI":"10.1007\/978-3-319-45510-5_49"},{"key":"e_1_3_2_1_9_1","first-page":"04031","article-title":"Facial landmark detection with tweaked convolutional neural networks","volume":"1511","author":"Wu Yue","year":"2015","unstructured":"Yue Wu and Tal Hassner , \" Facial landmark detection with tweaked convolutional neural networks ,\" CoRR , vol. abs\/ 1511 . 04031 , 2015 . Yue Wu and Tal Hassner, \"Facial landmark detection with tweaked convolutional neural networks,\" CoRR, vol. abs\/1511.04031, 2015.","journal-title":"CoRR"},{"key":"e_1_3_2_1_10_1","first-page":"137","volume-title":"Neural Probabilistic Language Models","author":"Bengio Yoshua","year":"2006","unstructured":"Yoshua Bengio , Holger Schwenk , Jean-S\u00e9bastien Sen\u00e9cal , Fr\u00e9deric Morin , and Jean-Luc Gauvain , Neural Probabilistic Language Models , pp. 137 -- 186 , Springer Berlin Heidelberg, Berlin , Heidelberg , 2006 . Yoshua Bengio, Holger Schwenk, Jean-S\u00e9bastien Sen\u00e9cal, Fr\u00e9deric Morin, and Jean-Luc Gauvain, Neural Probabilistic Language Models, pp. 137--186, Springer Berlin Heidelberg, Berlin, Heidelberg, 2006."},{"key":"e_1_3_2_1_11_1","first-page":"3111","volume-title":"Eds.","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov , Ilya Sutskever , Kai Chen , Greg S Corrado , and Jeff Dean , \" Distributed Representations of Words and Phrases and their Compositionality,\" in Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger , Eds. , pp. 3111 -- 3119 . Curran Associates, Inc. , 2013 , 00754. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean, \"Distributed Representations of Words and Phrases and their Compositionality,\" in Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, Eds., pp. 3111--3119. Curran Associates, Inc., 2013, 00754."},{"key":"e_1_3_2_1_12_1","volume-title":"FONETIK","author":"Kornel Laskowski Mattias Heldner","year":"2008","unstructured":"Mattias Heldner Kornel Laskowski and Jens Edlund , \"The fundamental frequency variation spectrum,\" in Proc . FONETIK , 2008 . Mattias Heldner Kornel Laskowski and Jens Edlund, \"The fundamental frequency variation spectrum,\" in Proc. FONETIK, 2008."},{"key":"e_1_3_2_1_13_1","volume-title":"USA","author":"Karafi\u00e1t M.","year":"2014","unstructured":"M. Karafi\u00e1t , K Vesel\u00fd , I. Szoke , L. Burget , F. Gr\u00e9zl , M. Hannemann , and J. Cernock\u00fd , \" But ASR system for BABEL surprise evaluation 2014,\" in Spoken Language Technology Workshop (SLT), 2014 IEEE, NV , USA , Dec 2014 . M. Karafi\u00e1t, K Vesel\u00fd, I. Szoke, L. Burget, F. Gr\u00e9zl, M. Hannemann, and J. Cernock\u00fd, \"But ASR system for BABEL surprise evaluation 2014,\" in Spoken Language Technology Workshop (SLT), 2014 IEEE, NV, USA, Dec 2014."},{"unstructured":"M. Harper \"The BABEL program and low resource speech technology \" in ASRU 2013 Dec 2013.  M. Harper \"The BABEL program and low resource speech technology \" in ASRU 2013 Dec 2013.","key":"e_1_3_2_1_14_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_15_1","DOI":"10.1016\/S0167-6393(01)00061-9"},{"issue":"5","key":"e_1_3_2_1_16_1","first-page":"1335","article-title":"Partitioning and transcription of broadcast news data","volume":"98","author":"Gauvain Jean-Luc","year":"1998","unstructured":"Jean-Luc Gauvain , Lori Lamel , and Gilles Adda , \" Partitioning and transcription of broadcast news data .,\" ICSLP , vol. 98 , no. 5 , pp. 1335 -- 1338 , 1998 . Jean-Luc Gauvain, Lori Lamel, and Gilles Adda, \"Partitioning and transcription of broadcast news data.,\" ICSLP, vol. 98, no. 5, pp. 1335--1338, 1998.","journal-title":"ICSLP"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"crossref","first-page":"1433","DOI":"10.21437\/Interspeech.2008-414","article-title":"Transcribing broadcast data using mlp features","volume":"8","author":"Fousek Petr","year":"2008","unstructured":"Petr Fousek , Lori Lamel , and Jean-Luc Gauvain , \" Transcribing broadcast data using mlp features .,\" InterSpeech , vol. 8 , pp. 1433 -- 1436 , 2008 . Petr Fousek, Lori Lamel, and Jean-Luc Gauvain, \"Transcribing broadcast data using mlp features.,\" InterSpeech, vol. 8, pp. 1433--1436, 2008.","journal-title":"InterSpeech"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"crossref","first-page":"1665","DOI":"10.21437\/Interspeech.2005-544","article-title":"Where are we in transcribing french broadcast news?","author":"Gauvain Jean-Luc","year":"2005","unstructured":"Jean-Luc Gauvain , Gilles Adda , Martine Adda-Decker , Alexandre Allauzen , Veronique Gendner , Lori Lamel , and Holger Schwenk , \" Where are we in transcribing french broadcast news? ,\" in InterSpeech , 2005 , pp. 1665 -- 1668 . Jean-Luc Gauvain, Gilles Adda, Martine Adda-Decker, Alexandre Allauzen, Veronique Gendner, Lori Lamel, and Holger Schwenk, \"Where are we in transcribing french broadcast news?,\" in InterSpeech, 2005, pp. 1665--1668.","journal-title":"InterSpeech"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_19_1","DOI":"10.1006\/csla.1995.0010"},{"key":"e_1_3_2_1_20_1","first-page":"1","article-title":"Multilingual speech processing activities in quaero: Application to multimedia search in unstructured data","author":"Lamel Lori","year":"2012","unstructured":"Lori Lamel , \" Multilingual speech processing activities in quaero: Application to multimedia search in unstructured data .,\" in Baltic HLT , 2012 , pp. 1 -- 8 . Lori Lamel, \"Multilingual speech processing activities in quaero: Application to multimedia search in unstructured data.,\" in Baltic HLT, 2012, pp. 1--8.","journal-title":"Baltic HLT"},{"volume-title":"a collection of very large linguistically processed web-crawled corpora,\" Language resources and evaluation","author":"Baroni Marco","unstructured":"Marco Baroni , Silvia Bernardini , Adriano Ferraresi , and Eros Zanchetta , \" The WaCky wide web : a collection of very large linguistically processed web-crawled corpora,\" Language resources and evaluation , vol. 43 , no. 3, pp. 209--226, 2009, 00587. Marco Baroni, Silvia Bernardini, Adriano Ferraresi, and Eros Zanchetta, \"The WaCky wide web: a collection of very large linguistically processed web-crawled corpora,\" Language resources and evaluation, vol. 43, no. 3, pp. 209--226, 2009, 00587.","key":"e_1_3_2_1_21_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_22_1","DOI":"10.1145\/2808196.2811642"},{"key":"e_1_3_2_1_23_1","volume-title":"Workshop on Emotion Representation, Analysis and Synthesis in Continuous Time and Space (EmoSPACE)","author":"Sonderegger F. Ringevaland A.","year":"2013","unstructured":"F. Ringevaland A. Sonderegger , J. Sauer , and D. Lalanne , \" Introducing the recola multimodal corpus of remote collaborative and affective interactions,\" in Proc. Face and Gestures 2013 , Workshop on Emotion Representation, Analysis and Synthesis in Continuous Time and Space (EmoSPACE) , 2013 . F. Ringevaland A. Sonderegger, J. Sauer, and D. Lalanne, \"Introducing the recola multimodal corpus of remote collaborative and affective interactions,\" in Proc. Face and Gestures 2013, Workshop on Emotion Representation, Analysis and Synthesis in Continuous Time and Space (EmoSPACE), 2013."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_24_1","DOI":"10.1145\/2808196.2811641"},{"key":"e_1_3_2_1_25_1","first-page":"5701","article-title":"ADADELTA: an adaptive learning rate method","volume":"1212","author":"Zeiler Matthew D.","year":"2012","unstructured":"Matthew D. Zeiler , \" ADADELTA: an adaptive learning rate method ,\" CoRR , vol. abs\/ 1212 . 5701 , 2012 . Matthew D. Zeiler, \"ADADELTA: an adaptive learning rate method,\" CoRR, vol. abs\/1212.5701, 2012.","journal-title":"CoRR"},{"doi-asserted-by":"crossref","unstructured":"G. Trigeorgis F. Ringeval R. Brueckner E. Marchi M. Nicolaou Schuller B. and S. Zafeiriou \"Adieu features' end-to-end speech emotion recognition using a deep convolutional recurrent network \" in Proceedings of the 41st IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) 2016.  G. Trigeorgis F. Ringeval R. Brueckner E. Marchi M. Nicolaou Schuller B. and S. Zafeiriou \"Adieu features' end-to-end speech emotion recognition using a deep convolutional recurrent network \" in Proceedings of the 41st IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) 2016.","key":"e_1_3_2_1_26_1","DOI":"10.1109\/ICASSP.2016.7472669"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_27_1","DOI":"10.1145\/2808196.2811634"}],"event":{"sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"acronym":"MM '16","name":"MM '16: ACM Multimedia Conference","location":"Amsterdam The Netherlands"},"container-title":["Proceedings of the 6th International Workshop on Audio\/Visual Emotion Challenge"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2988257.2988268","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2988257.2988268","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:50:36Z","timestamp":1750218636000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2988257.2988268"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,10,16]]},"references-count":27,"alternative-id":["10.1145\/2988257.2988268","10.1145\/2988257"],"URL":"https:\/\/doi.org\/10.1145\/2988257.2988268","relation":{},"subject":[],"published":{"date-parts":[[2016,10,16]]},"assertion":[{"value":"2016-10-16","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}