{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T10:48:48Z","timestamp":1761648528564,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":77,"publisher":"ACM","license":[{"start":{"date-parts":[[2018,2,26]],"date-time":"2018-02-26T00:00:00Z","timestamp":1519603200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/"}],"funder":[{"name":"ONRG","award":["N62909-17-1-2002"],"award-info":[{"award-number":["N62909-17-1-2002"]}]},{"name":"Conicyt-PCHA\/Doctorado","award":["2014-21140711"],"award-info":[{"award-number":["2014-21140711"]}]},{"name":"Conicyt-Fondecyt","award":["1151306"],"award-info":[{"award-number":["1151306"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2018,2,26]]},"DOI":"10.1145\/3171221.3171280","type":"proceedings-article","created":{"date-parts":[[2018,3,6]],"date-time":"2018-03-06T13:17:07Z","timestamp":1520342227000},"page":"150-159","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":35,"title":["DNN-HMM based Automatic Speech Recognition for HRI Scenarios"],"prefix":"10.1145","author":[{"given":"Jos\u00e9","family":"Novoa","sequence":"first","affiliation":[{"name":"University of Chile, Santiago, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jorge","family":"Wuth","sequence":"additional","affiliation":[{"name":"University of Chile, Santiago, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Juan Pablo","family":"Escudero","sequence":"additional","affiliation":[{"name":"University of Chile, Santiago, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Josu\u00e9","family":"Fredes","sequence":"additional","affiliation":[{"name":"University of Chile, Santiago, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rodrigo","family":"Mahu","sequence":"additional","affiliation":[{"name":"University of Chile, Santiago, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"N\u00e9stor Becerra","family":"Yoma","sequence":"additional","affiliation":[{"name":"University of Chile, Santiago, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2018,2,26]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1561\/1100000005"},{"volume-title":"Proceedings of IEEE\/RSJ International Conference on Intelligent Robots and Systems","author":"Lopes L. S.","key":"e_1_3_2_1_2_1","unstructured":"L. S. Lopes and A. Teixeira . 2000. Human-robot interaction through spoken language dialogue . In Proceedings of IEEE\/RSJ International Conference on Intelligent Robots and Systems , Takamatsu, Japan. L. S. Lopes and A. Teixeira. 2000. Human-robot interaction through spoken language dialogue. In Proceedings of IEEE\/RSJ International Conference on Intelligent Robots and Systems, Takamatsu, Japan."},{"volume-title":"Proceedings of ACM\/IEEE International Conference on Human-Robot Interaction (HRI)","author":"Hoffman G.","key":"e_1_3_2_1_3_1","unstructured":"G. Hoffman and K. Vanunu . 2013. Effects of robotic companionship on music enjoyment and agent perception . In Proceedings of ACM\/IEEE International Conference on Human-Robot Interaction (HRI) , Tokyo, Japan. G. Hoffman and K. Vanunu. 2013. Effects of robotic companionship on music enjoyment and agent perception. In Proceedings of ACM\/IEEE International Conference on Human-Robot Interaction (HRI), Tokyo, Japan."},{"volume-title":"Proceedings of 12th International Conference on Control, Automation and Systems, JeJu Island, South Korea.","author":"Lin C. Y.","key":"e_1_3_2_1_4_1","unstructured":"C. Y. Lin , K. T. Song , Y. W. Chen , S. C. Chien , S. H. Chen , C. Y. Chiang , J. H. Yang , Y. C. Wu and T. J. Liu . 2012. User identification design by fusion of face recognition and speaker recognition . In Proceedings of 12th International Conference on Control, Automation and Systems, JeJu Island, South Korea. C. Y. Lin, K. T. Song, Y. W. Chen, S. C. Chien, S. H. Chen, C. Y. Chiang, J. H. Yang, Y. C. Wu and T. J. Liu. 2012. User identification design by fusion of face recognition and speaker recognition. In Proceedings of 12th International Conference on Control, Automation and Systems, JeJu Island, South Korea."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMCA.2012.2216870"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5898\/JHRI.2.1.Kondo"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIM.2010.2047551"},{"key":"e_1_3_2_1_8_1","volume-title":"Compensating for Limitations in Speech-Based Natural Language Processing with Multimodal Interfaces in UAV Operation. In Advances in Human Factors in Robots and Unmanned Systems. AHFE","author":"Meszaros E. L.","year":"2017","unstructured":"E. L. Meszaros , M. Chandarana , A. Trujillo and B. D. Allen . 2018 . Compensating for Limitations in Speech-Based Natural Language Processing with Multimodal Interfaces in UAV Operation. In Advances in Human Factors in Robots and Unmanned Systems. AHFE 2017 . Advances in Intelligent Systems and Computing, California, LA, USA. E. L. Meszaros, M. Chandarana, A. Trujillo and B. D. Allen. 2018. Compensating for Limitations in Speech-Based Natural Language Processing with Multimodal Interfaces in UAV Operation. In Advances in Human Factors in Robots and Unmanned Systems. AHFE 2017. Advances in Intelligent Systems and Computing, California, LA, USA."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCE.2010.5506027"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cognition.2011.05.005"},{"volume-title":"DARPA Robotics Challenge","author":"Polido H.","key":"e_1_3_2_1_11_1","unstructured":"H. Polido . 2014. DARPA Robotics Challenge . Worcester Polytechnic Institute . H. Polido. 2014. DARPA Robotics Challenge. Worcester Polytechnic Institute."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/267658.267738"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/MGRS.2016.2540798"},{"volume-title":"Digital image processing and analysis: human and computer vision applications with CVIPtools","author":"Umbaugh S. E.","key":"e_1_3_2_1_14_1","unstructured":"S. E. Umbaugh . 2010. Digital image processing and analysis: human and computer vision applications with CVIPtools . CRC press . S. E. Umbaugh. 2010. Digital image processing and analysis: human and computer vision applications with CVIPtools. CRC press."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"crossref","unstructured":"W. Burger and M. J. Burge. 2016. Digital image processing: an algorithmic introduction using Java Springer.   W. Burger and M. J. Burge. 2016. Digital image processing: an algorithmic introduction using Java Springer.","DOI":"10.1007\/978-1-4471-6684-9"},{"volume-title":"Image sensors and signal processing for digital still cameras","author":"Nakamura J.","key":"e_1_3_2_1_16_1","unstructured":"J. Nakamura . 2016. Image sensors and signal processing for digital still cameras , CRC press . J. Nakamura. 2016. Image sensors and signal processing for digital still cameras, CRC press."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-49127-9_27"},{"key":"e_1_3_2_1_18_1","unstructured":"X. D. Huang Y. Ariki and M. A. Jack. 1990. Hidden Markov models for speech recognition. Edinburgh university press Edinburgh vol. 2004.  X. D. Huang Y. Ariki and M. A. Jack. 1990. Hidden Markov models for speech recognition. Edinburgh university press Edinburgh vol. 2004."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10044-014-0436-0"},{"key":"e_1_3_2_1_20_1","first-page":"275","volume-title":"Proceedings of DARPA Broadcast News Transcription and Understanding Workshop","author":"Chen S. F.","unstructured":"S. F. Chen , D. Beeferman and R. Rosenfeld . 1998. Evaluation metrics for language models . In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop , pp. 275 -- 280 . S. F. Chen, D. Beeferman and R. Rosenfeld. 1998. Evaluation metrics for language models. In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, pp. 275--280."},{"volume-title":"Proceedings of the 2002 International Joint Conference on Neural Networks","author":"Chetouani M.","key":"e_1_3_2_1_21_1","unstructured":"M. Chetouani , B. Gas and J. Zarader . 2002. Discriminative Training for Neural Predictive Coding Applied to Speech Features Extraction . In Proceedings of the 2002 International Joint Conference on Neural Networks , Honolulu, HI, USA. M. Chetouani, B. Gas and J. Zarader. 2002. Discriminative Training for Neural Predictive Coding Applied to Speech Features Extraction. In Proceedings of the 2002 International Joint Conference on Neural Networks, Honolulu, HI, USA."},{"issue":"6","key":"e_1_3_2_1_22_1","first-page":"1","article-title":"Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition","volume":"1","author":"Dave N.","year":"2013","unstructured":"N. Dave . 2013 . Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition . International Journal for Advance Research in Engineering and Technology , vol. 1 , no. 6 , pp. 1 -- 5 . N. Dave. 2013. Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition. International Journal for Advance Research in Engineering and Technology, vol. 1, no. 6, pp. 1--5.","journal-title":"International Journal for Advance Research in Engineering and Technology"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1986.1164788"},{"issue":"7","key":"e_1_3_2_1_24_1","first-page":"3464","article-title":"Language-model\/acoustic channel balance mechanism","volume":"23","author":"Bahl L.","year":"1980","unstructured":"L. Bahl , R. Bakis , E. Jelinek and R. Mercer . 1980 . Language-model\/acoustic channel balance mechanism . IBM Technical Disclosure Bulletin , vol. 23 , no. 7 B, pp. 3464 -- 3465 . L. Bahl, R. Bakis, E. Jelinek and R. Mercer. 1980. Language-model\/acoustic channel balance mechanism. IBM Technical Disclosure Bulletin, vol. 23, no. 7B, pp. 3464--3465.","journal-title":"IBM Technical Disclosure Bulletin"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2012.2205597"},{"key":"e_1_3_2_1_26_1","unstructured":"J. Godfrey and E. Holliman. 1997. Switchboard-1 Release 2. Linguistic Data Consortium Philadelphia.  J. Godfrey and E. Holliman. 1997. Switchboard-1 Release 2. Linguistic Data Consortium Philadelphia."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.2006.18.7.1527"},{"volume-title":"Proceedings of Workshop on Detection and Classification of Acoustic Scenes and Events","author":"Schr\u00f6der J.","key":"e_1_3_2_1_28_1","unstructured":"J. Schr\u00f6der , J. Anem\u00fcller and S. Goetze . 2016. Performance comparison of GMM, HMM and DNN based approaches for acoustic event detection within Task 3 of the DCASE 2016 challenge . In Proceedings of Workshop on Detection and Classification of Acoustic Scenes and Events , Budapest, Hungary. J. Schr\u00f6der, J. Anem\u00fcller and S. Goetze. 2016. Performance comparison of GMM, HMM and DNN based approaches for acoustic event detection within Task 3 of the DCASE 2016 challenge. In Proceedings of Workshop on Detection and Classification of Acoustic Scenes and Events, Budapest, Hungary."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2014.2339736"},{"volume-title":"Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing","author":"Graves A.","key":"e_1_3_2_1_31_1","unstructured":"A. Graves , A. R. Mohamed and G. Hinton . 2013. Speech recognition with deep recurrent neural networks . In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing , Vancouver, BC, Canada. A. Graves, A. R. Mohamed and G. Hinton. 2013. Speech recognition with deep recurrent neural networks. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada."},{"volume-title":"Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing","author":"Tang Z.","key":"e_1_3_2_1_32_1","unstructured":"Z. Tang , D. Wang and Z. Zhang . 2016. Recurrent neural network training with dark knowledge transfer . In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing , Shanghai, China. Z. Tang, D. Wang and Z. Zhang. 2016. Recurrent neural network training with dark knowledge transfer. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Shanghai, China."},{"volume-title":"IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)","author":"Li J.","key":"e_1_3_2_1_33_1","unstructured":"J. Li , A. Mohamed , G. Zweig and Y. Gong . 2015. LSTM time and frequency recurrence for automatic speech recognition . In IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) , Scottsdale, AZ, USA. J. Li, A. Mohamed, G. Zweig and Y. Gong. 2015. LSTM time and frequency recurrence for automatic speech recognition. In IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, USA."},{"volume-title":"Proceedings of INTERSPEECH","author":"Sainath T. N.","key":"e_1_3_2_1_34_1","unstructured":"T. N. Sainath and B. Li . 2016. Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks . In Proceedings of INTERSPEECH , San Francisco, USA. T. N. Sainath and B. Li. 2016. Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks. In Proceedings of INTERSPEECH, San Francisco, USA."},{"volume-title":"Proceedings of INTERSPEECH","author":"Liu Y.","key":"e_1_3_2_1_35_1","unstructured":"Y. Liu and K. Kirchhoff . 2016. Novel Front-End Features Based on Neural Graph Embeddings for DNN-HMM and LSTM-CTC Acoustic Modeling . In Proceedings of INTERSPEECH , San Francisco, USA. Y. Liu and K. Kirchhoff. 2016. Novel Front-End Features Based on Neural Graph Embeddings for DNN-HMM and LSTM-CTC Acoustic Modeling. In Proceedings of INTERSPEECH, San Francisco, USA."},{"volume-title":"Proceedings of INTERSPEECH","author":"Yu D.","key":"e_1_3_2_1_36_1","unstructured":"D. Yu , W. Xiong , J. Droppo , A. Stolcke , G. Ye , J. Li and G. Zweig . 2016. Deep convolutional neural networks with layer-wise context expansion and attention . In Proceedings of INTERSPEECH , San Francisco, USA. D. Yu, W. Xiong, J. Droppo, A. Stolcke, G. Ye, J. Li and G. Zweig. 2016. Deep convolutional neural networks with layer-wise context expansion and attention. In Proceedings of INTERSPEECH, San Francisco, USA."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2016.2602884"},{"volume-title":"Proceedings of INTERSPEECH","author":"Mitra V.","key":"e_1_3_2_1_38_1","unstructured":"V. Mitra and H. Franco . 2016. Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition . In Proceedings of INTERSPEECH , San Francisco, USA. V. Mitra and H. Franco. 2016. Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition. In Proceedings of INTERSPEECH, San Francisco, USA."},{"volume-title":"Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing","author":"Weng C.","key":"e_1_3_2_1_39_1","unstructured":"C. Weng , D. Yu , M. L. Seltzer and J. Droppo . 2014. Single-channel mixed speech recognition using deep neural networks . In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing , Florence, Italy. C. Weng, D. Yu, M. L. Seltzer and J. Droppo. 2014. Single-channel mixed speech recognition using deep neural networks. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Florence, Italy."},{"key":"e_1_3_2_1_40_1","volume-title":"Povey and others","author":"Young S.","year":"2006","unstructured":"S. Young , G. Evermann , M. Gales , T. Hain , D. Kershaw , X. Liu , G. Moore , J. Odell , D. Ollason , D. Povey and others . 2006 . The HTK book. Cambridge university engineering department, vol. 3 , p. 175. S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey and others. 2006. The HTK book. Cambridge university engineering department, vol. 3, p. 175."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/29.45616"},{"key":"e_1_3_2_1_42_1","unstructured":"W. Walker P. Lamere P. Kwok B. Raj R. Singh E. Gouvea P. Wolf and J. Woelfel. 2004. Sphinx-4: A flexible open source framework for speech recognition. Sun Microsystems Inc.  W. Walker P. Lamere P. Kwok B. Raj R. Singh E. Gouvea P. Wolf and J. Woelfel. 2004. Sphinx-4: A flexible open source framework for speech recognition. Sun Microsystems Inc."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"crossref","unstructured":"A. Lee T. Kawahara and K. Shikano. 2001. JULIUS - an open source real-time large vocabulary recognition engine. In Proceeding of INTERSPEECH Aalborg Denmark.  A. Lee T. Kawahara and K. Shikano. 2001. JULIUS - an open source real-time large vocabulary recognition engine. In Proceeding of INTERSPEECH Aalborg Denmark.","DOI":"10.21437\/Eurospeech.2001-396"},{"volume-title":"Proceedings of ASRU","author":"Povey D.","key":"e_1_3_2_1_44_1","unstructured":"D. Povey , A. Ghoshal , G. Boulianne , N. Goel , M. Hannemann , Y. Qian , P. Schwarz and G. Stemmer . 2011. The Kaldi Speech Recognition Toolkit . In Proceedings of ASRU , Hawaii, USA, December. D. Povey, A. Ghoshal, G. Boulianne, N. Goel, M. Hannemann, Y. Qian, P. Schwarz and G. Stemmer. 2011. The Kaldi Speech Recognition Toolkit. In Proceedings of ASRU, Hawaii, USA, December."},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/SLT.2012.6424249"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1007\/s12369-013-0217-8"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2666242.2666248"},{"volume-title":"Proceedings of INTERSPEECH","author":"Cutugno F.","key":"e_1_3_2_1_48_1","unstructured":"F. Cutugno , A. Finzi , M. Fiore , E. Leone and S. Rossi . 2013. Interacting with robots via speech and gestures, an integrated architecture . In Proceedings of INTERSPEECH , Lyon, France. F. Cutugno, A. Finzi, M. Fiore, E. Leone and S. Rossi. 2013. Interacting with robots via speech and gestures, an integrated architecture. In Proceedings of INTERSPEECH, Lyon, France."},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2016.2625818"},{"volume-title":"Proceedings of the 28th National Conference on Artificial Intelligence, Qu\u00e9bec City","author":"Matuszek C.","key":"e_1_3_2_1_50_1","unstructured":"C. Matuszek , L. Bo , L. Zettlemoyer and D. Fox . 2014. Learning from Unscripted Deictic Gesture and Language for Human-Robot Interactions . In Proceedings of the 28th National Conference on Artificial Intelligence, Qu\u00e9bec City , Quebec, Canada. C. Matuszek, L. Bo, L. Zettlemoyer and D. Fox. 2014. Learning from Unscripted Deictic Gesture and Language for Human-Robot Interactions. In Proceedings of the 28th National Conference on Artificial Intelligence, Qu\u00e9bec City, Quebec, Canada."},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2909824.3020229"},{"volume-title":"Proceedings of the Conference on Electronic Speech Signal Processing","author":"Lange P.","key":"e_1_3_2_1_52_1","unstructured":"P. Lange and D. Suendermann-Oeft . 2014. Tuning Sphinx to Outperform Google's Speech Recognition API . In Proceedings of the Conference on Electronic Speech Signal Processing , Dresden, Germany. P. Lange and D. Suendermann-Oeft. 2014. Tuning Sphinx to Outperform Google's Speech Recognition API. In Proceedings of the Conference on Electronic Speech Signal Processing, Dresden, Germany."},{"volume-title":"Proceedings of 23rd IEEE International Symposium on Robot and Human Interactive Communication","author":"Mubin O.","key":"e_1_3_2_1_53_1","unstructured":"O. Mubin , J. Henderson and C. Bartneck . 2014. You just do not understand me! Speech Recognition in Human Robot Interaction . In Proceedings of 23rd IEEE International Symposium on Robot and Human Interactive Communication , Edinburgh, Scotland. O. Mubin, J. Henderson and C. Bartneck. 2014. You just do not understand me! Speech Recognition in Human Robot Interaction. In Proceedings of 23rd IEEE International Symposium on Robot and Human Interactive Communication, Edinburgh, Scotland."},{"key":"e_1_3_2_1_54_1","unstructured":"M. Marge C. Bonial B. Byrne T. Cassidy A. W. Evans S. G. Hill and C. Voss. 2017. Applying the Wizard-of-Oz technique to multimodal human-robot dialogue. arXiv preprint arXiv:1703.03714.  M. Marge C. Bonial B. Byrne T. Cassidy A. W. Evans S. G. Hill and C. Voss. 2017. Applying the Wizard-of-Oz technique to multimodal human-robot dialogue. arXiv preprint arXiv:1703.03714."},{"volume-title":"Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction (HRI)","author":"Sequeira P.","key":"e_1_3_2_1_55_1","unstructured":"P. Sequeira , P. Alves-Oliveira , T. Ribeiro , E. Di Tullio , S. Petisca , F. S. Melo , G. Castellano and A. Paiva . 2016. Discovering social interaction strategies for robots from restricted-perception Wizard-of-Oz studies . In Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction (HRI) , Christchurch, New Zealand. P. Sequeira, P. Alves-Oliveira, T. Ribeiro, E. Di Tullio, S. Petisca, F. S. Melo, G. Castellano and A. Paiva. 2016. Discovering social interaction strategies for robots from restricted-perception Wizard-of-Oz studies. In Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction (HRI), Christchurch, New Zealand."},{"volume-title":"Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction","author":"Hensby K.","key":"e_1_3_2_1_56_1","unstructured":"K. Hensby , J. Wiles , M. Boden , S. Heath , M. Nielsen , P. Pounds , J. Riddell , K. Rogers , N. Rybak , V. Slaughter , M. Smith , J. Taufatofua , P. Worthy and J. Weigel . 2016. Hand in hand: Tools and techniques for understanding children- touch with a social robot . In Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction , Christchurch, New Zealand. K. Hensby, J. Wiles, M. Boden, S. Heath, M. Nielsen, P. Pounds, J. Riddell, K. Rogers, N. Rybak, V. Slaughter, M. Smith, J. Taufatofua, P. Worthy and J. Weigel. 2016. Hand in hand: Tools and techniques for understanding children- touch with a social robot. In Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction, Christchurch, New Zealand."},{"volume-title":"Proceedings of AAAI Spring Symposium Series","author":"Hoffman G.","key":"e_1_3_2_1_57_1","unstructured":"G. Hoffman . OpenWoZ : A Runtime-Configurable Wizard-of-Oz Framework for Human-Robot Interaction. 2016 . In Proceedings of AAAI Spring Symposium Series , Palo Alto, CA, USA. G. Hoffman. OpenWoZ: A Runtime-Configurable Wizard-of-Oz Framework for Human-Robot Interaction. 2016. In Proceedings of AAAI Spring Symposium Series, Palo Alto, CA, USA."},{"key":"e_1_3_2_1_58_1","volume-title":"Proceedings of AAAI Spring Symposium Series","author":"Martelaro N.","year":"2016","unstructured":"N. Martelaro . 2016 . Wizard-of-Oz Interfaces as a Step Towards Autonomous HRI . In Proceedings of AAAI Spring Symposium Series , Palo Alto, CA, USA. N. Martelaro. 2016. Wizard-of-Oz Interfaces as a Step Towards Autonomous HRI. In Proceedings of AAAI Spring Symposium Series, Palo Alto, CA, USA."},{"volume-title":"Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction","author":"Pourmehr S.","key":"e_1_3_2_1_59_1","unstructured":"S. Pourmehr , J. Thomas and R. Vaughan . 2016. What untrained people do when asked \"make the robot come to you \". In Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction , Christchurch, New Zealand. S. Pourmehr, J. Thomas and R. Vaughan. 2016. What untrained people do when asked \"make the robot come to you\". In Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction, Christchurch, New Zealand."},{"volume-title":"Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction","author":"Senft E.","key":"e_1_3_2_1_60_1","unstructured":"E. Senft , P. Baxter , J. Kennedy , S. Lemaignan and T. Belpaeme . 2016. Providing a robot with learning abilities improves its perception by users . In Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction , Christchurch, New Zealand. E. Senft, P. Baxter, J. Kennedy, S. Lemaignan and T. Belpaeme. 2016. Providing a robot with learning abilities improves its perception by users. In Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction, Christchurch, New Zealand."},{"volume-title":"Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction","author":"Westlund J. M. K.","key":"e_1_3_2_1_61_1","unstructured":"J. M. K. Westlund and C. Breazeal . 2016. Transparency, teleoperation, and children' understanding of social robots . In Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction , Christchurch, New Zealand. J. M. K. Westlund and C. Breazeal. 2016. Transparency, teleoperation, and children' understanding of social robots. In Proceedings of 11th ACM\/IEEE International Conference on Human-Robot Interaction, Christchurch, New Zealand."},{"volume-title":"Proceedings of Hands-free Speech Communications and Microphone Arrays","author":"L\u00f6llmann H. W.","key":"e_1_3_2_1_62_1","unstructured":"H. W. L\u00f6llmann , A. Moore , P. A. Naylor , B. Rafaely , R. Horaud , A. Mazel and W. Kellermann . 2017. Microphone array signal processing for robot audition . In Proceedings of Hands-free Speech Communications and Microphone Arrays , San Francisco, CA, USA. H. W. L\u00f6llmann, A. Moore, P. A. Naylor, B. Rafaely, R. Horaud, A. Mazel and W. Kellermann. 2017. Microphone array signal processing for robot audition. In Proceedings of Hands-free Speech Communications and Microphone Arrays, San Francisco, CA, USA."},{"volume-title":"Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Brisbane, QLD, Australia.","author":"Deleforge A.","key":"e_1_3_2_1_63_1","unstructured":"A. Deleforge and W. Kellermann . 2015. Phase-optimized K-SVD for signal extraction from underdetermined multichannel sparse mixtures . In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Brisbane, QLD, Australia. A. Deleforge and W. Kellermann. 2015. Phase-optimized K-SVD for signal extraction from underdetermined multichannel sparse mixtures. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Brisbane, QLD, Australia."},{"volume-title":"Proceedings of Interspeech","author":"Novoa J.","key":"e_1_3_2_1_64_1","unstructured":"J. Novoa , J. Wuth , J. P. Escudero , J. Fredes , R. Mahu , R. Stern and N. B. Yoma . 2017. Robustness over time-varying channels in DNN-HMM ASR based human-robot interaction . In Proceedings of Interspeech , Stockholm, Sweden. J. Novoa, J. Wuth, J. P. Escudero, J. Fredes, R. Mahu, R. Stern and N. B. Yoma. 2017. Robustness over time-varying channels in DNN-HMM ASR based human-robot interaction. In Proceedings of Interspeech, Stockholm, Sweden."},{"key":"e_1_3_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/1121241.1121272"},{"key":"e_1_3_2_1_66_1","volume-title":"Multichannel Robot Speech Recognition Database: MChRSR. arXiv preprint arXiv","author":"Novoa J.","year":"1801","unstructured":"J. Novoa , J. Wuth , J. P. Escudero , J. Fredes , R. Mahu and N. Becerra Yoma . 2017. Multichannel Robot Speech Recognition Database: MChRSR. arXiv preprint arXiv : 1801 .00061. J. Novoa, J. Wuth, J. P. Escudero, J. Fredes, R. Mahu and N. Becerra Yoma. 2017. Multichannel Robot Speech Recognition Database: MChRSR. arXiv preprint arXiv: 1801.00061."},{"volume-title":"Proceedings of 108th Audio Engineering Society Convention","author":"Farina A.","key":"e_1_3_2_1_67_1","unstructured":"A. Farina . Simultaneous measurement of impulse response and distortion with a swept-sine technique. 2000 . In Proceedings of 108th Audio Engineering Society Convention , Paris, France. A. Farina. Simultaneous measurement of impulse response and distortion with a swept-sine technique. 2000. In Proceedings of 108th Audio Engineering Society Convention, Paris, France."},{"volume-title":"Version 2.0, AU\/417\/02","author":"Hirsch G.","key":"e_1_3_2_1_68_1","unstructured":"G. Hirsch . 2002. Experimental Framework for the Performance Evaluation of Speech Recognition Front-ends on a Large Vocabulary Task , Version 2.0, AU\/417\/02 . ETSI STQ Aurora DSR Working Group . G. Hirsch. 2002. Experimental Framework for the Performance Evaluation of Speech Recognition Front-ends on a Large Vocabulary Task, Version 2.0, AU\/417\/02. ETSI STQ Aurora DSR Working Group."},{"volume-title":"FaNT filtering and noise adding tool","author":"Hirsch G.","key":"e_1_3_2_1_69_1","unstructured":"G. Hirsch . 2005. FaNT filtering and noise adding tool . Niederrhein University of Applied Sciences . G. Hirsch. 2005. FaNT filtering and noise adding tool. Niederrhein University of Applied Sciences."},{"key":"e_1_3_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2017.02.003"},{"key":"e_1_3_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2017.02.001"},{"key":"e_1_3_2_1_72_1","doi-asserted-by":"crossref","unstructured":"K. Vesel\u00fd A. Ghoshal L. Burget and D. Povey. 2013. Sequence-discriminative training of deep neural networks. In Proceeding of INTERSPEECH Lyon France.  K. Vesel\u00fd A. Ghoshal L. Burget and D. Povey. 2013. Sequence-discriminative training of deep neural networks. In Proceeding of INTERSPEECH Lyon France.","DOI":"10.21437\/Interspeech.2013-548"},{"key":"e_1_3_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.1996.540293"},{"key":"e_1_3_2_1_74_1","volume-title":"Speech Recognition (Version 3.7)","author":"Zhang A.","year":"2017","unstructured":"A. Zhang . Speech Recognition (Version 3.7) . 2017 . {Online}. Available: https:\/\/github.com\/Uberi\/speech_recognition#readme. {Accessed 5th September 2017}. A. Zhang. Speech Recognition (Version 3.7). 2017. {Online}. Available: https:\/\/github.com\/Uberi\/speech_recognition#readme. {Accessed 5th September 2017}."},{"volume-title":"Proceedings of INTERSPEECH","author":"Li B.","key":"e_1_3_2_1_75_1","unstructured":"B. Li , T. Sainath , A. Narayanan , J. Caroselli , M. Bacchiani , A. Misra , I. Shafran , H. Sak , G. Pundak , K. Chin and others. 2017. Acoustic Modeling for Google Home . In Proceedings of INTERSPEECH , Stockholm, Sweden. B. Li, T. Sainath, A. Narayanan, J. Caroselli, M. Bacchiani, A. Misra, I. Shafran, H. Sak, G. Pundak, K. Chin and others. 2017. Acoustic Modeling for Google Home. In Proceedings of INTERSPEECH, Stockholm, Sweden."},{"volume-title":"Proceedings of INTERSPEECH","author":"Saon G.","key":"e_1_3_2_1_76_1","unstructured":"G. Saon , H.-K. J. Kuo , S. Rennie and M. Picheny . 2015. The IBM 2015 English Conversational Telephone Speech Recognition System . In Proceedings of INTERSPEECH , Dresden, Germany. G. Saon, H.-K. J. Kuo, S. Rennie and M. Picheny. 2015. The IBM 2015 English Conversational Telephone Speech Recognition System. In Proceedings of INTERSPEECH, Dresden, Germany."},{"volume-title":"Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing","author":"Xiong W.","key":"e_1_3_2_1_77_1","unstructured":"W. Xiong , J. Droppo , X. Huang , F. Seide , M. Seltzer , A. Stolcke , D. Yu and G. Zweig . 2017. The microsoft 2016 conversational speech recognition system . In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing , New Orleans, LA, USA. W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu and G. Zweig. 2017. The microsoft 2016 conversational speech recognition system. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, LA, USA."}],"event":{"name":"HRI '18: ACM\/IEEE International Conference on Human-Robot Interaction","sponsor":["SIGAI ACM Special Interest Group on Artificial Intelligence","SIGCHI ACM Special Interest Group on Computer-Human Interaction","IEEE-RAS Robotics and Automation"],"location":"Chicago IL USA","acronym":"HRI '18"},"container-title":["Proceedings of the 2018 ACM\/IEEE International Conference on Human-Robot Interaction"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3171221.3171280","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3171221.3171280","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:02:53Z","timestamp":1750215773000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3171221.3171280"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,2,26]]},"references-count":77,"alternative-id":["10.1145\/3171221.3171280","10.1145\/3171221"],"URL":"https:\/\/doi.org\/10.1145\/3171221.3171280","relation":{},"subject":[],"published":{"date-parts":[[2018,2,26]]},"assertion":[{"value":"2018-02-26","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}