{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T18:17:59Z","timestamp":1776104279905,"version":"3.50.1"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2022,1,5]],"date-time":"2022-01-05T00:00:00Z","timestamp":1641340800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,1,5]],"date-time":"2022-01-05T00:00:00Z","timestamp":1641340800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004359","name":"Vetenskapsrdet","doi-asserted-by":"publisher","award":["2016-03698"],"award-info":[{"award-number":["2016-03698"]}],"id":[{"id":"10.13039\/501100004359","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100011898","name":"Marcus and Amalia Wallenberg Foundation","doi-asserted-by":"crossref","award":["MAW 2020.0052"],"award-info":[{"award-number":["MAW 2020.0052"]}],"id":[{"id":"10.13039\/501100011898","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J of Soc Robotics"],"published-print":{"date-parts":[[2022,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The large majority of previous work on human-robot conversations in a second language has been performed with a human wizard-of-Oz. The reasons are that automatic speech recognition of non-native conversational speech is considered to be unreliable and that the dialogue management task of selecting robot utterances that are adequate at a given turn is complex in social conversations. This study therefore investigates if robot-led conversation practice in a second language with pairs of adult learners could potentially be managed by an autonomous robot. We first investigate how correct and understandable transcriptions of second language learner utterances are when made by a state-of-the-art speech recogniser. We find both a relatively high word error rate (41%) and that a substantial share (42%) of the utterances are judged to be incomprehensible or only partially understandable by a human reader. We then evaluate how adequate the robot utterance selection is, when performed manually based on the speech recognition transcriptions or autonomously using (a) predefined sequences of robot utterances, (b) a general state-of-the-art language model that selects utterances based on learner input or the preceding robot utterance, or (c) a custom-made statistical method that is trained on observations of the wizard\u2019s choices in previous conversations. It is shown that adequate or at least acceptable robot utterances are selected by the human wizard in most cases (96%), even though the ASR transcriptions have a high word error rate. Further, the custom-made statistical method performs as well as manual selection of robot utterances based on ASR transcriptions. It was also found that the interaction strategy that the robot employed, which differed regarding how much the robot maintained the initiative in the conversation and if the focus of the conversation was on the robot or the learners, had marginal effects on the word error rate and understandability of the transcriptions but larger effects on the adequateness of the utterance selection. Autonomous robot-led conversations may hence work better with some robot interaction strategies.<\/jats:p>","DOI":"10.1007\/s12369-021-00849-8","type":"journal-article","created":{"date-parts":[[2022,1,5]],"date-time":"2022-01-05T10:02:47Z","timestamp":1641376967000},"page":"1067-1085","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Is a Wizard-of-Oz Required for Robot-Led Conversation Practice in a Second Language?"],"prefix":"10.1007","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4532-014X","authenticated-orcid":false,"given":"Olov","family":"Engwall","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8773-9216","authenticated-orcid":false,"given":"Jos\u00e9","family":"Lopes","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4472-4732","authenticated-orcid":false,"given":"Ronald","family":"Cumbal","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2022,1,5]]},"reference":[{"key":"849_CR1","doi-asserted-by":"crossref","unstructured":"Zhen-Jia Y, Chi-Yuh S, Chih-Wei C, L, BJ, Gwo-Dong C (2006) A robot as a teaching assistant in an English class. In: Sixth IEEE International Conference on Advanced Learning Technologies (ICALT\u201906). pp 87\u201391","DOI":"10.1109\/ICALT.2006.1652373"},{"key":"849_CR2","doi-asserted-by":"crossref","unstructured":"Park S, Han J, Kang B, Shin K (2011) Teaching assistant robot, robosem, in English class and practical issues for its diffusion. In: Proceedings of Advanced Robotics and its Social Impacts. pp 8\u201311","DOI":"10.1109\/ARSO.2011.6301971"},{"issue":"1","key":"849_CR3","doi-asserted-by":"publisher","first-page":"78","DOI":"10.5898\/JHRI.1.1.Tanaka","volume":"1","author":"F Tanaka","year":"2012","unstructured":"Tanaka F, Matsuzoe S (2012) Children teach a care-receiving robot to promote their learning: field experiments in a classroom for vocabulary learning. J Human Robot Interaction 1(1):78\u201395","journal-title":"J Human Robot Interaction"},{"key":"849_CR4","doi-asserted-by":"crossref","unstructured":"Alemi M, Meghdari A, Basiri NM, Taheri A (2015) The effect of applying humanoid robots as teacher assistants to help iranian autistic pupils learn english as a foreign language. In: Social Robotics. ICSR 2015. Lecture Notes in Computer Science. Volume 9388","DOI":"10.1007\/978-3-319-25554-5_1"},{"issue":"4","key":"849_CR5","first-page":"474","volume":"18","author":"E Mazzoni","year":"2015","unstructured":"Mazzoni E, Benvenuti M (2015) A robot-partner for preschool children learning english using socio-cognitive conflict. Educ Technol Soc 18(4):474\u2013485","journal-title":"Educ Technol Soc"},{"key":"849_CR6","doi-asserted-by":"crossref","unstructured":"Kennedy J, Baxter P, Senft E, Belpaeme T (2016) Social robot tutoring for child second language learning. In: The Eleventh ACM\/IEEE International Conference on Human Robot Interation (HRI \u201816). pp 231\u2013238","DOI":"10.1109\/HRI.2016.7451757"},{"key":"849_CR7","doi-asserted-by":"crossref","unstructured":"Khalifa A, Kato T, Yamamoto S (2017) Measuring effect of repetitive queries and implicit learning with joining-in type robot assisted language learning system. In: ISCA workshop on Speech and Language Technology in Education. pp 13\u201317","DOI":"10.21437\/SLaTE.2017-3"},{"issue":"5","key":"849_CR8","doi-asserted-by":"publisher","first-page":"962","DOI":"10.1109\/TRO.2007.904904","volume":"23","author":"T Kanda","year":"2007","unstructured":"Kanda T, Sato R, Saiwaki N, Ishiguro H (2007) A two-month field trial in an elementary school for long-term human-robot interaction. IEEE Trans Rob 23(5):962\u2013971","journal-title":"IEEE Trans Rob"},{"issue":"12","key":"849_CR9","first-page":"159","volume":"4","author":"J Han","year":"2008","unstructured":"Han J, Jo M, Jones V, Jo J (2008) Comparative study on the educational use of home robots for children. JIPS 4(12):159\u2013168","journal-title":"JIPS"},{"key":"849_CR10","unstructured":"Mubin O, Shahid S, Bartneck C (2013) Robot assisted language learning through games: a comparison of two case studies. Austral J Intell Information Proc Syst. Vol. 13"},{"key":"849_CR11","doi-asserted-by":"crossref","unstructured":"Gordon G, Spaulding S, Westlund J, Lee J, Plummer L, Martinez M, Das M, Breazeal C (2016) Affective personalization of a social robot tutor for children\u2019s second language skills. In: Proceedings of AAAI Conference on Artificial Intelligence","DOI":"10.1609\/aaai.v30i1.9914"},{"key":"849_CR12","doi-asserted-by":"crossref","unstructured":"Pereira A, Oertel C, Fermoselle, L, Mendelson J, Gustafson J (2020) Effects of different interaction contexts when evaluating gaze models in hri. 2020 15th ACM\/IEEE International Conference on Human-Robot Interaction (HRI) pp 131\u2013139","DOI":"10.1145\/3319502.3374810"},{"key":"849_CR13","unstructured":"Zollo T (1999) A study of human dialogue strategies the presence of speech recognition errors. In: AAAI Technical Report FS-99-03"},{"key":"849_CR14","doi-asserted-by":"crossref","unstructured":"Cavazza M (2001) An empirical study of speech recognition errors in a task-oriented dialogue system. In: Proceedings of the Second SIGdial Workshop on Discourse and Dialogue. (09)","DOI":"10.3115\/1118078.1118084"},{"key":"849_CR15","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1007\/s12369-020-00635-y","volume":"13","author":"O Engwall","year":"2020","unstructured":"Engwall O, Lopes J, \u00c5hlund A (2020) Robot interaction styles for conversation practice in second language learning. Int J Soc Robot 13:251\u2013276","journal-title":"Int J Soc Robot"},{"key":"849_CR16","doi-asserted-by":"crossref","unstructured":"Cumbal R, Moell, B, Lopes, J, Engwall O (2021) You don\u2019t understand me! comparing asr results for L1 and L2 speakers of swedish. In: Interspeech","DOI":"10.21437\/Interspeech.2021-2140"},{"key":"849_CR17","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1007\/s12369-018-0468-5","volume":"10","author":"T Arimoto","year":"2018","unstructured":"Arimoto T, Yoshikawa Y, Ishiguro H (2018) Multiple-robot conversational patterns for concealing incoherent responses. Int J Soc Robot 10:583\u2013593","journal-title":"Int J Soc Robot"},{"key":"849_CR18","doi-asserted-by":"crossref","unstructured":"Engwall O, Lopes J (2020) Interaction and collaboration in robot-assisted language learning for adults. Computer Assisted Language Learning pp 1\u201337","DOI":"10.1080\/09588221.2020.1799821"},{"key":"849_CR19","unstructured":"You ZJ, Shen CY, Chang CW, Liu BJ, Chen GD (2006) A robot as a teaching assistant in an english class. In: Proceedings - Sixth International Conference on Advanced Learning Technologies, ICALT 2006. Volume 2006. (08) pp 87 \u2013 91"},{"issue":"4","key":"849_CR20","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1007\/s12369-015-0286-y","volume":"7","author":"M Alemi","year":"2015","unstructured":"Alemi M, Meghdari A, Ghazisaedy M (2015) The impact of social robotics on l2 learners\u2018 anxiety and attitude in english vocabulary acquisition. Int J Soc Robot 7(4):523\u2013535","journal-title":"Int J Soc Robot"},{"issue":"04","key":"849_CR21","first-page":"296","volume":"16","author":"Y Wang","year":"2013","unstructured":"Wang Y, Young S, Jang JS (2013) Using tangible companions for enhancing learning english conversation. Educ Technol Soc 16(04):296\u2013309","journal-title":"Educ Technol Soc"},{"issue":"12","key":"849_CR22","first-page":"1","volume":"23","author":"Wu Wc","year":"2013","unstructured":"Wc Wu, Wang RJ, Chen NS (2013) Instructional design using an in-house built teaching assistant robot to enhance elementary school english-as-a-foreign-language learning. Interact Learn Environ 23(12):1\u201319","journal-title":"Interact Learn Environ"},{"issue":"1","key":"849_CR23","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1017\/S0958344010000273","volume":"23","author":"S Lee","year":"2013","unstructured":"Lee S, Noh H, Lee J, Lee K, Lee G, Sagong S, Kim M (2013) On the effectiveness of robot-assisted language learning. ReCALL 23(1):25\u201358","journal-title":"ReCALL"},{"issue":"1","key":"849_CR24","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1207\/s15327051hci1901&2_4","volume":"19","author":"T Kanda","year":"2004","unstructured":"Kanda T, Hirano T, Eaton D, Ishiguro H (2004) Interactive robots as social partners and peer tutors for children: a field trial. Human Comput Interact 19(1):61\u201384","journal-title":"Human Comput Interact"},{"issue":"2","key":"849_CR25","doi-asserted-by":"publisher","first-page":"259","DOI":"10.3102\/0034654318821286","volume":"89","author":"R van den Berghe","year":"2018","unstructured":"van den Berghe R, Verhagen J, Oudgenoeg-Paz O, van der Ven S, Leseman P (2018) Social robots for language learning: a review. Rev Educ Res 89(2):259\u2013295","journal-title":"Rev Educ Res"},{"issue":"12","key":"849_CR26","first-page":"1","volume":"9","author":"N Randall","year":"2019","unstructured":"Randall N (2019) A survey of robot-assisted language learning (rall). ACM Transact Human Robot Interact 9(12):1\u201336","journal-title":"ACM Transact Human Robot Interact"},{"key":"849_CR27","unstructured":"Khalifa A, Kato T, Yamamoto S (2016) Joining-in-type humanoid robot assisted language learning system. In: Proceedings of LREC. pp 245\u2013249"},{"key":"849_CR28","unstructured":"Engwall O, Lopes J, Cumbal R, Berndtson G, Lindstr\u00f6m R, Jin E, Johnston E, Mekonnen M, Tahir G Learner and teacher perspectives on robot-led l2 conversation practice. ReCALL (Accepted)"},{"issue":"1","key":"849_CR29","doi-asserted-by":"publisher","first-page":"59","DOI":"10.29140\/jaltcall.v13n1.212","volume":"13","author":"T Ashwell","year":"2017","unstructured":"Ashwell T, Elam JR (2017) How accurately can the google web speech api recognize and transcribe japanese l2 english learners\u2018 oral production? Jalt Call J 13(1):59\u201376","journal-title":"Jalt Call J"},{"key":"849_CR30","first-page":"1","volume":"2019(1) 3","author":"K Radzikowski","year":"2019","unstructured":"Radzikowski K, Nowak R, Wang L, Yoshie O (2019) Dual supervised learning for non-native speech recognition. EURASIP J Audio Speech Music Process 2019(1) 3:1\u201310","journal-title":"EURASIP J Audio Speech Music Process"},{"key":"849_CR31","doi-asserted-by":"crossref","unstructured":"Lund W, Kennard D, Ringger E (2013) Combining multiple thresholding binarization values to improve ocr output. In: Document Recognition and Retrieval XX. (02)","DOI":"10.1117\/12.2006228"},{"key":"849_CR32","doi-asserted-by":"crossref","unstructured":"Matassoni M, Gretter R, Falavigna D, Giuliani D (2018) Non-native children speech recognition through transfer learning. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp 6229\u20136233","DOI":"10.1109\/ICASSP.2018.8462059"},{"key":"849_CR33","doi-asserted-by":"crossref","unstructured":"Martinez VR, Kennedy J (2020) A multiparty chat-based dialogue system with concurrent conversation tracking and memory. In: Proceedings of the 2nd Conference on Conversational User Interfaces. CUI \u201920, Association for Computing Machinery","DOI":"10.1145\/3405755.3406121"},{"key":"849_CR34","unstructured":"Adiwardana D, Luong MT, So DR, Hall J, Fiedel N, Thoppilan R, Yang Z, Kulshreshtha A, Nemade G, Lu Y, Le QV (2020) Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977"},{"key":"849_CR35","doi-asserted-by":"crossref","unstructured":"Roller S, Dinan E, Goyal N, Ju D, Williamson M, Liu Y, Xu J, Ott M, Shuster K, Smith EM, Boureau YL, Weston J (2021) Recipes for building an open-domain chatbot. In: EACL","DOI":"10.18653\/v1\/2021.eacl-main.24"},{"key":"849_CR36","doi-asserted-by":"crossref","unstructured":"Wen TH, Vandyke D, Mrk\u0161i\u0107 N, Ga\u0161i\u0107 M, Rojas-Barahona LM, Su PH, Ultes S, Young S (2017) A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Valencia, Spain, Association for Computational Linguistics (April) 438\u2013449","DOI":"10.18653\/v1\/E17-1042"},{"key":"849_CR37","doi-asserted-by":"crossref","unstructured":"Budzianowski P, Wen TH, Tseng BH, Casanueva I, Ultes S, Ramadan O, Ga\u0161i\u0107 M (2018) MultiWOZ - a large-scale multi-domain Wizard-of-Oz dataset for task-oriented dialogue modelling. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, Association for Computational Linguistics (October-November) pp 5016\u20135026","DOI":"10.18653\/v1\/D18-1547"},{"key":"849_CR38","unstructured":"Lopes J, Garcia FJC, Hastie H (2020) The lab vs the crowd: An investigation into data quality for neural dialogue models. arXiv preprint arXiv:2012.03855"},{"key":"849_CR39","doi-asserted-by":"crossref","unstructured":"Jonell P, Fallgren P, Do\u011fan FI, Lopes J, Wennberg U, Skantze G (2019) Crowdsourcing a self-evolving dialog graph. In: Proceedings of the 1st International Conference on Conversational User Interfaces. pp 1\u20138","DOI":"10.1145\/3342775.3342790"},{"key":"849_CR40","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding"},{"key":"849_CR41","doi-asserted-by":"crossref","unstructured":"Al Moubayed S, Beskow J, Skantze G, Granstrom B (2012) Furhat: a back-projected human-like robot head for multiparty human-machine interaction. In: Cognitive behavioural systems. Springer pp 114\u2013130","DOI":"10.1007\/978-3-642-34584-5_9"},{"key":"849_CR42","unstructured":"Yamaguchi T, Inoue, K, Yoshino K, Takanashi K, Ward NG, Kawahara T (2015) Analysis and prediction of morphological patterns of backchannels for attentive listening agents"},{"key":"849_CR43","unstructured":"Malmsten M, B\u00f6rjeson L, Haffenden C Playing with words at the national library of sweden \u2013 making a swedish bert. arxiv"},{"key":"849_CR44","doi-asserted-by":"crossref","unstructured":"Higashinaka R, Araki M, Tsukahara H, Mizukami M (2019) Improving taxonomy of errors in chat-oriented dialogue systems. In: D\u2019Haro LF, Banchs RE, Li H (Eds) 9th International Workshop on Spoken Dialogue System Technology. Singapore, Springer Singapore, pp 331\u2013343","DOI":"10.1007\/978-981-13-9443-0_29"}],"container-title":["International Journal of Social Robotics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s12369-021-00849-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s12369-021-00849-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s12369-021-00849-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,21]],"date-time":"2023-01-21T22:01:48Z","timestamp":1674338508000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s12369-021-00849-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,5]]},"references-count":44,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,6]]}},"alternative-id":["849"],"URL":"https:\/\/doi.org\/10.1007\/s12369-021-00849-8","relation":{},"ISSN":["1875-4791","1875-4805"],"issn-type":[{"value":"1875-4791","type":"print"},{"value":"1875-4805","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,5]]},"assertion":[{"value":"19 November 2021","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 January 2022","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest. The original datasets analysed during the current study are not publicly available due to privacy issues, as the subjects have been guaranteed that their audio and video data will not be distributed. However, log files and ASR transcriptions are available from the corresponding author on reasonable request.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}