{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T22:04:54Z","timestamp":1780524294428,"version":"3.54.1"},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2020,7,9]],"date-time":"2020-07-09T00:00:00Z","timestamp":1594252800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,7,9]],"date-time":"2020-07-09T00:00:00Z","timestamp":1594252800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["16SV7960"],"award-info":[{"award-number":["16SV7960"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Multimodal User Interfaces"],"published-print":{"date-parts":[[2021,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>While the research area of artificial intelligence benefited from increasingly sophisticated machine learning techniques in recent years, the resulting systems suffer from a loss of transparency and comprehensibility, especially for end-users. In this paper, we explore the effects of incorporating virtual agents into explainable artificial intelligence (XAI) designs on the perceived trust of end-users. For this purpose, we conducted a user study based on a simple speech recognition system for keyword classification. As a result of this experiment, we found that the integration of virtual agents leads to increased user trust in the XAI system. Furthermore, we found that the user\u2019s trust significantly depends on the modalities that are used within the user-agent interface design. The results of our study show a linear trend where the visual presence of an agent combined with a voice output resulted in greater trust than the output of text or the voice output alone. Additionally, we analysed the participants\u2019 feedback regarding the presented XAI visualisations. We found that increased human-likeness of and interaction with the virtual agent are the two most common mention points on how to improve the proposed XAI interaction design. Based on these results, we discuss current limitations and interesting topics for further research in the field of XAI. Moreover, we present design recommendations for virtual agents in XAI systems for future projects.<\/jats:p>","DOI":"10.1007\/s12193-020-00332-0","type":"journal-article","created":{"date-parts":[[2020,7,9]],"date-time":"2020-07-09T10:05:31Z","timestamp":1594289131000},"page":"87-98","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":103,"title":["\u201cLet me explain!\u201d: exploring the potential of virtual agents in explainable AI interaction design"],"prefix":"10.1007","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1001-2278","authenticated-orcid":false,"given":"Katharina","family":"Weitz","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7364-5772","authenticated-orcid":false,"given":"Dominik","family":"Schiller","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3516-6188","authenticated-orcid":false,"given":"Ruben","family":"Schlagowski","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5010-4006","authenticated-orcid":false,"given":"Tobias","family":"Huber","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2367-162X","authenticated-orcid":false,"given":"Elisabeth","family":"Andr\u00e9","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2020,7,9]]},"reference":[{"key":"332_CR1","doi-asserted-by":"crossref","unstructured":"Alqaraawi A, Schuessler M, Wei\u00df P, Costanza E, Berthouze N (2020) Evaluating saliency map explanations for convolutional neural networks: a user study. arXiv:2002.00772","DOI":"10.1145\/3377325.3377519"},{"issue":"7","key":"332_CR2","doi-asserted-by":"publisher","first-page":"e0130140","DOI":"10.1371\/journal.pone.0130140","volume":"10","author":"S Bach","year":"2015","unstructured":"Bach S, Binder A, Montavon G, Klauschen F, M\u00fcller KR, Samek W (2015) On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS One 10(7):e0130140. https:\/\/doi.org\/10.1371\/journal.pone.0130140","journal-title":"PloS One"},{"key":"332_CR3","doi-asserted-by":"crossref","unstructured":"Broekens J, Harbers M, Hindriks K, Van Den\u00a0Bosch K, Jonker C, Meyer JJ (2010) Do you get it? user-evaluated explainable BDI agents. In: German conference on multiagent system technologies. Springer, pp 28\u201339","DOI":"10.1007\/978-3-642-16178-0_5"},{"key":"332_CR4","doi-asserted-by":"crossref","unstructured":"Chen JYC, Procci K, Boyce M, Wright J, Garcia A, Barnes MJ (2014) Situation awareness-based agent transparency. US Army Research Laboratory","DOI":"10.21236\/ADA600351"},{"key":"332_CR5","doi-asserted-by":"publisher","unstructured":"Chiu CC, Marsella S (2011) How to train your avatar: a data driven approach to gesture generation. In: International workshop on intelligent virtual agents. Springer, pp 127\u2013140, https:\/\/doi.org\/10.1007\/978-3-642-23974-8_14","DOI":"10.1007\/978-3-642-23974-8_14"},{"key":"332_CR6","unstructured":"De Graaf MMA, Malle BF (2017) How people explain action (and autonomous intelligent systems should too). In: AAAI 2017 fall symposium on AI-HRI, pp 19\u201326"},{"issue":"2","key":"332_CR7","doi-asserted-by":"publisher","first-page":"167","DOI":"10.1023\/B:VISI.0000022288.19776.77","volume":"59","author":"PF Felzenszwalb","year":"2004","unstructured":"Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167\u2013181. https:\/\/doi.org\/10.1023\/B:VISI.0000022288.19776.77","journal-title":"Int J Comput Vis"},{"issue":"1","key":"332_CR8","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1016\/j.tics.2003.10.016","volume":"8","author":"S Garrod","year":"2004","unstructured":"Garrod S, Pickering MJ (2004) Why is conversation so easy? Trends Cogn Sci 8(1):8\u201311. https:\/\/doi.org\/10.1016\/j.tics.2003.10.016","journal-title":"Trends Cogn Sci"},{"key":"332_CR9","unstructured":"Gatt A, Paggio P (2014) Learning when to point: a data-driven approach. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: Technical Papers, pp 2007\u20132017"},{"key":"332_CR10","doi-asserted-by":"crossref","unstructured":"Gilpin LH, Bau D, Yuan BZ, Bajwa A, Specter M, Kagal L (2018) Explaining explanations: an approach to evaluating interpretability of machine learning. arXiv:1806.00069","DOI":"10.1109\/DSAA.2018.00018"},{"key":"332_CR11","unstructured":"Gunning D (2017) Explainable artificial intelligence (XAI). Defense Advanced Research Projects Agency (DARPA)"},{"issue":"3","key":"332_CR12","doi-asserted-by":"publisher","first-page":"407","DOI":"10.1177\/0018720814547570","volume":"57","author":"KA Hoff","year":"2015","unstructured":"Hoff KA, Bashir M (2015) Trust in automation: integrating empirical evidence on factors that influence trust. Hum Factors 57(3):407\u2013434. https:\/\/doi.org\/10.1177\/0018720814547570","journal-title":"Hum Factors"},{"key":"332_CR13","doi-asserted-by":"publisher","unstructured":"Hoffman JD, Patterson MJ, Lee JD, Crittendon ZB, Stoner HA, Seppelt BD, Linegang MP (2006) Human-automation collaboration in dynamic mission planning: a challenge requiring an ecological approach. In: Proceedings of the human factors and ergonomics society annual meeting, vol 50(23), pp 2482\u20132486. https:\/\/doi.org\/10.1177\/154193120605002304","DOI":"10.1177\/154193120605002304"},{"key":"332_CR14","doi-asserted-by":"publisher","DOI":"10.1207\/S15327566IJCE0401_04","author":"JY Jian","year":"2000","unstructured":"Jian JY, Bisantz AM, Drury CG (2000) Foundations for an empirically determined scale of trust in automated systems. Int J Cognit Ergon. https:\/\/doi.org\/10.1207\/S15327566IJCE0401_04","journal-title":"Int J Cognit Ergon"},{"key":"332_CR15","doi-asserted-by":"publisher","first-page":"326","DOI":"10.1016\/j.csl.2017.01.005","volume":"45","author":"T Kisler","year":"2017","unstructured":"Kisler T, Reichel U, Schiel F (2017) Multilingual processing of speech via web services. Comput Speech Lang 45:326\u2013347. https:\/\/doi.org\/10.1016\/j.csl.2017.01.005","journal-title":"Comput Speech Lang"},{"key":"332_CR16","doi-asserted-by":"crossref","unstructured":"Lane HC, Core MG, Van Lent M, Solomon S, Gomboc D (2005) Explainable artificial intelligence for training and tutoring. University of Southern California\/Institute for Creative Technologies, Tech. rep","DOI":"10.21236\/ADA459166"},{"issue":"1","key":"332_CR17","doi-asserted-by":"publisher","first-page":"50","DOI":"10.1518\/hfes.46.1.50.30392","volume":"46","author":"JD Lee","year":"2004","unstructured":"Lee JD, See KA (2004) Trust in automation: designing for appropriate reliance. Hum Factors 46(1):50\u201380","journal-title":"Hum Factors"},{"issue":"10","key":"332_CR18","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1145\/3233231","volume":"61","author":"ZC Lipton","year":"2018","unstructured":"Lipton ZC (2018) The mythos of model interpretability. Commun ACM 61(10):36\u201343. https:\/\/doi.org\/10.1145\/3233231","journal-title":"Commun ACM"},{"issue":"3","key":"332_CR19","doi-asserted-by":"publisher","first-page":"401","DOI":"10.1177\/0018720815621206","volume":"58","author":"JE Mercado","year":"2016","unstructured":"Mercado JE, Rupp MA, Chen JY, Barnes MJ, Barber D, Procci K (2016) Intelligent agent transparency in human-agent teaming for multi-UxV management. Hum Factors 58(3):401\u2013415. https:\/\/doi.org\/10.1177\/0018720815621206","journal-title":"Hum Factors"},{"key":"332_CR20","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.artint.2018.07.007","volume":"267","author":"T Miller","year":"2018","unstructured":"Miller T (2018) Explanation in artificial intelligence: insights from the social sciences. Artif Intell 267:1\u201338. https:\/\/doi.org\/10.1016\/j.artint.2018.07.007","journal-title":"Artif Intell"},{"key":"332_CR21","unstructured":"Miller T, Howe P, Sonenberg L (2017) Explainable AI: beware of inmates running the asylum. In: IJCAI International joint conference on artificial intelligence, arXiv:1712.00547"},{"key":"332_CR22","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.dsp.2017.10.011","volume":"73","author":"G Montavon","year":"2017","unstructured":"Montavon G, Samek W, M\u00fcller KR (2017) Methods for interpreting and understanding deep neural networks. Digit Signal Process 73:1\u201315. https:\/\/doi.org\/10.1016\/j.dsp.2017.10.011","journal-title":"Digit Signal Process"},{"key":"332_CR23","doi-asserted-by":"publisher","unstructured":"Park DH, Hendricks LA, Akata Z, Rohrbach A, Schiele B, Darrell T, Rohrbach M (2018) Multimodal explanations: Justifying decisions and pointing to the evidence. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pp 8779\u20138788, https:\/\/doi.org\/10.1109\/CVPR.2018.00915","DOI":"10.1109\/CVPR.2018.00915"},{"issue":"1","key":"332_CR24","doi-asserted-by":"publisher","first-page":"20","DOI":"10.5334\/irsp.181","volume":"31","author":"M Perugini","year":"2018","unstructured":"Perugini M, Gallucci M, Costantini G (2018) A practical primer to power analysis for simple experimental designs. Int Rev Soc Psychol 31(1):20. https:\/\/doi.org\/10.5334\/irsp.181","journal-title":"Int Rev Soc Psychol"},{"key":"332_CR25","unstructured":"Ravenet B, Clavel C, Pelachaud C (2018) Automatic nonverbal behavior generation from image schemas. In: Proceedings of the 17th international conference on autonomous agents and multiagent systems, pp 1667\u20131674"},{"key":"332_CR26","doi-asserted-by":"publisher","unstructured":"Ribeiro MT, Singh S, Guestrin C (2016) Why should i trust you? Explaining the predictions of any classifier. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1135\u20131144, https:\/\/doi.org\/10.1145\/2939672.2939778","DOI":"10.1145\/2939672.2939778"},{"key":"332_CR27","unstructured":"Richardson A, Rosenfeld A (2018) A survey of interpretability and explainability in human-agent systems. In: Proceedings of the 2nd workshop of explainable artificial intelligence, pp 137\u2013143"},{"key":"332_CR28","first-page":"1478","volume":"2015","author":"TN Sainath","year":"2015","unstructured":"Sainath TN, Parada C (2015) Convolutional neural networks for small-footprint keyword spotting. Proc Interspeech 2015:1478\u20131482","journal-title":"Proc Interspeech"},{"key":"332_CR29","unstructured":"Samek W, Wiegand T, M\u00fcller KR (2017) Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. arXiv preprint arXiv:1708.08296 pp 1\u20138"},{"key":"332_CR30","unstructured":"Schmid U (2018) Inductive programming as approach to comprehensible machine learning. In: Proceedings of the 7th workshop on dynamics of knowledge and belief (DKB-2018) and the 6th workshop KI & Kognition (KIK-2018), co-located with 41st German conference on artificial intelligence, vol 2194"},{"key":"332_CR31","doi-asserted-by":"crossref","unstructured":"Selvaraju RR, Das A, Vedantam R, Cogswell M, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: The IEEE international conference on computer vision (ICCV) 2017, pp 618\u2013626","DOI":"10.1109\/ICCV.2017.74"},{"issue":"2","key":"332_CR32","doi-asserted-by":"publisher","first-page":"336","DOI":"10.1007\/s11263-019-01228-7","volume":"128","author":"RR Selvaraju","year":"2020","unstructured":"Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2020) Grad-cam: visual explanations from deep networks via gradient-based localization. Int J Comput Vis 128(2):336\u2013359","journal-title":"Int J Comput Vis"},{"key":"332_CR33","doi-asserted-by":"publisher","unstructured":"Siebers M, Schmid U (2018) Please delete that! why should I? Explaining learned irrelevance classifications of digital objects. KI - K\u00fcnstliche Intelligenz. https:\/\/doi.org\/10.1007\/s13218-018-0565-5","DOI":"10.1007\/s13218-018-0565-5"},{"key":"332_CR34","unstructured":"Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: visualising image classification models and saliency maps. http:\/\/arxiv.org\/abs\/1312.6034, arXiv:1312.6034"},{"issue":"2","key":"332_CR35","doi-asserted-by":"publisher","first-page":"42","DOI":"10.1109\/MIS.2007.21","volume":"22","author":"K Stubbs","year":"2007","unstructured":"Stubbs K, Hinds PJ, Wettergreen D (2007) Autonomy and common ground in human-robot interaction: a field study. IEEE Intell Syst 22(2):42\u201350. https:\/\/doi.org\/10.1109\/MIS.2007.21","journal-title":"IEEE Intell Syst"},{"key":"332_CR36","unstructured":"Susan\u00a0Robinson MI David\u00a0Traum, Henderer J (2008) What would you ask a conversational agent? observations of human-agent dialogues in a museum setting. In: Proceedings of the 6th international conference on language resources and evaluation (LREC\u201908), European Language Resources Association"},{"issue":"1","key":"332_CR37","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1037\/h0071663","volume":"4","author":"EL Thorndike","year":"1920","unstructured":"Thorndike EL (1920) A constant error in psychological ratings. J Appl Psychol 4(1):25\u201329","journal-title":"J Appl Psychol"},{"key":"332_CR38","doi-asserted-by":"publisher","unstructured":"Van\u00a0Mulken S, Andr\u00e9 E, M\u00fcller J (1998) The persona effect: how substantial is it? In: People and computers XIII. Springer, pp 53\u201366, https:\/\/doi.org\/10.1007\/978-1-4471-36057_4","DOI":"10.1007\/978-1-4471-36057_4"},{"key":"332_CR39","unstructured":"Van\u00a0Mulken S, Andr\u00e9 E, M\u00fcller J (1999) An empirical study on the trustworthiness of life-like interface agents. In: Human\u2013Computer interaction: communication, cooperation, and application design, proceedings of 8th international conference on human\u2013computer interaction, 1999, pp 152\u2013156"},{"key":"332_CR40","unstructured":"Vinyals O, Le QV (2015) A neural conversational model. arXiv preprint arXiv:1506.05869"},{"key":"332_CR41","doi-asserted-by":"publisher","first-page":"147","DOI":"10.21437\/Interspeech.2018-1238","volume":"2018","author":"J Wagner","year":"2018","unstructured":"Wagner J, Schiller D, Seiderer A, Andr\u00e9 E (2018) Deep learning in paralinguistic recognition tasks: are hand-crafted features still relevant? Proc Interspeech 2018:147\u2013151","journal-title":"Proc Interspeech"},{"key":"332_CR42","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/j.patrec.2018.02.010","volume":"119","author":"J Wang","year":"2018","unstructured":"Wang J, Chen Y, Hao S, Peng X, Hu L (2018) Deep learning for sensor-based activity recognition: a survey. Pattern Recognit Lett 119:3\u201311. https:\/\/doi.org\/10.1016\/j.patrec.2018.02.010","journal-title":"Pattern Recognit Lett"},{"key":"332_CR43","unstructured":"Warden P (2018) Speech commands: a dataset for limited-vocabulary speech recognition. arXiv:1804.03209v1"},{"key":"332_CR44","doi-asserted-by":"publisher","unstructured":"Weitz K, Hassan T, Schmid U, Garbas JU (2019a) Deep-learned faces of pain and emotions: Elucidating the differences of facial expressions with the help of explainable ai methods. tm-Technisches Messen 86(7-8):404\u2013412, https:\/\/doi.org\/10.1515\/teme-2019-0024","DOI":"10.1515\/teme-2019-0024"},{"key":"332_CR45","doi-asserted-by":"publisher","unstructured":"Weitz K, Schiller D, Schlagowski R, Huber T, Andr\u00e9 E (2019b) \u201cDo you trust me?\u201d: Increasing user-trust by integrating virtual agents in explainable ai interaction design. In: Proceedings of the 19th ACM international conference on intelligent virtual agents. ACM, New York, NY, USA, IVA \u201919, pp 7\u20139, https:\/\/doi.org\/10.1145\/3308532.3329441","DOI":"10.1145\/3308532.3329441"},{"key":"332_CR46","doi-asserted-by":"publisher","unstructured":"Wu J, Ghosh S, Chollet M, Ly S, Mozgai S, Scherer S (2018) Nadia: Neural network driven virtual human conversation agents. In: Proceedings of the 18th international conference on intelligent virtual agents. ACM, pp 173\u2013178, https:\/\/doi.org\/10.1145\/3267851.3267860","DOI":"10.1145\/3267851.3267860"},{"key":"332_CR47","doi-asserted-by":"publisher","unstructured":"Zhang Z, Geiger J, Pohjalainen J, Mousa AED, Jin W, Schuller B (2018) Deep learning for environmentally robust speech recognition: an overview of recent developments. ACM Trans Intell Syst Technol (TIST) 9(5):49:1\u201349:28, https:\/\/doi.org\/10.1145\/3178115","DOI":"10.1145\/3178115"}],"container-title":["Journal on Multimodal User Interfaces"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s12193-020-00332-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s12193-020-00332-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s12193-020-00332-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,9]],"date-time":"2021-07-09T00:05:07Z","timestamp":1625789107000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s12193-020-00332-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,9]]},"references-count":47,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,6]]}},"alternative-id":["332"],"URL":"https:\/\/doi.org\/10.1007\/s12193-020-00332-0","relation":{},"ISSN":["1783-7677","1783-8738"],"issn-type":[{"value":"1783-7677","type":"print"},{"value":"1783-8738","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,7,9]]},"assertion":[{"value":"2 November 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 June 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 July 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with ethical standards"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflicts of interest"}}]}}