{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T02:04:58Z","timestamp":1774577098574,"version":"3.50.1"},"reference-count":71,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2024,9,30]],"date-time":"2024-09-30T00:00:00Z","timestamp":1727654400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Auton. Adapt. Syst."],"published-print":{"date-parts":[[2024,9,30]]},"abstract":"<jats:p>Online reinforcement learning (RL) is increasingly used for realizing adaptive systems in the presence of design time uncertainty because Online RL can leverage data only available at run time. With Deep RL gaining interest, the learned knowledge is no longer represented explicitly but hidden in the parameterization of the underlying artificial neural network. For a human, it thus becomes practically impossible to understand the decision-making of Deep RL, which makes it difficult for (1) software engineers to perform debugging, (2) system providers to comply with relevant legal frameworks, and (3) system users to build trust. The explainable RL technique XRL-DINE, introduced in earlier work, provides insights into why certain decisions were made at important time steps. Here, we perform an empirical user study concerning XRL-DINE involving 73 software engineers split into treatment and control groups. The treatment group is given access to XRL-DINE, while the control group is not. We analyze (1) the participants\u2019 performance in answering concrete questions related to the decision-making of Deep RL, (2) the participants\u2019 self-assessed confidence in giving the right answers, (3) the perceived usefulness and ease of use of XRL-DINE, and (4) the concrete usage of the XRL-DINE dashboard.<\/jats:p>","DOI":"10.1145\/3666005","type":"journal-article","created":{"date-parts":[[2024,7,22]],"date-time":"2024-07-22T11:56:58Z","timestamp":1721649418000},"page":"1-44","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["A User Study on Explainable Online Reinforcement Learning for Adaptive Systems"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4808-8297","authenticated-orcid":false,"given":"Andreas","family":"Metzger","sequence":"first","affiliation":[{"name":"paluno (Ruhr Institute for Software Technology), University of Duisburg-Essen, Essen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3339-1760","authenticated-orcid":false,"given":"Jan","family":"Laufer","sequence":"additional","affiliation":[{"name":"paluno (Ruhr Institute for Software Technology), University of Duisburg-Essen, Essen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-2585-0062","authenticated-orcid":false,"given":"Felix","family":"Feit","sequence":"additional","affiliation":[{"name":"paluno (Ruhr Institute for Software Technology), University of Duisburg-Essen, Essen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2199-5257","authenticated-orcid":false,"given":"Klaus","family":"Pohl","sequence":"additional","affiliation":[{"name":"paluno (Ruhr Institute for Software Technology), University of Duisburg-Essen, Essen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,9,30]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1186\/s13174-021-00145-8"},{"key":"e_1_3_2_3_2","first-page":"13544","volume-title":"Proceedings of the Annual Conference on Neural Information Processing Systems 2019 (NeurIPS \u201919)","author":"Arjona-Medina Jose A.","year":"2019","unstructured":"Jose A. Arjona-Medina, Michael Gillhofer, Michael Widrich, Thomas Unterthiner, Johannes Brandstetter, and Sepp Hochreiter. 2019. RUDDER: Return decomposition for delayed rewards. In Proceedings of the Annual Conference on Neural Information Processing Systems 2019 (NeurIPS \u201919). Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d\u2019Alch\u00e9-Buc, Emily B. Fox, and Roman Garnett (Eds.), 13544\u201313555."},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICECCS20050.2012.6299211"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACSOS49614.2020.00047"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-44482-6_4"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/2768829"},{"issue":"3","key":"e_1_3_2_8_2","first-page":"61:1","article-title":"A survey and taxonomy of self-aware and self-adaptive cloud autoscaling systems","volume":"51","author":"Chen Tao","year":"2018","unstructured":"Tao Chen, Rami Bahsoon, and Xin Yao. 2018. A survey and taxonomy of self-aware and self-adaptive cloud autoscaling systems. ACM Comput. Surv. 51, 3 (2018), 61:1\u201361:40.","journal-title":"ACM Comput. Surv."},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-021-09972-6"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.2307\/249008"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/TEM.2003.822468"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2019.07.002"},{"key":"e_1_3_2_13_2","volume-title":"Proceedings of the 2014 AAAI Spring Symposia","author":"Dewey Daniel","year":"2014","unstructured":"Daniel Dewey. 2014. Reinforcement learning and the reward engineering principle. In Proceedings of the 2014 AAAI Spring Symposia. AAAI Press."},{"key":"e_1_3_2_14_2","unstructured":"Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv:1702.08608. Retrieved from http:\/\/arxiv.org\/abs\/1702.08608"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACSOS58161.2023.00028"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1201\/9780203485088"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACSOS55765.2022.00023"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-15565-9_3"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/FAS-W.2019.00018"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1007\/s40860-022-00198-x"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3236009"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2020.106685"},{"key":"e_1_3_2_24_2","volume-title":"Proceedings of the IJCAI\/ECAI Workshop on Explainable Artificial Intelligence","author":"Juozapaitis Zoe","year":"2019","unstructured":"Zoe Juozapaitis, Anurag Koul, Alan Fern, Martin Erwig, and Finale Doshi-Velez. 2019. Explainable reinforcement learning via reward decomposition. In Proceedings of the IJCAI\/ECAI Workshop on Explainable Artificial Intelligence."},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/381641.381656"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2003.1160055"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.im.2006.05.003"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2002.1027796"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1037\/\/0022-3514.77.6.1121"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/METRIC.1998.731237"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/SEAMS51251.2021.00017"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/s12559-022-10067-7"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177730491"},{"key":"e_1_3_2_34_2","doi-asserted-by":"crossref","unstructured":"Nikola Maranguni\u0107 and Andrina Grani\u0107. 2015. Technology acceptance model: A literature review from 1986 to 2013. Univers. Access Inf. Soc. 14 1 (2015) 81\u201395.","DOI":"10.1007\/s10209-014-0348-1"},{"key":"e_1_3_2_35_2","doi-asserted-by":"crossref","unstructured":"Alfonso Eduardo M\u00e1rquez-Chamorro Manuel Resinas and Antonio Ruiz-Cort\u00e9s. 2018. Predictive Monitoring of Business Processes: A Survey. IEEE Trans. Serv. Comput. 11 6 (2018) 962\u2013977.","DOI":"10.1109\/TSC.2017.2772256"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-48421-6_22"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2023.102254"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-24755-2_22"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00607-022-01052-x"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2018.07.007"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3387166"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/SEAMS.2017.2"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3194133.3194163"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/52.595956"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.5555\/2680842.2681219"},{"key":"e_1_3_2_47_2","first-page":"2772","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 12 (NIPS \u201917)","author":"Nachum Ofir","year":"2017","unstructured":"Ofir Nachum, Mohammad Norouzi, Kelvin Xu, and Dale Schuurmans. 2017. Bridging the gap between value and policy based reinforcement learning. In Proceedings of the Advances in Neural Information Processing Systems 12 (NIPS \u201917). 2772\u20132782."},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11257-017-9195-0"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-49435-3_11"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/1414004.1414029"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACSOS49614.2020.00039"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-57321-8_5"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/SEAMS51251.2021.00014"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1145\/3417990.3419503"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939778"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-019-0048-x"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1145\/1516533.1516538"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-28954-6_1"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2020.103367"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2021.103457"},{"key":"e_1_3_2_61_2","volume-title":"Reinforcement Learning: An introduction","author":"Sutton Richard S.","year":"2018","unstructured":"Richard S. Sutton and Andrew G Barto. 2018. Reinforcement Learning: An introduction. MIT Press."},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10270-021-00952-4"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2019.05.020"},{"issue":"6","key":"e_1_3_2_64_2","first-page":"70","article-title":"The challenge of crafting intelligible intelligence","volume":"62","author":"Weld Daniel S.","year":"2019","unstructured":"Daniel S. Weld and Gagan Bansal. 2019. The challenge of crafting intelligible intelligence. Commun. ACM 62, 6 (2019), 70\u201379.","journal-title":"Commun."},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-44871-7_5"},{"key":"e_1_3_2_66_2","volume-title":"An Introduction to Self-Adaptive Systems: A Contemporary Software Engineering Perspective","author":"Weyns Danny","year":"2020","unstructured":"Danny Weyns. 2020. An Introduction to Self-Adaptive Systems: A Contemporary Software Engineering Perspective. John Wiley & Sons."},{"key":"e_1_3_2_67_2","first-page":"31","volume-title":"Software Engineering for Self-Adaptive Systems III. Assurances.","author":"Weyns Danny","year":"2013","unstructured":"Danny Weyns, Nelly Bencomo, Radu Calinescu, Javier C\u00e1mara, Carlo Ghezzi, Vincenzo Grassi, Lars Grunske, Paola Inverardi, Jean-Marc J\u00e9z\u00e9quel, Sam Malek, Raffaela Mirandola, Marco Mori, and Giordano Tamburrelli. 2013. Perpetual assurances for self-adaptive systems. In Software Engineering for Self-Adaptive Systems III. Assurances. Rog\u00e9rio de Lemos, David Garlan, Carlo Ghezzi, and Holger Giese (Eds.), Lecture Notes in Computer Science, Vol. 9640, Springer, 31\u201363."},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1145\/3589227"},{"key":"e_1_3_2_69_2","first-page":"76","article-title":"On patterns for decentralized control in self-adaptive systems","volume":"7475","author":"Weyns Danny","year":"2010","unstructured":"Danny Weyns, Bradley R. Schmerl, Vincenzo Grassi, Sam Malek, Raffaela Mirandola, Christian Prehofer, Jochen Wuttke, Jesper Andersson, Holger Giese, and Karl M. G\u00f6schka. 2010. On patterns for decentralized control in self-adaptive systems. In Software Engineering for Self-Adaptive Systems II - International Seminar. Rog\u00e9rio de Lemos, Holger Giese, Hausi A. M\u00fcller, and Mary Shaw (Eds.), Lecture Notes in Computer Science, Vol. 7475, Springer, 76\u2013107.","journal-title":"Software Engineering for Self-Adaptive Systems II - International Seminar"},{"key":"e_1_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2022.111538"},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACSOS52086.2021.00024"},{"key":"e_1_3_2_72_2","doi-asserted-by":"publisher","DOI":"10.23919\/DATE51398.2021.9474232"}],"container-title":["ACM Transactions on Autonomous and Adaptive Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3666005","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3666005","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:53:46Z","timestamp":1750287226000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3666005"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,30]]},"references-count":71,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,9,30]]}},"alternative-id":["10.1145\/3666005"],"URL":"https:\/\/doi.org\/10.1145\/3666005","relation":{},"ISSN":["1556-4665","1556-4703"],"issn-type":[{"value":"1556-4665","type":"print"},{"value":"1556-4703","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,9,30]]},"assertion":[{"value":"2023-03-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-05-13","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-09-30","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}