{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,13]],"date-time":"2025-05-13T06:28:14Z","timestamp":1747117694692,"version":"3.28.0"},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,11]]},"abstract":"<jats:p>This paper presents PROB-IRM, an approach that learns\n\nrobust reward machines (RMs) for reinforcement learning\n\n(RL) agents from noisy execution traces. The key aspect\n\nof RM-driven RL is the exploitation of a finite-state ma-\n\nchine that decomposes the agent\u2019s task into different sub-\n\ntasks. PROB-IRM uses a state-of-the-art inductive logic pro-\n\ngramming framework robust to noisy examples to learn RMs\n\nfrom noisy traces using the Bayesian posterior degree of be-\n\nliefs, thus ensuring robustness against inconsistencies. Piv-\n\notal for the results is the interleaving between RM learning\n\nand policy learning: a new RM is learned whenever the RL\n\nagent generates a trace that is believed not to be accepted by\n\nthe current RM. To speed up the training of the RL agent,\n\nPROB-IRM employs a probabilistic formulation of reward\n\nshaping that uses the posterior Bayesian beliefs derived from\n\nthe traces. Our experimental analysis shows that PROB-IRM\n\ncan learn (potentially imperfect) RMs from noisy traces and\n\nexploit them to train an RL agent to solve its tasks success-\n\nfully. Despite the complexity of learning the RM from noisy\n\ntraces, agents trained with PROB-IRM perform comparably\n\nto agents provided with handcrafted RMs.<\/jats:p>","DOI":"10.24963\/kr.2024\/85","type":"proceedings-article","created":{"date-parts":[[2024,10,26]],"date-time":"2024-10-26T06:30:28Z","timestamp":1729924228000},"page":"909-919","source":"Crossref","is-referenced-by-count":1,"title":["Learning Robust Reward Machines from Noisy Labels"],"prefix":"10.24963","author":[{"given":"Roko","family":"Para\u0107","sequence":"first","affiliation":[{"name":"Imperial College London"}]},{"given":"Lorenzo","family":"Nodari","sequence":"additional","affiliation":[{"name":"University of Brescia"}]},{"given":"Leo","family":"Ardon","sequence":"additional","affiliation":[{"name":"Imperial College London"}]},{"given":"Daniel","family":"Furelos-Blanco","sequence":"additional","affiliation":[{"name":"Imperial College London"}]},{"given":"Federico","family":"Cerutti","sequence":"additional","affiliation":[{"name":"University of Brescia"},{"name":"Cardiff University"}]},{"given":"Alessandra","family":"Russo","sequence":"additional","affiliation":[{"name":"Imperial College London"}]}],"member":"10584","event":{"name":"21st International Conference on Principles of Knowledge Representation and Reasoning {KR-2023}","theme":"Artificial Intelligence","location":"Hanoi, Vietnam","acronym":"KR-2024","number":"21","sponsor":["Artificial Intelligence Journal","Principles of Knowledge Representation and Reasoning Inc.","Academic College of Tel-Aviv","European Association for Artificial Intelligence","National Science Foundation"],"start":{"date-parts":[[2024,11,1]]},"end":{"date-parts":[[2024,11,8]]}},"container-title":["Proceedings of the TwentyFirst International Conference on Principles of Knowledge Representation and Reasoning"],"original-title":[],"deposited":{"date-parts":[[2024,10,26]],"date-time":"2024-10-26T06:30:46Z","timestamp":1729924246000},"score":1,"resource":{"primary":{"URL":"https:\/\/proceedings.kr.org\/2024\/85"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2024,11]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/kr.2024\/85","relation":{},"subject":[],"published":{"date-parts":[[2024,11]]}}}