{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,2]],"date-time":"2025-10-02T10:46:22Z","timestamp":1759401982095},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,8]]},"abstract":"<jats:p>We introduce a novel apprenticeship learning algorithm to learn an expert's underlying reward structure in off-policy model-free batch settings. Unlike existing methods that require hand-crafted features, on-policy evaluation, further data acquisition for evaluation policies or the knowledge of model dynamics, our algorithm requires only batch data (demonstrations) of the observed expert behavior.\u00a0 Such settings are common in many real-world tasks---health care, finance, or industrial process control---where accurate simulators do not exist and additional data acquisition is costly.\u00a0 We develop a transition-regularized imitation learning model to learn a rich feature representation and a near-expert initial policy that makes the subsequent batch inverse reinforcement learning process viable. We also introduce deep successor feature networks that perform off-policy evaluation to estimate feature expectations of candidate policies. Under the batch setting, our method achieves superior results on control benchmarks as well as a real clinical task of sepsis management in the Intensive Care Unit.<\/jats:p>","DOI":"10.24963\/ijcai.2019\/819","type":"proceedings-article","created":{"date-parts":[[2019,7,28]],"date-time":"2019-07-28T03:46:05Z","timestamp":1564285565000},"page":"5909-5915","source":"Crossref","is-referenced-by-count":6,"title":["Truly Batch Apprenticeship Learning with Deep Successor Features"],"prefix":"10.24963","author":[{"given":"Donghun","family":"Lee","sequence":"first","affiliation":[{"name":"SEAS, Harvard University"}]},{"given":"Srivatsan","family":"Srinivasan","sequence":"additional","affiliation":[{"name":"SEAS, Harvard University"}]},{"given":"Finale","family":"Doshi-Velez","sequence":"additional","affiliation":[{"name":"SEAS, Harvard University"}]}],"member":"10584","event":{"number":"28","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"acronym":"IJCAI-2019","name":"Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}","start":{"date-parts":[[2019,8,10]]},"theme":"Artificial Intelligence","location":"Macao, China","end":{"date-parts":[[2019,8,16]]}},"container-title":["Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2019,7,28]],"date-time":"2019-07-28T03:52:02Z","timestamp":1564285922000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2019\/819"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2019,8]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2019\/819","relation":{},"subject":[],"published":{"date-parts":[[2019,8]]}}}