{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T07:32:27Z","timestamp":1723015947240},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,8]]},"abstract":"<jats:p>Learning optimal policies in real-world domains with delayed rewards is a major challenge in Reinforcement Learning. We address the credit assignment problem by proposing a Gaussian Process (GP)-based immediate reward approximation algorithm and evaluate its effectiveness in 4 contexts where rewards can be delayed for long trajectories. In one GridWorld game and 8 Atari games, where immediate rewards are available, our results showed that on 7 out 9 games, the proposed GP-inferred reward policy performed at least as well as the immediate reward policy and significantly outperformed the corresponding delayed reward policy. In e-learning and healthcare applications, we combined GP-inferred immediate rewards with offline Deep Q-Network (DQN) policy induction and showed that the GP-inferred reward policies outperformed the policies induced using delayed rewards in both real-world contexts.<\/jats:p>","DOI":"10.24963\/ijcai.2019\/273","type":"proceedings-article","created":{"date-parts":[[2019,7,28]],"date-time":"2019-07-28T03:46:05Z","timestamp":1564285565000},"page":"1974-1980","source":"Crossref","is-referenced-by-count":9,"title":["Unobserved Is Not Equal to Non-existent: Using Gaussian Processes to Infer Immediate Rewards Across Contexts"],"prefix":"10.24963","author":[{"given":"Hamoon","family":"Azizsoltani","sequence":"first","affiliation":[{"name":"North Carolina State University"}]},{"given":"Yeo Jin","family":"Kim","sequence":"additional","affiliation":[{"name":"North Carolina State University"}]},{"given":"Markel Sanz","family":"Ausin","sequence":"additional","affiliation":[{"name":"North Carolina State University"}]},{"given":"Tiffany","family":"Barnes","sequence":"additional","affiliation":[{"name":"North Carolina State University"}]},{"given":"Min","family":"Chi","sequence":"additional","affiliation":[{"name":"North Carolina State University"}]}],"member":"10584","event":{"number":"28","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"acronym":"IJCAI-2019","name":"Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}","start":{"date-parts":[[2019,8,10]]},"theme":"Artificial Intelligence","location":"Macao, China","end":{"date-parts":[[2019,8,16]]}},"container-title":["Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2019,7,28]],"date-time":"2019-07-28T03:48:02Z","timestamp":1564285682000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2019\/273"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2019,8]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2019\/273","relation":{},"subject":[],"published":{"date-parts":[[2019,8]]}}}