{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T16:59:41Z","timestamp":1777654781076,"version":"3.51.4"},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,7]]},"abstract":"<jats:p>Learning rational behaviors in open-world games like Minecraft remains to be challenging for Reinforcement Learning (RL) research due to the compound challenge of partial observability, high-dimensional visual perception and delayed reward. To address this, we propose JueWu-MC, a sample-efficient hierarchical RL approach equipped with representation learning and imitation learning to deal with perception and exploration. Specifically, our approach includes two levels of hierarchy, where the high-level controller learns a policy to control over options and the low-level workers learn to solve each sub-task. To boost the learning of sub-tasks, we propose a combination of techniques including 1) action-aware representation learning which captures underlying relations between action and representation, 2) discriminator-based self-imitation learning for efficient exploration, and 3) ensemble behavior cloning with consistency filtering for policy robustness. Extensive experiments show that JueWu-MC significantly improves sample efficiency and outperforms a set of baselines by a large margin. Notably, we won the championship of the NeurIPS MineRL 2021 research competition and achieved the highest performance score ever.<\/jats:p>","DOI":"10.24963\/ijcai.2022\/452","type":"proceedings-article","created":{"date-parts":[[2022,7,16]],"date-time":"2022-07-16T02:55:56Z","timestamp":1657940156000},"page":"3257-3263","source":"Crossref","is-referenced-by-count":9,"title":["JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning"],"prefix":"10.24963","author":[{"given":"Zichuan","family":"Lin","sequence":"first","affiliation":[{"name":"Tencent AI Lab"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Junyou","family":"Li","sequence":"additional","affiliation":[{"name":"Tencent AI Lab"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianing","family":"Shi","sequence":"additional","affiliation":[{"name":"Tencent AI Lab"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Deheng","family":"Ye","sequence":"additional","affiliation":[{"name":"Tencent AI Lab"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qiang","family":"Fu","sequence":"additional","affiliation":[{"name":"Tencent AI Lab"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei","family":"Yang","sequence":"additional","affiliation":[{"name":"Tencent AI Lab"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"10584","event":{"name":"Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}","theme":"Artificial Intelligence","location":"Vienna, Austria","acronym":"IJCAI-2022","number":"31","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"start":{"date-parts":[[2022,7,23]]},"end":{"date-parts":[[2022,7,29]]}},"container-title":["Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2022,7,18]],"date-time":"2022-07-18T11:09:48Z","timestamp":1658142588000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2022\/452"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2022,7]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2022\/452","relation":{},"subject":[],"published":{"date-parts":[[2022,7]]}}}