{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:21:08Z","timestamp":1750220468044,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":26,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,17]],"date-time":"2021-10-17T00:00:00Z","timestamp":1634428800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key R&D Program of China","award":["2020AAA0103500;2020AAA0103501"],"award-info":[{"award-number":["2020AAA0103500;2020AAA0103501"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62006006"],"award-info":[{"award-number":["62006006"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,17]]},"DOI":"10.1145\/3474085.3478323","type":"proceedings-article","created":{"date-parts":[[2021,10,18]],"date-time":"2021-10-18T20:00:05Z","timestamp":1634587205000},"page":"3759-3762","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Efficient Reinforcement Learning Development with RLzoo"],"prefix":"10.1145","author":[{"given":"Zihan","family":"Ding","sequence":"first","affiliation":[{"name":"Imperial College London, London, United Kingdom"}]},{"given":"Tianyang","family":"Yu","sequence":"additional","affiliation":[{"name":"Nanchang University, Nanchang, China"}]},{"given":"Hongming","family":"Zhang","sequence":"additional","affiliation":[{"name":"Peking University, Beijing, China"}]},{"given":"Yanhua","family":"Huang","sequence":"additional","affiliation":[{"name":"Xiaohongshu Technology Co., Shanghai, China"}]},{"given":"Guo","family":"Li","sequence":"additional","affiliation":[{"name":"Imperial College London, London, United Kingdom"}]},{"given":"Quancheng","family":"Guo","sequence":"additional","affiliation":[{"name":"University of Edinburgh, Edinburgh, United Kingdom"}]},{"given":"Luo","family":"Mai","sequence":"additional","affiliation":[{"name":"University of Edinburgh, Edinburgh, United Kingdom"}]},{"given":"Hao","family":"Dong","sequence":"additional","affiliation":[{"name":"Peking University, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2021,10,17]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/3026877.3026899"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2020.2966414"},{"key":"e_1_3_2_1_3_1","volume-title":"Openai gym. arXiv preprint arXiv:1606.01540","author":"Brockman Greg","year":"2016","unstructured":"Greg Brockman , Vicki Cheung , Ludwig Pettersson , Jonas Schneider , John Schulman , Jie Tang , and Wojciech Zaremba . 2016. Openai gym. arXiv preprint arXiv:1606.01540 ( 2016 ). Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. Openai gym. arXiv preprint arXiv:1606.01540 (2016)."},{"key":"#cr-split#-e_1_3_2_1_4_1.1","unstructured":"Itai Caspi Gal Leibovich Gal Novik and Shadi Endrawis. 2017. Reinforcement Learning Coach. https:\/\/doi.org\/10.5281\/zenodo.1134899 10.5281\/zenodo.1134899"},{"key":"#cr-split#-e_1_3_2_1_4_1.2","unstructured":"Itai Caspi Gal Leibovich Gal Novik and Shadi Endrawis. 2017. Reinforcement Learning Coach. https:\/\/doi.org\/10.5281\/zenodo.1134899"},{"key":"e_1_3_2_1_5_1","unstructured":"Carlo D'Eramo Davide Tateo Andrea Bonarini Marcello Restelli and Jan Peters. 2020. MushroomRL: Simplifying Reinforcement Learning Research. https:\/\/github.com\/MushroomRL\/mushroom-rl.  Carlo D'Eramo Davide Tateo Andrea Bonarini Marcello Restelli and Jan Peters. 2020. MushroomRL: Simplifying Reinforcement Learning Research. https:\/\/github.com\/MushroomRL\/mushroom-rl."},{"key":"e_1_3_2_1_6_1","unstructured":"Prafulla Dhariwal Christopher Hesse Oleg Klimov Alex Nichol Matthias Plappert Alec Radford John Schulman Szymon Sidor Yuhuai Wu and Peter Zhokhov. 2017. OpenAI Baselines. https:\/\/github.com\/openai\/baselines.  Prafulla Dhariwal Christopher Hesse Oleg Klimov Alex Nichol Matthias Plappert Alec Radford John Schulman Szymon Sidor Yuhuai Wu and Peter Zhokhov. 2017. OpenAI Baselines. https:\/\/github.com\/openai\/baselines."},{"volume-title":"Deep Reinforcement Learning","author":"Dong Hao","key":"e_1_3_2_1_7_1","unstructured":"Hao Dong , Hao Dong , Zihan Ding , Shanghang Zhang , and Chang. 2020. Deep Reinforcement Learning . Springer . Hao Dong, Hao Dong, Zihan Ding, Shanghang Zhang, and Chang. 2020. Deep Reinforcement Learning. Springer."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3129391"},{"key":"e_1_3_2_1_9_1","volume-title":"International Conference on Machine Learning. PMLR, 1407--1416","author":"Espeholt Lasse","year":"2018","unstructured":"Lasse Espeholt , Hubert Soyer , Remi Munos , Karen Simonyan , Vlad Mnih , Tom Ward , Yotam Doron , Vlad Firoiu , Tim Harley , Iain Dunning , 2018 . Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures . In International Conference on Machine Learning. PMLR, 1407--1416 . Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Vlad Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, et al. 2018. Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures. In International Conference on Machine Learning. PMLR, 1407--1416."},{"key":"e_1_3_2_1_10_1","volume-title":"Garage: A toolkit for reproducible reinforcement learning research. https:\/\/github.com\/rlworkgroup\/garage.","author":"The","year":"2019","unstructured":"The garage contributors. 2019 . Garage: A toolkit for reproducible reinforcement learning research. https:\/\/github.com\/rlworkgroup\/garage. The garage contributors. 2019. Garage: A toolkit for reproducible reinforcement learning research. https:\/\/github.com\/rlworkgroup\/garage."},{"key":"e_1_3_2_1_11_1","volume-title":"Horizon: Facebook's Open Source Applied Reinforcement Learning Platform. arXiv preprint arXiv:1811.00260","author":"Gauci Jason","year":"2018","unstructured":"Jason Gauci , Edoardo Conti , Yitao Liang , Kittipat Virochsiri , Zhengxing Chen , Yuchen He , Zachary Kaden , Vivek Narayanan , and Xiaohui Ye . 2018 . Horizon: Facebook's Open Source Applied Reinforcement Learning Platform. arXiv preprint arXiv:1811.00260 (2018). Jason Gauci, Edoardo Conti, Yitao Liang, Kittipat Virochsiri, Zhengxing Chen, Yuchen He, Zachary Kaden, Vivek Narayanan, and Xiaohui Ye. 2018. Horizon: Facebook's Open Source Applied Reinforcement Learning Platform. arXiv preprint arXiv:1811.00260 (2018)."},{"key":"e_1_3_2_1_12_1","unstructured":"Nicolas Heess Dhruva TB Srinivasan  Sriram Jay Lemmon Josh Merel Greg Wayne Yuval Tassa Tom Erez Ziyu Wang SM Eslami etal 2017. Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286 (2017).  Nicolas Heess Dhruva TB Srinivasan Sriram Jay Lemmon Josh Merel Greg Wayne Yuval Tassa Tom Erez Ziyu Wang SM Eslami et al. 2017. Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286 (2017)."},{"key":"e_1_3_2_1_13_1","volume-title":"Acme: A research framework for distributed reinforcement learning. arXiv preprint arXiv:2006.00979","author":"Hoffman Matt","year":"2020","unstructured":"Matt Hoffman , Bobak Shahriari , John Aslanides , Gabriel Barth-Maron , Feryal Behbahani , Tamara Norman , Abbas Abdolmaleki , Albin Cassirer , Fan Yang , Kate Baumli , 2020 . Acme: A research framework for distributed reinforcement learning. arXiv preprint arXiv:2006.00979 (2020). Matt Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang, Kate Baumli, et al. 2020. Acme: A research framework for distributed reinforcement learning. arXiv preprint arXiv:2006.00979 (2020)."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2020.2974707"},{"key":"e_1_3_2_1_15_1","unstructured":"Dong Yan Hang Su Jun Zhu Jiayi Weng Minghao Zhang. 2020. Tianshou. https:\/\/github.com\/thu-ml\/tianshou.  Dong Yan Hang Su Jun Zhu Jiayi Weng Minghao Zhang. 2020. Tianshou. https:\/\/github.com\/thu-ml\/tianshou."},{"key":"e_1_3_2_1_16_1","unstructured":"Alexander Kuhnle Michael Schaarschmidt and Kai Fricke. 2017. Tensorforce: a TensorFlow library for applied reinforcement learning. Web page. https:\/\/github.com\/tensorforce\/tensorforce  Alexander Kuhnle Michael Schaarschmidt and Kai Fricke. 2017. Tensorforce: a TensorFlow library for applied reinforcement learning. Web page. https:\/\/github.com\/tensorforce\/tensorforce"},{"key":"e_1_3_2_1_17_1","volume-title":"International Conference on Machine Learning. PMLR, 3053--3062","author":"Liang Eric","year":"2018","unstructured":"Eric Liang , Richard Liaw , Robert Nishihara , Philipp Moritz , Roy Fox , Ken Goldberg , Joseph Gonzalez , Michael Jordan , and Ion Stoica . 2018 . RLlib: Abstractions for distributed reinforcement learning . In International Conference on Machine Learning. PMLR, 3053--3062 . Eric Liang, Richard Liaw, Robert Nishihara, Philipp Moritz, Roy Fox, Ken Goldberg, Joseph Gonzalez, Michael Jordan, and Ion Stoica. 2018. RLlib: Abstractions for distributed reinforcement learning. In International Conference on Machine Learning. PMLR, 3053--3062."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352020.3352029"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.5555\/3488766.3488819"},{"key":"e_1_3_2_1_20_1","volume-title":"et almbox","author":"Mnih Volodymyr","year":"2015","unstructured":"Volodymyr Mnih , Koray Kavukcuoglu , David Silver , Andrei A Rusu , Joel Veness , Marc G Bellemare , Alex Graves , Martin Riedmiller , Andreas K Fidjeland , Georg Ostrovski , et almbox . 2015 . Human-level control through deep reinforcement learning. Nature , Vol. 518 , 7540 (2015), 529--533. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et almbox. 2015. Human-level control through deep reinforcement learning. Nature, Vol. 518, 7540 (2015), 529--533."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/3291168.3291210"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00621"},{"key":"e_1_3_2_1_23_1","unstructured":"Matthias Plappert. 2016. keras-rl. https:\/\/github.com\/keras-rl\/keras-rl.  Matthias Plappert. 2016. keras-rl. https:\/\/github.com\/keras-rl\/keras-rl."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/3045118.3045319"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992696"}],"event":{"name":"MM '21: ACM Multimedia Conference","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Virtual Event China","acronym":"MM '21"},"container-title":["Proceedings of the 29th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3474085.3478323","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3474085.3478323","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:48:26Z","timestamp":1750193306000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3474085.3478323"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,17]]},"references-count":26,"alternative-id":["10.1145\/3474085.3478323","10.1145\/3474085"],"URL":"https:\/\/doi.org\/10.1145\/3474085.3478323","relation":{},"subject":[],"published":{"date-parts":[[2021,10,17]]},"assertion":[{"value":"2021-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}