{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T14:30:10Z","timestamp":1760711410222,"version":"3.44.0"},"publisher-location":"New York, NY, USA","reference-count":48,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,6,17]],"date-time":"2024-06-17T00:00:00Z","timestamp":1718582400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,6,17]]},"DOI":"10.1145\/3626183.3659967","type":"proceedings-article","created":{"date-parts":[[2024,6,4]],"date-time":"2024-06-04T18:23:04Z","timestamp":1717525384000},"page":"41-51","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Efficient Parallel Reinforcement Learning Framework Using the Reactor Model"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-1482-2768","authenticated-orcid":false,"given":"Jacky","family":"Kwok","sequence":"first","affiliation":[{"name":"UC Berkeley, Berkeley, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8833-4117","authenticated-orcid":false,"given":"Marten","family":"Lohstroh","sequence":"additional","affiliation":[{"name":"UC Berkeley, Berkeley, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5663-0584","authenticated-orcid":false,"given":"Edward A.","family":"Lee","sequence":"additional","affiliation":[{"name":"UC Berkeley, Berkeley, USA"}]}],"member":"320","published-online":{"date-parts":[[2024,6,17]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)","author":"Abadi Mart'in","year":"2016","unstructured":"Mart'in Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 265--283."},{"key":"e_1_3_2_1_2_1","unstructured":"Ilge Akkaya Marcin Andrychowicz Maciek Chociej Mateusz Litwin Bob McGrew Arthur Petron Alex Paino Matthias Plappert Glenn Powell Raphael Ribas et al. 2019. Solving rubik's cube with a robot hand."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.97297"},{"key":"e_1_3_2_1_4_1","volume-title":"Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, et al.","author":"Bradbury James","year":"2018","unstructured":"James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, et al. 2018. JAX: composable transformations of Python NumPy programs."},{"key":"e_1_3_2_1_5_1","unstructured":"Greg Brockman Vicki Cheung Ludwig Pettersson Jonas Schneider John Schulman Jie Tang and Wojciech Zaremba. 2016. Openai gym."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3230543.3230551"},{"key":"e_1_3_2_1_7_1","unstructured":"Tianqi Chen Mu Li Yutian Li Min Lin Naiyan Wang Minjie Wang Tianjun Xiao Bing Xu Chiyuan Zhang and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. https:\/\/arxiv.org\/abs\/1512.01274"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-21401-6_26"},{"key":"e_1_3_2_1_9_1","volume-title":"Edwards and John Hui","author":"Stephen","year":"2020","unstructured":"Stephen A. Edwards and John Hui. 2020. The Sparse Synchronous Model. In Forum on Specification and Design Languages (FDL). 1--8."},{"key":"e_1_3_2_1_10_1","unstructured":"Lasse Espeholt Rapha\u00ebl Marinier Piotr Stanczyk Ke Wang and Marcin Michalski. 2019. Seed rl: Scalable and efficient deep-rl with accelerated central inference."},{"key":"e_1_3_2_1_11_1","volume-title":"International conference on machine learning. PMLR, 1407--1416","author":"Espeholt Lasse","year":"2018","unstructured":"Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Vlad Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, et al. 2018. Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures. In International conference on machine learning. PMLR, 1407--1416."},{"key":"e_1_3_2_1_12_1","volume-title":"Multithreaded Python without the GIL. https:\/\/docs.google.com\/document\/d\/18CXhDb1ygxg-YXNBJNzfzZsDFosB5e6BfnXLlejd9l0\/edit. [Online","author":"Gross Sam","year":"2024","unstructured":"Sam Gross. 2021. Multithreaded Python without the GIL. https:\/\/docs.google.com\/document\/d\/18CXhDb1ygxg-YXNBJNzfzZsDFosB5e6BfnXLlejd9l0\/edit. [Online; accessed 17-April-2024]."},{"key":"e_1_3_2_1_13_1","unstructured":"Ameer Haj-Ali Nesreen K. Ahmed Theodore L. Willke Joseph Gonzalez Krste Asanovic and Ion Stoica. 2019. Deep Reinforcement Learning in System Optimization. showeprint[arXiv]1908.01275 http:\/\/arxiv.org\/abs\/1908.01275"},{"key":"e_1_3_2_1_14_1","unstructured":"Carl Hewitt. 2010. Actor Model for Discretionary Adaptive Concurrency. showeprint[arXiv]1008.1459 http:\/\/arxiv.org\/abs\/1008.1459"},{"key":"e_1_3_2_1_15_1","volume-title":"Acme: A research framework for distributed reinforcement learning.","author":"Hoffman Matthew W","year":"2020","unstructured":"Matthew W Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Nikola Momchev, Danila Sinopalnikov, Piotr Sta'nczyk, Sabela Ramos, Anton Raichuk, Damien Vincent, et al. 2020. Acme: A research framework for distributed reinforcement learning."},{"key":"e_1_3_2_1_16_1","unstructured":"Dan Horgan John Quan David Budden Gabriel Barth-Maron Matteo Hessel Hado van Hasselt and David Silver. 2018. Distributed Prioritized Experience Replay. showeprint[arXiv]1803.00933 http:\/\/arxiv.org\/abs\/1803.00933"},{"key":"e_1_3_2_1_17_1","unstructured":"Ahmet Inci Evgeny Bolotin Yaosheng Fu Gal Dalal Shie Mannor David Nellans and Diana Marculescu. 2020. The architectural implications of distributed reinforcement learning on CPU-GPU systems."},{"key":"e_1_3_2_1_18_1","volume-title":"Nature","volume":"620","author":"Kaufmann Elia","year":"2023","unstructured":"Elia Kaufmann, Leonard Bauersfeld, Antonio Loquercio, Matthias M\u00fcller, Vladlen Koltun, and Davide Scaramuzza. 2023. Champion-level drone racing using deep reinforcement learning. Nature, Vol. 620, 7976 (2023), 982--987."},{"key":"e_1_3_2_1_19_1","unstructured":"Anurag Koul. 2019. ma-gym: Collection of multi-agent environments based on OpenAI gym. https:\/\/github.com\/koulanurag\/ma-gym."},{"key":"e_1_3_2_1_20_1","first-page":"5506","article-title":"RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem","volume":"34","author":"Liang Eric","year":"2021","unstructured":"Eric Liang, Zhanghao Wu, Michael Luo, Sven Mika, Joseph E Gonzalez, and Ion Stoica. 2021. RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem. Advances in Neural Information Processing Systems , Vol. 34 (2021), 5506--5517.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/11817949_1"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/AAI28263682"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448128"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3617687"},{"key":"e_1_3_2_1_25_1","volume-title":"International conference on machine learning. PMLR","author":"Mnih Volodymyr","year":"2016","unstructured":"Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In International conference on machine learning. PMLR, 1928--1937."},{"key":"e_1_3_2_1_26_1","volume-title":"Riedmiller","author":"Mnih Volodymyr","year":"2013","unstructured":"Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin A. Riedmiller. 2013. Playing Atari with Deep Reinforcement Learning. showeprint[arXiv]1312.5602 http:\/\/arxiv.org\/abs\/1312.5602"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1561\/9781638280576"},{"key":"e_1_3_2_1_28_1","volume-title":"13th USENIX symposium on operating systems design and implementation (OSDI 18)","author":"Moritz Philipp","year":"2018","unstructured":"Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I Jordan, et al. 2018. Ray: A distributed framework for emerging $$AI$$ applications. In 13th USENIX symposium on operating systems design and implementation (OSDI 18). 561--577."},{"key":"e_1_3_2_1_29_1","unstructured":"OpenAI. 2023. GPT-4 Technical Report. https:\/\/arxiv.org\/abs\/2303.08774"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10489-022-04105-y"},{"key":"e_1_3_2_1_31_1","unstructured":"Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in pytorch."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","unstructured":"Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga et al. 2019. PyTorch: An Imperative Style High-Performance Deep Learning Library. https:\/\/doi.org\/10.48550\/arXiv.1912.01703","DOI":"10.48550\/arXiv.1912.01703"},{"key":"e_1_3_2_1_33_1","unstructured":"Xavi Puig Eric Undersander Andrew Szot Mikael Dallaire Cote Ruslan Partsey Jimmy Yang Ruta Desai Alexander William Clegg Michal Hlavac Tiffany Min Theo Gervet Vladim\u00edr Vondru? Vincent-Pierre Berges John Turner Oleksandr Maksymets Zsolt Kira Mrinal Kalakrishnan Jitendra Malik Devendra Singh Chaplot Unnat Jain Dhruv Batra Akshara Rai and Roozbeh Mottaghi. 2023. Habitat 3.0: A Co-Habitat for Humans Avatars and Robots."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3406703"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/947955.947961"},{"key":"e_1_3_2_1_36_1","volume-title":"Provable Determinism for Software in Cyber-Physical Systems. In 15th International Conference on Verified Software: Theories, Tools, and Experiments (VSTTE). 1--22","author":"Rossel Marcus","year":"2023","unstructured":"Marcus Rossel, Shaokai Lin, Marten Lohstroh, Jeronimo Castrillon, and Andres Goens. 2023. Provable Determinism for Software in Cyber-Physical Systems. In 15th International Conference on Verified Software: Theories, Tools, and Experiments (VSTTE). 1--22."},{"key":"e_1_3_2_1_37_1","unstructured":"Mohammad Reza Samsami and Hossein Alimadad. 2020. Distributed deep reinforcement learning: An overview."},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00943"},{"key":"e_1_3_2_1_39_1","unstructured":"John Schulman Filip Wolski Prafulla Dhariwal Alec Radford and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. showeprint[arXiv]1707.06347 http:\/\/arxiv.org\/abs\/1707.06347"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3576914.3587498"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3592427"},{"key":"e_1_3_2_1_42_1","volume-title":"Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al.","author":"Silver David","year":"2016","unstructured":"David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. nature, Vol. 529, 7587 (2016), 484--489."},{"key":"e_1_3_2_1_43_1","unstructured":"Andrew Szot Alex Clegg Eric Undersander Erik Wijmans Yili Zhao John Turner Noah Maestre Mustafa Mukadam Devendra Chaplot Oleksandr Maksymets Aaron Gokaslan Vladimir Vondrus Sameer Dharur Franziska Meier Wojciech Galuba Angel Chang Zsolt Kira Vladlen Koltun Jitendra Malik Manolis Savva and Dhruv Batra. 2021. Habitat 2.0: Training Home Assistants to Rearrange their Habitat. In Advances in Neural Information Processing Systems (NeurIPS). 1--17."},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"crossref","unstructured":"Hado Van Hasselt Arthur Guez and David Silver. 2015. Deep Reinforcement Learning with Double Q-learning.","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"e_1_3_2_1_45_1","volume-title":"Menger: Massively large-scale distributed reinforcement learning.","author":"Yazdanbakhsh Amir","year":"2020","unstructured":"Amir Yazdanbakhsh, Junchao Chen, and Yu Zheng. 2020. Menger: Massively large-scale distributed reinforcement learning."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2020\/8702962","article-title":"Visual navigation with asynchronous proximal policy optimization in artificial agents","volume":"2020","author":"Zeng Fanyu","year":"2020","unstructured":"Fanyu Zeng and Chen Wang. 2020. Visual navigation with asynchronous proximal policy optimization in artificial agents. Journal of Robotics , Vol. 2020 (2020), 1--7.","journal-title":"Journal of Robotics"},{"key":"e_1_3_2_1_47_1","volume-title":"Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handbook of reinforcement learning and control","author":"Zhang Kaiqing","year":"2021","unstructured":"Kaiqing Zhang, Zhuoran Yang, and Tamer Bacs ar. 2021. Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handbook of reinforcement learning and control, Vol. 325, 11 (2021), 321--384."},{"key":"e_1_3_2_1_48_1","volume-title":"MSRL: Distributed Reinforcement Learning with Dataflow Fragments. In 2023 USENIX Annual Technical Conference (USENIX ATC 23)","author":"Zhu Huanzhou","year":"2023","unstructured":"Huanzhou Zhu, Bo Zhao, Gang Chen, Weifeng Chen, Yijie Chen, Liang Shi, Yaodong Yang, Peter Pietzuch, and Lei Chen. 2023. MSRL: Distributed Reinforcement Learning with Dataflow Fragments. In 2023 USENIX Annual Technical Conference (USENIX ATC 23). 977--993. https:\/\/www.usenix.org\/conference\/atc23\/presentation\/zhu-huanzhou"}],"event":{"name":"SPAA '24: 36th ACM Symposium on Parallelism in Algorithms and Architectures","sponsor":["SIGACT ACM Special Interest Group on Algorithms and Computation Theory","SIGARCH ACM Special Interest Group on Computer Architecture","EATCS European Association for Theoretical Computer Science"],"location":"Nantes France","acronym":"SPAA '24"},"container-title":["Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3626183.3659967","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3626183.3659967","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T16:24:39Z","timestamp":1755879879000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3626183.3659967"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,17]]},"references-count":48,"alternative-id":["10.1145\/3626183.3659967","10.1145\/3626183"],"URL":"https:\/\/doi.org\/10.1145\/3626183.3659967","relation":{},"subject":[],"published":{"date-parts":[[2024,6,17]]},"assertion":[{"value":"2024-06-17","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}