{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,15]],"date-time":"2026-06-15T16:50:02Z","timestamp":1781542202586,"version":"3.54.5"},"reference-count":34,"publisher":"American Association for the Advancement of Science (AAAS)","issue":"106","content-domain":{"domain":["www.science.org"],"crossmark-restriction":true},"short-container-title":["Sci. Robot."],"published-print":{"date-parts":[[2025,9,3]]},"abstract":"<jats:p>Modern robotic manufacturing requires collision-free coordination of multiple robots to complete numerous tasks in shared, obstacle-rich workspaces. Although individual tasks may be simple in isolation, automated joint task allocation, scheduling, and motion planning under spatiotemporal constraints remain computationally intractable for classical methods at real-world scales. Existing multiarm systems deployed in industry rely on human intuition and experience to design feasible trajectories manually in a labor-intensive process. To address this challenge, we propose a reinforcement learning (RL) framework to achieve automated task and motion planning, tested in an obstacle-rich environment with eight robots performing 40 reaching tasks in a shared workspace, where any robot can perform any task in any order. Our approach builds on a graph neural network (GNN) policy trained via RL on procedurally generated environments with diverse obstacle layouts, robot configurations, and task distributions. It uses a graph representation of scenes and a graph policy neural network trained through RL to generate trajectories of multiple robots, jointly solving the subproblems of task allocation, scheduling, and motion planning. Trained on large randomly generated task sets in simulation, our policy generalizes zero-shot to unseen settings with varying robot placements, obstacle geometries, and task poses. We further demonstrate that the high-speed capability of our solution enables its use in workcell layout optimization, improving solution times. The speed and scalability of our planner also open the door to capabilities such as fault-tolerant planning and online perception-based replanning, where rapid adaptation to dynamic task sets is required.<\/jats:p>","DOI":"10.1126\/scirobotics.ads1204","type":"journal-article","created":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T17:59:14Z","timestamp":1756922354000},"update-policy":"https:\/\/doi.org\/10.34133\/aaas_crossmark","source":"Crossref","is-referenced-by-count":9,"title":["RoboBallet: Planning for multirobot reaching with graph neural networks and reinforcement learning"],"prefix":"10.1126","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-0888-0249","authenticated-orcid":true,"given":"Matthew","family":"Lai","sequence":"first","affiliation":[{"name":"Google DeepMind, London, UK."},{"name":"University College London, London, UK."}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-8137-596X","authenticated-orcid":true,"given":"Keegan","family":"Go","sequence":"additional","affiliation":[{"name":"Intrinsic, Mountain View, CA, USA."}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6357-7419","authenticated-orcid":true,"given":"Zhibin","family":"Li","sequence":"additional","affiliation":[{"name":"University College London, London, UK."}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9831-3384","authenticated-orcid":true,"given":"Torsten","family":"Kr\u00f6ger","sequence":"additional","affiliation":[{"name":"Intrinsic, Mountain View, CA, USA."}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5660-1874","authenticated-orcid":true,"given":"Stefan","family":"Schaal","sequence":"additional","affiliation":[{"name":"Intrinsic, Mountain View, CA, USA."}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kelsey","family":"Allen","sequence":"additional","affiliation":[{"name":"Google DeepMind, London, UK."}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jonathan","family":"Scholz","sequence":"additional","affiliation":[{"name":"Google DeepMind, London, UK."}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"221","reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.rcim.2016.08.006"},{"key":"e_1_3_2_3_2","doi-asserted-by":"crossref","unstructured":"Y. Yamada S. Nagamatsu Y. Sato \u201cDevelopment of multi-arm robots for automobile assembly\u201d in Proceedings of 1995 IEEE International Conference on Robotics and Automation (IEEE 1995) vol. 3 pp. 2224\u20132229.","DOI":"10.1109\/ROBOT.1995.525592"},{"key":"e_1_3_2_4_2","doi-asserted-by":"crossref","unstructured":"H. Chen T. Fuhlbrigge X. Li \u201cAutomated industrial robot path planning for spray painting process: A review\u201d in 2008 IEEE International Conference on Automation Science and Engineering (IEEE 2008) pp. 522\u2013527.","DOI":"10.1109\/COASE.2008.4626515"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2022.3198020"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2017.2708134"},{"key":"e_1_3_2_7_2","unstructured":"S. LaValle Rapidly-exploring random trees: A new tool for path planning (Tech. Rep. 98-11 Iowa State University 1998)."},{"key":"e_1_3_2_8_2","doi-asserted-by":"crossref","unstructured":"S. M. LaValle J. J. Kuffner \u201cRapidly-exploring random trees: Progress and prospects\u201d in Algorithmic and Computational Robotics (Taylor-Francis 2001) pp. 303\u2013307.","DOI":"10.1201\/9781439864135-43"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1177\/0278364911406761"},{"key":"e_1_3_2_10_2","doi-asserted-by":"crossref","unstructured":"J. Canny The Complexity of Robot Motion Planning (MIT Press 1988).","DOI":"10.1109\/SFCS.1988.21947"},{"key":"e_1_3_2_11_2","doi-asserted-by":"crossref","unstructured":"W. Vega-Brown N. Roy \u201cTask and motion planning is pspace-complete\u201d in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020) vol. 34 pp. 10385\u201310392.","DOI":"10.1609\/aaai.v34i06.6607"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1112\/plms\/s1-28.1.486"},{"key":"e_1_3_2_13_2","unstructured":"H. Ha J. Xu S. Song Learning a decentralized multi-arm motion planner. arXiv:2011.02608 [cs.RO] (2020)."},{"key":"e_1_3_2_14_2","doi-asserted-by":"crossref","unstructured":"T. Pan A. M. Wells R. Shome L. E. Kavraki \u201cA general task and motion planning framework for multiple manipulators\u201d 2021 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE 2021) pp. 3168\u20133174.","DOI":"10.1109\/IROS51168.2021.9636119"},{"key":"e_1_3_2_15_2","unstructured":"Y. LeCun B. E. Boser J. S. Denker D. Henderson R. E. Howard W. E. Hubbard L. D. Jackel \u201cHandwritten digit recognition with a back-propagation network\u201d in Advances in Neural Information Processing Systems (MIT Press 1989) pp. 396\u2013404."},{"key":"e_1_3_2_16_2","doi-asserted-by":"crossref","unstructured":"J. J. Kuffner S. M. LaValle \u201cRRT-connect: An efficient approach to single-query path planning\u201d in Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065) (IEEE 2000) vol. 2 pp. 995\u20131001.","DOI":"10.1109\/ROBOT.2000.844730"},{"key":"e_1_3_2_17_2","doi-asserted-by":"crossref","unstructured":"D. Golovin B. Solnik S. Moitra G. Kochanski J. Karro D. Sculley \u201cGoogle Vizier: A service for black-box optimization\u201d in KDD \u201917: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery 2017) pp. 1487\u20131495.","DOI":"10.1145\/3097983.3098043"},{"key":"e_1_3_2_18_2","unstructured":"X. Song Q. Zhang C. Lee E. Fertig T.-K. Huang L. Belenki G. Kochanski S. Ariafar S. Vasudevan S. Perel D. Golovin The Vizier gaussian process bandit algorithm. arXiv:2408.11527 [cs.LG] (2024)."},{"key":"e_1_3_2_19_2","first-page":"10","article-title":"Graph attention networks","volume":"1050","author":"Veli\u010dkovi\u0107 P.","year":"2017","unstructured":"P. Veli\u010dkovi\u0107, G. Cucurull, A. Casanova, A. Romero, P. Li\u00f2, Y. Bengio, Graph attention networks. stat 1050, 10 (2017).","journal-title":"stat"},{"key":"e_1_3_2_20_2","unstructured":"P. W. Battaglia J. B. Hamrick V. Bapst A. Sanchez-Gonzalez V. Zambaldi M. Malinowski A. Tacchetti D. Raposo A. Santoro R. Faulkner C. Gulcehre F. Song A. Ballard J. Gilmer G. Dahl A. Vaswani K. Allen C. Nash V. Langston C. Dyer N. Heess D. Wierstra P. Kohli M. Botvinick O. Vinyals Y. Li R. Pascanu Relational inductive biases deep learning and graph networks. arXiv:1806.01261 [cs.LG] (2018)."},{"key":"e_1_3_2_21_2","unstructured":"S. Fujimoto H. Hoof D. Meger \u201cAddressing function approximation error in actor-critic methods\u201d in Proceedings of the 35th International Conference on Machine Learning (PMLR 2018) pp. 1587\u20131596."},{"key":"e_1_3_2_22_2","unstructured":"J. W. Ratcliff K. Mamou Voxelized hierarchical convex decomposition - V-HACD version 4 GitHub (2023); https:\/\/github.com\/kmammou\/v-hacd."},{"key":"e_1_3_2_23_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2019627.2019641","article-title":"Fast oriented bounding box optimization on the rotation group SO(3, \u211d)","volume":"30","author":"Chang C.-T.","year":"2011","unstructured":"C.-T. Chang, B. Gorissen, S. Melchior, Fast oriented bounding box optimization on the rotation group SO(3, \u211d). ACM Trans. Graph. 30, 1\u201316 (2011).","journal-title":"ACM Trans. Graph."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1080\/00401706.1975.10489269"},{"key":"e_1_3_2_25_2","doi-asserted-by":"crossref","unstructured":"Y. Zhou C. Barnes J. Lu J. Yang H. Li \u201cOn the continuity of rotation representations in neural networks\u201d in Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (IEEE 2019) pp. 5745\u20135753.","DOI":"10.1109\/CVPR.2019.00589"},{"key":"e_1_3_2_26_2","unstructured":"M. Andrychowicz F. Wolski A. Ray J. Schneider R. Fong P. Welinder B. McGrew J. Tobin P. Abbeel W. Zaremba \u201cHindsight experience replay\u201d in Advances in Neural Information Processing Systems (Curran Associates 2017) pp. 5048\u20135058."},{"key":"e_1_3_2_27_2","unstructured":"D. Silver G. Lever N. Heess T. Degris D. Wierstra M. Riedmiller \u201cDeterministic policy gradient algorithms\u201d in Proceedings of the 31st International Conference on Machine Learning (MLResearchPress 2014) pp. 387\u2013395."},{"key":"e_1_3_2_28_2","first-page":"98","article-title":"A new method of stochastic approximation type","volume":"7","author":"Polyak B. T.","year":"1990","unstructured":"B. T. Polyak, A new method of stochastic approximation type. Avtom. Telemek. 7, 98\u2013107 (1990).","journal-title":"Avtom. Telemek."},{"key":"e_1_3_2_29_2","unstructured":"J. L. Ba J. R. Kiros G. E. Hinton Layer normalization. arXiv:1607.06450 [stat.ML] (2016)."},{"key":"e_1_3_2_30_2","unstructured":"D. Hendrycks K. Gimpel Gaussian error linear units (GELUs). arXiv:1606.08415 [cs.LG] (2016)."},{"key":"e_1_3_2_31_2","unstructured":"J. Bradbury R. Frostig P. Hawkins M. J. Johnson C. Leary D. Maclaurin G. Necula A. Paszke J. VanderPlas S. Wanderman-Milne Q. Zhang JAX: Composable transformations of Python+NumPy programs version 0.3.13 GitHub (2018); https:\/\/github.com\/jax-ml\/jax."},{"key":"e_1_3_2_32_2","unstructured":"J. Godwin T. Keck P. Battaglia V. Bapst T. Kipf Y. Li K. Stachenfeld P. Veli\u010dkovi\u0107 A. Sanchez-Gonzalez Jraph: A library for graph neural networks in jax version 0.0.1.dev GitHub (2020); http:\/\/github.com\/google-deepmind\/jraph."},{"key":"e_1_3_2_33_2","unstructured":"J. Heek A. Levskaya A. Oliver M. Ritter B. Rondepierre A. Steiner M. van Zee Flax: A neural network library and ecosystem for JAX version 0.11.1 GitHub (2023); http:\/\/github.com\/google\/flax."},{"key":"e_1_3_2_34_2","unstructured":"A. Cassirer G. Barth-Maron E. Brevdo S. Ramos T. Boyd T. Sottiaux M. Kroiss Reverb: A framework for experience replay. arXiv:2102.04736 [cs.LG] (2021)."},{"key":"e_1_3_2_35_2","doi-asserted-by":"crossref","unstructured":"E. Todorov T. Erez Y. Tassa \u201cMujoco: A physics engine for model-based control\u201d in 2012 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IEEE 2012) pp. 5026\u20135033.","DOI":"10.1109\/IROS.2012.6386109"}],"container-title":["Science Robotics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.science.org\/doi\/pdf\/10.1126\/scirobotics.ads1204","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T17:59:19Z","timestamp":1756922359000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.science.org\/doi\/10.1126\/scirobotics.ads1204"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,3]]},"references-count":34,"journal-issue":{"issue":"106","published-print":{"date-parts":[[2025,9,3]]}},"alternative-id":["10.1126\/scirobotics.ads1204"],"URL":"https:\/\/doi.org\/10.1126\/scirobotics.ads1204","relation":{},"ISSN":["2470-9476"],"issn-type":[{"value":"2470-9476","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,3]]},"assertion":[{"value":"2024-11-04","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-05","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-03","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"eads1204"}}