{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T20:38:20Z","timestamp":1776458300519,"version":"3.51.2"},"reference-count":153,"publisher":"Annual Reviews","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Annu. Rev. Control Robot. Auton. Syst."],"published-print":{"date-parts":[[2022,5,3]]},"abstract":"<jats:p>The last half decade has seen a steep rise in the number of contributions on safe learning methods for real-world robotic deployments from both the control and reinforcement learning communities. This article provides a concise but holistic review of the recent advances made in using machine learning to achieve safe decision-making under uncertainties, with a focus on unifying the language and frameworks used in control theory and reinforcement learning research. It includes learning-based control approaches that safely improve performance by learning the uncertain dynamics, reinforcement learning approaches that encourage safety or robustness, and methods that can formally certify the safety of a learned control policy. As data- and learning-based robot control methods continue to gain traction, researchers must understand when and how to best leverage them in real-world scenarios where safety is imperative, such as when operating in close proximityto humans. We highlight some of the open challenges that will drive the field of robot learning in the coming years, and emphasize the need for realistic physics-based benchmarks to facilitate fair comparisons between control and reinforcement learning approaches.<\/jats:p>","DOI":"10.1146\/annurev-control-042920-020211","type":"journal-article","created":{"date-parts":[[2022,1,26]],"date-time":"2022-01-26T19:22:35Z","timestamp":1643224955000},"page":"411-444","source":"Crossref","is-referenced-by-count":583,"title":["Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning"],"prefix":"10.1146","volume":"5","author":[{"given":"Lukas","family":"Brunke","sequence":"first","affiliation":[{"name":"Institute for Aerospace Studies, University of Toronto, Toronto, Ontario, Canada;, , , , , ,"},{"name":"University of Toronto Robotics Institute, Toronto, Ontario, Canada"},{"name":"Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada"}]},{"given":"Melissa","family":"Greeff","sequence":"additional","affiliation":[{"name":"Institute for Aerospace Studies, University of Toronto, Toronto, Ontario, Canada;, , , , , ,"},{"name":"University of Toronto Robotics Institute, Toronto, Ontario, Canada"},{"name":"Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada"}]},{"given":"Adam W.","family":"Hall","sequence":"additional","affiliation":[{"name":"Institute for Aerospace Studies, University of Toronto, Toronto, Ontario, Canada;, , , , , ,"},{"name":"University of Toronto Robotics Institute, Toronto, Ontario, Canada"},{"name":"Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada"}]},{"given":"Zhaocong","family":"Yuan","sequence":"additional","affiliation":[{"name":"Institute for Aerospace Studies, University of Toronto, Toronto, Ontario, Canada;, , , , , ,"},{"name":"University of Toronto Robotics Institute, Toronto, Ontario, Canada"},{"name":"Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada"}]},{"given":"Siqi","family":"Zhou","sequence":"additional","affiliation":[{"name":"Institute for Aerospace Studies, University of Toronto, Toronto, Ontario, Canada;, , , , , ,"},{"name":"University of Toronto Robotics Institute, Toronto, Ontario, Canada"},{"name":"Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada"}]},{"given":"Jacopo","family":"Panerati","sequence":"additional","affiliation":[{"name":"Institute for Aerospace Studies, University of Toronto, Toronto, Ontario, Canada;, , , , , ,"},{"name":"University of Toronto Robotics Institute, Toronto, Ontario, Canada"},{"name":"Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada"}]},{"given":"Angela P.","family":"Schoellig","sequence":"additional","affiliation":[{"name":"Institute for Aerospace Studies, University of Toronto, Toronto, Ontario, Canada;, , , , , ,"},{"name":"University of Toronto Robotics Institute, Toronto, Ontario, Canada"},{"name":"Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada"}]}],"member":"22","reference":[{"key":"B1","doi-asserted-by":"publisher","DOI":"10.1002\/rob.21958"},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.1161\/CIRCULATIONAHA.116.026318"},{"key":"B3","doi-asserted-by":"crossref","unstructured":"Dong K, Pereida K, Shkurti F, Schoellig AP. 2020. Catch the ball: accurate high-speed motions for mobile manipulators via inverse dynamics learning. arXiv:2003.07489 [cs.RO]","DOI":"10.1109\/IROS45743.2020.9341134"},{"key":"B4","first-page":"1437","volume":"16","author":"Garc\u00eda J","year":"2015","journal-title":"J. Mach. Learn. Res."},{"key":"B5","unstructured":"Dulac-Arnold G, Levine N, Mankowitz DJ, Li J, Paduraru C, et al. 2021. An empirical investigation of the challenges of real-world reinforcement learning. arXiv:2003.11881 [cs.LG]"},{"key":"B6","author":"Dyn. Syst. Lab","year":"2021","journal-title":"GitHub"},{"key":"B7","doi-asserted-by":"crossref","unstructured":"Yuan Z, Hall AW, Zhou S, Brunke L, Greeff M, et al. 2021. safe-control-gym: a unified benchmark suite for safe learning-based control and reinforcement learning. arXiv:2109.06325 [cs.RO]","DOI":"10.1109\/LRA.2022.3196132"},{"key":"B8","unstructured":"Dulac-Arnold G, Mankowitz D, Hester T. 2019. Challenges of real-world reinforcement learning. arXiv:1904.12901 [cs.LG]"},{"key":"B9","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-control-090419-075625"},{"key":"B10","doi-asserted-by":"publisher","DOI":"10.1109\/MCS.2006.1636313"},{"key":"B11","doi-asserted-by":"publisher","DOI":"10.1109\/TSMCC.2007.905759"},{"key":"B12","doi-asserted-by":"publisher","DOI":"10.1007\/s10846-017-0468-y"},{"key":"B13","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2019.2958211"},{"key":"B14","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-control-100819-063206"},{"key":"B15","doi-asserted-by":"publisher","DOI":"10.1177\/0278364913495721"},{"key":"B16","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-control-053018-023825"},{"key":"B17","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2017.2773458"},{"key":"B18","doi-asserted-by":"publisher","DOI":"10.1109\/ICUAS51884.2021.9476765"},{"key":"B19","doi-asserted-by":"crossref","unstructured":"Tambon F, Laberge G, An L, Nikanjam A, Mindom PSN, et al. 2021. How to certify machine learning based safety-critical systems? A systematic literature review. arXiv:2107.12045 [cs.LG]","DOI":"10.1007\/s10515-022-00337-x"},{"key":"B20","volume-title":"Benchmarking safe exploration in deep reinforcement learning","author":"Ray A","year":"2019"},{"key":"B21","unstructured":"Leike J, Martic M, Krakovna V, Ortega PA, Everitt T, et al. 2017. AI safety gridworlds. arXiv:1711.09883 [cs.LG]"},{"key":"B22","volume-title":"Nonlinear Systems","author":"Khalil H.","year":"2002","edition":"3"},{"key":"B23","volume-title":"Adaptive Control: Stability, Convergence and Robustness","author":"Sastry S","year":"2011"},{"key":"B24","doi-asserted-by":"publisher","DOI":"10.1007\/s10339-011-0404-1"},{"key":"B25","volume-title":"Robust and Optimal Control","author":"Zhou K","year":"1996"},{"key":"B26","volume-title":"A Course in Robust Control Theory: A Convex Approach","author":"Dullerud G","year":"2005"},{"key":"B27","volume-title":"Model Predictive Control: Theory, Computation, and Design","author":"Rawlings J","year":"2017"},{"key":"B28","doi-asserted-by":"publisher","DOI":"10.1016\/j.automatica.2004.08.019"},{"key":"B29","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2017.2743240"},{"key":"B30","first-page":"1125","volume-title":"Proceedings of the 35th International Conference on Machine Learning","author":"Dai B","year":"2018"},{"key":"B31","first-page":"1141","volume-title":"Proceedings of the 36th International Conference on Machine Learning","author":"Cheng R","year":"2019"},{"key":"B32","doi-asserted-by":"publisher","DOI":"10.1561\/2200000049"},{"key":"B33","volume-title":"Constrained Markov Decision Processes","author":"Altman E.","year":"1999"},{"key":"B34","first-page":"22","volume-title":"Proceedings of the 34th International Conference on Machine Learning","author":"Achiam J","year":"2017"},{"key":"B35","doi-asserted-by":"publisher","DOI":"10.1287\/opre.1050.0216"},{"key":"B36","first-page":"2817","volume-title":"Proceedings of the 34th International Conference on Machine Learning","author":"Pinto L","year":"2017"},{"key":"B37","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8794293"},{"key":"B38","unstructured":"Vinitsky E, Du Y, Parvate K, Jang K, Abbeel P, Bayen A. 2020. Robust reinforcement learning using adversarial populations. arXiv:2008.01825 [cs.LG]"},{"key":"B39","doi-asserted-by":"publisher","DOI":"10.1002\/acs.2397"},{"key":"B40","first-page":"826","volume-title":"Proceedings of the 2nd Conference on Learning for Dynamics and Control","author":"Gahlawat A","year":"2020"},{"key":"B41","doi-asserted-by":"publisher","DOI":"10.1137\/1.9780898719376"},{"key":"B42","first-page":"565","volume":"11","author":"Grande RC","year":"2014","journal-title":"J. Aerosp. Inf. Syst."},{"key":"B43","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2014.2319052"},{"key":"B44","doi-asserted-by":"publisher","DOI":"10.1109\/CDC40024.2019.9029173"},{"key":"B45","doi-asserted-by":"crossref","unstructured":"Joshi G, Virdi J, Chowdhary G. 2020. Asynchronous deep model reference adaptive control. arXiv:2011.02920 [cs.RO]","DOI":"10.1109\/CDC40024.2019.9029173"},{"key":"B46","doi-asserted-by":"publisher","DOI":"10.1109\/ECC.2015.7330913"},{"key":"B47","doi-asserted-by":"publisher","DOI":"10.1109\/LCSYS.2020.3004506"},{"key":"B48","first-page":"324","volume-title":"Proceedings of the 3rd Conference on Learning for Dynamics and Control","author":"von Rohr A","year":"2021"},{"key":"B49","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2019.2896728"},{"key":"B50","doi-asserted-by":"publisher","DOI":"10.1109\/LCSYS.2020.3009177"},{"key":"B51","doi-asserted-by":"publisher","DOI":"10.1016\/j.automatica.2014.10.036"},{"key":"B52","doi-asserted-by":"publisher","DOI":"10.1016\/j.automatica.2019.02.023"},{"key":"B53","doi-asserted-by":"publisher","DOI":"10.23919\/ACC.2018.8431586"},{"key":"B54","unstructured":"Bujarbaruah M, Zhang X, Tanaskovic M, Borrelli F. 2019. Adaptive MPC under time varying uncertainty: robust and stochastic. arXiv:1909.13473 [eess.SY]"},{"key":"B55","doi-asserted-by":"publisher","DOI":"10.1016\/j.jprocont.2015.12.006"},{"key":"B56","doi-asserted-by":"publisher","DOI":"10.1002\/rnc.5147"},{"key":"B57","doi-asserted-by":"publisher","DOI":"10.1109\/TAC.2017.2753460"},{"key":"B58","doi-asserted-by":"publisher","DOI":"10.1109\/CDC.2018.8618694"},{"key":"B59","doi-asserted-by":"publisher","DOI":"10.1002\/rnc.5712"},{"key":"B60","doi-asserted-by":"publisher","DOI":"10.1016\/j.automatica.2013.02.003"},{"key":"B61","doi-asserted-by":"publisher","DOI":"10.1016\/j.ifacol.2018.11.052"},{"key":"B62","doi-asserted-by":"publisher","DOI":"10.1177\/0278364916645661"},{"key":"B63","doi-asserted-by":"publisher","DOI":"10.1109\/TCST.2019.2949757"},{"key":"B64","first-page":"1701","volume-title":"Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics","author":"Kamthe S","year":"2018"},{"key":"B65","doi-asserted-by":"crossref","unstructured":"Koller T, Berkenkamp F, Turchetta M, Boedecker J, Krause A. 2019. Learning-based model predictive control for safe exploration and reinforcement learning. arXiv:1906.12189 [eess.SY]","DOI":"10.1109\/CDC.2018.8619572"},{"key":"B66","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2020.XVI.087"},{"key":"B67","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA40945.2020.9197521"},{"key":"B68","first-page":"908","volume-title":"Advances in Neural Information Processing Systems 30","author":"Berkenkamp F","year":"2017"},{"key":"B69","first-page":"4312","volume-title":"Advances in Neural Information Processing Systems 29","author":"Turchetta M","year":"2016"},{"key":"B70","unstructured":"Dalal G, Dvijotham K, Vecerik M, Hester T, Paduraru C, Tassa Y. 2018. Safe exploration in continuous action spaces. arXiv:1801.08757 [cs.AI]"},{"key":"B71","volume-title":"Reinforcement Learning: An Introduction","author":"Sutton RS","year":"2018","edition":"2"},{"key":"B72","first-page":"3207","volume-title":"The Thirty-Second AAAI Conference on Artificial Intelligence","author":"Henderson P","year":"2018"},{"key":"B73","first-page":"1451","volume-title":"Proceedings of the 29th International Conference on Machine Learning (ICML)","author":"Moldovan TM","year":"2012"},{"key":"B74","first-page":"213","volume":"3","author":"Brafman RI","year":"2002","journal-title":"J. Mach. Learn. Res."},{"key":"B75","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8460547"},{"key":"B76","doi-asserted-by":"crossref","unstructured":"Kim Y, Allmendinger R, L\u00f3pez-Ib\u00e1\u00f1ez M. 2021. Safe learning and optimization techniques: towards a survey of the state of the art. arXiv:2101.09505 [cs.LG]","DOI":"10.1007\/978-3-030-73959-1_12"},{"key":"B77","doi-asserted-by":"publisher","DOI":"10.1016\/j.ifacol.2017.08.1991"},{"key":"B78","first-page":"997","volume-title":"Proceedings of the 32nd International Conference on Machine Learning","author":"Sui Y","year":"2015"},{"key":"B79","doi-asserted-by":"crossref","unstructured":"Berkenkamp F, Krause A, Schoellig AP. 2020. Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics. arXiv:1602.04450 [cs.RO]","DOI":"10.1007\/s10994-021-06019-1"},{"key":"B80","first-page":"4781","volume-title":"Proceedings of the 35th International Conference on Machine Learning","author":"Sui Y","year":"2018"},{"key":"B81","doi-asserted-by":"crossref","unstructured":"Baumann D, Marco A, Turchetta M, Trimpe S. 2021. GoSafe: globally optimal safe robot learning. arXiv:2105.13281 [cs.RO]","DOI":"10.1109\/ICRA48506.2021.9560738"},{"key":"B82","first-page":"6548","volume-title":"The Thirty-Second AAAI Conference on Artificial Intelligence","author":"Wachi A","year":"2018"},{"key":"B83","unstructured":"Srinivasan K, Eysenbach B, Ha S, Tan J, Finn C. 2020. Learning to be safe: deep RL with a safety critic. arXiv:2010.14603 [cs.LG]"},{"key":"B84","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3070252"},{"key":"B85","unstructured":"Bharadhwaj H, Kumar A, Rhinehart N, Levine S, Shkurti F, Garg A. 2021. Conservative safety critics for exploration. arXiv:2010.14497 [cs.LG]"},{"key":"B86","unstructured":"Kumar A, Zhou A, Tucker G, Levine S. 2020. Conservative Q-learning for offline reinforcement learning. arXiv:2006.04779 [cs.LG]"},{"key":"B87","unstructured":"Kahn G, Villaflor A, Pong V, Abbeel P, Levine S. 2017. Uncertainty-aware reinforcement learning for collision avoidance. arXiv:1702.01182 [cs.LG]"},{"key":"B88","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8793611"},{"key":"B89","first-page":"11055","volume-title":"Proceedings of the 37th International Conference on Machine Learning","author":"Zhang J","year":"2020"},{"key":"B90","first-page":"4759","volume-title":"Advances in Neural Information Processing Systems 31","author":"Chua K","year":"2018"},{"key":"B91","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2020.2976272"},{"key":"B92","unstructured":"Urp\u00ed NA, Curi S, Krause A. 2021. Risk-averse offline reinforcement learning. arXiv:2102.05371 [cs.LG]"},{"key":"B93","first-page":"6070","volume":"18","author":"Chow Y","year":"2017","journal-title":"J. Mach. Learn. Res."},{"key":"B94","unstructured":"Liang Q, Que F, Modiano E. 2018. Accelerated primal-dual policy optimization for safe reinforcement learning. arXiv:1802.06480 [cs.AI]"},{"key":"B95","first-page":"1889","volume-title":"Proceedings of the 32nd International Conference on Machine Learning","author":"Schulman J","year":"2015"},{"key":"B96","first-page":"8103","volume-title":"Advances in Neural Information Processing Systems 31","author":"Chow Y","year":"2018"},{"key":"B97","unstructured":"Chow Y, Nachum O, Faust A, Duenez-Guzman E, Ghavamzadeh M. 2019. Lyapunov-based safe policy optimization for continuous control. arXiv:1901.10031 [cs.LG]"},{"key":"B98","first-page":"8502","volume-title":"Proceedings of the 37th International Conference on Machine Learning","author":"Satija H","year":"2020"},{"key":"B99","doi-asserted-by":"publisher","DOI":"10.1162\/0899766053011528"},{"key":"B100","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA40945.2020.9197000"},{"key":"B101","unstructured":"Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, et al. 2014. Generative adversarial networks. arXiv:1406.2661 [stat.ML]"},{"key":"B102","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"B103","first-page":"1328","volume-title":"Proceedings of the Conference on Robot Learning","author":"L\u00fctjens B","year":"2020"},{"key":"B104","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2017.XIII.034"},{"key":"B105","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2019.2942989"},{"key":"B106","unstructured":"Rajeswaran A, Ghotra S, Ravindran B, Levine S. 2017. EPOpt: learning robust neural network policies using model ensembles. arXiv:1610.01283 [cs.LG]"},{"key":"B107","unstructured":"Mehta B, Diaz M, Golemo F, Pal CJ, Paull L. 2020. Active domain randomization. InProceedings of the Conference on Robot Learning, ed. LP Kaelbling, D Kragic, K Sugiura, pp. 1162\u201376. Proc. Mach. Learn. Res. 100. N.p.: PMLR"},{"key":"B108","doi-asserted-by":"publisher","DOI":"10.1177\/0278364920953902"},{"key":"B109","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.3045114"},{"key":"B110","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8794351"},{"key":"B111","unstructured":"Fazlyab M, Robey A, Hassani H, Morari M, Pappas GJ. 2019. Efficient and accurate estimation of Lipschitz constants for deep neural networks. arXiv:1906.04893 [cs.LG]"},{"key":"B112","first-page":"466","volume-title":"Proceedings of the 2nd Conference on Robot Learning","author":"Richards SM","year":"2018"},{"key":"B113","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2020.2992981"},{"key":"B114","doi-asserted-by":"publisher","DOI":"10.1109\/CDC.2003.1272309"},{"key":"B115","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-78841-6"},{"key":"B116","first-page":"2669","volume-title":"The Thirty-Second AAAI Conference on Artificial Intelligence","author":"Alshiekh M","year":"2018"},{"key":"B117","doi-asserted-by":"publisher","DOI":"10.23919\/ECC.2019.8796030"},{"key":"B118","doi-asserted-by":"publisher","DOI":"10.1109\/IROS40897.2019.8967820"},{"key":"B119","first-page":"708","volume-title":"Proceedings of the 2nd Conference on Learning for Dynamics and Control","author":"Taylor A","year":"2020"},{"key":"B120","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2019.2920206"},{"key":"B121","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2020.XVI.088"},{"key":"B122","doi-asserted-by":"publisher","DOI":"10.1109\/CDC40024.2019.9029226"},{"key":"B123","doi-asserted-by":"crossref","unstructured":"Taylor AJ, Singletary A, Yue Y, Ames AD. 2020. A control barrier perspective on episodic learning via projection-to-state safety. arXiv:2003.08028 [eess.SY]","DOI":"10.1109\/LCSYS.2020.3009082"},{"key":"B124","doi-asserted-by":"crossref","unstructured":"Taylor AJ, Dorobantu VD, Dean S, Recht B, Yue Y, Ames AD. 2020. Towards robust data-driven control synthesis for nonlinear systems with actuation uncertainty. arXiv:2011.10730 [eess.SY]","DOI":"10.1109\/CDC45484.2021.9683511"},{"key":"B125","doi-asserted-by":"publisher","DOI":"10.23919\/ACC45564.2020.9147463"},{"key":"B126","doi-asserted-by":"publisher","DOI":"10.1109\/LCSYS.2020.3005923"},{"key":"B127","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33013387"},{"key":"B128","doi-asserted-by":"crossref","unstructured":"Fan DD, Nguyen J, Thakker R, Alatur N, Agha-mohammadi A, Theodorou EA. 2019. Bayesian learning-based adaptive control for safety critical systems. arXiv:1910.02325 [eess.SY]","DOI":"10.1109\/ICRA40945.2020.9196709"},{"key":"B129","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8460471"},{"key":"B130","first-page":"781","volume-title":"Proceedings of the 2nd Conference on Learning for Dynamics and Control","author":"Khojasteh MJ","year":"2020"},{"key":"B131","unstructured":"Dean S, Taylor AJ, Cosner RK, Recht B, Ames AD. 2020. Guaranteeing safety of learned perception modules via measurement-robust control barrier functions. arXiv:2010.16001 [eess.SY]"},{"key":"B132","doi-asserted-by":"publisher","DOI":"10.1109\/TAC.2005.851439"},{"key":"B133","doi-asserted-by":"publisher","DOI":"10.1109\/TAC.2018.2876389"},{"key":"B134","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2012.6225136"},{"key":"B135","doi-asserted-by":"publisher","DOI":"10.1109\/CDC40024.2019.9030133"},{"key":"B136","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8794107"},{"key":"B137","doi-asserted-by":"crossref","unstructured":"Choi JJ, Lee D, Sreenath K, Tomlin CJ, Herbert SL. 2021. Robust control barrier-value functions for safety-critical control. arXiv:2104.02808 [eess.SY]","DOI":"10.1109\/CDC45484.2021.9683085"},{"key":"B138","doi-asserted-by":"crossref","unstructured":"Herbert S, Choi JJ, Sanjeev S, Gibson M, Sreenath K, Tomlin CJ. 2021. Scalable learning of safety guarantees for autonomous systems using Hamilton-Jacobi reachability. arXiv:2101.05916 [cs.RO]","DOI":"10.1109\/ICRA48506.2021.9561561"},{"key":"B139","doi-asserted-by":"publisher","DOI":"10.1109\/CDC.2018.8619829"},{"key":"B140","doi-asserted-by":"crossref","unstructured":"Wabersich KP, Hewing L, Carron A, Zeilinger MN. 2019. Probabilistic model predictive safety certification for learning-based control. arXiv:1906.10417 [eess.SY]","DOI":"10.1109\/CDC.2018.8619829"},{"key":"B141","doi-asserted-by":"publisher","DOI":"10.1016\/j.automatica.2021.109597"},{"key":"B142","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2017.2654539"},{"key":"B143","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-control-072220-093055"},{"key":"B144","unstructured":"Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, et al. 2016. OpenAI Gym. arXiv:1606.01540 [cs.LG]"},{"key":"B145","doi-asserted-by":"publisher","DOI":"10.1109\/IROS51168.2021.9635857"},{"key":"B146","unstructured":"Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. 2017. Proximal policy optimization algorithms. arXiv:1707.06347 [cs.LG]"},{"key":"B147","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-32552-1_48"},{"key":"B148","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2018.8593882"},{"key":"B149","first-page":"9156","volume-title":"Advances in Neural Information Processing Systems 33","author":"Chandak Y","year":"2020"},{"key":"B150","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2015.2489500"},{"key":"B151","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2012.6385647"},{"key":"B152","doi-asserted-by":"crossref","unstructured":"Dean S, Tu S, Matni N, Recht B. 2018. Safely learning to control the constrained linear quadratic regulator. arXiv:5582\u201388","DOI":"10.23919\/ACC.2019.8814865"},{"key":"B153","doi-asserted-by":"publisher","DOI":"10.23919\/ECC.2019.8796295"}],"container-title":["Annual Review of Control, Robotics, and Autonomous Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.annualreviews.org\/doi\/pdf\/10.1146\/annurev-control-042920-020211","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T21:05:44Z","timestamp":1674594344000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.annualreviews.org\/doi\/10.1146\/annurev-control-042920-020211"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,3]]},"references-count":153,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,5,3]]}},"alternative-id":["10.1146\/annurev-control-042920-020211"],"URL":"https:\/\/doi.org\/10.1146\/annurev-control-042920-020211","relation":{},"ISSN":["2573-5144","2573-5144"],"issn-type":[{"value":"2573-5144","type":"print"},{"value":"2573-5144","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,5,3]]}}}