{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,14]],"date-time":"2026-01-14T00:34:57Z","timestamp":1768350897559,"version":"3.49.0"},"publisher-location":"Cham","reference-count":70,"publisher":"Springer Nature Switzerland","isbn-type":[{"value":"9783031308222","type":"print"},{"value":"9783031308239","type":"electronic"}],"license":[{"start":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T00:00:00Z","timestamp":1672531200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,4,22]],"date-time":"2023-04-22T00:00:00Z","timestamp":1682121600000},"content-version":"vor","delay-in-days":111,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Deep reinforcement learning (DRL) has become a dominant deep-learning paradigm for tasks where complex policies are learned within reactive systems. Unfortunately, these policies are known to be susceptible to bugs. Despite significant progress in DNN verification, there has been little work demonstrating the use of modern verification tools on real-world, DRL-controlled systems. In this case study, we attempt to begin bridging this gap, and focus on the important task of mapless robotic navigation \u2014 a classic robotics problem, in which a robot, usually controlled by a DRL agent, needs to efficiently and safely navigate through an unknown arena towards a target. We demonstrate how modern verification engines can be used for effective <jats:italic>model selection<\/jats:italic>, i.e., selecting the best available policy for the robot in question from a pool of candidate policies. Specifically, we use verification to detect and rule out policies that may demonstrate suboptimal behavior, such as collisions and infinite loops. We also apply verification to identify models with overly conservative behavior, thus allowing users to choose superior policies, which might be better at finding shorter paths to a target. To validate our work, we conducted extensive experiments on an actual robot, and confirmed that the suboptimal policies detected by our method were indeed flawed. We also demonstrate the superiority of our verification-driven approach over state-of-the-art, gradient attacks. Our work is the first to establish the usefulness of DNN verification in identifying and filtering out suboptimal DRL policies in real-world robots, and we believe that the methods presented here are applicable to a wide range of systems that incorporate deep-learning-based agents.<\/jats:p>","DOI":"10.1007\/978-3-031-30823-9_31","type":"book-chapter","created":{"date-parts":[[2023,4,21]],"date-time":"2023-04-21T20:19:12Z","timestamp":1682108352000},"page":"607-627","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["Verifying Learning-Based Robotic Navigation Systems"],"prefix":"10.1007","author":[{"given":"Guy","family":"Amir","sequence":"first","affiliation":[]},{"given":"Davide","family":"Corsi","sequence":"additional","affiliation":[]},{"given":"Raz","family":"Yerushalmi","sequence":"additional","affiliation":[]},{"given":"Luca","family":"Marzari","sequence":"additional","affiliation":[]},{"given":"David","family":"Harel","sequence":"additional","affiliation":[]},{"given":"Alessandro","family":"Farinelli","sequence":"additional","affiliation":[]},{"given":"Guy","family":"Katz","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,4,22]]},"reference":[{"key":"31_CR1","unstructured":"J.\u00a0Achiam, D.\u00a0Held, A.\u00a0Tamar, and P.\u00a0Abbeel. Constrained Policy Optimization. In Proc. 34th Int. Conf. on Machine Learning (ICML), pages 22\u201331, 2017."},{"key":"31_CR2","doi-asserted-by":"crossref","unstructured":"M.\u00a0Alshiekh, R.\u00a0Bloem, R.\u00a0Ehlers, B.\u00a0K\u00f6nighofer, S.\u00a0Niekum, and U.\u00a0Topcu. Safe Reinforcement Learning via Shielding. In Proc. 32th AAAI Conf. on Artificial Intelligence (AAAI), pages 2669\u20132678, 2018.","DOI":"10.1609\/aaai.v32i1.11797"},{"key":"31_CR3","unstructured":"G.\u00a0Amir, D.\u00a0Corsi, R.\u00a0Yerushalmi, L.\u00a0Marzari, D.\u00a0Harel, A.\u00a0Farinelli, and G.\u00a0Katz. Supplementary Artifact, 2022. https:\/\/doi.org\/10.5281\/zenodo.7496352."},{"key":"31_CR4","unstructured":"G.\u00a0Amir, D.\u00a0Corsi, R.\u00a0Yerushalmi, L.\u00a0Marzari, D.\u00a0Harel, A.\u00a0Farinelli, and G.\u00a0Katz. Supplementary Video, 2022. https:\/\/youtu.be\/QIZqOgxLkAE."},{"key":"31_CR5","doi-asserted-by":"crossref","unstructured":"G.\u00a0Amir, D.\u00a0Corsi, R.\u00a0Yerushalmi, L.\u00a0Marzari, D.\u00a0Harel, A.\u00a0Farinelli, and G.\u00a0Katz.Verifying Learning-Based Robotic Navigation Systems, 2023. Technical Report. https:\/\/arxiv.org\/abs\/2205.13536.","DOI":"10.1007\/978-3-031-30823-9_31"},{"key":"31_CR6","unstructured":"G.\u00a0Amir, M.\u00a0Schapira, and G.\u00a0Katz. Towards Scalable Verification of Deep Reinforcement Learning. In Proc. 21st Int. Conf. on Formal Methods in Computer-Aided Design (FMCAD), pages 193\u2013203, 2021."},{"key":"31_CR7","doi-asserted-by":"crossref","unstructured":"R.\u00a0Amsters and P.\u00a0Slaets. Turtlebot 3 as a Robotics Education Platform. In Proc. 10th Int. Conf. on Robotics in Education (RiE), pages 170\u2013181, 2019.","DOI":"10.1007\/978-3-030-26945-6_16"},{"key":"31_CR8","doi-asserted-by":"crossref","unstructured":"G.\u00a0Avni, R.\u00a0Bloem, K.\u00a0Chatterjee, T.\u00a0Henzinger, B.\u00a0Konighofer, and S.\u00a0Pranger. Run-Time Optimization for Learned Controllers through Quantitative Games. In Proc. 31st Int. Conf. on Computer Aided Verification (CAV), pages 630\u2013649, 2019.","DOI":"10.1007\/978-3-030-25540-4_36"},{"key":"31_CR9","doi-asserted-by":"crossref","unstructured":"E.\u00a0Bacci, M.\u00a0Giacobbe, and D.\u00a0Parker. Verifying Reinforcement Learning Up to Infinity. In Proc. 30th Int. Joint Conf. on Artificial Intelligence(IJCAI), 2021.","DOI":"10.24963\/ijcai.2021\/297"},{"key":"31_CR10","doi-asserted-by":"crossref","unstructured":"T.\u00a0Baluta, S.\u00a0Shen, S.\u00a0Shinde, K.\u00a0Meel, and P.\u00a0Saxena. Quantitative Verification of Neural Networks and its Security Applications. In Proc. ACM SIGSAC Conf. on Computer and Communications Security (CCS), pages 1249\u20131264, 2019.","DOI":"10.1145\/3319535.3354245"},{"key":"31_CR11","unstructured":"M.\u00a0Bojarski, D.\u00a0Del\u00a0Testa, D.\u00a0Dworakowski, B.\u00a0Firner, B.\u00a0Flepp, P.\u00a0Goyal,L.\u00a0Jackel, M.\u00a0Monfort, U.\u00a0Muller, J.\u00a0Zhang, X.\u00a0Zhang, J.\u00a0Zhao, and K.\u00a0Zieba. End to End Learning for Self-Driving Cars, 2016. Technical Report. http:\/\/arxiv.org\/abs\/1604.07316."},{"key":"31_CR12","doi-asserted-by":"crossref","unstructured":"L.\u00a0Brunke, M.\u00a0Greeff, A.\u00a0Hall, Z.\u00a0Yuan, S.\u00a0Zhou, J.\u00a0Panerati, and A.\u00a0Schoellig. Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning. Annual Review of Control, Robotics, and Autonomous Systems, 5, 2021.","DOI":"10.1146\/annurev-control-042920-020211"},{"key":"31_CR13","doi-asserted-by":"crossref","unstructured":"H.\u00a0Chiang, A.\u00a0Faust, M.\u00a0Fiser, and A.\u00a0Francis. Learning Navigation Behaviors End-to-End with AutoRL. IEEE Robotics and Automation Letters (RA-L\/ICRA),4(2):2007\u20132014, 2019.","DOI":"10.1109\/LRA.2019.2899918"},{"key":"31_CR14","doi-asserted-by":"crossref","unstructured":"E.\u00a0Clarke, T.\u00a0Henzinger, H.\u00a0Veith, and R.\u00a0Bloem. Handbook of Model Checking, volume\u00a010. Springer, 2018.","DOI":"10.1007\/978-3-319-10575-8"},{"key":"31_CR15","unstructured":"D.\u00a0Corsi, E.\u00a0Marchesini, and A.\u00a0Farinelli. Formal Verification of Neural Networks for Safety-Critical Tasks in Deep Reinforcement Learning. In Proc. 37th Conf. on Uncertainty in Artificial Intelligence(UAI), pages 333\u2013343, 2021."},{"key":"31_CR16","doi-asserted-by":"crossref","unstructured":"L.\u00a0Deng and Y.\u00a0Liu. Deep Learning in Natural Language Processing. Springer, 2018.","DOI":"10.1007\/978-981-10-5209-5"},{"key":"31_CR17","doi-asserted-by":"crossref","unstructured":"S.\u00a0Dutta, X.\u00a0Chen, and S.\u00a0Sankaranarayanan. Reachability Analysis for Neural Feedback Systems using Regressive Polynomial Rule Inference. In Proc. 22nd ACM Int. Conf. on Hybrid Systems: Computation andControl (HSCC), pages 157\u2013168, 2019.","DOI":"10.1145\/3302504.3311807"},{"key":"31_CR18","doi-asserted-by":"crossref","unstructured":"S.\u00a0Dutta, S.\u00a0Jha, S.\u00a0Sankaranarayanan, and A.\u00a0Tiwari. Learning and Verification of Feedback Control Systems using Feedforward Neural Networks. IFAC-PapersOnLine, 51(16):151\u2013156, 2018.","DOI":"10.1016\/j.ifacol.2018.08.026"},{"key":"31_CR19","doi-asserted-by":"crossref","unstructured":"S.\u00a0Dutta, S.\u00a0Jha, S.\u00a0Sankaranarayanan, and A.\u00a0Tiwari. Output Range Analysis for Deep Feed forward Neural Networks. In Proc. 10th NASA Formal Methods Symposium (NFM), pages 121\u2013138, 2018.","DOI":"10.1007\/978-3-319-77935-5_9"},{"key":"31_CR20","doi-asserted-by":"crossref","unstructured":"R.\u00a0Ehlers. Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks. In Proc. 15th Int. Symp. on Automated Technology for Verification and Analysis (ATVA), pages 269\u2013286, 2017.","DOI":"10.1007\/978-3-319-68167-2_19"},{"key":"31_CR21","doi-asserted-by":"crossref","unstructured":"T.\u00a0Eliyahu, Y.\u00a0Kazak, G.\u00a0Katz, and M.\u00a0Schapira. Verifying Learning-Augmented Systems. In Proc. Conf. of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), pages 305\u2013318, 2021.","DOI":"10.1145\/3452296.3472936"},{"key":"31_CR22","doi-asserted-by":"crossref","unstructured":"N.\u00a0Fulton and A.\u00a0Platzer. Safe Reinforcement Learning via Formal Methods: Toward Safe Control through Proof and Learning. In Proc. 32nd AAAI Conf. on Artificial Intelligence (AAAI), 2018.","DOI":"10.1609\/aaai.v32i1.12107"},{"key":"31_CR23","unstructured":"J.\u00a0Garc\u0131a and F.\u00a0Fern\u00e1ndez. A Comprehensive Survey on Safe Reinforcement Learning. Journal of Machine Learning Research, 16(1):1437\u20131480, 2015."},{"key":"31_CR24","doi-asserted-by":"crossref","unstructured":"T.\u00a0Gehr, M.\u00a0Mirman, D.\u00a0Drachsler-Cohen, E.\u00a0Tsankov, S.\u00a0Chaudhuri, and M.\u00a0Vechev. AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation. In Proc. 39th IEEE Symposium on Security and Privacy (S &P), 2018.","DOI":"10.1109\/SP.2018.00058"},{"key":"31_CR25","unstructured":"I.\u00a0Goodfellow, Y.\u00a0Bengio, and A.\u00a0Courville. Deep Learning. MIT Press, 2016."},{"key":"31_CR26","doi-asserted-by":"crossref","unstructured":"D.\u00a0Gopinath, G.\u00a0Katz, C.\u00a0P\u01ces\u01cereanu, and C.\u00a0Barrett. DeepSafe: A Data-driven Approach for Assessing Robustness of Neural Networks. In Proc. 16th. Int. Symposium on Automated Technology for Verification and Analysis (ATVA), pages 3\u201319, 2018.","DOI":"10.1007\/978-3-030-01090-4_1"},{"key":"31_CR27","unstructured":"D.\u00a0Gunning. Explainable Artificial Intelligence (XAI), 2017. Defense Advanced Research Projects Agency (DARPA) Project."},{"key":"31_CR28","doi-asserted-by":"crossref","unstructured":"X.\u00a0Huang, M.\u00a0Kwiatkowska, S.\u00a0Wang, and M.\u00a0Wu. Safety Verification of Deep Neural Networks.In Proc. 29th Int. Conf. on Computer Aided Verification (CAV), pages 3\u201329, 2017.","DOI":"10.1007\/978-3-319-63387-9_1"},{"key":"31_CR29","doi-asserted-by":"crossref","unstructured":"R.\u00a0Ivanov, T.\u00a0Carpenter, J.\u00a0Weimer, R.\u00a0Alur, G.\u00a0Pappas, and I.\u00a0Lee. Verifying the Safety of Autonomous Systems with Neural Network Controllers.ACM Transactions on Embedded Computing Systems (TECS), 20(1):1\u201326, 2020.","DOI":"10.1145\/3419742"},{"key":"31_CR30","doi-asserted-by":"crossref","unstructured":"P.\u00a0Jin, J.\u00a0Tian, D.\u00a0Zhi, X.\u00a0Wen, and M.\u00a0Zhang. Trainify: A CEGAR-Driven Training and Verification Framework for Safe Deep Reinforcement Learning. In Proc. 34th Int. Conf. on Computer Aided Verification (CAV), pages 193\u2013218, 2022.","DOI":"10.1007\/978-3-031-13185-1_10"},{"key":"31_CR31","doi-asserted-by":"crossref","unstructured":"G.\u00a0Katz, C.\u00a0Barrett, D.\u00a0Dill, K.\u00a0Julian, and M.\u00a0Kochenderfer. Reluplex: An Efficient SMT Solver for Verifying Deep NeuralNetworks. In Proc. 29th Int. Conf. on Computer Aided Verification (CAV), pages 97\u2013117, 2017.","DOI":"10.1007\/978-3-319-63387-9_5"},{"key":"31_CR32","doi-asserted-by":"crossref","unstructured":"G.\u00a0Katz, C.\u00a0Barrett, D.\u00a0Dill, K.\u00a0Julian, and M.\u00a0Kochenderfer. Reluplex: a Calculus for Reasoning about Deep Neural Networks. Formal Methods in System Design (FMSD), 2021.","DOI":"10.1007\/s10703-021-00363-7"},{"key":"31_CR33","doi-asserted-by":"crossref","unstructured":"G.\u00a0Katz, D.\u00a0Huang, D.\u00a0Ibeling, K.\u00a0Julian, C.\u00a0Lazarus, R.\u00a0Lim, P.\u00a0Shah,S.\u00a0Thakoor, H.\u00a0Wu, A.\u00a0Zelji\u0107, D.\u00a0Dill, M.\u00a0Kochenderfer, and C.\u00a0Barrett. The Marabou Framework for Verification and Analysis of Deep Neural Networks. In Proc. 31st Int. Conf. on Computer Aided Verification (CAV), pages 443\u2013452, 2019.","DOI":"10.1007\/978-3-030-25540-4_26"},{"key":"31_CR34","doi-asserted-by":"crossref","unstructured":"Y.\u00a0Kazak, C.\u00a0Barrett, G.\u00a0Katz, and M.\u00a0Schapira. Verifying Deep-RL-Driven Systems. In Proc. 1st ACM SIGCOMM Workshop on Network Meets AI & ML(NetAI), pages 83\u201389, 2019.","DOI":"10.1145\/3341216.3342218"},{"key":"31_CR35","doi-asserted-by":"crossref","unstructured":"B.\u00a0K\u00f6nighofer, F.\u00a0Lorber, N.\u00a0Jansen, and R.\u00a0Bloem. Shield Synthesis for Reinforcement Learning. In Proc. Int. Symposium on Leveraging Applications of Formal Methods, Verification and Validation (ISoLA), pages 290\u2013306, 2020.","DOI":"10.1007\/978-3-030-61362-4_16"},{"key":"31_CR36","unstructured":"L.\u00a0Kuper, G.\u00a0Katz, J.\u00a0Gottschlich, K.\u00a0Julian, C.\u00a0Barrett, and M.\u00a0Kochenderfer. Toward Scalable Verification for Safety-Critical Deep Networks, 2018. Technical Report. https:\/\/arxiv.org\/abs\/1801.05950."},{"key":"31_CR37","unstructured":"Y.\u00a0Li. Deep Reinforcement Learning: An Overview, 2017. Technical Report. http:\/\/arxiv.org\/abs\/1701.07274."},{"key":"31_CR38","doi-asserted-by":"crossref","unstructured":"Y.\u00a0Liu, J.\u00a0Ding, and X.\u00a0Liu. Ipo: Interior-Point Policy Optimization under Constraints. In Proc. 34th AAAI Conf. on Artificial Intelligence (AAAI), pages 4940\u20134947, 2020.","DOI":"10.1609\/aaai.v34i04.5932"},{"key":"31_CR39","unstructured":"A.\u00a0Lomuscio and L.\u00a0Maganti. An Approach to Reachability Analysis for Feed-Forward ReLU NeuralNetworks, 2017. Technical Report. http:\/\/arxiv.org\/abs\/1706.07351."},{"key":"31_CR40","doi-asserted-by":"crossref","unstructured":"Z.\u00a0Lyu, C.\u00a0Y. Ko, Z.\u00a0Kong, N.\u00a0Wong, D.\u00a0Lin, and L.\u00a0Daniel. Fastened Crown: Tightened Neural Network Robustness Certificates. In Proc. 34th AAAI Conf. on Artificial Intelligence (AAAI), pages 5037\u20135044, 2020.","DOI":"10.1609\/aaai.v34i04.5944"},{"key":"31_CR41","doi-asserted-by":"crossref","unstructured":"E.\u00a0Marchesini, D.\u00a0Corsi, and A.\u00a0Farinelli. Benchmarking Safe Deep Reinforcement Learning in AquaticNavigation. In Proc. IEEE\/RSJ Int. Conf on Intelligent Robots and Systems(IROS), 2021.","DOI":"10.1109\/IROS51168.2021.9635925"},{"key":"31_CR42","doi-asserted-by":"crossref","unstructured":"E.\u00a0Marchesini, D.\u00a0Corsi, and A.\u00a0Farinelli. Exploring Safer Behaviors for Deep Reinforcement Learning. In Proc. 35th AAAI Conf. on Artificial Intelligence (AAAI), 2021.","DOI":"10.1609\/aaai.v36i7.20737"},{"key":"31_CR43","doi-asserted-by":"crossref","unstructured":"E.\u00a0Marchesini and A.\u00a0Farinelli. Discrete Deep Reinforcement Learning for Mapless Navigation. In Proc. IEEE Int. Conf. on Robotics and Automation (ICRA), pages 10688\u201310694, 2020.","DOI":"10.1109\/ICRA40945.2020.9196739"},{"key":"31_CR44","unstructured":"V.\u00a0Mnih, K.\u00a0Kavukcuoglu, D.\u00a0Silver, A.\u00a0Graves, I.\u00a0Antonoglou, D.\u00a0Wierstra, and M.\u00a0Riedmiller. Playing Atari with Deep Reinforcement Learning, 2013. Technical Report. https:\/\/arxiv.org\/abs\/1312.5602."},{"key":"31_CR45","doi-asserted-by":"crossref","unstructured":"S.\u00a0M. Moosavi-Dezfooli, A.\u00a0Fawzi, O.\u00a0Fawzi, and P.\u00a0Frossard. Universal Adversarial Perturbations. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 1765\u20131773, 2017.","DOI":"10.1109\/CVPR.2017.17"},{"key":"31_CR46","doi-asserted-by":"crossref","unstructured":"C.\u00a0Nandkumar, P.\u00a0Shukla, and V.\u00a0Varma. Simulation of Indoor Localization and Navigation of Turtlebot 3using Real Time Object Detection. In Proc. Int. Conf. on Disruptive Technologies for Multi-Disciplinary Research and Applications (CENTCON), 2021.","DOI":"10.1109\/CENTCON52345.2021.9687937"},{"key":"31_CR47","doi-asserted-by":"crossref","unstructured":"M.\u00a0Pfeiffer, S.\u00a0Shukla, M.\u00a0Turchetta, C.\u00a0Cadena, A.\u00a0Krause, R.\u00a0Siegwart, and J.\u00a0Nieto. Reinforced Imitation: Sample Efficient Deep Reinforcement Learning for Mapless Navigation by Leveraging Prior Demonstrations. IEEE Robotics and Automation Letters, 3(4):4423\u20134430, 2018.","DOI":"10.1109\/LRA.2018.2869644"},{"key":"31_CR48","unstructured":"A.\u00a0Ray, J.\u00a0Achiam, and D.\u00a0Amodei. Benchmarking Safe Exploration in Deep Reinforcement Learning, 2019. Technical Report. https:\/\/cdn.openai.com\/safexp-short.pdf."},{"key":"31_CR49","unstructured":"J.\u00a0Roy, R.\u00a0Girgis, J.\u00a0Romoff, P.\u00a0Bacon, and C.\u00a0Pal. Direct Behavior Specification via Constrained Reinforcement Learning, 2021. Technical Report. https:\/\/arxiv.org\/abs\/2112.12228."},{"key":"31_CR50","unstructured":"J.\u00a0Schulman, F.\u00a0Wolski, P.\u00a0Dhariwal, A.\u00a0Radford, and O.\u00a0Klimov. Proximal Policy Optimization Algorithms, 2017. Technical Report. http:\/\/arxiv.org\/abs\/1707.06347."},{"key":"31_CR51","unstructured":"K.\u00a0Simonyan and A.\u00a0Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition, 2014. Technical Report. http:\/\/arxiv.org\/abs\/1409.1556."},{"key":"31_CR52","doi-asserted-by":"crossref","unstructured":"G.\u00a0Singh, T.\u00a0Gehr, M.\u00a0Puschel, and M.\u00a0Vechev. An Abstract Domain for Certifying Neural Networks. In Proc. 46th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL), 2019.","DOI":"10.1145\/3290354"},{"key":"31_CR53","unstructured":"A.\u00a0Stooke, J.\u00a0Achiam, and P.\u00a0Abbeel. Responsive Safety in Reinforcement Learning by Pid LagrangianMethods. In Proc. 37th Int. Conf. on Machine Learning (ICML), pages 9133\u20139143, 2020."},{"key":"31_CR54","doi-asserted-by":"crossref","unstructured":"X.\u00a0Sun, H.\u00a0Khedr, and Y.\u00a0Shoukry. Formal Verification of Neural Network Controlled AutonomousSystems. In Proc. 22nd ACM Int. Conf. on Hybrid Systems: Computation andControl (HSCC), 2019.","DOI":"10.1145\/3302504.3311802"},{"key":"31_CR55","unstructured":"R.\u00a0Sutton and A.\u00a0Barto. Reinforcement Learning: An Introduction. MIT press, 2018."},{"key":"31_CR56","unstructured":"R.\u00a0Sutton, D.\u00a0McAllester, S.\u00a0Singh, and Y.\u00a0Mansour. Policy Gradient Methods for Reinforcement Learning with Function Approximation. In Proc. Advances in Neural Information Processing Systems(NeurIPS), 1999."},{"key":"31_CR57","unstructured":"C.\u00a0Szegedy, W.\u00a0Zaremba, I.\u00a0Sutskever, J.\u00a0Bruna, D.\u00a0Erhan, I.\u00a0Goodfellow, and R.\u00a0Fergus. Intriguing Properties of Neural Networks, 2013. Technical Report. http:\/\/arxiv.org\/abs\/1312.6199."},{"key":"31_CR58","doi-asserted-by":"crossref","unstructured":"L.\u00a0Tai, G.\u00a0Paolo, and M.\u00a0Liu. Virtual-to-Real Deep Reinforcement Learning: Continuous Control ofMobile Robots for Mapless Navigation. In Proc. IEEE\/RSJ Int. Conf. on Intelligent Robots and Systems(IROS), pages 31\u201336, 2017.","DOI":"10.1109\/IROS.2017.8202134"},{"key":"31_CR59","unstructured":"V.\u00a0Tjeng, K.\u00a0Xiao, and R.\u00a0Tedrake. Evaluating Robustness of Neural Networks with Mixed Integer Programming, 2017. Technical Report. http:\/\/arxiv.org\/abs\/1711.07356."},{"key":"31_CR60","doi-asserted-by":"crossref","unstructured":"H.\u00a0Van\u00a0Hasselt, A.\u00a0Guez, and D.\u00a0Silver. Deep Reinforcement Learning with Double Q-Learning. In Proc. 30th AAAI Conf. on Artificial Intelligence (AAAI), 2016.","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"31_CR61","doi-asserted-by":"crossref","unstructured":"M.\u00a0Vasi\u0107, A.\u00a0Petrovi\u0107, K.\u00a0Wang, M.\u00a0Nikoli\u0107, R.\u00a0Singh, and S.\u00a0Khurshid. Mo\u00cbT: Mixture of Expert Trees and its Application to Verifiable Reinforcement Learning. Neural Networks, 151:34\u201347, 2022.","DOI":"10.1016\/j.neunet.2022.03.022"},{"key":"31_CR62","unstructured":"A.\u00a0Wachi and Y.\u00a0Sui. Safe Reinforcement Learning in Constrained Markov Decision Processes. In Proc. 37th Int. Conf. on Machine Learning (ICML), pages 9797\u20139806, 2020."},{"key":"31_CR63","doi-asserted-by":"crossref","unstructured":"A.\u00a0Wahid, A.\u00a0Toshev, M.\u00a0Fiser, and T.\u00a0Lee. Long Range Neural Navigation Policies for the Real World. In Proc. IEEE\/RSJ Int. Conf. on Intelligent Robots and Systems(IROS), pages 82\u201389, 2019.","DOI":"10.1109\/IROS40897.2019.8968004"},{"key":"31_CR64","unstructured":"S.\u00a0Wang, K.\u00a0Pei, J.\u00a0Whitehouse, J.\u00a0Yang, and S.\u00a0Jana. Formal Security Analysis of Neural Networks using Symbolic Intervals. In Proc. 27th USENIX Security Symposium, pages 1599\u20131614, 2018."},{"key":"31_CR65","doi-asserted-by":"crossref","unstructured":"K.\u00a0Yoneda, H.\u00a0Tehrani, T.\u00a0Ogawa, N.\u00a0Hukuyama, and S.\u00a0Mita. Lidar Scan Feature for Localization with Highly Precise 3-D Map. In Proc. IEEE Intelligent Vehicles Symposium (IV), pages 1345\u20131350, 2014.","DOI":"10.1109\/IVS.2014.6856596"},{"key":"31_CR66","unstructured":"H.\u00a0Zhang, M.\u00a0Shinn, A.\u00a0Gupta, A.\u00a0Gurfinkel, N.\u00a0Le, and N.\u00a0Narodytska. Verification of Recurrent Neural Networks for Cognitive Tasks viaReachability Analysis. In Proc. 24th European Conf. on Artificial Intelligence (ECAI), pages 1690\u20131697, 2020."},{"key":"31_CR67","unstructured":"J.\u00a0Zhang, J.\u00a0Kim, B.\u00a0O\u2019Donoghue, and S.\u00a0Boyd. Sample Efficient Reinforcement Learning with REINFORCE, 2020. Technical Report. https:\/\/arxiv.org\/abs\/2010.11364."},{"key":"31_CR68","doi-asserted-by":"crossref","unstructured":"J.\u00a0Zhang, J.\u00a0Springenberg, J.\u00a0Boedecker, and W.\u00a0Burgard. Deep Reinforcement Learning with Successor Features for Navigationacross Similar Environments. In Proc. IEEE\/RSJ Int. Conf. on Intelligent Robots and Systems(IROS), 2017.","DOI":"10.1109\/IROS.2017.8206049"},{"key":"31_CR69","doi-asserted-by":"crossref","unstructured":"L.\u00a0Zhang, R.\u00a0Zhang, T.\u00a0Wu, R.\u00a0Weng, M.\u00a0Han, and Y.\u00a0Zhao. Safe Reinforcement Learning with Stability Guarantee for MotionPlanning of Autonomous Vehicles.IEEE Transactions on Neural Networks and Learning Systems,32(12): 5435\u20135444, 2021.","DOI":"10.1109\/TNNLS.2021.3084685"},{"key":"31_CR70","unstructured":"O.\u00a0Zhelo, J.\u00a0Zhang, L.\u00a0Tai, M.\u00a0Liu, and W. Burgard. Curiosity-Driven Exploration for Mapless Navigation with Deep Reinforcement Learning, 2018. Technical Report. https:\/\/arxiv.org\/abs\/1804.00456."}],"container-title":["Lecture Notes in Computer Science","Tools and Algorithms for the Construction and Analysis of Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-30823-9_31","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,4]],"date-time":"2023-08-04T10:06:46Z","timestamp":1691143606000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-30823-9_31"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"ISBN":["9783031308222","9783031308239"],"references-count":70,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-30823-9_31","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"value":"0302-9743","type":"print"},{"value":"1611-3349","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023]]},"assertion":[{"value":"22 April 2023","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"TACAS","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Tools and Algorithms for the Construction and Analysis of Systems","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Paris","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"France","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2023","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"22 April 2023","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"27 April 2023","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"29","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"tacas2023","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/etaps.org\/2023\/tacas","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Double-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"EasyChair","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"169","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"56","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"6","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"33% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"11","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"Yes","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}