{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,30]],"date-time":"2025-06-30T16:05:23Z","timestamp":1751299523219,"version":"3.40.3"},"publisher-location":"Cham","reference-count":59,"publisher":"Springer Nature Switzerland","isbn-type":[{"type":"print","value":"9783031656323"},{"type":"electronic","value":"9783031656330"}],"license":[{"start":{"date-parts":[[2024,1,1]],"date-time":"2024-01-01T00:00:00Z","timestamp":1704067200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,7,26]],"date-time":"2024-07-26T00:00:00Z","timestamp":1721952000000},"content-version":"vor","delay-in-days":207,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In reinforcement learning, an agent incrementally refines a behavioral policy through a series of episodic interactions with its environment. This process can be characterized as <jats:italic>explicit<\/jats:italic> reinforcement learning, as it deals with explicit states and concrete transitions. Building upon the concept of symbolic model checking, we propose a <jats:italic>symbolic<\/jats:italic> variant of reinforcement learning, in which sets of states are represented through predicates and transitions are represented by predicate transformers. Drawing inspiration from regular model checking, we choose <jats:italic>regular languages<\/jats:italic> over the states as our predicates, and <jats:italic>rational transductions<\/jats:italic> as predicate transformations. We refer to this framework as <jats:italic>regular reinforcement learning<\/jats:italic>, and study its utility as a symbolic approach to reinforcement learning. Theoretically, we establish results around decidability, approximability, and efficient learnability in the context of regular reinforcement learning. Towards practical applications, we develop a deep regular reinforcement learning algorithm, enabled by the use of graph neural networks. We showcase the applicability and effectiveness of (deep) regular reinforcement learning through empirical evaluation on a diverse set of case studies.<\/jats:p>","DOI":"10.1007\/978-3-031-65633-0_9","type":"book-chapter","created":{"date-parts":[[2024,7,25]],"date-time":"2024-07-25T07:03:08Z","timestamp":1721890988000},"page":"184-208","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Regular Reinforcement Learning"],"prefix":"10.1007","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5722-4847","authenticated-orcid":false,"given":"Taylor","family":"Dohmen","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4220-3212","authenticated-orcid":false,"given":"Mateo","family":"Perez","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2085-2003","authenticated-orcid":false,"given":"Fabio","family":"Somenzi","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9346-0126","authenticated-orcid":false,"given":"Ashutosh","family":"Trivedi","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,7,26]]},"reference":[{"key":"9_CR1","doi-asserted-by":"publisher","unstructured":"Abadi, E., Brafman, R.I.: Learning and solving regular decision processes. In: International Joint Conference on Artificial Intelligence, IJCAI, pp. 1948\u20131954. ijcai.org (2020). https:\/\/doi.org\/10.24963\/ijcai.2020\/270","DOI":"10.24963\/ijcai.2020\/270"},{"issue":"2","key":"9_CR2","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1007\/S10009-011-0216-8","volume":"14","author":"PA Abdulla","year":"2012","unstructured":"Abdulla, P.A.: Regular model checking. Int. J. Softw. Tools Technol. Transfer 14(2), 109\u2013118 (2012). https:\/\/doi.org\/10.1007\/S10009-011-0216-8","journal-title":"Int. J. Softw. Tools Technol. Transfer"},{"key":"9_CR3","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1007\/978-3-030-91384-7_5","volume-title":"Model Checking, Synthesis, and Learning","author":"PA Abdulla","year":"2021","unstructured":"Abdulla, P.A.: Regular model checking: evolution and perspectives. In: Olderog, E.-R., Steffen, B., Yi, W. (eds.) Model Checking, Synthesis, and Learning. LNCS, vol. 13030, pp. 78\u201396. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-91384-7_5"},{"key":"9_CR4","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"555","DOI":"10.1007\/3-540-45657-0_47","volume-title":"Computer Aided Verification","author":"PA Abdulla","year":"2002","unstructured":"Abdulla, P.A., Jonsson, B., Mahata, P., d\u2019Orso, J.: Regular tree model checking. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp. 555\u2013568. Springer, Heidelberg (2002). https:\/\/doi.org\/10.1007\/3-540-45657-0_47"},{"key":"9_CR5","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"35","DOI":"10.1007\/978-3-540-28644-8_3","volume-title":"CONCUR 2004 - Concurrency Theory","author":"PA Abdulla","year":"2004","unstructured":"Abdulla, P.A., Jonsson, B., Nilsson, M., Saksena, M.: A survey of regular model checking. In: Gardner, P., Yoshida, N. (eds.) CONCUR 2004. LNCS, vol. 3170, pp. 35\u201348. Springer, Heidelberg (2004). https:\/\/doi.org\/10.1007\/978-3-540-28644-8_3"},{"issue":"1\u20132","key":"9_CR6","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1016\/j.jlap.2006.02.001","volume":"69","author":"PA Abdulla","year":"2006","unstructured":"Abdulla, P.A., Legay, A., d\u2019Orso, J., Rezine, A.: Tree regular model checking: a simulation-based approach. J. Logic Algebraic Program. 69(1\u20132), 93\u2013121 (2006). https:\/\/doi.org\/10.1016\/j.jlap.2006.02.001","journal-title":"J. Logic Algebraic Program."},{"key":"9_CR7","unstructured":"Agarwal, A., Jiang, N., Kakade, S.M., Sun, W.: Reinforcement learning: Theory and algorithms. CS Department, UW Seattle, Seattle, WA, USA, Technical Report 32, 96 (2019)"},{"key":"9_CR8","series-title":"Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence)","doi-asserted-by":"publisher","first-page":"347","DOI":"10.1007\/11916277_24","volume-title":"Logic for Programming, Artificial Intelligence, and Reasoning","author":"C Baier","year":"2006","unstructured":"Baier, C., Bertrand, N., Schnoebelen, P.: On computing fixpoints in well-structured regular model checking, with applications to lossy channel systems. In: Hermann, M., Voronkov, A. (eds.) LPAR 2006. LNCS (LNAI), vol. 4246, pp. 347\u2013361. Springer, Heidelberg (2006). https:\/\/doi.org\/10.1007\/11916277_24"},{"key":"9_CR9","doi-asserted-by":"crossref","unstructured":"Bause, F., Kritzinger, P.S.: Stochastic Petri Nets - an Introduction to the Theory, 2nd edn. Vieweg (2002)","DOI":"10.1007\/978-3-322-86501-4"},{"key":"9_CR10","unstructured":"Berner, C., Brockman, G., Chan, B., Cheung, V., D\u0119biak, P., Dennison, C., Farhi, D., Fischer, Q., Hashme, S., Hesse, C., et\u00a0al.: Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680 (2019)"},{"issue":"1\u20133","key":"9_CR11","doi-asserted-by":"publisher","first-page":"413","DOI":"10.1016\/S0304-3975(03)00314-1","volume":"309","author":"B Boigelot","year":"2003","unstructured":"Boigelot, B.: On iterating linear transformations over recognizable sets of integers. Theoret. Comput. Sci. 309(1\u20133), 413\u2013468 (2003). https:\/\/doi.org\/10.1016\/S0304-3975(03)00314-1","journal-title":"Theoret. Comput. Sci."},{"key":"9_CR12","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"223","DOI":"10.1007\/978-3-540-45069-6_24","volume-title":"Computer Aided Verification","author":"B Boigelot","year":"2003","unstructured":"Boigelot, B., Legay, A., Wolper, P.: Iterating transducers in the large. In: Hunt, W.A., Somenzi, F. (eds.) CAV 2003. LNCS, vol. 2725, pp. 223\u2013235. Springer, Heidelberg (2003). https:\/\/doi.org\/10.1007\/978-3-540-45069-6_24"},{"key":"9_CR13","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"561","DOI":"10.1007\/978-3-540-24730-2_41","volume-title":"Tools and Algorithms for the Construction and Analysis of Systems","author":"B Boigelot","year":"2004","unstructured":"Boigelot, B., Legay, A., Wolper, P.: Omega-regular model checking. In: Jensen, K., Podelski, A. (eds.) TACAS 2004. LNCS, vol. 2988, pp. 561\u2013575. Springer, Heidelberg (2004). https:\/\/doi.org\/10.1007\/978-3-540-24730-2_41"},{"key":"9_CR14","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1007\/978-3-540-31980-1_2","volume-title":"Tools and Algorithms for the Construction and Analysis of Systems","author":"A Bouajjani","year":"2005","unstructured":"Bouajjani, A., Habermehl, P., Moro, P., Vojnar, T.: Verifying programs with dynamic 1-selector-linked structures in regular model checking. In: Halbwachs, N., Zuck, L.D. (eds.) TACAS 2005. LNCS, vol. 3440, pp. 13\u201329. Springer, Heidelberg (2005). https:\/\/doi.org\/10.1007\/978-3-540-31980-1_2"},{"key":"9_CR15","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"52","DOI":"10.1007\/11823230_5","volume-title":"Static Analysis","author":"A Bouajjani","year":"2006","unstructured":"Bouajjani, A., Habermehl, P., Rogalewicz, A., Vojnar, T.: Abstract regular tree model checking of complex dynamic data structures. In: Yi, K. (ed.) SAS 2006. LNCS, vol. 4134, pp. 52\u201370. Springer, Heidelberg (2006). https:\/\/doi.org\/10.1007\/11823230_5"},{"issue":"2","key":"9_CR16","doi-asserted-by":"publisher","first-page":"167","DOI":"10.1007\/s10009-011-0205-y","volume":"14","author":"A Bouajjani","year":"2012","unstructured":"Bouajjani, A., Habermehl, P., Rogalewicz, A., Vojnar, T.: Abstract regular (tree) model checking. Int. J. Softw. Tools Technol. Transfer 14(2), 167\u2013191 (2012). https:\/\/doi.org\/10.1007\/s10009-011-0205-y","journal-title":"Int. J. Softw. Tools Technol. Transfer"},{"key":"9_CR17","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"372","DOI":"10.1007\/978-3-540-27813-9_29","volume-title":"Computer Aided Verification","author":"A Bouajjani","year":"2004","unstructured":"Bouajjani, A., Habermehl, P., Vojnar, T.: Abstract regular model checking. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 372\u2013386. Springer, Heidelberg (2004). https:\/\/doi.org\/10.1007\/978-3-540-27813-9_29"},{"key":"9_CR18","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"403","DOI":"10.1007\/10722167_31","volume-title":"Computer Aided Verification","author":"A Bouajjani","year":"2000","unstructured":"Bouajjani, A., Jonsson, B., Nilsson, M., Touili, T.: Regular model checking. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 403\u2013418. Springer, Heidelberg (2000). https:\/\/doi.org\/10.1007\/10722167_31"},{"issue":"2","key":"9_CR19","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1007\/s10009-011-0208-8","volume":"14","author":"A Bouajjani","year":"2012","unstructured":"Bouajjani, A., Touili, T.: Widening techniques for regular tree model checking. Int. J. Softw. Tools Technol. Transfer 14(2), 145\u2013165 (2012). https:\/\/doi.org\/10.1007\/s10009-011-0208-8","journal-title":"Int. J. Softw. Tools Technol. Transfer"},{"key":"9_CR20","doi-asserted-by":"publisher","unstructured":"Brafman, R.I., Giacomo, G.D.: Regular decision processes: A model for non-markovian domains. In: International Joint Conference on Artificial Intelligence, IJCAI, pp. 5516\u20135522. ijcai.org (2019). https:\/\/doi.org\/10.24963\/ijcai.2019\/766","DOI":"10.24963\/ijcai.2019\/766"},{"key":"9_CR21","unstructured":"Brafman, R.I., Giacomo, G.D.: Regular decision processes: modelling dynamic systems without using hidden variables. In: International Conference on Autonomous Agents and MultiAgent Systems, AAMAS, pp. 1844\u20131846. International Foundation for Autonomous Agents and Multiagent Systems (2019). http:\/\/dl.acm.org\/citation.cfm?id=3331938"},{"key":"9_CR22","doi-asserted-by":"publisher","unstructured":"Camacho, A., Varley, J., Zeng, A., Jain, D., Iscen, A., Kalashnikov, D.: Reward machines for vision-based robotic manipulation. In: International Conference on Robotics and Automation, ICRA, pp. 14284\u201314290. IEEE (2021). https:\/\/doi.org\/10.1109\/ICRA48506.2021.9561927","DOI":"10.1109\/ICRA48506.2021.9561927"},{"key":"9_CR23","doi-asserted-by":"publisher","unstructured":"Chatterjee, K., Fu, H., Goharshady, A.K., Okati, N.: Computational approaches for stochastic shortest path on succinct MDPs. In: International Joint Conference on Artificial Intelligence, IJCAI, pp. 4700\u20134707. ijcai.org (2018). https:\/\/doi.org\/10.24963\/ijcai.2018\/653","DOI":"10.24963\/ijcai.2018\/653"},{"key":"9_CR24","doi-asserted-by":"publisher","unstructured":"Corazza, J., Gavran, I., Neider, D.: Reinforcement learning with stochastic reward machines. In: Conference on Artificial Intelligence, AAAI vol. 36, no. 6, pp. 6429\u20136436 (2022). https:\/\/doi.org\/10.1609\/aaai.v36i6.20594","DOI":"10.1609\/aaai.v36i6.20594"},{"key":"9_CR25","doi-asserted-by":"publisher","unstructured":"Dann, M., Yao, Y., Alechina, N., Logan, B., Thangarajah, J.: Multi-agent intention progression with reward machines. In: International Joint Conference on Artificial Intelligence, IJCAI, pp. 215\u2013222. ijcai.org (2022). https:\/\/doi.org\/10.24963\/ijcai.2022\/31","DOI":"10.24963\/ijcai.2022\/31"},{"key":"9_CR26","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"190","DOI":"10.1007\/978-3-030-86593-1_13","volume-title":"Fundamentals of Computation Theory","author":"V Dave","year":"2021","unstructured":"Dave, V., Dohmen, T., Krishna, S.N., Trivedi, A.: Regular model checking with regular relations. In: Bampis, E., Pagourtzis, A. (eds.) FCT 2021. LNCS, vol. 12867, pp. 190\u2013203. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-86593-1_13"},{"issue":"9\u201310","key":"9_CR27","doi-asserted-by":"publisher","first-page":"1498","DOI":"10.1016\/j.artint.2011.01.001","volume":"175","author":"KV Delgado","year":"2011","unstructured":"Delgado, K.V., Sanner, S., De Barros, L.N.: Efficient solutions to factored MDPs with imprecise transition probabilities. Artif. Intell. 175(9\u201310), 1498\u20131527 (2011). https:\/\/doi.org\/10.1016\/j.artint.2011.01.001","journal-title":"Artif. Intell."},{"key":"9_CR28","unstructured":"Dijkstra, E.W.: Algol 60 translation: An algol 60 translator for the x1 and making a translator for algol 60. Stichting Mathematisch Centrum. Rekenafdeling (MR 34\/61) (1961)"},{"key":"9_CR29","doi-asserted-by":"publisher","unstructured":"Dohmen, T., Topper, N., Atia, G.K., Beckus, A., Trivedi, A., Velasquez, A.: Inferring probabilistic reward machines from non-Markovian reward signals for reinforcement learning. In: International Conference on Automated Planning and Scheduling, ICAPS, pp. 574\u2013582. AAAI Press (2022). https:\/\/doi.org\/10.1609\/icaps.v32i1.19844","DOI":"10.1609\/icaps.v32i1.19844"},{"key":"9_CR30","unstructured":"Feinberg, E.A.: Total expected discounted reward MDPs: existence of optimal policies (2011). https:\/\/onlinelibrary.wiley.com\/doi\/abs\/10.1002\/9780470400531.eorms0906"},{"key":"9_CR31","unstructured":"Goodfellow, I.J., Bengio, Y., Courville, A.C.: Deep Learning. Adaptive Computation and Machine Learning, MIT Press (2016). http:\/\/www.deeplearningbook.org\/"},{"key":"9_CR32","doi-asserted-by":"publisher","first-page":"399","DOI":"10.1613\/jair.1000","volume":"19","author":"C Guestrin","year":"2003","unstructured":"Guestrin, C., Koller, D., Parr, R., Venkataraman, S.: Efficient solution algorithms for factored MDPs. J. Artif. Intell. Res. 19, 399\u2013468 (2003). https:\/\/doi.org\/10.1613\/jair.1000","journal-title":"J. Artif. Intell. Res."},{"key":"9_CR33","doi-asserted-by":"publisher","unstructured":"Habermehl, P., Vojnar, T.: Regular model checking using inference of regular languages. In: International Workshop on Verification of Infinite-State Systems, INFINITY, pp. 21\u201336. Electronic Notes in Theoretical Computer Science, Elsevier (2004). https:\/\/doi.org\/10.1016\/j.entcs.2005.01.044","DOI":"10.1016\/j.entcs.2005.01.044"},{"key":"9_CR34","unstructured":"Icarte, R.T.: Reward Machines. Ph.D. thesis, University of Toronto, Canada (2022). http:\/\/hdl.handle.net\/1807\/110754"},{"key":"9_CR35","unstructured":"Icarte, R.T., Klassen, T.Q., Valenzano, R.A., McIlraith, S.A.: Using reward machines for high-level task specification and decomposition in reinforcement learning. In: International Conference on Machine Learning, ICML. Proceedings of Machine Learning Research, vol.\u00a080, pp. 2112\u20132121. PMLR (2018). http:\/\/proceedings.mlr.press\/v80\/icarte18a.html"},{"key":"9_CR36","doi-asserted-by":"publisher","first-page":"173","DOI":"10.1613\/jair.1.12440","volume":"73","author":"RT Icarte","year":"2022","unstructured":"Icarte, R.T., Klassen, T.Q., Valenzano, R.A., McIlraith, S.A.: Reward machines: exploiting reward function structure in reinforcement learning. J. Artif. Intell. Res. 73, 173\u2013208 (2022). https:\/\/doi.org\/10.1613\/jair.1.12440","journal-title":"J. Artif. Intell. Res."},{"key":"9_CR37","unstructured":"Icarte, R.T., Waldie, E., Klassen, T.Q., Valenzano, R.A., Castro, M.P., McIlraith, S.A.: Learning reward machines for partially observable reinforcement learning. In: Conference on Neural Information Processing Systems, NeurIPS, pp. 15497\u201315508 (2019). https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/532435c44bec236b471a47a88d63513d-Abstract.html"},{"key":"9_CR38","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1007\/978-3-540-73368-3_16","volume-title":"Computer Aided Verification","author":"B Jonsson","year":"2007","unstructured":"Jonsson, B., Saksena, M.: Systematic acceleration in regular model checking. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 131\u2013144. Springer, Heidelberg (2007). https:\/\/doi.org\/10.1007\/978-3-540-73368-3_16"},{"key":"9_CR39","unstructured":"Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations, ICLR. OpenReview.net (2017). https:\/\/openreview.net\/forum?id=SJU4ayYgl"},{"issue":"2","key":"9_CR40","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1007\/s10009-011-0209-7","volume":"14","author":"A Legay","year":"2012","unstructured":"Legay, A.: Extrapolating (omega-)regular model checking. Int. J. Softw. Tools Technol. Transfer 14(2), 119\u2013143 (2012). https:\/\/doi.org\/10.1007\/s10009-011-0209-7","journal-title":"Int. J. Softw. Tools Technol. Transfer"},{"key":"9_CR41","doi-asserted-by":"publisher","unstructured":"Legay, A., Wolper, P.: On (omega-)regular model checking. ACM Trans. Computat. Logic 12(1), 2:1\u20132:46 (2010). https:\/\/doi.org\/10.1145\/1838552.1838554","DOI":"10.1145\/1838552.1838554"},{"key":"9_CR42","doi-asserted-by":"publisher","unstructured":"Lenaers, N., van Otterlo, M.: Regular decision processes for grid worlds. In: Leiva, L.A., Pruski, C., Markovich, R., Najjar, A., Schommer, C. (eds.) Benelux Conference on Artificial Intelligence, BNAIC\/Benelearn. Communications in Computer and Information Science, vol.\u00a01530, pp. 218\u2013238. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-93842-0_13","DOI":"10.1007\/978-3-030-93842-0_13"},{"key":"9_CR43","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1007\/978-3-030-91384-7_6","volume-title":"Model Checking, Synthesis, and Learning","author":"AW Lin","year":"2021","unstructured":"Lin, A.W., R\u00fcmmer, P.: Regular model checking revisited. In: Olderog, E.-R., Steffen, B., Yi, W. (eds.) Model Checking, Synthesis, and Learning. LNCS, vol. 13030, pp. 97\u2013114. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-91384-7_6"},{"key":"9_CR44","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1038\/nature14236","volume":"518","author":"V Mnih","year":"2015","unstructured":"Mnih, V., et al.: Human-level control through reinforcement learning. Nature 518, 529\u2013533 (2015)","journal-title":"Nature"},{"key":"9_CR45","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1007\/978-3-642-38088-4_2","volume-title":"NASA Formal Methods","author":"D Neider","year":"2013","unstructured":"Neider, D., Jansen, N.: Regular model checking using solver technologies and automata learning. In: Brat, G., Rungta, N., Venet, A. (eds.) NFM 2013. LNCS, vol. 7871, pp. 16\u201331. Springer, Heidelberg (2013). https:\/\/doi.org\/10.1007\/978-3-642-38088-4_2"},{"issue":"2","key":"9_CR46","doi-asserted-by":"publisher","first-page":"358","DOI":"10.1090\/S0002-9947-1953-0053041-6","volume":"74","author":"HG Rice","year":"1953","unstructured":"Rice, H.G.: Classes of recursively enumerable sets and their decision problems. Trans. Am. Math. Soc. 74(2), 358\u2013366 (1953)","journal-title":"Trans. Am. Math. Soc."},{"key":"9_CR47","doi-asserted-by":"publisher","unstructured":"Ronca, A., Giacomo, G.D.: Efficient PAC reinforcement learning in regular decision processes. In: International Joint Conference on Artificial Intelligence, IJCAI, pp. 2026\u20132032. ijcai.org (2021). https:\/\/doi.org\/10.24963\/ijcai.2021\/279","DOI":"10.24963\/ijcai.2021\/279"},{"key":"9_CR48","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs\/1707.06347 arxiv:1707.06347 (2017)"},{"key":"9_CR49","doi-asserted-by":"publisher","first-page":"484","DOI":"10.1038\/nature16961","volume":"529","author":"D Silver","year":"2016","unstructured":"Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484\u2013489 (2016)","journal-title":"Nature"},{"key":"9_CR50","unstructured":"Sipser, M.: Introduction to the Theory of Computation, chap.\u00a01. PWS Publishing Company (1997)"},{"key":"9_CR51","unstructured":"Sutton, R.S., Barto, A.G.: Reinforcement learning - an Introduction. Adaptive Computation and Machine Learning, MIT Press (1998). https:\/\/www.worldcat.org\/oclc\/37293240"},{"key":"9_CR52","doi-asserted-by":"publisher","unstructured":"Touili, T.: Regular model checking using widening techniques. In: Verification of Parameterized Systems, VEPAS 2001, Satellite Workshop of ICALP, pp. 342\u2013356. Electronic Notes in Theoretical Computer Science, Elsevier (2001). https:\/\/doi.org\/10.1016\/S1571-0661(04)00187-2","DOI":"10.1016\/S1571-0661(04)00187-2"},{"key":"9_CR53","doi-asserted-by":"publisher","unstructured":"Valiant, L.G.: A theory of the learnable. In: Symposium on Theory of Computing, STOC, pp. 436\u2013445. ACM (1984). https:\/\/doi.org\/10.1145\/800057.808710","DOI":"10.1145\/800057.808710"},{"key":"9_CR54","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1007\/BF00992698","volume":"8","author":"CJCH Watkins","year":"1992","unstructured":"Watkins, C.J.C.H., Dayan, P.: Technical note q-learning. Mach. Learn. 8, 279\u2013292 (1992). https:\/\/doi.org\/10.1007\/BF00992698","journal-title":"Mach. Learn."},{"key":"9_CR55","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"88","DOI":"10.1007\/BFb0028736","volume-title":"Computer Aided Verification","author":"P Wolper","year":"1998","unstructured":"Wolper, P., Boigelot, B.: Verifying systems with infinite but regular state spaces. In: Hu, A.J., Vardi, M.Y. (eds.) CAV 1998. LNCS, vol. 1427, pp. 88\u201397. Springer, Heidelberg (1998). https:\/\/doi.org\/10.1007\/BFb0028736"},{"issue":"1","key":"9_CR56","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1109\/TNNLS.2020.2978386","volume":"32","author":"Z Wu","year":"2021","unstructured":"Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32(1), 4\u201324 (2021). https:\/\/doi.org\/10.1109\/TNNLS.2020.2978386","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"9_CR57","doi-asserted-by":"publisher","unstructured":"Xu, Z., Gavran, I., Ahmad, Y., Majumdar, R., Neider, D., Topcu, U., Wu, B.: Joint inference of reward machines and policies for reinforcement learning. In: International Conference on Automated Planning and Scheduling, Nancy, France, October 26-30, 2020, pp. 590\u2013598. AAAI Press (2020). https:\/\/doi.org\/10.1609\/icaps.v30i1.6756","DOI":"10.1609\/icaps.v30i1.6756"},{"key":"9_CR58","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1007\/978-3-030-84060-0_8","volume-title":"Machine Learning and Knowledge Extraction","author":"Z Xu","year":"2021","unstructured":"Xu, Z., Wu, B., Ojha, A., Neider, D., Topcu, U.: Active finite reward automaton inference and reinforcement learning using queries and counterexamples. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2021. LNCS, vol. 12844, pp. 115\u2013135. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-84060-0_8"},{"key":"9_CR59","unstructured":"Zhou, W., Li, W.: A hierarchical Bayesian approach to inverse reinforcement learning with symbolic reward machines. In: International Conference on Machine Learning, ICML. Proceedings of Machine Learning Research, vol.\u00a0162, pp. 27159\u201327178. PMLR (2022). https:\/\/proceedings.mlr.press\/v162\/zhou22b.html"}],"container-title":["Lecture Notes in Computer Science","Computer Aided Verification"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-65633-0_9","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,25]],"date-time":"2024-07-25T07:04:36Z","timestamp":1721891076000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-65633-0_9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024]]},"ISBN":["9783031656323","9783031656330"],"references-count":59,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-65633-0_9","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2024]]},"assertion":[{"value":"26 July 2024","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"CAV","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Computer Aided Verification","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Montreal, QC","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Canada","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2024","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"24 July 2024","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"27 July 2024","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"36","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"cav2024","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"http:\/\/i-cav.org\/2024\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}}]}}