{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,6]],"date-time":"2025-12-06T17:13:56Z","timestamp":1765041236440,"version":"3.37.3"},"reference-count":38,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2022,3,30]],"date-time":"2022-03-30T00:00:00Z","timestamp":1648598400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,3,30]],"date-time":"2022-03-30T00:00:00Z","timestamp":1648598400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100011914","name":"M\u00e4lardalen University","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100011914","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Softw Tools Technol Transfer"],"published-print":{"date-parts":[[2022,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Path planning and task scheduling are two challenging problems in the design of multiple autonomous agents. Both problems can be solved by the use of exhaustive search techniques such as model checking and algorithmic game theory. However, model checking suffers from the infamous state-space explosion problem that makes it inefficient at solving the problems when the number of agents is large, which is often the case in realistic scenarios. In this paper, we propose a new version of our novel approach called MCRL that integrates model checking and reinforcement learning to alleviate this scalability limitation. We apply this new technique to synthesize path planning and task scheduling strategies for multiple autonomous agents. Our method is capable of handling a larger number of agents if compared to what is feasibly handled by the model-checking technique alone. Additionally, MCRL also guarantees the correctness of the synthesis results via post-verification. The method is implemented in UPPAAL STRATEGO and leverages our tool <jats:italic>MALTA<\/jats:italic> for model generation, such that one can use the method with less effort of model construction and higher efficiency of learning than those of the original MCRL. We demonstrate the feasibility of our approach on an industrial case study: an autonomous quarry, and discuss the strengths and weaknesses of the methods.<\/jats:p>","DOI":"10.1007\/s10009-022-00657-z","type":"journal-article","created":{"date-parts":[[2022,3,30]],"date-time":"2022-03-30T07:03:31Z","timestamp":1648623811000},"page":"395-414","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["Verifiable strategy synthesis for multiple autonomous agents: a scalable approach"],"prefix":"10.1007","volume":"24","author":[{"given":"Rong","family":"Gu","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Peter G.","family":"Jensen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Danny B.","family":"Poulsen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cristina","family":"Seceleanu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eduard","family":"Enoiu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kristina","family":"Lundqvist","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2022,3,30]]},"reference":[{"issue":"2","key":"657_CR1","doi-asserted-by":"publisher","first-page":"272","DOI":"10.1016\/j.tcs.2005.11.018","volume":"354","author":"Y Abdedda\u0131","year":"2006","unstructured":"Abdedda\u0131, Y., Asarin, E., Maler, O., et al.: Scheduling with timed automata. Theor. Comput. Sci. 354(2), 272\u2013300 (2006)","journal-title":"Theor. Comput. Sci."},{"key":"657_CR2","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1016\/0304-3975(94)90010-8","volume":"126","author":"R Alur","year":"1994","unstructured":"Alur, R., Dill, D.L.: A theory of timed automata. Theor. Comput. Sci. 126, 183\u2013235 (1994)","journal-title":"Theor. Comput. Sci."},{"key":"657_CR3","doi-asserted-by":"crossref","unstructured":"Basile, D., ter Beek, M.H., Legay, A.: Strategy synthesis for autonomous driving in a moving block railway system with uppaal stratego. In: International Conference on Formal Techniques for Distributed Objects, Components, and Systems. Springer (2020)","DOI":"10.1007\/978-3-030-50086-3_1"},{"key":"657_CR4","doi-asserted-by":"crossref","unstructured":"Behjati, R., Sirjani, M., Ahmadabadi, M.N.: Bounded rational search for on-the-fly model checking of LTL properties. In: Symposium on the Foundations of Software Engineering. Springer (2009)","DOI":"10.1007\/978-3-642-11623-0_17"},{"key":"657_CR5","doi-asserted-by":"crossref","unstructured":"Behrmann, G., David, A., Fleury, E., Larsen, K., Lime, D., Nantes, E.: Uppaal-Tiga: Time for playing games! (tool paper). In: International Conference on Computer Aided Verification. Springer Berlin Heidelberg (2007)","DOI":"10.1007\/978-3-540-73368-3_14"},{"key":"657_CR6","doi-asserted-by":"crossref","unstructured":"Bengtsson, J., Yi, W.: Timed automata: Semantics, algorithms and tools. Lecture Notes in Computer Science (2004)","DOI":"10.1007\/978-3-540-27755-2_3"},{"key":"657_CR7","doi-asserted-by":"crossref","unstructured":"Bersani, M.M., Soldo, M., Menghi, C., Pelliccione, P., Rossi, M.: Pursue-from specification of robotic environments to synthesis of controllers. Formal Aspects of Computing (2020)","DOI":"10.1007\/s00165-020-00509-0"},{"key":"657_CR8","doi-asserted-by":"crossref","unstructured":"Biere, A., Cimatti, A., Clarke, E.M., Strichman, O., Zhu, Y.: Bounded Model Checking. Carnegie Mellon University (2003)","DOI":"10.1016\/S0065-2458(03)58003-2"},{"key":"657_CR9","unstructured":"B\u00f8nneland, F.M., Jensen, P.G., Larsen, K.G., Mu\u00f1iz, M., Srba, J.: Stubborn set reduction for two-player reachability games. Preprint arXiv:1912.09875 (2019)"},{"key":"657_CR10","doi-asserted-by":"crossref","unstructured":"Bouton, M., Cosgun, A., Kochenderfer, M.J.: Belief state planning for autonomously navigating urban intersections. In: Intelligent Vehicles Symposium. IEEE (2017)","DOI":"10.1109\/IVS.2017.7995818"},{"key":"657_CR11","unstructured":"Bouton, M., Karlsson, J., Nakhaei, A., Fujimura, K., Kochenderfer, M.J., Tumova, J.: Reinforcement learning with probabilistic guarantees for autonomous driving. Preprint arXiv:1904.07189 (2019)"},{"key":"657_CR12","doi-asserted-by":"crossref","unstructured":"Br\u00e1zdil, T., Chatterjee, K., Chmelik, M., Forejt, V., K\u0159et\u00ednsk\u1ef3, J., Kwiatkowska, M., Parker, D., Ujma, M.: Verification of markov decision processes using learning algorithms. In: International Symposium on Automated Technology for Verification and Analysis. Springer (2014)","DOI":"10.1007\/978-3-319-11936-6_8"},{"key":"657_CR13","doi-asserted-by":"crossref","unstructured":"Cassez, F., David, A., Fleury, E., Larsen, K.G., Lime, D.: Efficient on-the-fly algorithms for the analysis of timed games. In: International Conference on Concurrency Theory. Springer (2005)","DOI":"10.1007\/11539452_9"},{"key":"657_CR14","doi-asserted-by":"crossref","unstructured":"Clarke, E.M., Klieber, W., Nov\u00e1\u010dek, M., Zuliani, P.: Model checking and the state explosion problem. In: LASER Summer School. Springer (2011)","DOI":"10.1007\/978-3-642-35746-6_1"},{"key":"657_CR15","doi-asserted-by":"publisher","first-page":"533","DOI":"10.1613\/jair.2994","volume":"39","author":"K Daniel","year":"2010","unstructured":"Daniel, K., Nash, A., Koenig, S., Felner, A.: Theta*: any-angle path planning on grids. J. Artif. Intell. Res. 39, 533\u201379 (2010)","journal-title":"J. Artif. Intell. Res."},{"key":"657_CR16","doi-asserted-by":"crossref","unstructured":"David, A., Du, D., Larsen, K.G., Legay, A., Miku\u010dionis, M., Poulsen, D.B., Sedwards, S.: Statistical model checking for stochastic hybrid systems. Preprint arXiv:1208.3856 (2012)","DOI":"10.4204\/EPTCS.92.9"},{"key":"657_CR17","doi-asserted-by":"crossref","unstructured":"David, A., Jensen, P.G., Larsen, K.G., Legay, A., Lime, D., S\u00f8rensen, M.G., Taankvist, J.H.: On time with minimal expected cost! In: International Symposium on Automated Technology for Verification and Analysis. Springer (2014)","DOI":"10.1007\/978-3-319-11936-6_10"},{"key":"657_CR18","doi-asserted-by":"crossref","unstructured":"David, A., Jensen, P.G., Larsen, K.G., Miku\u010dionis, M., Taankvist, J.H.: Uppaal Stratego. In: International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer (2015)","DOI":"10.1007\/978-3-662-46681-0_16"},{"key":"657_CR19","unstructured":"Fisher, H.: Probabilistic learning combinations of local job-shop scheduling rules. In: Industrial Scheduling. Prentice Hall, Englewood Cliffs (1963)"},{"key":"657_CR20","doi-asserted-by":"crossref","unstructured":"Franklin, S., Graesser, A.: Is it an agent, or just a program?: A taxonomy for autonomous agents. In: International Workshop on Agent Theories, Architectures, and Languages. Springer (1996)","DOI":"10.1007\/BFb0013570"},{"key":"657_CR21","doi-asserted-by":"crossref","unstructured":"Gleirscher, M., Calinescu, R., Douthwaite, J., Lesage, B., Paterson, C., Aitken, J., Alexander, R., Law, J.: Verified synthesis of optimal safety controllers for human-robot collaboration. Preprint arXiv:2106.06604 (2021)","DOI":"10.1016\/j.scico.2022.102809"},{"key":"657_CR22","doi-asserted-by":"crossref","unstructured":"Gu, R., Enoiu, E.P., Seceleanu, C.: TAMAA: UPPAAL-based mission planning for autonomous agents. In: ACM\/SIGAPP Symposium On Applied Computing (2020)","DOI":"10.1145\/3341105.3374001"},{"key":"657_CR23","doi-asserted-by":"crossref","unstructured":"Gu, R., Enoiu, E.P., Seceleanu, C., Lundqvist, K.: Verifiable and scalable mission-plan synthesis for multiple autonomous agents. In: International Conference on Formal Methods for Industrial Critical Systems. Springer (2020)","DOI":"10.1007\/978-3-030-58298-2_2"},{"key":"657_CR24","doi-asserted-by":"crossref","unstructured":"Gu, R., Marinescu, R., Seceleanu, C., Lundqvist, K.: Towards a two-layer framework for verifying autonomous vehicles. In: NASA Formal Methods Symposium. Springer (2019)","DOI":"10.1007\/978-3-030-20652-9_12"},{"key":"657_CR25","doi-asserted-by":"crossref","unstructured":"Gu, R., Seceleanu, C., Enoiu, E.P., Lundqvist, K.: Model checking collision avoidance of nonlinear autonomous vehicle models. In: Formal Methods 2021 (2021)","DOI":"10.1007\/978-3-030-90870-6_37"},{"key":"657_CR26","doi-asserted-by":"crossref","unstructured":"Jaeger, M., Jensen, P.G., Larsen, K.G., Legay, A., Sedwards, S., Taankvist, J.H.: Teaching stratego to play ball: Optimal synthesis for continuous space MDPs. In: International Symposium on Automated Technology for Verification and Analysis. Springer (2019)","DOI":"10.1007\/978-3-030-31784-3_5"},{"key":"657_CR27","doi-asserted-by":"crossref","unstructured":"Kempf, J.F., Bozga, M., Maler, O.: As soon as probable: Optimal scheduling under stochastic uncertainty. In: International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer (2013)","DOI":"10.1007\/978-3-642-36742-7_27"},{"key":"657_CR28","unstructured":"Konda, V.R., Tsitsiklis, J.N.: Actor-critic algorithms. In: Advances in neural information processing systems (2000)"},{"key":"657_CR29","unstructured":"LaValle, S.M.: Rapidly-exploring random trees: a new tool for path planning. In: Technical Report (1998)"},{"key":"657_CR30","doi-asserted-by":"crossref","unstructured":"Legay, A., Sedwards, S., Traonouez, L.M.: Scalable verification of markov decision processes. In: International Conference on Software Engineering and Formal Methods. Springer (2014)","DOI":"10.1007\/978-3-319-15201-1_23"},{"key":"657_CR31","doi-asserted-by":"crossref","unstructured":"Li, X., Serlin, Z., Yang, G., Belta, C.: A formal methods approach to interpretable reinforcement learning for robotic planning. Science Robotics (2019)","DOI":"10.1126\/scirobotics.aay6276"},{"key":"657_CR32","doi-asserted-by":"crossref","unstructured":"Nikou, A., Boskos, D., Tumova, J., Dimarogonas, D.V.: On the timed temporal logic planning of coupled multi-agent systems. Automatica (2018)","DOI":"10.1016\/j.automatica.2018.08.023"},{"key":"657_CR33","doi-asserted-by":"crossref","unstructured":"Palopoli, L., Argyros, A., Birchbauer, J., Colombo, A., Fontanelli, D., Legay, A., Garulli, A., Giannitrapani, A., Macii, D., Moro, F., et\u00a0al.: Navigation assistance and guidance of older adults across complex public spaces: the DALi approach. Intelligent Service Robotics (2015)","DOI":"10.1007\/s11370-015-0169-y"},{"key":"657_CR34","doi-asserted-by":"crossref","unstructured":"Pel\u00e1nek, R.: Fighting state space explosion: Review and evaluation. In: International Conference on Formal Methods for Industrial Critical Systems. Springer (2008)","DOI":"10.1007\/978-3-642-03240-0_7"},{"key":"657_CR35","unstructured":"Rabin, S.: Game Programming Gems, Chapter a* Aesthetic Optimizations. Charles River Media (2000)"},{"key":"657_CR36","unstructured":"Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT press (2018)"},{"key":"657_CR37","unstructured":"Wang, Y., Chaudhuri, S., Kavraki, L.E.: Bounded policy synthesis for POMDPs with safe-reachability objectives. In: International Conference on Autonomous Agents and Multi Agent Systems. Springer (2018)"},{"key":"657_CR38","unstructured":"Watkins, C.J.C.H.: Learning from Delayed Rewards. King\u2019s College, Cambridge United Kingdom (1989)"}],"container-title":["International Journal on Software Tools for Technology Transfer"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10009-022-00657-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10009-022-00657-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10009-022-00657-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,11]],"date-time":"2022-05-11T10:08:53Z","timestamp":1652263733000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10009-022-00657-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,30]]},"references-count":38,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,6]]}},"alternative-id":["657"],"URL":"https:\/\/doi.org\/10.1007\/s10009-022-00657-z","relation":{},"ISSN":["1433-2779","1433-2787"],"issn-type":[{"type":"print","value":"1433-2779"},{"type":"electronic","value":"1433-2787"}],"subject":[],"published":{"date-parts":[[2022,3,30]]},"assertion":[{"value":"8 March 2022","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 March 2022","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}