{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,26]],"date-time":"2025-03-26T18:28:53Z","timestamp":1743013733560,"version":"3.40.3"},"publisher-location":"Cham","reference-count":29,"publisher":"Springer Nature Switzerland","isbn-type":[{"type":"print","value":"9783031377082"},{"type":"electronic","value":"9783031377099"}],"license":[{"start":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T00:00:00Z","timestamp":1672531200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,7,17]],"date-time":"2023-07-17T00:00:00Z","timestamp":1689552000000},"content-version":"vor","delay-in-days":197,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>This paper marries two state-of-the-art controller synthesis methods for partially observable Markov decision processes (POMDPs), a prominent model in sequential decision making under uncertainty. A central issue is to find a POMDP controller\u2014that solely decides based on the observations seen so far\u2014to achieve a total expected reward objective. As finding optimal controllers is undecidable, we concentrate on synthesising good finite-state controllers (FSCs). We do so by tightly integrating two modern, orthogonal methods for POMDP controller\u00a0synthesis: a belief-based and an inductive approach. The former method obtains an FSC from a finite fragment of the so-called belief MDP, an MDP that keeps track of the probabilities of equally observable POMDP states. The latter is an inductive search technique over a set of FSCs, e.g., controllers with a fixed memory size. The key result of this paper is a symbiotic anytime algorithm that tightly integrates both approaches such that each profits from the controllers constructed by the other. Experimental results indicate a substantial improvement in the value of the controllers while significantly reducing the synthesis time and memory\u00a0footprint.<\/jats:p>","DOI":"10.1007\/978-3-031-37709-9_6","type":"book-chapter","created":{"date-parts":[[2023,7,16]],"date-time":"2023-07-16T10:01:21Z","timestamp":1689501681000},"page":"113-135","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Search and\u00a0Explore: Symbiotic Policy Synthesis in\u00a0POMDPs"],"prefix":"10.1007","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1286-934X","authenticated-orcid":false,"given":"Roman","family":"Andriushchenko","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7026-228X","authenticated-orcid":false,"given":"Alexander","family":"Bork","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0300-9727","authenticated-orcid":false,"given":"Milan","family":"\u010ce\u0161ka","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0978-8466","authenticated-orcid":false,"given":"Sebastian","family":"Junges","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6143-1926","authenticated-orcid":false,"given":"Joost-Pieter","family":"Katoen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0009-0004-4277-2751","authenticated-orcid":false,"given":"Filip","family":"Mac\u00e1k","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,7,17]]},"reference":[{"issue":"3","key":"6_CR1","doi-asserted-by":"publisher","first-page":"293","DOI":"10.1007\/s10458-009-9103-z","volume":"21","author":"C Amato","year":"2010","unstructured":"Amato, C., Bernstein, D.S., Zilberstein, S.: Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs. Auton. Agent. Multi-Agent Syst. 21(3), 293\u2013320 (2010)","journal-title":"Auton. Agent. Multi-Agent Syst."},{"key":"6_CR2","doi-asserted-by":"crossref","unstructured":"Amato, C., Bonet, B., Zilberstein, S.: Finite-state controllers based on Mealy machines for centralized and decentralized POMDPs. In: AAAI, pp. 1052\u20131058. AAAI Press (2010)","DOI":"10.1609\/aaai.v24i1.7748"},{"key":"6_CR3","doi-asserted-by":"crossref","unstructured":"Andriushchenko, R., Bork, A., \u010ce\u0161ka, M., Junges, S., Katoen, J.P., Mac\u00e1k, F.: Search and explore: symbiotic policy synthesis in POMDPs. arXiv preprint arXiv:2305.14149 (2023)","DOI":"10.1007\/978-3-031-37709-9_6"},{"key":"6_CR4","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"191","DOI":"10.1007\/978-3-030-72016-2_11","volume-title":"Tools and Algorithms for the Construction and Analysis of Systems","author":"R Andriushchenko","year":"2021","unstructured":"Andriushchenko, R., \u010ce\u0161ka, M., Junges, S., Katoen, J.-P.: Inductive synthesis for probabilistic programs reaches new horizons. In: TACAS 2021. LNCS, vol. 12651, pp. 191\u2013209. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-72016-2_11"},{"key":"6_CR5","unstructured":"Andriushchenko, R., \u010ce\u0161ka, M., Junges, S., Katoen, J.P.: Inductive synthesis of finite-state controllers for POMDPs. In: UAI, vol. 180, pp. 85\u201395. PMRL (2022)"},{"key":"6_CR6","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"856","DOI":"10.1007\/978-3-030-81685-8_40","volume-title":"Computer Aided Verification","author":"R Andriushchenko","year":"2021","unstructured":"Andriushchenko, R., \u010ce\u0161ka, M., Junges, S., Katoen, J.-P., Stupinsk\u00fd, \u0160: PAYNT: a tool for inductive synthesis of probabilistic programs. In: Silva, A., Leino, K.R.M. (eds.) CAV 2021. LNCS, vol. 12759, pp. 856\u2013869. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-81685-8_40"},{"key":"6_CR7","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"288","DOI":"10.1007\/978-3-030-59152-6_16","volume-title":"Automated Technology for Verification and Analysis","author":"A Bork","year":"2020","unstructured":"Bork, A., Junges, S., Katoen, J.-P., Quatmann, T.: Verification of indefinite-horizon POMDPs. In: Hung, D.V., Sokolsky, O. (eds.) ATVA 2020. LNCS, vol. 12302, pp. 288\u2013304. Springer, Cham (2020). https:\/\/doi.org\/10.1007\/978-3-030-59152-6_16"},{"key":"6_CR8","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"22","DOI":"10.1007\/978-3-030-99527-0_2","volume-title":"Tools and Algorithms for the Construction and Analysis of Systems","author":"A Bork","year":"2022","unstructured":"Bork, A., Katoen, J.-P., Quatmann, T.: Under-approximating expected total rewards in POMDPs. In: TACAS 2022. LNCS, vol. 13244, pp. 22\u201340. Springer, Cham (2022). https:\/\/doi.org\/10.1007\/978-3-030-99527-0_2"},{"key":"6_CR9","doi-asserted-by":"publisher","first-page":"819","DOI":"10.1613\/jair.1.12963","volume":"72","author":"S Carr","year":"2021","unstructured":"Carr, S., Jansen, N., Topcu, U.: Task-aware verifiable RNN-based policies for partially observable Markov decision processes. J. Artif. Intell. Res. 72, 819\u2013847 (2021)","journal-title":"J. Artif. Intell. Res."},{"key":"6_CR10","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1007\/978-3-030-17465-1_10","volume-title":"Tools and Algorithms for the Construction and Analysis of Systems","author":"M \u010ce\u0161ka","year":"2019","unstructured":"\u010ce\u0161ka, M., Jansen, N., Junges, S., Katoen, J.-P.: Shepherding hordes of Markov chains. In: Vojnar, T., Zhang, L. (eds.) TACAS 2019. LNCS, vol. 11428, pp. 172\u2013190. Springer, Cham (2019). https:\/\/doi.org\/10.1007\/978-3-030-17465-1_10"},{"issue":"1","key":"6_CR11","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1007\/s00165-017-0432-4","volume":"30","author":"P Chrszon","year":"2018","unstructured":"Chrszon, P., Dubslaff, C., Kl\u00fcppelholz, S., Baier, C.: ProFeat: feature-oriented engineering for family-based probabilistic model checking. Formal Aspects Comput. 30(1), 45\u201375 (2018)","journal-title":"Formal Aspects Comput."},{"key":"6_CR12","doi-asserted-by":"crossref","unstructured":"Cubuktepe, M., Jansen, N., Junges, S., Marandi, A., Suilen, M., Topcu, U.: Robust finite-state controllers for uncertain POMDPs. In: AAAI, pp. 11792\u201311800. AAAI Press (2021)","DOI":"10.1609\/aaai.v35i13.17401"},{"key":"6_CR13","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"592","DOI":"10.1007\/978-3-319-63390-9_31","volume-title":"Computer Aided Verification","author":"C Dehnert","year":"2017","unstructured":"Dehnert, C., Junges, S., Katoen, J.-P., Volk, M.: A storm is coming: a modern probabilistic model checker. In: Majumdar, R., Kun\u010dak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 592\u2013600. Springer, Cham (2017). https:\/\/doi.org\/10.1007\/978-3-319-63390-9_31"},{"key":"6_CR14","unstructured":"Hansen, E.A.: Solving POMDPs by searching in policy space. In: UAI, pp. 211\u2013219. Morgan Kaufmann (1998)"},{"key":"6_CR15","doi-asserted-by":"publisher","unstructured":"Hartmanns, A., Junges, S., Quatmann, T., Weininger, M.: A practitioner\u2019s guide to MDP model checking algorithms. In: Sankaranarayanan, S., Sharygina, N. (eds.) Tools and Algorithms for the Construction and Analysis of Systems. TACAS 2023. Lecture Notes in Computer Science, vol. 13993, pp. 469\u2013488. Springer, Cham (2023). https:\/\/doi.org\/10.1007\/978-3-031-30823-9_24","DOI":"10.1007\/978-3-031-30823-9_24"},{"key":"6_CR16","unstructured":"Hauskrecht, M.: Incremental methods for computing bounds in partially observable Markov decision processes. In: AAAI\/IAAI, pp. 734\u2013739 (1997)"},{"key":"6_CR17","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1007\/978-3-030-94583-1_7","volume-title":"Verification, Model Checking, and Abstract Interpretation","author":"L Heck","year":"2022","unstructured":"Heck, L., Spel, J., Junges, S., Moerman, J., Katoen, J.-P.: Gradient-descent for randomized controllers under partial observability. In: Finkbeiner, B., Wies, T. (eds.) VMCAI 2022. LNCS, vol. 13182, pp. 127\u2013150. Springer, Cham (2022). https:\/\/doi.org\/10.1007\/978-3-030-94583-1_7"},{"key":"6_CR18","doi-asserted-by":"crossref","unstructured":"Horak, K., Bosansky, B., Chatterjee, K.: Goal-HSVI: heuristic search value iteration for Goal POMDPs. In: IJCAI, pp. 4764\u20134770. AAAI Press (2018)","DOI":"10.24963\/ijcai.2018\/662"},{"key":"6_CR19","unstructured":"Junges, S., et al.: Finite-state controllers of POMDPs via parameter synthesis. In: UAI, pp. 519\u2013529 (2018)"},{"key":"6_CR20","doi-asserted-by":"crossref","unstructured":"Kurniawati, H., Hsu, D., Lee, W.S.: SARSOP: efficient point-based POMDP planning by approximating optimally reachable belief spaces. In: Robotics: Science and Systems. MIT Press (2008)","DOI":"10.15607\/RSS.2008.IV.009"},{"key":"6_CR21","doi-asserted-by":"crossref","unstructured":"Kwiatkowska, M.Z., Norman, G., Parker, D.: Game-based abstraction for Markov decision processes. In: QEST, pp. 157\u2013166. IEEE Computer Society (2006)","DOI":"10.1109\/QEST.2006.19"},{"key":"6_CR22","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"585","DOI":"10.1007\/978-3-642-22110-1_47","volume-title":"Computer Aided Verification","author":"M Kwiatkowska","year":"2011","unstructured":"Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585\u2013591. Springer, Heidelberg (2011). https:\/\/doi.org\/10.1007\/978-3-642-22110-1_47"},{"issue":"1","key":"6_CR23","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1016\/S0004-3702(02)00378-8","volume":"147","author":"O Madani","year":"2003","unstructured":"Madani, O., Hanks, S., Condon, A.: On the undecidability of probabilistic planning and related stochastic optimization problems. Artif. Intell. 147(1), 5\u201334 (2003)","journal-title":"Artif. Intell."},{"key":"6_CR24","unstructured":"Meuleau, N., Kim, K., Kaelbling, L.P., Cassandra, A.R.: Solving POMDPs by searching the space of finite policies. In: UAI, pp. 417\u2013426. Morgan Kaufmann (1999)"},{"issue":"3","key":"6_CR25","doi-asserted-by":"publisher","first-page":"354","DOI":"10.1007\/s11241-017-9269-4","volume":"53","author":"G Norman","year":"2017","unstructured":"Norman, G., Parker, D., Zou, X.: Verification and control of partially observable probabilistic systems. Real-Time Syst. 53(3), 354\u2013402 (2017). https:\/\/doi.org\/10.1007\/s11241-017-9269-4","journal-title":"Real-Time Syst."},{"key":"6_CR26","doi-asserted-by":"crossref","unstructured":"Puterman, M.L.: Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons (1994)","DOI":"10.1002\/9780470316887"},{"issue":"5","key":"6_CR27","doi-asserted-by":"publisher","first-page":"1071","DOI":"10.1287\/opre.21.5.1071","volume":"21","author":"RD Smallwood","year":"1973","unstructured":"Smallwood, R.D., Sondik, E.J.: The optimal control of partially observable Markov processes over a finite horizon. Oper. Res. 21(5), 1071\u20131088 (1973)","journal-title":"Oper. Res."},{"key":"6_CR28","unstructured":"Verma, A., Murali, V., Singh, R., Kohli, P., Chaudhuri, S.: Programmatically interpretable reinforcement learning. In: ICML, vol. 80, pp. 5052\u20135061. PMLR (2018)"},{"key":"6_CR29","unstructured":"Wang, Y., Chaudhuri, S., Kavraki, L.E.: Bounded policy synthesis for pomdps with safe-reachability objectives. In: AAMAS, pp. 238\u2013246. International Foundation for Autonomous Agents and Multiagent Systems Richland, SC, USA\/ACM (2018)"}],"container-title":["Lecture Notes in Computer Science","Computer Aided Verification"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-37709-9_6","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,24]],"date-time":"2024-10-24T06:56:37Z","timestamp":1729752997000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-37709-9_6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"ISBN":["9783031377082","9783031377099"],"references-count":29,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-37709-9_6","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2023]]},"assertion":[{"value":"17 July 2023","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"CAV","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Computer Aided Verification","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Paris","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"France","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2023","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"17 July 2023","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"22 July 2023","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"35","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"cav2023","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"http:\/\/www.i-cav.org\/2023\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Double-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"hotcrp","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"261","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"67","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"0","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"26% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"11","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"Yes","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}