{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,1]],"date-time":"2025-10-01T16:29:32Z","timestamp":1759336172661,"version":"3.40.3"},"publisher-location":"Cham","reference-count":37,"publisher":"Springer International Publishing","isbn-type":[{"type":"print","value":"9783031131875"},{"type":"electronic","value":"9783031131882"}],"license":[{"start":{"date-parts":[[2022,1,1]],"date-time":"2022-01-01T00:00:00Z","timestamp":1640995200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,8,6]],"date-time":"2022-08-06T00:00:00Z","timestamp":1659744000000},"content-version":"vor","delay-in-days":217,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Markov decision processes (MDP) and continuous-time MDP (CTMDP) are the fundamental models for non-deterministic systems with probabilistic uncertainty. Mean payoff (a.k.a. long-run average reward) is one of the most classic objectives considered in their context. We provide the first algorithm to compute mean payoff probably approximately correctly in unknown MDP; further, we extend it to unknown CTMDP. We do not require any knowledge of the state space, only a lower bound on the minimum transition probability, which has been advocated in literature. In addition to providing probably approximately correct (PAC) bounds for our algorithm, we also demonstrate its practical nature by running experiments on standard benchmarks.<\/jats:p>","DOI":"10.1007\/978-3-031-13188-2_1","type":"book-chapter","created":{"date-parts":[[2022,8,5]],"date-time":"2022-08-05T08:16:57Z","timestamp":1659687417000},"page":"3-25","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["PAC Statistical Model Checking of\u00a0Mean Payoff in\u00a0Discrete- and Continuous-Time MDP"],"prefix":"10.1007","author":[{"given":"Chaitanya","family":"Agarwal","sequence":"first","affiliation":[]},{"given":"Shibashis","family":"Guha","sequence":"additional","affiliation":[]},{"given":"Jan","family":"K\u0159et\u00ednsk\u00fd","sequence":"additional","affiliation":[]},{"given":"Pazhamalai","family":"Muruganandham","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,8,6]]},"reference":[{"key":"1_CR1","unstructured":"Agarwal, C., Guha, S., Pazhamalai, M., K\u0159et\u00ednsk\u00fd, J.: Pac statistical model checking of mean payoff in discrete- and continuous-time mdp (2022). CoRR, abs\/2206.01465"},{"key":"1_CR2","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"201","DOI":"10.1007\/978-3-319-63387-9_10","volume-title":"Computer Aided Verification","author":"P Ashok","year":"2017","unstructured":"Ashok, P., Chatterjee, K., Daca, P., K\u0159et\u00ednsk\u00fd, J., Meggendorfer, T.: Value iteration for long-run average reward in markov decision processes. In: Majumdar, R., Kun\u010dak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 201\u2013221. Springer, Cham (2017). https:\/\/doi.org\/10.1007\/978-3-319-63387-9_10"},{"key":"1_CR3","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"497","DOI":"10.1007\/978-3-030-25540-4_29","volume-title":"Computer Aided Verification","author":"P Ashok","year":"2019","unstructured":"Ashok, P., K\u0159et\u00ednsk\u00fd, J., Weininger, M.: PAC statistical model checking for markov decision processes and stochastic games. In: Dillig, I., Tasiran, S. (eds.) CAV 2019. LNCS, vol. 11561, pp. 497\u2013519. Springer, Cham (2019). https:\/\/doi.org\/10.1007\/978-3-030-25540-4_29"},{"key":"1_CR4","doi-asserted-by":"crossref","unstructured":"Auer, P., Ortner, R.: Logarithmic online regret bounds for undiscounted reinforcement learning. In: NIPS, pp. 49\u201356. MIT Press (2006)","DOI":"10.7551\/mitpress\/7503.003.0011"},{"issue":"9","key":"1_CR5","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1145\/1810891.1810912","volume":"53","author":"C Baier","year":"2010","unstructured":"Baier, C., Haverkort, B.R., Hermanns, H., Katoen, J.-P.: Performance evaluation and model checking join forces. Commun. ACM 53(9), 76\u201385 (2010)","journal-title":"Commun. ACM"},{"key":"1_CR6","unstructured":"Baier, C., Katoen, J-P.: Principles of Model Checking. MIT Press (2008)"},{"key":"1_CR7","unstructured":"Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. II. Athena Scientific (1995)"},{"issue":"1","key":"1_CR8","first-page":"1","volume":"10","author":"T Br\u00e1zdil","year":"2014","unstructured":"Br\u00e1zdil, T., Bro\u017eek, V., Chatterjee, K., Forejt, V., Ku\u010dera, A.: Two views on multiple mean-payoff objectives in Markov decision processes. LMCS 10(1), 1\u201329 (2014)","journal-title":"LMCS"},{"key":"1_CR9","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1007\/978-3-319-11936-6_8","volume-title":"Automated Technology for Verification and Analysis","author":"T Br\u00e1zdil","year":"2014","unstructured":"Br\u00e1zdil, T., et al.: Verification of markov decision processes using learning algorithms. In: Cassez, F., Raskin, J.-F. (eds.) ATVA 2014. LNCS, vol. 8837, pp. 98\u2013114. Springer, Cham (2014). https:\/\/doi.org\/10.1007\/978-3-319-11936-6_8"},{"issue":"1","key":"1_CR10","doi-asserted-by":"publisher","first-page":"100","DOI":"10.1145\/322234.322242","volume":"28","author":"JL Bruno","year":"1981","unstructured":"Bruno, J.L., Downey, P.J., Frederickson, G.N.: Sequencing tasks with exponential service times to minimize the expected flow time or makespan. J. ACM 28(1), 100\u2013113 (1981)","journal-title":"J. ACM"},{"key":"1_CR11","unstructured":"Butkova, Y.: Towards efficient analysis of Markov automata. PhD thesis, Saarland University, Saarbr\u00fccken, Germany (2020)"},{"key":"1_CR12","doi-asserted-by":"crossref","unstructured":"Chatterjee, K.: Robustness of structurally equivalent concurrent parity games. In: FOSSACS, pp. 270\u2013285 (2012)","DOI":"10.1007\/978-3-642-28729-9_18"},{"key":"1_CR13","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"112","DOI":"10.1007\/978-3-662-49674-9_7","volume-title":"Tools and Algorithms for the Construction and Analysis of Systems","author":"P Daca","year":"2016","unstructured":"Daca, P., Henzinger, T.A., K\u0159et\u00ednsk\u00fd, J., Petrov, T.: Faster statistical model checking for unbounded temporal properties. In: Chechik, M., Raskin, J.-F. (eds.) TACAS 2016. LNCS, vol. 9636, pp. 112\u2013129. Springer, Heidelberg (2016). https:\/\/doi.org\/10.1007\/978-3-662-49674-9_7"},{"key":"1_CR14","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"592","DOI":"10.1007\/978-3-319-63390-9_31","volume-title":"Computer Aided Verification","author":"C Dehnert","year":"2017","unstructured":"Dehnert, C., Junges, S., Katoen, J.-P., Volk, M.: A storm is coming: a modern probabilistic model checker. In: Majumdar, R., Kun\u010dak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 592\u2013600. Springer, Cham (2017). https:\/\/doi.org\/10.1007\/978-3-319-63390-9_31"},{"key":"1_CR15","doi-asserted-by":"publisher","unstructured":"Dembo, A., Zeitouni, O.: Large deviations techniques and applications. Springer, Cham (2010). https:\/\/doi.org\/10.1007\/978-3-642-03311-7","DOI":"10.1007\/978-3-642-03311-7"},{"issue":"3","key":"1_CR16","doi-asserted-by":"publisher","first-page":"492","DOI":"10.1287\/moor.1040.0089","volume":"29","author":"EA Feinberg","year":"2004","unstructured":"Feinberg, E.A.: Continuous time discounted jump markov decision processes: a discrete-event approach. Math. Oper. Res. 29(3), 492\u2013524 (2004)","journal-title":"Math. Oper. Res."},{"key":"1_CR17","doi-asserted-by":"publisher","unstructured":"Feinberg, E.A., Shwartz, A.: Handbook of Markov decision processes: methods and applications, volume 40. Springer Science & Business Media, New York (2012). https:\/\/doi.org\/10.1007\/978-1-4615-0805-2","DOI":"10.1007\/978-1-4615-0805-2"},{"key":"1_CR18","doi-asserted-by":"crossref","unstructured":"Fu, J., Topcu, U.: Probably approximately correct MDP learning and control with temporal logic constraints. Science and Systems, In Robotics (2014)","DOI":"10.15607\/RSS.2014.X.039"},{"key":"1_CR19","doi-asserted-by":"crossref","unstructured":"Ghemawat, S., Gobioff, H., Leung, S.: The google file system. In: SOSP (2003)","DOI":"10.1145\/945445.945450"},{"key":"1_CR20","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"395","DOI":"10.1007\/978-3-030-17462-0_27","volume-title":"Tools and Algorithms for the Construction and Analysis of Systems","author":"EM Hahn","year":"2019","unstructured":"Hahn, E.M., Perez, M., Schewe, S., Somenzi, F., Trivedi, A., Wojtczak, D.: Omega-regular objectives in model-free reinforcement learning. In: Vojnar, T., Zhang, L. (eds.) TACAS 2019. LNCS, vol. 11427, pp. 395\u2013412. Springer, Cham (2019). https:\/\/doi.org\/10.1007\/978-3-030-17462-0_27"},{"key":"1_CR21","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"344","DOI":"10.1007\/978-3-030-17462-0_20","volume-title":"Tools and Algorithms for the Construction and Analysis of Systems","author":"A Hartmanns","year":"2019","unstructured":"Hartmanns, A., Klauck, M., Parker, D., Quatmann, T., Ruijters, E.: The quantitative verification benchmark set. In: Vojnar, T., Zhang, L. (eds.) TACAS 2019. LNCS, vol. 11427, pp. 344\u2013350. Springer, Cham (2019). https:\/\/doi.org\/10.1007\/978-3-030-17462-0_20"},{"key":"1_CR22","unstructured":"Haverkort, B.R., Hermanns, H., Katoen, J-P.: On the use of model checking techniques for dependability evaluation. In: SRDS 2000 (2000)"},{"key":"1_CR23","doi-asserted-by":"crossref","unstructured":"Henriques, D., Martins, J.G., Zuliani, P., Platzer, A., Clarke, E.M.: Statistical model checking for markov decision processes. In: QEST, pp. 84\u201393. IEEE Computer Society (2012)","DOI":"10.1109\/QEST.2012.19"},{"key":"1_CR24","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"623","DOI":"10.1007\/978-3-319-96145-3_36","volume-title":"Computer Aided Verification","author":"E Kelmendi","year":"2018","unstructured":"Kelmendi, E., Kr\u00e4mer, J., K\u0159et\u00ednsk\u00fd, J., Weininger, M.: Value iteration for simple stochastic games: stopping criterion and learning algorithm. In: Chockler, H., Weissenbacher, G. (eds.) CAV 2018. LNCS, vol. 10981, pp. 623\u2013642. Springer, Cham (2018). https:\/\/doi.org\/10.1007\/978-3-319-96145-3_36"},{"key":"1_CR25","unstructured":"J. Kret\u00ednsk\u00fd, Michel, F., Michel, L., P\u00e9rez, G.A.: Finite-memory near-optimal learning for markov decision processes with long-run average reward. In: UAI of Proceedings of Machine Learning Research, vol. 124, pp. 1149\u20131158. AUAI Press (2020)"},{"key":"1_CR26","unstructured":"K\u0159et\u00ednsk\u00fd, J., P\u00e9rez, G.A., Raskin, J.-F.: Learning-based mean-payoff optimization in an unknown MDP under omega-regular constraints. In: CONCUR, Dagstuhl, pp. 8:1\u20138:18 (2018)"},{"key":"1_CR27","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"200","DOI":"10.1007\/3-540-46029-2_13","volume-title":"Computer Performance Evaluation: Modelling Techniques and Tools","author":"M Kwiatkowska","year":"2002","unstructured":"Kwiatkowska, M., Norman, G., Parker, D.: PRISM: probabilistic symbolic model checker. In: Field, T., Harrison, P.G., Bradley, J., Harder, U. (eds.) TOOLS 2002. LNCS, vol. 2324, pp. 200\u2013204. Springer, Heidelberg (2002). https:\/\/doi.org\/10.1007\/3-540-46029-2_13"},{"key":"1_CR28","doi-asserted-by":"crossref","unstructured":"Kwiatkowska, M.Z., Norman, G., Parker, D.: The PRISM benchmark suite. In: QEST, pp. 203\u2013204. IEEE Computer Society (2012)","DOI":"10.1109\/QEST.2012.14"},{"key":"1_CR29","doi-asserted-by":"crossref","unstructured":"Lassaigne, R., Peyronnet, S.: Approximate planning and verification for large Markov decision processes. In: SAC, pp. 1314\u20131319. ACM (2012)","DOI":"10.1145\/2245276.2231984"},{"key":"1_CR30","doi-asserted-by":"crossref","unstructured":"Puterman, M.L.: Markov decision processes: Discrete stochastic dynamic programming. John Wiley and Sons (1994)","DOI":"10.1002\/9780470316887"},{"key":"1_CR31","doi-asserted-by":"crossref","unstructured":"Qiu, Q., Qu, Q., Pedram, M.: Stochastic modeling of a power-managed system-construction and optimization. IEEE Trans. CAD Integrated Circuits Syst. 20(10), 1200\u20131217 (2001)","DOI":"10.1109\/43.952737"},{"key":"1_CR32","volume-title":"Stochastic Dynamic Programming and the Control of Queueing Systems","author":"LI Sennott","year":"1999","unstructured":"Sennott, L.I.: Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley-Interscience, New York (1999)"},{"key":"1_CR33","doi-asserted-by":"publisher","first-page":"831","DOI":"10.1023\/B:JOTP.0000011995.28536.ef","volume":"16","author":"E Solan","year":"2003","unstructured":"Solan, E.: Continuity of the value of competitive markov decision processes. J. Theor. Probab. 16, 831\u2013845 (2003)","journal-title":"J. Theor. Probab."},{"key":"1_CR34","doi-asserted-by":"crossref","unstructured":"Strehl, A.L., Li, L., Wiewiora, E., Langford, J., Littman, M.L.: PAC model-free reinforcement learning. In: ICML, pp. 881\u2013888. ACM (2006)","DOI":"10.1145\/1143844.1143955"},{"key":"1_CR35","doi-asserted-by":"crossref","unstructured":"Sutton, R.S., Barto, A.G.: Reinforcement learning - an introduction. Adaptive computation and machine learning. MIT Press (1998)","DOI":"10.1109\/TNN.1998.712192"},{"key":"1_CR36","first-page":"1563","volume":"11","author":"R Ortner","year":"2010","unstructured":"Ortner, R., Jaksch, T., Auer, P.: Near-optimal regret bounds for reinforcement learning. J. Mach. Learn. Res. 11, 1563\u20131600 (2010)","journal-title":"J. Mach. Learn. Res."},{"key":"1_CR37","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"223","DOI":"10.1007\/3-540-45657-0_17","volume-title":"Computer Aided Verification","author":"HLS Younes","year":"2002","unstructured":"Younes, H.L.S., Simmons, R.G.: Probabilistic verification of discrete event systems using acceptance sampling. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp. 223\u2013235. Springer, Heidelberg (2002). https:\/\/doi.org\/10.1007\/3-540-45657-0_17"}],"container-title":["Lecture Notes in Computer Science","Computer Aided Verification"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-13188-2_1","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,25]],"date-time":"2023-11-25T09:53:47Z","timestamp":1700906027000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-13188-2_1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022]]},"ISBN":["9783031131875","9783031131882"],"references-count":37,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-13188-2_1","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2022]]},"assertion":[{"value":"6 August 2022","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"CAV","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Computer Aided Verification","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Haifa","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Israel","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2022","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"7 August 2022","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"10 August 2022","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"34","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"cav2022","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"http:\/\/i-cav.org\/2022\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Double-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"EasyChair","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"209","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"40","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"11","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"19% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3.9","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"9.7","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"Yes","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}