{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:24:54Z","timestamp":1760235894195,"version":"build-2065373602"},"reference-count":28,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2021,10,4]],"date-time":"2021-10-04T00:00:00Z","timestamp":1633305600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>In industrial applications, the processes of optimal sequential decision making are naturally formulated and optimized within a standard setting of Markov decision theory. In practice, however, decisions must be made under incomplete and uncertain information about parameters and transition probabilities. This situation occurs when a system may suffer a regime switch changing not only the transition probabilities but also the control costs. After such an event, the effect of the actions may turn to the opposite, meaning that all strategies must be revised. Due to practical importance of this problem, a variety of methods has been suggested, ranging from incorporating regime switches into Markov dynamics to numerous concepts addressing model uncertainty. In this work, we suggest a pragmatic and practical approach using a natural re-formulation of this problem as a so-called convex switching system, we make efficient numerical algorithms applicable.<\/jats:p>","DOI":"10.3390\/a14100291","type":"journal-article","created":{"date-parts":[[2021,10,4]],"date-time":"2021-10-04T09:24:15Z","timestamp":1633339455000},"page":"291","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["An Algorithm for Making Regime-Changing Markov Decisions"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4076-8818","authenticated-orcid":false,"given":"Juri","family":"Hinz","sequence":"first","affiliation":[{"name":"School of Mathematical and Physical Sciences, University of Technology Sydney, P.O. Box 123, Ultimo, NSW 2007, Australia"}]}],"member":"1968","published-online":{"date-parts":[[2021,10,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Puterman, M. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley.","DOI":"10.1002\/9780470316887"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1071","DOI":"10.1287\/opre.21.5.1071","article-title":"The Optimal Control of Partially Observable Markov Processes over a Finite Horizon","volume":"21","author":"Smallwood","year":"1973","journal-title":"Oper. Res."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1007\/s10479-018-2910-3","article-title":"Efficient algorithms of pathwise dynamic programming for decision optimization in mining operations","volume":"286","author":"Hinz","year":"2020","journal-title":"Ann. Oper. Res."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Tsiropoulou, E.E., Kastrinogiannis, T., and Symeon, P. (2009). Uplink Power Control in QoS-aware Multi-Service CDMA Wireless Networks. J. Commun., 4.","DOI":"10.4304\/jcm.4.9.654-668"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"3018105:1","DOI":"10.1155\/2018\/3018105","article-title":"Machine Learning for Communication Performance Enhancement","volume":"2018","author":"Huang","year":"2018","journal-title":"Wirel. Commun. Mob. Comput."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"B\u00e4uerle, N., and Rieder, U. (2011). Markov Decision Processes with Applications to Finance, Springer.","DOI":"10.1007\/978-3-642-18324-9"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Powell, W.B. (2007). Approximate Dynamic Programming: Solving the Curses of Dimensionality, Wiley.","DOI":"10.1002\/9780470182963"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"3562","DOI":"10.1137\/090752651","article-title":"Regression methods for stochastic control problems and their convergence analysis","volume":"48","author":"Belomestny","year":"2010","journal-title":"SIAM J. Control Optim."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1016\/S0167-6687(96)00004-2","article-title":"Valuation of the Early-Exercise Price for Options Using Simulations and Nonparametric Regression","volume":"19","author":"Carriere","year":"1996","journal-title":"Insur. Math. Econ."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1138","DOI":"10.1214\/105051607000000249","article-title":"A dynamic look-ahead Monte Carlo algorithm","volume":"17","author":"Egloff","year":"2007","journal-title":"Ann. Appl. Probab."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"694","DOI":"10.1109\/72.935083","article-title":"Regression Methods for Pricing Complex American-Style Options","volume":"12","author":"Tsitsiklis","year":"2001","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1840","DOI":"10.1109\/9.793723","article-title":"Optimal Stopping of Markov Processes: Hilbert Space, Theory, Approximation Algorithms, and an Application to Pricing High-Dimensional Financial Derivatives","volume":"44","author":"Tsitsiklis","year":"1999","journal-title":"IEEE Trans. Automat. Contr."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1093\/rfs\/14.1.113","article-title":"Valuing American options by simulation: A simple least-squares approach","volume":"14","author":"Longstaff","year":"2001","journal-title":"Rev. Financ. Stud."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1023\/A:1017928328829","article-title":"Kernel-Based Reinforcement Learning","volume":"49","author":"Ormoneit","year":"2002","journal-title":"Mach. Learn."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1624","DOI":"10.1109\/TAC.2002.803530","article-title":"Kernel-Based Reinforcement Learning in Average-Cost Problems","volume":"47","author":"Ormoneit","year":"2002","journal-title":"IEEE Trans. Automat. Contr."},{"key":"ref_16","unstructured":"Fan, J., and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications, Chapman and Hall."},{"key":"ref_17","unstructured":"Bertsekas, D.P., and Tsitsiklis, J.N. (1996). Neuro-Dynamic Programming, Athena Scientific."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1016\/S0004-3702(98)00023-X","article-title":"Planning and acting in partially observable stochastic domains","volume":"101","author":"Kaelbling","year":"1998","journal-title":"Artific. Intell."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1287\/opre.39.1.162","article-title":"Computationally Feasible Bounds for Partially Observed Markov Decision Processes","volume":"39","author":"Lovejoy","year":"1991","journal-title":"Operat. Res."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s10458-012-9200-2","article-title":"A survey of point-based POMDP solvers","volume":"27","author":"Shani","year":"2013","journal-title":"Autonom. Agents Multi-Agent Syst."},{"key":"ref_21","unstructured":"Pineau, J., Gordon, G., and Thrun, S. (2003, January 9\u201315). Point-based value iteration: An anytime algorithm for POMDPs. Proceedings of the 18th International IJCAI\u201903: Joint Conference on Artificial Intelligence, Acapulco, Mexico."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1137\/13091333X","article-title":"Optimal stochastic switching under convexity assumptions","volume":"52","author":"Hinz","year":"2014","journal-title":"SIAM J. Contr. Optim."},{"key":"ref_23","first-page":"770","article-title":"Algorithms for optimal control of stochastic switching systems","volume":"60","author":"Hinz","year":"2015","journal-title":"Theor. Prob. Appl."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Hinz, J., and Yee, J. (2017). Optimal forward trading and battery control under renewable electricity generation. J. Bank. Financ.","DOI":"10.1016\/j.jbankfin.2017.06.006"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"553","DOI":"10.1080\/00207179.2016.1186841","article-title":"Stochastic switching for partially observable dynamics and optimal asset allocation","volume":"90","author":"Hinz","year":"2017","journal-title":"Int. J. Contr."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Hinz, J., and Yee, J. (2018). Rcss: R package for optimal convex stochastic switching. R J.","DOI":"10.32614\/RJ-2018-054"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Hinz, J., and Yee, J. (2016, January 15\u201318). Algorithmic Solutions for Optimal Switching Problems. Proceedings of the 2016 Second International Symposium on Stochastic Models in Reliability Engineering, Life Science and Operations Management (SMRLO), Beer Sheva, Israel.","DOI":"10.1109\/SMRLO.2016.102"},{"key":"ref_28","first-page":"466","article-title":"Generalization of Faustmann\u2019s Formula for Stochastic Forest Growth and Prices with Markov Decision Process Models","volume":"47","author":"Buongiorno","year":"2001","journal-title":"For. Sci."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/14\/10\/291\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:09:18Z","timestamp":1760166558000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/14\/10\/291"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,4]]},"references-count":28,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2021,10]]}},"alternative-id":["a14100291"],"URL":"https:\/\/doi.org\/10.3390\/a14100291","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2021,10,4]]}}}