{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T00:42:02Z","timestamp":1760229722952,"version":"build-2065373602"},"reference-count":35,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2022,6,27]],"date-time":"2022-06-27T00:00:00Z","timestamp":1656288000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Fundamental Research Funds for the Central Universities, JLU","award":["93K172020K26"],"award-info":[{"award-number":["93K172020K26"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>In this paper, we present an iteration algorithm for the pricing of American options based on reinforcement learning. At each iteration, the method approximates the expected discounted payoff of stopping times and produces those closer to optimal. In the convergence analysis, a finite sample bound of the algorithm is derived. The algorithm is evaluated on a multi-dimensional Black-Scholes model and a symmetric stochastic volatility model, the numerical results implied that our algorithm is accurate and efficient for pricing high-dimensional American options.<\/jats:p>","DOI":"10.3390\/sym14071324","type":"journal-article","created":{"date-parts":[[2022,6,28]],"date-time":"2022-06-28T00:07:02Z","timestamp":1656374822000},"page":"1324","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["An Iteration Algorithm for American Options Pricing Based on Reinforcement Learning"],"prefix":"10.3390","volume":"14","author":[{"given":"Nan","family":"Li","sequence":"first","affiliation":[{"name":"School of Mathematics, Jilin University, Changchun 130012, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,6,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"012165","DOI":"10.1088\/1742-6596\/1661\/1\/012165","article-title":"Forming a taxi service order price using neural networks with multi-parameter training","volume":"1661","author":"Andriyanov","year":"2020","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Ullrich, T. (2021). On the Autoregressive Time Series Model Using Real and Complex Analysis. Forecasting, 3.","DOI":"10.3390\/forecast3040044"},{"key":"ref_3","unstructured":"Hull, J.C. (2018). Options, Futures, and Other Derivatives, Pearson Education."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Achdou, Y., and Pironneau, O. (2005). Computational Methods for Option Pricing, Society for Industrial and Applied Mathematics.","DOI":"10.1137\/1.9780898717495"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1016\/0304-405X(79)90015-1","article-title":"Option pricing: A simplified approach","volume":"7","author":"Cox","year":"1979","journal-title":"J. Financ. Econ."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1015","DOI":"10.1007\/s10614-020-10005-5","article-title":"Exploring option pricing and hedging via volatility asymmetry","volume":"57","author":"Casas","year":"2021","journal-title":"Comput. Econ."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1093\/rfs\/14.1.113","article-title":"Valuing American Options by Simulation: A Simple Least-Squares Approach","volume":"14","author":"Longstaff","year":"2001","journal-title":"Rev. Financ. Stud."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"694","DOI":"10.1109\/72.935083","article-title":"Regression methods for pricing complex American-style options","volume":"12","author":"Tsitsiklis","year":"2001","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"383","DOI":"10.1111\/j.1467-9965.2010.00404.x","article-title":"Pricing of High-Dimensional American Options by Neural Networks","volume":"20","author":"Kohler","year":"2010","journal-title":"Math. Financ."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1080\/14697688.2019.1701698","article-title":"Machine learning for pricing American options in high-dimensional Markovian and non-Markovian models","volume":"20","author":"Goudenege","year":"2020","journal-title":"Quant. Financ."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"851","DOI":"10.1080\/14697688.2020.1713393","article-title":"Pricing high-dimensional American options by kernel ridge regression","volume":"20","author":"Hu","year":"2020","journal-title":"Quant. Financ."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1111\/1467-9965.02010","article-title":"Monte Carlo valuation of American options","volume":"12","author":"Rogers","year":"2002","journal-title":"Math. Financ."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"258","DOI":"10.1287\/opre.1030.0070","article-title":"Pricing American options: A duality approach","volume":"52","author":"Haugh","year":"2004","journal-title":"Oper. Res."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"5","DOI":"10.21314\/JCF.1999.041","article-title":"A simple approach to the pricing of Bermudan swaptions in the multifactor LIBOR market model","volume":"3","author":"Andersen","year":"2000","journal-title":"J. Comput. Financ."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1214\/10-AAP692","article-title":"On the rates of convergence of simulation-based optimization algorithms for optimal stopping problems","volume":"21","author":"Belomestny","year":"2011","journal-title":"Ann. Appl. Probab."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1201","DOI":"10.1137\/20M1373876","article-title":"Randomized optimal stopping algorithms and their convergence analysis","volume":"12","author":"Bayer","year":"2021","journal-title":"SIAM J. Financ. Math."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"470","DOI":"10.1017\/S0956792521000073","article-title":"Solving high-dimensional optimal stopping problems using deep learning","volume":"32","author":"Becker","year":"2021","journal-title":"Eur. J. Appl. Math."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1855","DOI":"10.1016\/S0165-1889(02)00086-6","article-title":"Convergence and biases of Monte Carlo estimates of American option prices using a parametric exercise rule","volume":"27","year":"2003","journal-title":"J. Econ. Dyn. Control"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1749","DOI":"10.1080\/14697688.2020.1750678","article-title":"Pricing American options by exercise rate optimization","volume":"20","author":"Bayer","year":"2020","journal-title":"Quant. Financ."},{"key":"ref_20","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press."},{"key":"ref_21","first-page":"6123","article-title":"Reinforcement learning with general value function approximation: Provably efficient approach via bounded eluder dimension","volume":"33","author":"Wang","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_22","first-page":"17816","article-title":"On reward-free reinforcement learning with linear function approximation","volume":"33","author":"Wang","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1007\/s10994-007-5038-2","article-title":"Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path","volume":"71","author":"Antos","year":"2008","journal-title":"Mach. Learn."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Yu, H., and Bertsekas, D.P. (2007, January 2\u20135). Q-learning algorithms for optimal stopping based on least squares. Proceedings of the European Control Conference, Kos, Greece.","DOI":"10.23919\/ECC.2007.7068523"},{"key":"ref_25","unstructured":"Li, Y., Szepesvari, C., and Schuurmans, D. (2009, January 16\u201318). Learning policies for American options. Proceedings of the Conference on Artificial Intelligence and Statistics, Clearwater Beach, FL, USA."},{"key":"ref_26","first-page":"74","article-title":"Deep optimal stopping","volume":"20","author":"Becker","year":"2019","journal-title":"J. Mach. Learn. Res."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Chen, S., Devraj, A.M., Bu\u0161i\u0107, A., and Meyn, S. (2020, January 1\u20133). Zap Q-Learning for optimal stopping. Proceedings of the American Control Conference, Denver, CO, USA.","DOI":"10.23919\/ACC45564.2020.9147481"},{"key":"ref_28","unstructured":"Herrera, C., Krach, F., Ruyssen, P., and Teichmann, J. (2021). Optimal Stopping via Randomized Neural Networks. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Glasserman, P. (2003). Monte Carlo Methods in Financial Engineering, Springer Science & Business Media.","DOI":"10.1007\/978-0-387-21617-1"},{"key":"ref_30","unstructured":"Kohler, M., and Langer, S. (2019). On the rate of convergence of fully connected very deep neural network regression estimates. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1007\/s00780-013-0204-9","article-title":"Quantitative error estimates for a least-squares Monte Carlo algorithm for American option pricing","volume":"17","author":"Zanger","year":"2013","journal-title":"Financ. Stoch."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1563","DOI":"10.1007\/s00332-018-9525-3","article-title":"Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations","volume":"29","author":"Beck","year":"2019","journal-title":"J. Nonlinear Sci."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1093\/rfs\/6.2.327","article-title":"A closed-form solution for options with stochastic volatility with applications to bond and currency options","volume":"6","author":"Heston","year":"1993","journal-title":"Rev. Financ. Stud."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"923","DOI":"10.1287\/moor.2019.1017","article-title":"General error estimates for the Longstaff\u2013Schwartz least-squares Monte Carlo algorithm","volume":"45","author":"Zanger","year":"2020","journal-title":"Math. Oper. Res."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"2261","DOI":"10.1214\/18-AOS1747","article-title":"On deep learning as a remedy for the curse of dimensionality in nonparametric regression","volume":"47","author":"Bauer","year":"2019","journal-title":"Ann. Stat."}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/14\/7\/1324\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:38:52Z","timestamp":1760139532000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/14\/7\/1324"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,27]]},"references-count":35,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2022,7]]}},"alternative-id":["sym14071324"],"URL":"https:\/\/doi.org\/10.3390\/sym14071324","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2022,6,27]]}}}