{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T17:03:39Z","timestamp":1776877419060,"version":"3.51.2"},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2020,5,18]],"date-time":"2020-05-18T00:00:00Z","timestamp":1589760000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,5,18]],"date-time":"2020-05-18T00:00:00Z","timestamp":1589760000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000769","name":"University of Oxford","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100000769","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Found Comput Math"],"published-print":{"date-parts":[[2021,4]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In this work, we propose a class of numerical schemes for solving semilinear Hamilton\u2013Jacobi\u2013Bellman\u2013Isaacs (HJBI) boundary value problems which arise naturally from exit time problems of diffusion processes with controlled drift. We exploit policy iteration to reduce the semilinear problem into a sequence of linear Dirichlet problems, which are subsequently approximated by a multilayer feedforward neural network ansatz. We establish that the numerical solutions converge globally in the <jats:inline-formula><jats:alternatives><jats:tex-math>$$H^2$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:msup>\n                    <mml:mi>H<\/mml:mi>\n                    <mml:mn>2<\/mml:mn>\n                  <\/mml:msup>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>-norm and further demonstrate that this convergence is superlinear, by interpreting the algorithm as an inexact Newton iteration for the HJBI equation. Moreover, we construct the optimal feedback controls from the numerical value functions and deduce convergence. The numerical schemes and convergence results are then extended to oblique derivative boundary conditions. Numerical experiments on the stochastic Zermelo navigation problem are presented to illustrate the theoretical results and to demonstrate the effectiveness of the method.<\/jats:p>","DOI":"10.1007\/s10208-020-09460-1","type":"journal-article","created":{"date-parts":[[2020,5,18]],"date-time":"2020-05-18T21:02:50Z","timestamp":1589835770000},"page":"331-374","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":27,"title":["A Neural Network-Based Policy Iteration Algorithm with Global $$H^2$$-Superlinear Convergence for Stochastic Games on Domains"],"prefix":"10.1007","volume":"21","author":[{"given":"Kazufumi","family":"Ito","sequence":"first","affiliation":[]},{"given":"Christoph","family":"Reisinger","sequence":"additional","affiliation":[]},{"given":"Yufei","family":"Zhang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,5,18]]},"reference":[{"key":"9460_CR1","volume-title":"Infinite Dimensional Analysis: A Hitchhiker\u2019s Guide","author":"CD Aliprantis","year":"2006","unstructured":"C. D. Aliprantis and K. C. Border, Infinite Dimensional Analysis: A Hitchhiker\u2019s Guide, 3rd ed., Springer-Verlag, Berlin, 2006.","edition":"3"},{"key":"9460_CR2","doi-asserted-by":"publisher","first-page":"A181","DOI":"10.1137\/130932284","volume":"37","author":"A Alla","year":"2015","unstructured":"A. Alla, M. Falcone, and D. Kalise, An efficient policy iteration algorithm for dynamic programming equations, SIAM J. Sci. Comput., 37 (2015), pp. A181\u2013A200,","journal-title":"SIAM J. Sci. Comput."},{"key":"9460_CR3","volume-title":"Set-Valued Analysis","author":"J-P Aubin","year":"1990","unstructured":"J.-P. Aubin and H. Frankowska, Set-Valued Analysis, Birkh\u00e4user, Basel, 1990."},{"key":"9460_CR4","doi-asserted-by":"publisher","first-page":"2159","DOI":"10.1016\/S0005-1098(97)00128-3","volume":"33","author":"RW Beard","year":"1997","unstructured":"R. W. Beard, G. N. Saridis, and J. T. Wen, Galerkin approximation of the Generalized Hamilton\u2013Jacobi\u2013Bellman equation, Automatica, (33) 1997, pp.\u00a02159\u20132177.","journal-title":"Automatica"},{"key":"9460_CR5","doi-asserted-by":"publisher","first-page":"717","DOI":"10.1080\/002071798221542","volume":"71","author":"RW Beard","year":"1998","unstructured":"R. W. Beard and T. W. Mclain, Successive Galerkin approximation algorithms for nonlinear optimal and robust control, Internat. J. Control, (71) 1998, pp.\u00a0717\u2013743.","journal-title":"Internat. J. Control"},{"key":"9460_CR6","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1016\/j.neucom.2018.06.056","volume":"317","author":"J Berg","year":"2018","unstructured":"J. Berg and K. Nystr\u00f6m, A unified deep artificial neural network approach to partial differential equations in complex geometries, Neurocomputing, (317) 2018, pp. 28\u201341.","journal-title":"Neurocomputing"},{"key":"9460_CR7","volume-title":"Least-Squares Finite Element Methods","author":"PB Bochev","year":"2009","unstructured":"P. B. Bochev and M. D. Gunzburger, Least-Squares Finite Element Methods, Springer, New York, 2009."},{"key":"9460_CR8","doi-asserted-by":"publisher","first-page":"3001","DOI":"10.1137\/08073041X","volume":"47","author":"O Bokanowski","year":"2009","unstructured":"O. Bokanowski, S. Maroso, and H. Zidani, Some convergence results for Howard\u2019s algorithm, SIAM J. Numer. Anal., 47 (2009), pp. 3001\u20133026.","journal-title":"SIAM J. Numer. Anal."},{"key":"9460_CR9","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-4338-8","volume-title":"The Mathematical Theory of Finite Element Methods","author":"SC Brenner","year":"1994","unstructured":"S. C. Brenner and L. R. Scott, The Mathematical Theory of Finite Element Methods, Springer-Verlag, New York, 1994."},{"key":"9460_CR10","doi-asserted-by":"publisher","first-page":"602","DOI":"10.1137\/140998160","volume":"54","author":"R Buckdahn","year":"2016","unstructured":"R. Buckdahn and T. Y. Nie, Generalized Hamilton\u2013Jacobi\u2013Bellman equations with Dirichlet boundary condition and stochastic exit time optimal control problem, SIAM J. Control Optim., 54 (2016), pp. 602\u2013631.","journal-title":"SIAM J. Control Optim."},{"key":"9460_CR11","doi-asserted-by":"publisher","first-page":"533","DOI":"10.1109\/TNN.2004.824413","volume":"15","author":"C Cervellera","year":"2004","unstructured":"C. Cervellera and M. Muselli, Deterministic design for neural network learning: An approach based on discrepancy, IEEE Trans. Neural Networks, 15 (2004), pp. 533\u2013544.","journal-title":"IEEE Trans. Neural Networks"},{"key":"9460_CR12","doi-asserted-by":"publisher","first-page":"1200","DOI":"10.1137\/S0036142999356719","volume":"38","author":"X Chen","year":"2000","unstructured":"X. Chen, Z. Nashed, and L. Qi, Smoothing methods and semismooth methods for nondifferentiable operator equations, SIAM J. Numer. Anal., 38 (2000), pp. 1200\u20131216.","journal-title":"SIAM J. Numer. Anal."},{"key":"9460_CR13","doi-asserted-by":"publisher","first-page":"614","DOI":"10.1137\/16M1072863","volume":"56","author":"KC Cheung","year":"2018","unstructured":"K.C. Cheung, L. Ling, R. Schaback, $$H^2$$-convergence of least-squares kernel collocation methods, SIAM J. Numer. Anal. 56 (2018) 614\u2013633.","journal-title":"SIAM J. Numer. Anal."},{"key":"9460_CR14","doi-asserted-by":"publisher","first-page":"821","DOI":"10.1137\/110833567","volume":"22","author":"AL Dontchev","year":"2012","unstructured":"A. L. Dontchev, Generalizations of the Dennis\u2013Mor\u00e9 theorem, SIAM J. Optim., 22 (2012), pp. 821\u2013830.","journal-title":"SIAM J. Optim."},{"key":"9460_CR15","doi-asserted-by":"crossref","unstructured":"W. E, J. Han, and A. Jentzen, Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations, Commun. Math. Stat., 5 (2017), pp. 349\u2013380.","DOI":"10.1007\/s40304-017-0117-6"},{"key":"9460_CR16","doi-asserted-by":"crossref","unstructured":"W. E and B. Yu, The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems, Commun. Math. Stat., 6 (2018), pp. 1\u201312.","DOI":"10.1007\/s40304-018-0127-z"},{"key":"9460_CR17","doi-asserted-by":"crossref","unstructured":"H. Faure and C. Lemieux, Generalized Halton sequence in 2008: a comparative study, ACM Trans. Model. Comput. Simul., 19 (2009).","DOI":"10.1145\/1596519.1596520"},{"key":"9460_CR18","doi-asserted-by":"crossref","unstructured":"P. Forsyth and G. Labahn, Numerical methods for controlled Hamilton\u2013Jacobi\u2013Bellman PDEs in finance, J. Computational Finance, 11 (2007\/2008, Winter), pp. 1\u201343.","DOI":"10.21314\/JCF.2007.163"},{"key":"9460_CR19","doi-asserted-by":"publisher","DOI":"10.1201\/9781420035797","volume-title":"Second order elliptic integro-differential problems","author":"MG Garroni","year":"2002","unstructured":"M. G. Garroni and J. L. Menaldi, Second order elliptic integro-differential problems, Chapman & Hall\/CRC, Boca Raton, FL, 2002."},{"key":"9460_CR20","volume-title":"Elliptic Partial Differential Equations of Second Order","author":"D Gilbarg","year":"1983","unstructured":"D. Gilbarg and N. Trudinger, Elliptic Partial Differential Equations of Second Order, 2nd edition, Springer-Verlag, Berlin, New York, 1983.","edition":"2"},{"key":"9460_CR21","volume-title":"Elliptic problems in nonsmooth domains","author":"E Grisvard","year":"1985","unstructured":"E. Grisvard, Elliptic problems in nonsmooth domains, Pitman, Boston, MA, 1985."},{"key":"9460_CR22","unstructured":"J. Han and W. E, Deep learning approximation for stochastic control problems, preprint, arXiv:1611.07422, 2016."},{"key":"9460_CR23","unstructured":"J. Han and J. Long, Convergence of the deep BSDE method for coupled FBSDEs, preprint, arXiv:1811.01165v1, 2018."},{"key":"9460_CR24","doi-asserted-by":"publisher","first-page":"865","DOI":"10.1137\/S1052623401383558","volume":"13","author":"M Hinterm\u00fcller","year":"2002","unstructured":"M. Hinterm\u00fcller, K. Ito, and K. Kunisch, The primal-dual active set strategy as a semismooth Newton method, SIAM J. Optim., 13 (2002), pp. 865\u2013888.","journal-title":"SIAM J. Optim."},{"key":"9460_CR25","doi-asserted-by":"publisher","first-page":"551","DOI":"10.1016\/0893-6080(90)90005-6","volume":"3","author":"K Hornik","year":"1990","unstructured":"K. Hornik, M. Stinchcombe, and H. White, Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks, Neural Networks, 3 (1990), pp. 551\u2013560.","journal-title":"Neural Networks"},{"key":"9460_CR26","unstructured":"C. Hur\u00e9, H. Pham, A. Bachouch, and N. Langren\u00e9, Deep neural networks algorithms for stochastic control problems on finite horizon, part I: convergence analysis, preprint, arXiv:1812.04300, 2018."},{"key":"9460_CR27","doi-asserted-by":"crossref","unstructured":"K. Ito and K. Kunisch, Semismooth Newton methods for variational inequalities of the first kind, M2AN Math. Model. Numer. Anal., 37 (2003), pp. 41\u201362.","DOI":"10.1051\/m2an:2003021"},{"key":"9460_CR28","doi-asserted-by":"publisher","first-page":"A629","DOI":"10.1137\/17M1116635","volume":"40","author":"D Kalise","year":"2018","unstructured":"D. Kalise and K. Kunisch, Polynomial approximation of high-dimensional Hamilton\u2013Jacobi\u2013Bellman equations and applications to feedback control of semilinear parabolic PDEs, SIAM J. Sci. Comput., 40 (2018), A629\u2013A652.","journal-title":"SIAM J. Sci. Comput."},{"key":"9460_CR29","doi-asserted-by":"crossref","unstructured":"D. Kalise, S. Kundu, and K. Kunisch, Robust feedback control of nonlinear PDEs by numerical approximation of high-dimensional Hamilton\u2013Jacobi\u2013Isaacs equations, preprint, arXiv:1905.06276, 2019.","DOI":"10.1002\/pamm.201900333"},{"key":"9460_CR30","unstructured":"B. Kerimkulov, D. \u0160i\u0161ka, and \u0141. Szpruch, Exponential convergence and stability of Howard\u2019s policy improvement algorithm for controlled diffusions, preprint, arXiv:1812.07846, 2018."},{"key":"9460_CR31","unstructured":"D. P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, CoRR preprint, arxiv:1412.6980, 2014."},{"key":"9460_CR32","doi-asserted-by":"publisher","first-page":"751","DOI":"10.1007\/s00440-013-0495-y","volume":"158","author":"NV Krylov","year":"2014","unstructured":"N. V. Krylov, On the dynamic programming principle for uniformly nondegenerate stochastic differential games in domains and the Isaacs equations, Probab. Theory Related Fields, 158 (2014), pp. 751\u2013783.","journal-title":"Probab. Theory Related Fields"},{"key":"9460_CR33","doi-asserted-by":"publisher","DOI":"10.1090\/surv\/233","volume-title":"Sobolev and Viscosity Solutions for Fully Nonlinear Elliptic and Parabolic Equations, Mathematical Surveys and Monographs, 233","author":"NV Krylov","year":"2018","unstructured":"N.V. Krylov, Sobolev and Viscosity Solutions for Fully Nonlinear Elliptic and Parabolic Equations, Mathematical Surveys and Monographs, 233, Amer. Math. Soc., Providence, RI, 2018."},{"key":"9460_CR34","volume-title":"Numerical Methods for Stochastic Control Problems in Continuous Time","author":"H Kushner","year":"1991","unstructured":"H. Kushner and P. Dupuis, Numerical Methods for Stochastic Control Problems in Continuous Time, Springer-Verlag, New York, 1991."},{"key":"9460_CR35","doi-asserted-by":"publisher","first-page":"987","DOI":"10.1109\/72.712178","volume":"9","author":"E Lagaris","year":"1998","unstructured":"E. Lagaris, A. Likas, and D. I. Fotiadis, Artificial neural networks for solving ordinary and partial differential equations, IEEE Trans. Neural Netw., 9 (1998) pp. 987\u20131000.","journal-title":"IEEE Trans. Neural Netw."},{"key":"9460_CR36","doi-asserted-by":"publisher","first-page":"511","DOI":"10.1002\/cpa.3160370408","volume":"37","author":"P-L Lions","year":"1984","unstructured":"P.-L. Lions and A.-S. Sznitman, Stochastic differential equations with reflecting boundary conditions, Comm. Pure Appl. Math., 37 (1984) pp. 511\u2013 553.","journal-title":"Comm. Pure Appl. Math."},{"key":"9460_CR37","doi-asserted-by":"crossref","unstructured":"P. Mohajerin Esfahani, D. Chatterjee, and J. Lygeros, The stochastic reach-avoid problem and set characterization for diffusions, Automatica, 70 (2016), pp. 43\u201356.","DOI":"10.1016\/j.automatica.2016.03.016"},{"key":"9460_CR38","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1287\/moor.4.1.60","volume":"4","author":"ML Puterman","year":"1979","unstructured":"M.L. Puterman and S.L. Brumelle, On the convergence of policy iteration in stationary dynamic programming, Math. Oper. Res., 4 (1979), pp. 60\u201369.","journal-title":"Math. Oper. Res."},{"key":"9460_CR39","unstructured":"C. Reisinger and Y. Zhang, A penalty scheme and policy iteration for nonlocal HJB variational inequalities with monotone drivers, preprint, arXiv:1805.06255, 2018."},{"key":"9460_CR40","doi-asserted-by":"crossref","unstructured":"C. Reisinger and Y. Zhang, Error estimates of penalty schemes for quasi-variational inequalities arising from impulse control problems, preprint, arXiv:1901.07841, 2019.","DOI":"10.1137\/19M124040X"},{"key":"9460_CR41","doi-asserted-by":"publisher","first-page":"1358","DOI":"10.1016\/j.spa.2006.02.009","volume":"116","author":"M Royer","year":"2006","unstructured":"M. Royer, Backward stochastic differential equations with jumps and related non-linear expectations, Stochastic Process. Appl., 116 (2006), pp. 1358\u20131376.","journal-title":"Stochastic Process. Appl."},{"key":"9460_CR42","doi-asserted-by":"publisher","first-page":"2094","DOI":"10.1137\/S0363012902399824","volume":"42","author":"MS Santos","year":"2004","unstructured":"M.S. Santos and J. Rust, Convergence properties of policy iteration, SIAM J. Control Optim., 42 (2004), pp. 2094\u20132115.","journal-title":"SIAM J. Control Optim."},{"key":"9460_CR43","unstructured":"O. Shamir and T. Zhang, Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes, in Proceedings of the International Conference on Machine Learning, 2013."},{"key":"9460_CR44","doi-asserted-by":"publisher","first-page":"1339","DOI":"10.1016\/j.jcp.2018.08.029","volume":"375","author":"J Sirignano","year":"2018","unstructured":"J.\u00a0Sirignano and K. Spiliopoulos, DGM: A deep learning algorithm for solving partial differential equations, J. Comput. Phys., 375 (2018), pp. 1339\u20131364.","journal-title":"J. Comput. Phys."},{"key":"9460_CR45","doi-asserted-by":"publisher","first-page":"993","DOI":"10.1137\/130909536","volume":"52","author":"I Smears","year":"2014","unstructured":"I.\u00a0Smears and E.\u00a0S\u00fcli, Discontinuous Galerkin finite element approximation of Hamilton-Jacobi-Bellman equations with Cordes coefficients, SIAM J. Numer. Anal., 52 (2014), pp. 993\u20131016,","journal-title":"SIAM J. Numer. Anal."},{"key":"9460_CR46","doi-asserted-by":"crossref","unstructured":"M. Ulbrich, Semismooth Newton Methods for Variational Inequalities and Constrained Optimization Problems in Function Spaces, MOS-SIAM Ser. Optim. 11, SIAM, Philadelphia, 2011.","DOI":"10.1137\/1.9781611970692"},{"key":"9460_CR47","doi-asserted-by":"publisher","first-page":"595","DOI":"10.1137\/110835840","volume":"50","author":"JH Witte","year":"2012","unstructured":"J. H. Witte and C. Reisinger, Penalty methods for the solution of discrete HJB equations: Continuous control and obstacle problems, SIAM J. Numer. Anal., 50 (2012), pp. 595\u2013625.","journal-title":"SIAM J. Numer. Anal."}],"container-title":["Foundations of Computational Mathematics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10208-020-09460-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10208-020-09460-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10208-020-09460-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,5,17]],"date-time":"2021-05-17T23:04:50Z","timestamp":1621292690000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10208-020-09460-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,18]]},"references-count":47,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,4]]}},"alternative-id":["9460"],"URL":"https:\/\/doi.org\/10.1007\/s10208-020-09460-1","relation":{},"ISSN":["1615-3375","1615-3383"],"issn-type":[{"value":"1615-3375","type":"print"},{"value":"1615-3383","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,5,18]]},"assertion":[{"value":"14 June 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 January 2020","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 March 2020","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 May 2020","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}