{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,5]],"date-time":"2025-11-05T07:01:56Z","timestamp":1762326116357,"version":"build-2065373602"},"reference-count":18,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2025,5,13]],"date-time":"2025-05-13T00:00:00Z","timestamp":1747094400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Foundation for Science and Technology (FCT, Portugal) in the framework","award":["2021.07608.BD","DOI 10.54499\/LA\/P\/0112\/2020","UIDB\/00147\/2020","UIDP\/00147\/2020"],"award-info":[{"award-number":["2021.07608.BD","DOI 10.54499\/LA\/P\/0112\/2020","UIDB\/00147\/2020","UIDP\/00147\/2020"]}]},{"name":"FCT in the framework of ARISE","award":["2021.07608.BD","DOI 10.54499\/LA\/P\/0112\/2020","UIDB\/00147\/2020","UIDP\/00147\/2020"],"award-info":[{"award-number":["2021.07608.BD","DOI 10.54499\/LA\/P\/0112\/2020","UIDB\/00147\/2020","UIDP\/00147\/2020"]}]},{"name":"R&amp;D Unit SYSTEC","award":["2021.07608.BD","DOI 10.54499\/LA\/P\/0112\/2020","UIDB\/00147\/2020","UIDP\/00147\/2020"],"award-info":[{"award-number":["2021.07608.BD","DOI 10.54499\/LA\/P\/0112\/2020","UIDB\/00147\/2020","UIDP\/00147\/2020"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>We investigate the convergence properties of policy iteration and value iteration algorithms in reinforcement learning by leveraging fixed-point theory, with a focus on mappings that exhibit weak contractive behavior. Unlike traditional studies that rely on strong contraction properties, such as those defined by the Banach contraction principle, we consider a more general class of mappings that includes weak contractions. Employing Zamfirscu\u2019s fixed-point theorem, we establish sufficient conditions for norm convergence in infinite-dimensional policy spaces under broad assumptions. Our approach extends the applicability of these algorithms to feedback control problems in reinforcement learning, where standard contraction conditions may not hold. Through illustrative examples, we demonstrate that this framework encompasses a wider range of operators, offering new insights into the robustness and flexibility of iterative methods in dynamic programming.<\/jats:p>","DOI":"10.3390\/sym17050750","type":"journal-article","created":{"date-parts":[[2025,5,13]],"date-time":"2025-05-13T11:31:49Z","timestamp":1747135909000},"page":"750","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Convergence Analysis of Reinforcement Learning Algorithms Using Generalized Weak Contraction Mappings"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4272-6232","authenticated-orcid":false,"given":"Abdelkader","family":"Belhenniche","sequence":"first","affiliation":[{"name":"Research Center for Systems and Technologies SYSTEC-ARISE, Faculty of Engineering, University of Porto, Rua Dr. Roberto Frias s\/n, 4200-465 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5179-4344","authenticated-orcid":false,"given":"Roman","family":"Chertovskih","sequence":"additional","affiliation":[{"name":"Research Center for Systems and Technologies SYSTEC-ARISE, Faculty of Engineering, University of Porto, Rua Dr. Roberto Frias s\/n, 4200-465 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0249-816X","authenticated-orcid":false,"given":"Rui","family":"Gon\u00e7alves","sequence":"additional","affiliation":[{"name":"Research Center for Systems and Technologies SYSTEC-ARISE, Faculty of Engineering, University of Porto, Rua Dr. Roberto Frias s\/n, 4200-465 Porto, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2025,5,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1090\/S0002-9904-1954-09848-8","article-title":"The theory of dynamic programming","volume":"60","author":"Bellman","year":"1954","journal-title":"Bull. Am. Math. Soc."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1023\/A:1022111402527","article-title":"Second order necessary conditions for optimal impulsive control problems","volume":"9","author":"Arutyunov","year":"2003","journal-title":"J. Dyn. Control Syst."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1109\/TAC.2011.2167822","article-title":"Hamilton-Jacobi-Bellman equation and feedback synthesis for impulsive control","volume":"57","author":"Fraga","year":"2011","journal-title":"IEEE Trans. Autom. Control"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Chertovskih, R., Ribeiro, V.M., Gon\u00e7alves, R., and Aguiar, A.P. (2024). Sixty Years of the Maximum Principle in Optimal Control: Historical Roots and Content Classification. Symmetry, 16.","DOI":"10.3390\/sym16101398"},{"key":"ref_5","unstructured":"Bertsekas, D., and Tsitsiklis, J.N. (1996). Neuro-Dynamic Programming, Athena Scientific."},{"key":"ref_6","unstructured":"Bertsekas, D.P., and Ioffe, S. (1996). Temporal Differences-Based Policy Iteration and Applications in Neuro-Dynamic Programming, MIT. Laboratory for Information and Decision Systems Report LIDS-P-2349."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"709","DOI":"10.1007\/s10589-018-9990-5","article-title":"Proximal algorithms and temporal difference methods for solving fixed point problems","volume":"70","author":"Bertsekas","year":"2018","journal-title":"Comput. Optim. Appl."},{"key":"ref_8","unstructured":"Li, Y., Johansson, K.H., and M\u00e5rtensson, J. (2020, January 10\u201311). Lambda-policy iteration with randomization for contractive models with infinite policies: Well-posedness and convergence. Proceedings of the Learning for Dynamics and Control, Online."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"511","DOI":"10.24193\/fpt-ro.2021.2.34","article-title":"Extension of \u03bb-PIR for weakly contractive operators via fixed point theory","volume":"22","author":"Belhenniche","year":"2021","journal-title":"Fixed Point Theory"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"133","DOI":"10.4064\/fm-3-1-133-181","article-title":"On operations in abstract sets and their application to integral equations","volume":"3","author":"Banach","year":"1922","journal-title":"Fundam. Math."},{"key":"ref_11","first-page":"405","article-title":"Some results on fixed points\u2014II","volume":"76","author":"Kannan","year":"1969","journal-title":"Am. Math. Mon."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"531","DOI":"10.15388\/NA.2016.4.7","article-title":"F-Contractions of Hardy\u2013Rogers Type and Application to Multistage Decision","volume":"21","author":"Vetro","year":"2016","journal-title":"Nonlinear Anal. Model. Control"},{"key":"ref_13","first-page":"15","article-title":"Fixed point theorems for a sequence of mappings with contractive iterates","volume":"14","author":"Chatterjea","year":"1972","journal-title":"Publ. l\u2019Institut Math\u00e9matique"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1007\/BF01304884","article-title":"Fix point theorems in metric spaces","volume":"23","author":"Zamfirescu","year":"1972","journal-title":"Arch. Math."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"375","DOI":"10.1137\/24M1631602","article-title":"Policy Iteration for the Deterministic Control Problems: A Viscosity Approach","volume":"63","author":"Tang","year":"2025","journal-title":"SIAM J. Control Optim."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"785","DOI":"10.1007\/s10589-021-00278-3","article-title":"Policy iteration for Hamilton\u2013Jacobi\u2013Bellman equations with control constraints","volume":"87","author":"Kundu","year":"2024","journal-title":"Comput. Optim. Appl."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"325","DOI":"10.1007\/BF01472580","article-title":"Completeness and fixed-points","volume":"80","author":"Subrahmanyam","year":"1975","journal-title":"Monatshefte Math."},{"key":"ref_18","unstructured":"Bertsekas, D. (2022). Abstract Dynamic Programming, Athena Scientific."}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/5\/750\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:32:02Z","timestamp":1760031122000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/5\/750"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,13]]},"references-count":18,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2025,5]]}},"alternative-id":["sym17050750"],"URL":"https:\/\/doi.org\/10.3390\/sym17050750","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2025,5,13]]}}}