{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T03:24:24Z","timestamp":1768965864123,"version":"3.49.0"},"reference-count":34,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2025,6,21]],"date-time":"2025-06-21T00:00:00Z","timestamp":1750464000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>Portfolio optimisation is a crucial decision-making task. Traditionally static, this problem is more realistically addressed as dynamic, reflecting frequent trading within financial markets. The dynamic nature of the portfolio optimisation problem makes it susceptible to rapid market changes or financial contagions, which may cause drifts in historical data. While reinforcement learning (RL) offers a framework that allows for the formulation of portfolio optimisation as a dynamic problem, existing RL approaches lack the ability to adapt to rapid market changes, such as pandemics, and fail to capture the resulting concept drift. This study introduces a recurrent proximal policy optimisation (PPO) algorithm, leveraging recurrent neural networks (RNNs), specifically the long short-term memory network (LSTM) for pattern recognition. Initial results conclusively demonstrate the recurrent PPO\u2019s efficacy in generating quality portfolios. However, its performance declined during the COVID-19 pandemic, highlighting susceptibility to rapid market changes. To address this, an incremental recurrent PPO is developed, leveraging incremental learning to adapt to concept drift triggered by the pandemic. This enhanced algorithm not only learns from ongoing market data but also consistently identifies optimal portfolios despite significant market volatility, offering a robust tool for real-time financial decision-making.<\/jats:p>","DOI":"10.3390\/computers14070242","type":"journal-article","created":{"date-parts":[[2025,6,23]],"date-time":"2025-06-23T04:50:05Z","timestamp":1750654205000},"page":"242","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Incremental Reinforcement Learning for Portfolio Optimisation"],"prefix":"10.3390","volume":"14","author":[{"given":"Refiloe","family":"Shabe","sequence":"first","affiliation":[{"name":"Department of Industrial Engineering, Stellenbosch University, Stellenbosch 7600, South Africa"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0242-3539","authenticated-orcid":false,"given":"Andries","family":"Engelbrecht","sequence":"additional","affiliation":[{"name":"Department of Industrial Engineering, Stellenbosch University, Stellenbosch 7600, South Africa"},{"name":"GUST Engineering and Applied Innovation Research Center, Gulf University for Science and Technology, Mubarak Al-Abdullah 32093, Kuwait"},{"name":"Computer Science Division, Stellenbosch University, Stellenbosch 7600, South Africa"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5443-6801","authenticated-orcid":false,"given":"Kian","family":"Anderson","sequence":"additional","affiliation":[{"name":"Computer Science Division, Stellenbosch University, Stellenbosch 7600, South Africa"}]}],"member":"1968","published-online":{"date-parts":[[2025,6,21]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Jiang, Z., and Liang, J. (2017, January 21\u201323). Cryptocurrency portfolio management with deep reinforcement learning. Proceedings of the IEEE Intelligent Systems Conference, Glasgow, UK.","DOI":"10.1109\/IntelliSys.2017.8324237"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1002\/(SICI)1099-131X(1998090)17:5\/6<441::AID-FOR707>3.0.CO;2-#","article-title":"Performance functions and reinforcement learning for trading systems and portfolios","volume":"17","author":"Moody","year":"1998","journal-title":"J. Forecast."},{"key":"ref_3","unstructured":"Moody, J., and Saffell, M. (1998, January 27\u201331). Reinforcement Learning for Trading Systems and Portfolios. Proceedings of the Knowledge Discovery and Data Mining, Montreal, QC, Canada."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"875","DOI":"10.1109\/72.935097","article-title":"Learning to trade via direct reinforcement","volume":"12","author":"Moody","year":"2001","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_5","unstructured":"Agarwal, A., Hazan, E., Kale, S., and Schapire, R.E. (2006, January 25\u201329). Algorithms for portfolio management based on the newton method. Proceedings of the International Conference on Machine Learning, Pittsburgh, PA, USA."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Engelbrecht, A.P. (2007). Computational Intelligence: An Introduction, John Wiley & Sons.","DOI":"10.1002\/9780470512517"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Gabrielsson, P., and Johansson, U. (2015, January 8\u201310). High-frequency equity index futures trading using recurrent reinforcement learning with candlesticks. Proceedings of the IEEE Symposium Series on Computational Intelligence, Singapore.","DOI":"10.1109\/SSCI.2015.111"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhang, J., and Maringer, D. (2013, January 6\u201310). Indicator selection for daily equity trading with recurrent reinforcement learning. Proceedings of the Genetic and Evolutionary Computation Conference Companion, Amsterdam, The Netherlands.","DOI":"10.1145\/2464576.2480773"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhang, J., and Maringer, D. (2014, January 6\u201311). Two parameter update schemes for recurrent reinforcement learning. Proceedings of the IEEE Congress on Evolutionary Computation, Beijing, China.","DOI":"10.1109\/CEC.2014.6900330"},{"key":"ref_11","unstructured":"Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., and Riedmiller, M. (2014, January 21\u201326). Deterministic policy gradient algorithms. Proceedings of the International Conference on Machine Learning, Beijing, China."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Hegde, S., Kumar, V., and Singh, A. (2018, January 6\u20138). Risk aware portfolio construction using deep deterministic policy gradients. Proceedings of the IEEE Symposium Series on Computational Intelligence, Rio de Janeiro, Brazil.","DOI":"10.1109\/SSCI.2018.8628791"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"27","DOI":"10.3905\/jpm.1991.409343","article-title":"Downside risk","volume":"17","author":"Sortino","year":"1991","journal-title":"J. Portf. Manag."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1016\/j.jfds.2020.06.002","article-title":"Deep deterministic portfolio optimization","volume":"6","author":"Chaouki","year":"2020","journal-title":"J. Financ. Data Sci."},{"key":"ref_15","first-page":"21","article-title":"Deep reinforcement learning for optimal portfolio allocation: A comparative study with mean-variance optimization","volume":"2023","author":"Sood","year":"2023","journal-title":"Plan. Sched. Financ. Serv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Yue, H., Liu, J., Tian, D., and Zhang, Q. (2022). A Novel Anti-Risk Method for Portfolio Trading Using Deep Reinforcement Learning. Electronics, 11.","DOI":"10.3390\/electronics11091506"},{"key":"ref_17","first-page":"77","article-title":"Portfolio Selection","volume":"7","author":"Markowitz","year":"1952","journal-title":"J. Financ."},{"key":"ref_18","first-page":"425","article-title":"Capital asset prices: A theory of market equilibrium under conditions of risk","volume":"19","author":"Sharpe","year":"1964","journal-title":"J. Financ."},{"key":"ref_19","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press."},{"key":"ref_20","first-page":"679","article-title":"A Markovian decision process","volume":"6","author":"Bellman","year":"1957","journal-title":"J. Math. Mech."},{"key":"ref_21","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1007\/s40745-020-00253-5","article-title":"A Comprehensive Survey of Loss Functions in Machine Learning","volume":"9","author":"Wang","year":"2022","journal-title":"Ann. Data Sci."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1550","DOI":"10.1109\/5.58337","article-title":"Backpropagation Through Time: What It Does and How to Do It","volume":"78","author":"Werbos","year":"1990","journal-title":"Proc. IEEE"},{"key":"ref_24","unstructured":"Pascanu, R., Mikolov, T., and Bengio, Y. (2013, January 16\u201321). On the difficulty of training recurrent neural networks. Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long Short-Term Memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"5937","DOI":"10.1080\/00207543.2021.1975057","article-title":"Multi-resource constrained dynamic workshop scheduling based on proximal policy optimisation","volume":"60","author":"Luo","year":"2022","journal-title":"Int. J. Prod. Res."},{"key":"ref_27","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv."},{"key":"ref_28","unstructured":"Huang, S., Dossa, R.F.J., Raffin, A., Kanervisto, A., and Wang, W. (2022, January 25\u201329). The 37 Implementation Details of Proximal Policy Optimization. Proceedings of the International Conference on Learning Representations Blog Track, Virtually."},{"key":"ref_29","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_30","unstructured":"Cobbe, K., Klimov, O., Hesse, C., Kim, T., and Schulman, J. (2019, January 9\u201315). Quantifying generalization in reinforcement learning. Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"3066","DOI":"10.21105\/joss.03066","article-title":"PyPortfolioOpt: Portfolio optimization in Python","volume":"6","author":"Martin","year":"2021","journal-title":"J. Open Source Softw."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Todorov, E., Erez, T., and Tassa, Y. (2012, January 7\u201312). MuJoCo: A physics engine for model-based control. Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems, Lisbon, Portugal.","DOI":"10.1109\/IROS.2012.6386109"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1609","DOI":"10.1007\/s00521-019-04212-x","article-title":"Stock price prediction based on deep neural networks","volume":"32","author":"Yu","year":"2020","journal-title":"Neural Comput. Appl."},{"key":"ref_34","first-page":"281","article-title":"Random search for hyper-parameter optimization","volume":"13","author":"Bergstra","year":"2012","journal-title":"J. Mach. Learn. Res."}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/14\/7\/242\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:56:26Z","timestamp":1760032586000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/14\/7\/242"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,21]]},"references-count":34,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2025,7]]}},"alternative-id":["computers14070242"],"URL":"https:\/\/doi.org\/10.3390\/computers14070242","relation":{},"ISSN":["2073-431X"],"issn-type":[{"value":"2073-431X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,21]]}}}