{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,1]],"date-time":"2026-02-01T06:00:40Z","timestamp":1769925640189,"version":"3.49.0"},"reference-count":35,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T00:00:00Z","timestamp":1769731200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Artif. Intell."],"abstract":"<jats:sec>\n                    <jats:title>Introduction<\/jats:title>\n                    <jats:p>In the face of high uncertainty and complexity in financial markets, achieving portfolio return maximization while effectively controlling risk remains a critical challenge.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>We propose a novel portfolio management framework based on the value distribution maximum entropy actor-critic (VD-MEAC) reinforcement learning algorithm. We establish a framework where the agent\u2019s actions represent portfolio weight adjustments and stock factors serve as state observations. For risk management, the critic network learns the complete distribution of future returns. For return enhancement, we incorporate entropy regularization.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We conduct extensive experiments using real market data from the Chinese stock market. Results demonstrate that our VD-MEAC strategy achieves an average return of 2.490 and an average Sharpe ratio of 2.978, significantly outperforming benchmark strategies.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Discussion<\/jats:title>\n                    <jats:p>These results validate the effectiveness of our approach in practical portfolio management scenarios.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.3389\/frai.2025.1709493","type":"journal-article","created":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T15:00:26Z","timestamp":1769785226000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Portfolio management based on value distribution reinforcement learning algorithm"],"prefix":"10.3389","volume":"8","author":[{"given":"Yan","family":"Yang","sequence":"first","affiliation":[{"name":"Strategic Development Department Southern Power Grid Capital Holding Co., Ltd.","place":["Guangzhou, China"]}]},{"given":"Tian","family":"Wang","sequence":"additional","affiliation":[{"name":"Strategic Development Department Southern Power Grid Capital Holding Co., Ltd.","place":["Guangzhou, China"]}]},{"given":"Yiding","family":"Fu","sequence":"additional","affiliation":[{"name":"Strategic Development Department Southern Power Grid Capital Holding Co., Ltd.","place":["Guangzhou, China"]}]},{"given":"Jingna","family":"Huang","sequence":"additional","affiliation":[{"name":"Southern Power Grid Financial Leasing Co., Ltd.","place":["Guangzhou, China"]}]},{"given":"Dong","family":"Zhou","sequence":"additional","affiliation":[{"name":"Southern Power Grid Private Fund Management Co., Ltd., Southern Power Grid Jianxin Fund Management","place":["Guangzhou, China"]}]}],"member":"1965","published-online":{"date-parts":[[2026,1,30]]},"reference":[{"key":"ref1","doi-asserted-by":"publisher","first-page":"126430","DOI":"10.1016\/j.eswa.2025.126430","article-title":"Optimizing portfolio selection through stock ranking and matching: a reinforcement learning approach","volume":"269","author":"Alzaman","year":"2025","journal-title":"Expert. Syst. Appl."},{"key":"ref2","doi-asserted-by":"publisher","first-page":"855","DOI":"10.1007\/s12667-021-00448-6","article-title":"A review of power system protection and asset management with machine learning techniques","volume":"13","author":"Aminifar","year":"2022","journal-title":"Energy Syst."},{"key":"ref3","doi-asserted-by":"publisher","first-page":"200467","DOI":"10.1016\/j.iswa.2024.200467","article-title":"Hidden-layer configurations in reinforcement learning models for stock portfolio optimization","volume":"25","author":"Aritonang","year":"2025","journal-title":"Intell Syst. Appl."},{"key":"ref4","first-page":"189047","article-title":"Adaptive algorithm for selecting the optimal trading strategy based on reinforcement learning for managing a hedge fund","volume":"12","author":"Belyakov","year":"2024","journal-title":"IEEE Access"},{"key":"ref5","doi-asserted-by":"publisher","first-page":"114002","DOI":"10.1016\/j.eswa.2020.114002","article-title":"Deep reinforcement learning for portfolio management of markets with a dynamic number of assets","volume":"164","author":"Betancourt","year":"2021","journal-title":"Expert. Syst. Appl."},{"key":"ref6","doi-asserted-by":"publisher","first-page":"127800","DOI":"10.1016\/j.neucom.2024.127800","article-title":"Multiagent-based deep reinforcement learning framework for multi-asset adaptive trading and portfolio management","volume":"594","author":"Cheng","year":"2024","journal-title":"Neurocomputing"},{"key":"ref7","doi-asserted-by":"publisher","first-page":"426","DOI":"10.1109\/OJCS.2025.3543450","article-title":"FraudGNN-RL: a graph neural network with reinforcement learning for adaptive financial fraud detection","volume":"6","author":"Cui","year":"2025","journal-title":"IEEE Open J. Comput. Soc."},{"key":"ref8","doi-asserted-by":"publisher","first-page":"124959","DOI":"10.1016\/j.eswa.2024.124959","article-title":"Mobile robot sequential decision making using a deep reinforcement learning hyper-heuristic approach","volume":"257","author":"Cui","year":"2024","journal-title":"Expert. Syst. Appl."},{"key":"ref9","doi-asserted-by":"crossref","first-page":"8715","DOI":"10.1007\/s00500-023-08973-5","article-title":"Portfolio dynamic trading strategies using deep reinforcement learning","volume":"28","author":"Day","year":"2024","journal-title":"Soft. Comput. J."},{"key":"ref10","doi-asserted-by":"publisher","first-page":"319","DOI":"10.1007\/s10479-024-06257-1","article-title":"Enhancing Markowitz's portfolio selection paradigm with machine learning","volume":"346","author":"de L\u00f3pez Prado","year":"2025","journal-title":"Ann. Oper. Res."},{"key":"ref11","doi-asserted-by":"crossref","first-page":"4918","DOI":"10.1145\/3637528.3671629","article-title":"Fnspid: a comprehensive financial news dataset in time series","volume-title":"Proceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining","author":"Dong","year":"2024"},{"key":"ref12","doi-asserted-by":"publisher","first-page":"102221","DOI":"10.1016\/j.strusafe.2022.102221","article-title":"Parameterized deep reinforcement learning-enabled maintenance decision-support and life-cycle risk assessment for highway bridge portfolios","volume":"97","author":"Du","year":"2022","journal-title":"Struct. Saf."},{"key":"ref13","doi-asserted-by":"publisher","first-page":"113264","DOI":"10.1016\/j.asoc.2025.113264","article-title":"Optimizing stock investment strategies with double deep Q-networks: exploring the impact of oil and gold price signals","volume":"180","author":"Fu","year":"2025","journal-title":"Appl. Soft. Comput."},{"key":"ref14","doi-asserted-by":"publisher","first-page":"102365","DOI":"10.1016\/j.pacfin.2024.102365","article-title":"The enhanced benefits of ESG in portfolios: a multi-factor model perspective based on LightGBM","volume":"85","author":"Gong","year":"2024","journal-title":"Pac. Basin Financ. J."},{"key":"ref15","first-page":"1861","article-title":"Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor","volume-title":"International conference on machine learning","author":"Haarnoja","year":"2018"},{"key":"ref16","doi-asserted-by":"publisher","first-page":"119556","DOI":"10.1016\/j.eswa.2023.119556","article-title":"Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory","volume":"218","author":"Jang","year":"2023","journal-title":"Expert. Syst. Appl."},{"key":"ref17","doi-asserted-by":"publisher","first-page":"103810","DOI":"10.1109\/ACCESS.2024.3434528","article-title":"A deep learning based expert framework for portfolio prediction and forecasting","volume":"12","author":"Jeribi","year":"2024","journal-title":"IEEE Access"},{"key":"ref18","doi-asserted-by":"publisher","first-page":"101016","DOI":"10.1016\/j.gfj.2024.101016","article-title":"Deep reinforcement learning for portfolio selection","volume":"62","author":"Jiang","year":"2024","journal-title":"Glob. Financ. J."},{"key":"ref19","article-title":"A deep reinforcement learning framework for the financial portfolio management problem","author":"Jiang","year":"2017"},{"key":"ref20","doi-asserted-by":"publisher","first-page":"27794","DOI":"10.1109\/ACCESS.2024.3366905","article-title":"Deep machine learning-based asset management approach for oil-immersed power transformers using dissolved gas analysis","volume":"12","author":"Jin","year":"2024","journal-title":"IEEE Access"},{"key":"ref21","doi-asserted-by":"crossref","first-page":"2792","DOI":"10.1002\/for.3155","article-title":"Portfolio management based on a reinforcement learning framework","volume":"43","author":"Junfeng","year":"2024","journal-title":"J. Forecast."},{"key":"ref22","doi-asserted-by":"publisher","first-page":"122763","DOI":"10.1016\/j.eswa.2023.122763","article-title":"A deep reinforcement learning system for the allocation of epidemic prevention materials based on DDPG","volume":"242","author":"Kitchat","year":"2024","journal-title":"Expert. Syst. Appl."},{"key":"ref23","doi-asserted-by":"publisher","first-page":"848","DOI":"10.1016\/j.jestch.2021.01.007","article-title":"Market sentiment-aware deep reinforcement learning approach for stock portfolio allocation","volume":"24","author":"Koratamaddi","year":"2021","journal-title":"Eng. Sci. Technol. Int. J."},{"key":"ref24","doi-asserted-by":"publisher","first-page":"108","DOI":"10.1007\/s11063-024-11582-4","article-title":"Deep reinforcement learning model for stock portfolio management based on data fusion","volume":"56","author":"Li","year":"2024","journal-title":"Neural Process. Lett."},{"key":"ref25","first-page":"56","article-title":"Q-ADER: an effective Q-learning for recommendation with diminishing action space","author":"Li","year":"2024","journal-title":"IEEE Trans. Neural. Netw. Learn. Syst."},{"key":"ref26","doi-asserted-by":"publisher","first-page":"2803","DOI":"10.1016\/j.matpr.2021.07.042","article-title":"Impact of machine learning on management, healthcare and agriculture","volume":"80","author":"Pallathadka","year":"2023","journal-title":"Mater. Today Proc."},{"key":"ref27","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3733714","article-title":"The evolution of reinforcement learning in quantitative finance: a survey","volume":"57","author":"Pippas","year":"2025","journal-title":"ACM Comput. Surv."},{"key":"ref28","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1007\/s10462-024-11066-w","article-title":"A taxonomy of literature reviews and experimental study of deepreinforcement learning in portfolio management","volume":"58","author":"Rezaei","year":"2025","journal-title":"Artif. Intell. Rev."},{"key":"ref29","doi-asserted-by":"publisher","first-page":"42813","DOI":"10.1109\/ACCESS.2025.3546099","article-title":"A novel RMS-driven deep reinforcement learning for optimized portfolio management in stock trading","volume":"13","author":"Sattar","year":"2025","journal-title":"IEEE Access"},{"key":"ref30","doi-asserted-by":"publisher","first-page":"2087","DOI":"10.1109\/JIOT.2021.3050441","article-title":"IoT and fog-computing-based predictive maintenance model for effective asset management in industry 4.0 using machine learning","volume":"10","author":"Teoh","year":"2021","journal-title":"IEEE Internet Things J."},{"key":"ref31","doi-asserted-by":"publisher","first-page":"8119","DOI":"10.1007\/s10489-021-02262-0","article-title":"Portfolio management system in equity market neutral using reinforcement learning","volume":"51","author":"Wu","year":"2021","journal-title":"Appl. Intell."},{"key":"ref32","doi-asserted-by":"publisher","first-page":"101125","DOI":"10.1016\/j.suscom.2025.101125","article-title":"Deep reinforcement learning-driven intelligent portfolio management with green computing: sustainable portfolio optimization and management","volume":"46","author":"Xu","year":"2025","journal-title":"Sustain. Comput. Inf. Syst."},{"key":"ref33","article-title":"Fully parameterized quantile function for distributional reinforcement learning","volume":"32","author":"Yang","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref34","doi-asserted-by":"publisher","first-page":"152","DOI":"10.1007\/s11063-024-11611-2","article-title":"An overestimation reduction method based on the multi-step weighted double estimation using value-decomposition multi-agent reinforcement learning","volume":"56","author":"Zhao","year":"2024","journal-title":"Neural Process. Lett."},{"key":"ref35","doi-asserted-by":"publisher","first-page":"2194","DOI":"10.1109\/TSTE.2024.3406590","article-title":"Cooperative dispatch of renewable-penetrated microgrids alliances using risk-sensitive reinforcement learning","volume":"15","author":"Zhu","year":"2024","journal-title":"IEEE Trans. Sustain. Energy"}],"container-title":["Frontiers in Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2025.1709493\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T15:00:28Z","timestamp":1769785228000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2025.1709493\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,30]]},"references-count":35,"alternative-id":["10.3389\/frai.2025.1709493"],"URL":"https:\/\/doi.org\/10.3389\/frai.2025.1709493","relation":{},"ISSN":["2624-8212"],"issn-type":[{"value":"2624-8212","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,30]]},"article-number":"1709493"}}