{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T05:05:40Z","timestamp":1750309540086,"version":"3.41.0"},"reference-count":203,"publisher":"Association for Computing Machinery (ACM)","issue":"10","license":[{"start":{"date-parts":[[2025,5,6]],"date-time":"2025-05-06T00:00:00Z","timestamp":1746489600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Comput. Surv."],"published-print":{"date-parts":[[2025,10,31]]},"abstract":"<jats:p>Many sequential decision-making problems in finance like trading, portfolio optimisation, and so on have been modelled using reinforcement learning (RL) and evolutionary computation (EC). Recent studies on problems from various domains have shown that EC can be used to improve the performance of RL and vice versa. Over the years, researchers have proposed different ways of hybridising RL and EC for trading and portfolio optimisation. However, there is a lack of a thorough survey in this research area, which lies at the intersection of RL, EC, and finance. This article surveys hybrid techniques combining EC and RL for financial applications and presents a novel taxonomy. Research gaps have been discovered in existing works and some open problems have been identified for future works. A detailed discussion about different design choices made in the existing literature is also included.<\/jats:p>","DOI":"10.1145\/3728634","type":"journal-article","created":{"date-parts":[[2025,4,8]],"date-time":"2025-04-08T11:44:54Z","timestamp":1744112694000},"page":"1-36","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Hybrids of Reinforcement Learning and Evolutionary Computation in Finance: A Survey"],"prefix":"10.1145","volume":"57","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-5472-3015","authenticated-orcid":false,"given":"Sandarbh","family":"Yadav","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Indian Institute of Technology Bombay, Mumbai, India and Centre for AI and ML, Institute for Development and Research in Banking Technology, Hyderabad, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0082-6227","authenticated-orcid":false,"given":"Vadlamani","family":"Ravi","sequence":"additional","affiliation":[{"name":"Centre for AI and ML, Institute for Development and Research in Banking Technology, Hyderabad, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-7707-6056","authenticated-orcid":false,"given":"Shivaram","family":"Kalyanakrishnan","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Indian Institute of Technology Bombay, Mumbai, India"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,5,6]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2019.112891"},{"key":"e_1_3_2_3_2","volume-title":"Technical Analysis from A to Z","author":"Achelis Steven B.","year":"2001","unstructured":"Steven B. Achelis. 2001. Technical Analysis from A to Z. McGraw Hill Education."},{"key":"e_1_3_2_4_2","first-page":"487","article-title":"Interactions between learning and evolution","volume":"10","author":"Ackley David","year":"1991","unstructured":"David Ackley and Michael Littman. 1991. Interactions between learning and evolution. Artificial Life II 10 (1991), 487\u2013509.","journal-title":"Artificial Life II"},{"key":"e_1_3_2_5_2","unstructured":"Wilhelm Ala-Krekola. 2021. Financial portfolio management with evolution strategies-based reinforcement learning. Master\u2019s thesis. Aalto University School of Business."},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2017.06.023"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2019.04.013"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1561\/0500000003"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1080\/14697680400008593"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/NAFIPS.2016.7851630"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.34133\/icomputing.0025"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1177\/1073858406293182"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/CIFER.2003.1196282"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1287\/opre.30.5.961"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/72.279181"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.5555\/375108"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1015059928466"},{"key":"e_1_3_2_18_2","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1007\/978-1-4842-5127-0_4","article-title":"Market making via reinforcement learning","author":"II Taweh Beysolow","year":"2019","unstructured":"Taweh Beysolow II and Taweh Beysolow II. 2019. Market making via reinforcement learning. Applied Reinforcement Learning with Python: With OpenAI Gym, Tensorflow, and Keras (2019), 77\u201394.","journal-title":"Applied Reinforcement Learning with Python: With OpenAI Gym, Tensorflow, and Keras"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC55065.2022.9870311"},{"issue":"2","key":"e_1_3_2_20_2","first-page":"47","article-title":"Using bollinger bands","volume":"10","author":"Bollinger John","year":"1992","unstructured":"John Bollinger. 1992. Using bollinger bands. Stocks and Commodities 10, 2 (1992), 47\u201351.","journal-title":"Stocks and Commodities"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/CIFEr.2012.6327770"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2010.5586067"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2009.02.062"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2016.09.016"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1540-6288.2001.tb00024.x"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cor.2009.12.003"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/SICE.2007.4421448"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2007.4424475"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/1276958.1277232"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2009.4983238"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/1389095.1389413"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2009.05.054"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1541\/ieejeiss.129.344"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/SICE.2008.4654739"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2009.02.049"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ejor.2014.07.034"},{"key":"e_1_3_2_37_2","article-title":"The use of ESG scores in academic literature: A systematic literature review","author":"Cl\u00e9ment Alexandre","year":"2023","unstructured":"Alexandre Cl\u00e9ment, \u00c9lisabeth Robinot, and L\u00e9o Trespeuch. 2023. The use of ESG scores in academic literature: A systematic literature review. Journal of Enterprising Communities: People and Places in the Global Economy 19, 1 (2023), 92\u2013110.","journal-title":"Journal of Enterprising Communities: People and Places in the Global Economy"},{"key":"e_1_3_2_38_2","unstructured":"Alfredo V. Clemente Humberto N. Castej\u00f3n and Arjun Chandra. 2017. Efficient parallel methods for deep reinforcement learning. arXiv:1705.04862. Retrieved from https:\/\/arxiv.org\/abs\/1705.04862"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3383455.3422544"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1093\/rfs\/11.3.489"},{"key":"e_1_3_2_41_2","article-title":"Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents","volume":"31","author":"Conti Edoardo","year":"2018","unstructured":"Edoardo Conti, Vashisht Madhavan, Felipe Petroski Such, Joel Lehman, Kenneth Stanley, and Jeff Clune. 2018. Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.irfa.2018.09.003"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.nonrwa.2008.04.023"},{"key":"e_1_3_2_44_2","volume-title":"Proceedings of the Congresso Brasileiro de Autom\u00e1tica-CBA","author":"Costa Guinther K. da","year":"2020","unstructured":"Guinther K. da Costa, Leandro dos S. Coelho, and Roberto Z. Freire. 2020. Image representation of time series for reinforcement learning trading agent. In Proceedings of the Congresso Brasileiro de Autom\u00e1tica-CBA."},{"key":"e_1_3_2_45_2","unstructured":"Sanjiv Ranjan Das Daniel N. Ostrov Anand Radhakrishnan and Deep Srivastav. 2018. A new approach to goals-based wealth management. Journal of Investment Management 16 3 (2018) 4\u201330."},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1540-6261.1985.tb05004.x"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2005.10.012"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/72.935088"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1088\/1469-7688\/1\/4\/301"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45675-9_52"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2016.2522401"},{"key":"e_1_3_2_52_2","article-title":"The impact of the Russian-Ukraine war on the stock market: A causal analysis","author":"K\u00f6seo\u011flu Sinem Derindere","year":"2024","unstructured":"Sinem Derindere K\u00f6seo\u011flu, Burcu Ad\u0131g\u00fczel Mercang\u00f6z, Khalid Khan, and Suleman Sarwar. 2024. The impact of the Russian-Ukraine war on the stock market: A causal analysis. Applied Economics 56, 21 (2023), 2509\u20132519.","journal-title":"Applied Economics"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.swevo.2018.03.011"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-44874-8"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2014.04.011"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15461-4_17"},{"key":"e_1_3_2_57_2","volume-title":"Reinforcement Learning in Financial Markets-a Survey","author":"Fischer Thomas G.","year":"2018","unstructured":"Thomas G. Fischer. 2018. Reinforcement Learning in Financial Markets-a Survey. Technical Report. FAU Discussion Papers in Economics."},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1561\/2200000071"},{"key":"e_1_3_2_59_2","unstructured":"Sumitra Ganesh Nelson Vadori Mengda Xu Hua Zheng Prashant Reddy and Manuela Veloso. 2019. Reinforcement learning for market making in a multi-agent dealer market. arXiv:1911.05892. Retrieved from https:\/\/arxiv.org\/abs\/1911.05892"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1989.1.4.502"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1080\/10920277.1998.10595728"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2016.01.018"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1093\/rfs\/14.1.1"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/CIFER.2003.1196283"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(00)00081-3"},{"key":"e_1_3_2_66_2","unstructured":"Denise Gorse. 2011. Application of stochastic recurrent reinforcement learning to index trading. In Proceedings of the 19th European Symposium on Artificial Neural Networks Computational Intelligence and Machine Learning. 123\u2013128."},{"key":"e_1_3_2_67_2","volume-title":"The Intelligent Investor","author":"Graham Benjamin","year":"2017","unstructured":"Benjamin Graham. 2017. The Intelligent Investor. HarperCollins Publishers Inc."},{"key":"e_1_3_2_68_2","first-page":"143","volume-title":"Proceedings of the SICE Annual Conference 2011","author":"Gu Yunqing","year":"2011","unstructured":"Yunqing Gu, Shingo Mabu, Yang Yang, Jianhua Li, and Kotaro Hirasawa. 2011. Trading rules on stock markets using genetic network programming-sarsa learning with plural subroutines. In Proceedings of the SICE Annual Conference 2011. IEEE, 143\u2013148."},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1145\/3449639.3459386"},{"key":"e_1_3_2_70_2","first-page":"1861","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Haarnoja Tuomas","year":"2018","unstructured":"Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the International Conference on Machine Learning. PMLR, 1861\u20131870."},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.1111\/mafi.12382"},{"key":"e_1_3_2_72_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10458-022-09552-y"},{"key":"e_1_3_2_73_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-87700-4_43"},{"key":"e_1_3_2_74_2","doi-asserted-by":"publisher","DOI":"10.1145\/3319619.3326755"},{"issue":"3","key":"e_1_3_2_75_2","first-page":"495","article-title":"How learning can guide evolution","volume":"1","author":"Hinton Geoffrey E.","year":"1987","unstructured":"Geoffrey E. Hinton, and Steven J. Nowlan. 1987. How learning can guide evolution. Complex Systems 1, 3 (1987), 495\u2013502.","journal-title":"Complex Systems"},{"key":"e_1_3_2_76_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2001.934337"},{"key":"e_1_3_2_77_2","doi-asserted-by":"publisher","DOI":"10.1145\/3487664.3487698"},{"key":"e_1_3_2_78_2","doi-asserted-by":"publisher","DOI":"10.26421\/JDI3.3-3"},{"key":"e_1_3_2_79_2","doi-asserted-by":"publisher","DOI":"10.5555\/2613686"},{"key":"e_1_3_2_80_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_81_2","doi-asserted-by":"publisher","DOI":"10.1038\/scientificamerican0792-66"},{"key":"e_1_3_2_82_2","doi-asserted-by":"publisher","DOI":"10.1080\/00207720412331303697"},{"key":"e_1_3_2_83_2","doi-asserted-by":"publisher","DOI":"10.1007\/11508069_76"},{"key":"e_1_3_2_84_2","doi-asserted-by":"publisher","DOI":"10.4018\/978-1-59140-702-7.ch020"},{"key":"e_1_3_2_85_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2015.07.008"},{"key":"e_1_3_2_86_2","doi-asserted-by":"publisher","DOI":"10.5555\/2331300"},{"key":"e_1_3_2_87_2","doi-asserted-by":"publisher","DOI":"10.3389\/fpsyg.2021.763346"},{"issue":"8938","key":"e_1_3_2_88_2","first-page":"8938","article-title":"Hybrid Forex prediction model using multiple regression, simulated annealing, reinforcement learning and technical analysis","volume":"2252","author":"Jamali Hana","year":"2023","unstructured":"Hana Jamali, Younes Chihab, Iv\u00e1n Garc\u00eda-Magari\u00f1o, and Omar Bencharef. 2023. Hybrid Forex prediction model using multiple regression, simulated annealing, reinforcement learning and technical analysis. IAES International Journal of Artificial Intelligence 2252, 8938 (2023), 8938.","journal-title":"IAES International Journal of Artificial Intelligence"},{"key":"e_1_3_2_89_2","doi-asserted-by":"publisher","DOI":"10.23919\/CCC55666.2022.9901620"},{"key":"e_1_3_2_90_2","doi-asserted-by":"publisher","DOI":"10.7232\/iems.2012.11.3.215"},{"key":"e_1_3_2_91_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(98)00023-X"},{"key":"e_1_3_2_92_2","doi-asserted-by":"publisher","DOI":"10.1142\/9789814417358_0006"},{"key":"e_1_3_2_93_2","doi-asserted-by":"publisher","DOI":"10.2307\/2330874"},{"key":"e_1_3_2_94_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICNN.1995.488968"},{"key":"e_1_3_2_95_2","unstructured":"Harshad Khadilkar. 2023. Supplementing gradient-based reinforcement learning with simple evolutionary ideas. arXiv:2305.07571. Retrieved from https:\/\/arxiv.org\/abs\/2305.07571"},{"key":"e_1_3_2_96_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.220.4598.671"},{"key":"e_1_3_2_97_2","article-title":"Actor-critic algorithms","volume":"12","author":"Konda Vijay","year":"1999","unstructured":"Vijay Konda and John Tsitsiklis. 1999. Actor-critic algorithms. Advances in Neural Information Processing Systems 12 (1999).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_98_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00175355"},{"key":"e_1_3_2_99_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0165-1889(02)00122-7"},{"key":"e_1_3_2_100_2","doi-asserted-by":"publisher","DOI":"10.2307\/2491270"},{"key":"e_1_3_2_101_2","first-page":"3084","volume-title":"Proceedings of the SICE Annual Conference 2010","author":"Li JianHua","year":"2010","unstructured":"JianHua Li, QinBiao Meng, Yang Yang, Shingo Mabu, Yifei Wang, and Kotaro Hirasawa. 2010. Trading rules on stock markets using genetic network programming with subroutines. In Proceedings of the SICE Annual Conference 2010. IEEE, 3084\u20133088."},{"key":"e_1_3_2_102_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2014.6900421"},{"key":"e_1_3_2_103_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10479-016-2377-z"},{"key":"e_1_3_2_104_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSYST.2020.3034416"},{"issue":"3","key":"e_1_3_2_105_2","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1109\/TSMC.2014.2358639","article-title":"Multiobjective reinforcement learning: A comprehensive overview","volume":"45","author":"Liu Chunming","year":"2014","unstructured":"Chunming Liu, Xin Xu, and Dewen Hu. 2014. Multiobjective reinforcement learning: A comprehensive overview. IEEE Transactions on Systems, Man, and Cybernetics: Systems 45, 3 (2014), 385\u2013398.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics: Systems"},{"key":"e_1_3_2_106_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2023.110680"},{"key":"e_1_3_2_107_2","doi-asserted-by":"publisher","DOI":"10.3905\/jfds.2023.1.124"},{"key":"e_1_3_2_108_2","doi-asserted-by":"publisher","DOI":"10.1145\/3490354.3494366"},{"key":"e_1_3_2_109_2","article-title":"The adaptive markets hypothesis: Market efficiency from an evolutionary perspective","author":"Lo Andrew W.","year":"2004","unstructured":"Andrew W. Lo. 2004. The adaptive markets hypothesis: Market efficiency from an evolutionary perspective. Journal of Portfolio Management. 15\u201329.","journal-title":"Journal of Portfolio Management"},{"key":"e_1_3_2_110_2","article-title":"Multi-agent actor-critic for mixed cooperative-competitive environments","volume":"30","author":"Lowe Ryan","year":"2017","unstructured":"Ryan Lowe, Yi I. Wu, Aviv Tamar, Jean Harb, and Igor Mordatch. 2017. Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in Neural Information Processing Systems 30 (2017).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_111_2","unstructured":"David W. Lu. 2017. Agent inspired trading using recurrent reinforcement learning and lstm neural networks. arXiv:1707.07338. Retrieved from https:\/\/arxiv.org\/abs\/1707.07338"},{"key":"e_1_3_2_112_2","doi-asserted-by":"publisher","DOI":"10.1145\/1276958.1277398"},{"key":"e_1_3_2_113_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2007.4424513"},{"key":"e_1_3_2_114_2","doi-asserted-by":"publisher","DOI":"10.1145\/2001576.2001800"},{"key":"e_1_3_2_115_2","doi-asserted-by":"publisher","DOI":"10.1162\/evco.2007.15.3.369"},{"key":"e_1_3_2_116_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2013.05.037"},{"key":"e_1_3_2_117_2","first-page":"1164","volume-title":"Proceedings of SICE Annual Conference 2010","author":"Mabu Shingo","year":"2010","unstructured":"Shingo Mabu, Yuzhu Lian, Yan Chen, and Kotaro Hirasawa. 2010. Generating stock trading signals based on matching degree with extracted rules by genetic network programming. In Proceedings of SICE Annual Conference 2010. IEEE, 1164\u20131169."},{"key":"e_1_3_2_118_2","doi-asserted-by":"publisher","DOI":"10.1109\/CIFEr52523.2022.9776048"},{"key":"e_1_3_2_119_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10287-011-0131-1"},{"key":"e_1_3_2_120_2","doi-asserted-by":"publisher","DOI":"10.1109\/CIFEr.2014.6924102"},{"issue":"1","key":"e_1_3_2_121_2","first-page":"77","article-title":"Portfolio selection","volume":"7","author":"Markowitz Harry","year":"1952","unstructured":"Harry Markowitz. 1952. Portfolio selection. The Journal of Finance 7, 1 (1952), 77\u201391.","journal-title":"The Journal of Finance"},{"key":"e_1_3_2_122_2","doi-asserted-by":"publisher","DOI":"10.1111\/joes.12429"},{"key":"e_1_3_2_123_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cor.2021.105400"},{"key":"e_1_3_2_124_2","doi-asserted-by":"publisher","DOI":"10.3390\/data4030110"},{"key":"e_1_3_2_125_2","first-page":"1928","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Mnih Volodymyr","year":"2016","unstructured":"Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In Proceedings of the International Conference on Machine Learning. PMLR, 1928\u20131937."},{"key":"e_1_3_2_126_2","unstructured":"Volodymyr Mnih Koray Kavukcuoglu David Silver Alex Graves Ioannis Antonoglou Daan Wierstra and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv:1312.5602. Retrieved from https:\/\/arxiv.org\/abs\/1312.5602"},{"key":"e_1_3_2_127_2","doi-asserted-by":"publisher","DOI":"10.3905\/jpm.23.2.45"},{"key":"e_1_3_2_128_2","article-title":"Reinforcement learning for trading","volume":"11","author":"Moody John","year":"1998","unstructured":"John Moody and Matthew Saffell. 1998. Reinforcement learning for trading. Advances in Neural Information Processing Systems 11 (1998).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_129_2","doi-asserted-by":"publisher","DOI":"10.1109\/72.935097"},{"key":"e_1_3_2_130_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4615-5625-1_10"},{"key":"e_1_3_2_131_2","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1099-131X(1998090)17:5\/6<441::AID-FOR707>3.0.CO;2-#"},{"key":"e_1_3_2_132_2","doi-asserted-by":"publisher","DOI":"10.3390\/math8101640"},{"key":"e_1_3_2_133_2","volume-title":"Technical Analysis of the Financial Markets: A Comprehensive Guide to Trading Methods and Applications","author":"Murphy John J.","year":"1999","unstructured":"John J. Murphy. 1999. Technical Analysis of the Financial Markets: A Comprehensive Guide to Trading Methods and Applications. Penguin."},{"key":"e_1_3_2_134_2","doi-asserted-by":"publisher","DOI":"10.2307\/2331231"},{"key":"e_1_3_2_135_2","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/7.4.308"},{"key":"e_1_3_2_136_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC55065.2022.9870209"},{"key":"e_1_3_2_137_2","doi-asserted-by":"publisher","DOI":"10.5555\/AAI28022764"},{"key":"e_1_3_2_138_2","volume-title":"Japanese Candlestick Charting Techniques: A Contemporary Guide to the Ancient Investment Techniques of the Far East","author":"Nison Steve","year":"2001","unstructured":"Steve Nison. 2001. Japanese Candlestick Charting Techniques: A Contemporary Guide to the Ancient Investment Techniques of the Far East. Penguin."},{"key":"e_1_3_2_139_2","doi-asserted-by":"crossref","unstructured":"Michael O\u2019Neill Leonardo Vanneschi Steven Gustafson and Wolfgang Banzhaf. 2010. Open issues in genetic programming. Genetic Programming and Evolvable Machines 11 (2010) 339\u2013363.","DOI":"10.1007\/s10710-010-9113-2"},{"key":"e_1_3_2_140_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2020.113573"},{"key":"e_1_3_2_141_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1000206"},{"key":"e_1_3_2_142_2","first-page":"36","article-title":"FinRDDL: Can AI planning be used for quantitative finance problems?","author":"Patra Sunandita","year":"2023","unstructured":"Sunandita Patra, Mahmoud Mahfouz, Sriram Gopalakrishnan, Daniele Magazzeni, and Manuela Veloso. 2023. FinRDDL: Can AI planning be used for quantitative finance problems? FinPlan 2023 (2023), 36\u201356.","journal-title":"FinPlan 2023"},{"key":"e_1_3_2_143_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijforecast.2021.11.001"},{"key":"e_1_3_2_144_2","doi-asserted-by":"publisher","DOI":"10.1109\/TEVC.2012.2196800"},{"key":"e_1_3_2_145_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-6419.2011.00692.x"},{"key":"e_1_3_2_146_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11704-020-0241-4"},{"key":"e_1_3_2_147_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2021.106836"},{"key":"e_1_3_2_148_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-023-10470-y"},{"key":"e_1_3_2_149_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC48606.2020.9185844"},{"key":"e_1_3_2_150_2","doi-asserted-by":"publisher","DOI":"10.1142\/S0219649224500801"},{"key":"e_1_3_2_151_2","doi-asserted-by":"publisher","DOI":"10.1201\/9781003229193"},{"key":"e_1_3_2_152_2","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-financial-110311-101813"},{"key":"e_1_3_2_153_2","volume-title":"On-line Q-learning Using Connectionist Systems","author":"Rummery Gavin A.","year":"1994","unstructured":"Gavin A. Rummery and Mahesan Niranjan. 1994. On-line Q-learning Using Connectionist Systems. University of Cambridge, Department of Engineering Cambridge, UK."},{"key":"e_1_3_2_154_2","unstructured":"Tim Salimans Jonathan Ho Xi Chen Szymon Sidor and Ilya Sutskever. 2017. Evolution strategies as a scalable alternative to reinforcement learning. arXiv:1703.03864. Retrieved from https:\/\/arxiv.org\/abs\/1703.03864"},{"key":"e_1_3_2_155_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.2007.19.3.757"},{"key":"e_1_3_2_156_2","first-page":"1889","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Schulman John","year":"2015","unstructured":"John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust region policy optimization. In Proceedings of the International Conference on Machine Learning. PMLR, 1889\u20131897."},{"key":"e_1_3_2_157_2","unstructured":"John Schulman Filip Wolski Prafulla Dhariwal Alec Radford and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv:1707.06347. Retrieved from https:\/\/arxiv.org\/abs\/1707.06347"},{"key":"e_1_3_2_158_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-92007-8_26"},{"key":"e_1_3_2_159_2","doi-asserted-by":"publisher","DOI":"10.1086\/294846"},{"key":"e_1_3_2_160_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature16961"},{"key":"e_1_3_2_161_2","unstructured":"David Silver Thomas Hubert Julian Schrittwieser Ioannis Antonoglou Matthew Lai Arthur Guez Marc Lanctot Laurent Sifre Dharshan Kumaran Thore Graepel et\u00a0al. 2017. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv:1712.01815. Retrieved from https:\/\/arxiv.org\/abs\/1712.01815"},{"key":"e_1_3_2_162_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.aar6404"},{"key":"e_1_3_2_163_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2007.4424514"},{"key":"e_1_3_2_164_2","volume-title":"A Forex Trading System Using Evolutionary Reinforcement Learning","author":"Song Yupu","year":"2017","unstructured":"Yupu Song. 2017. A Forex Trading System Using Evolutionary Reinforcement Learning. Ph. D. Dissertation. Worcester Polytechnic Institute."},{"key":"e_1_3_2_165_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.swevo.2023.101236"},{"key":"e_1_3_2_166_2","doi-asserted-by":"crossref","unstructured":"Yanjie Song Yutong Wu Yangyang Guo Ran Yan Ponnuthurai Nagaratnam Suganthan Yue Zhang Witold Pedrycz Swagatam Das Rammohan Mallipeddi Oladayo Solomon Ajani and Qiang Feng. 2024. Reinforcement learning-assisted evolutionary algorithm: A survey and research opportunities. Swarm and Evolutionary Computation 86 (2024) 101517.","DOI":"10.1016\/j.swevo.2024.101517"},{"issue":"2023","key":"e_1_3_2_167_2","first-page":"21","article-title":"Deep reinforcement learning for optimal portfolio allocation: A comparative study with mean-variance optimization","volume":"2023","author":"Sood Srijan","year":"2023","unstructured":"Srijan Sood, Kassiani Papasotiriou, Marius Vaiciulis, and Tucker Balch. 2023. Deep reinforcement learning for optimal portfolio allocation: A comparative study with mean-variance optimization. FinPlan 2023, 2023 (2023), 21.","journal-title":"FinPlan"},{"key":"e_1_3_2_168_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.tics.2004.07.008"},{"key":"e_1_3_2_169_2","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2670313"},{"key":"e_1_3_2_170_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1008202821328"},{"key":"e_1_3_2_171_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2018.8477732"},{"key":"e_1_3_2_172_2","unstructured":"Felipe Petroski Such Vashisht Madhavan Edoardo Conti Joel Lehman Kenneth O. Stanley and Jeff Clune. 2017. Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. arXiv:1712.06567. Retrieved from https:\/\/arxiv.org\/abs\/1712.06567"},{"key":"e_1_3_2_173_2","doi-asserted-by":"publisher","DOI":"10.1145\/3582560"},{"key":"e_1_3_2_174_2","volume-title":"Reinforcement Learning: An Introduction","author":"Sutton Richard S.","year":"2018","unstructured":"Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT Press."},{"key":"e_1_3_2_175_2","article-title":"Policy gradient methods for reinforcement learning with function approximation","volume":"12","author":"Sutton Richard S.","year":"1999","unstructured":"Richard S. Sutton, David McAllester, Satinder Singh, and Yishay Mansour. 1999. Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems 12 (1999).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_176_2","article-title":"Transfer learning for reinforcement learning domains: A survey.","volume":"10","author":"Taylor Matthew E.","year":"2009","unstructured":"Matthew E. Taylor and Peter Stone. 2009. Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research 10 (2009), 1633\u20131685.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_177_2","doi-asserted-by":"publisher","DOI":"10.1145\/3520304.3533983"},{"key":"e_1_3_2_178_2","doi-asserted-by":"publisher","DOI":"10.1155\/2009\/736398"},{"key":"e_1_3_2_179_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-61037-0_9"},{"key":"e_1_3_2_180_2","volume-title":"Augmenting traders with Learning Machines","author":"Vittori Edoardo","year":"2022","unstructured":"Edoardo Vittori. 2022. Augmenting traders with Learning Machines. Ph. D. Dissertation. Politecnico di Milano."},{"key":"e_1_3_2_181_2","doi-asserted-by":"publisher","DOI":"10.1145\/3490354.3494402"},{"key":"e_1_3_2_182_2","first-page":"1","article-title":"Optimal technical indicator based trading strategies using evolutionary multi objective optimization algorithms","author":"Vivek Yelleti","year":"2024","unstructured":"Yelleti Vivek, P. Shanmukh Kali Prasad, Vadlamani Madhav, Ramanuj Lal, and Vadlamani Ravi. 2024. Optimal technical indicator based trading strategies using evolutionary multi objective optimization algorithms. Computational Economics (2024), 1\u201351.","journal-title":"Computational Economics"},{"key":"e_1_3_2_183_2","first-page":"213","article-title":"Stock portfolio evaluation: An application of genetic-programming-based technical analysis","volume":"2003","author":"Wagman Liad","year":"2003","unstructured":"Liad Wagman. 2003. Stock portfolio evaluation: An application of genetic-programming-based technical analysis. Genetic Algorithms and Genetic Programming at Stanford 2003 (2003), 213\u2013220.","journal-title":"Genetic Algorithms and Genetic Programming at Stanford"},{"key":"e_1_3_2_184_2","doi-asserted-by":"crossref","unstructured":"Shuyang Wang and Diego Klabjan. 2024. An ensemble method of deep reinforcement learning for automated cryptocurrency trading. In 2024 IEEE International Conference on Blockchain and Cryptocurrency (ICBC). IEEE 461\u2013463.","DOI":"10.1109\/ICBC59979.2024.10634436"},{"key":"e_1_3_2_185_2","volume-title":"Learning from Delayed Rewards","author":"Watkins Christopher John Cornish Hellaby","year":"1989","unstructured":"Christopher John Cornish Hellaby Watkins. 1989. Learning from Delayed Rewards. Ph. D. Dissertation. King\u2019s College, Cambridge United Kingdom."},{"key":"e_1_3_2_186_2","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-016-0043-6"},{"key":"e_1_3_2_187_2","doi-asserted-by":"publisher","DOI":"10.1109\/5.58337"},{"key":"e_1_3_2_188_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2010.5586457"},{"key":"e_1_3_2_189_2","article-title":"Evolutionary function approximation for reinforcement learning","volume":"7","author":"Whiteson Shimon","year":"2006","unstructured":"Shimon Whiteson. 2006. Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research 7 (2006), 877\u2013917.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_190_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-27645-3_10"},{"key":"e_1_3_2_191_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992696"},{"key":"e_1_3_2_192_2","doi-asserted-by":"publisher","DOI":"10.2991\/978-94-6463-198-2_18"},{"key":"e_1_3_2_193_2","first-page":"1769","volume-title":"2012 Proceedings of the SICE Annual Conference (SICE\u201912)","author":"Xu Wei","year":"2012","unstructured":"Wei Xu, Lutao Wang, Shingo Mabu, and Kotaro Hirasawa. 2012. Genetic network programming with credit. In 2012 Proceedings of the SICE Annual Conference (SICE\u201912). IEEE, 1769\u20131777."},{"key":"e_1_3_2_194_2","doi-asserted-by":"publisher","DOI":"10.1145\/3383455.3422540"},{"key":"e_1_3_2_195_2","doi-asserted-by":"publisher","DOI":"10.1541\/ieejeiss.132.439"},{"key":"e_1_3_2_196_2","doi-asserted-by":"publisher","DOI":"10.20965\/jaciii.2012.p0581"},{"key":"e_1_3_2_197_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSMC.2010.5642366"},{"key":"e_1_3_2_198_2","doi-asserted-by":"publisher","DOI":"10.1145\/2464576.2480773"},{"key":"e_1_3_2_199_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2014.6900330"},{"key":"e_1_3_2_200_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10614-015-9490-y"},{"key":"e_1_3_2_201_2","doi-asserted-by":"crossref","unstructured":"Zihao Zhang Stefan Zohren and Stephen Roberts. 2020. Deep reinforcement learning for trading. The Journal of Financial Data Science 2 2 (2020) 25\u201340.","DOI":"10.3905\/jfds.2020.1.030"},{"key":"e_1_3_2_202_2","doi-asserted-by":"crossref","unstructured":"Zihao Zhang Stefan Zohren and Stephen Roberts. 2020. Deep learning for portfolio optimization. The Journal of Financial Data Science 2 4 (2020) 8\u201320.","DOI":"10.3905\/jfds.2020.1.042"},{"key":"e_1_3_2_203_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2023.126628"},{"key":"e_1_3_2_204_2","volume-title":"Introduction to Computational Finance and Financial Econometrics","author":"Zivot Eric","year":"2017","unstructured":"Eric Zivot. 2017. Introduction to Computational Finance and Financial Econometrics. Chapman and Hall Crc."}],"container-title":["ACM Computing Surveys"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3728634","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3728634","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:18:36Z","timestamp":1750295916000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3728634"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,6]]},"references-count":203,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2025,10,31]]}},"alternative-id":["10.1145\/3728634"],"URL":"https:\/\/doi.org\/10.1145\/3728634","relation":{},"ISSN":["0360-0300","1557-7341"],"issn-type":[{"type":"print","value":"0360-0300"},{"type":"electronic","value":"1557-7341"}],"subject":[],"published":{"date-parts":[[2025,5,6]]},"assertion":[{"value":"2023-11-16","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-01","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-05-06","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}