{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T10:09:17Z","timestamp":1768817357052,"version":"3.49.0"},"reference-count":58,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2021,4,13]],"date-time":"2021-04-13T00:00:00Z","timestamp":1618272000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Research Foundation of Korea","award":["2017R1E1A1A03070652, 2020R1F1A1072772"],"award-info":[{"award-number":["2017R1E1A1A03070652, 2020R1F1A1072772"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>The problem of finding adequate population models in ecology is important for understanding essential aspects of their dynamic nature. Since analyzing and accurately predicting the intelligent adaptation of multiple species is difficult due to their complex interactions, the study of population dynamics still remains a challenging task in computational biology. In this paper, we use a modern deep reinforcement learning (RL) approach to explore a new avenue for understanding predator-prey ecosystems. Recently, reinforcement learning methods have achieved impressive results in areas, such as games and robotics. RL agents generally focus on building strategies for taking actions in an environment in order to maximize their expected returns. Here we frame the co-evolution of predators and preys in an ecosystem as allowing agents to learn and evolve toward better ones in a manner appropriate for multi-agent reinforcement learning. Recent significant advancements in reinforcement learning allow for new perspectives on these types of ecological issues. Our simulation results show that throughout the scenarios with RL agents, predators can achieve a reasonable level of sustainability, along with their preys.<\/jats:p>","DOI":"10.3390\/e23040461","type":"journal-article","created":{"date-parts":[[2021,4,13]],"date-time":"2021-04-13T22:55:09Z","timestamp":1618354509000},"page":"461","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Co-Evolution of Predator-Prey Ecosystems by Reinforcement Learning Agents"],"prefix":"10.3390","volume":"23","author":[{"given":"Jeongho","family":"Park","sequence":"first","affiliation":[{"name":"Department of Control and Instrumentation Engineering, Korea University, 2511 Sejong-ro, Sejong-City 30019, Korea"}]},{"given":"Juwon","family":"Lee","sequence":"additional","affiliation":[{"name":"Department of Control and Instrumentation Engineering, Korea University, 2511 Sejong-ro, Sejong-City 30019, Korea"}]},{"given":"Taehwan","family":"Kim","sequence":"additional","affiliation":[{"name":"Department of Control and Instrumentation Engineering, Korea University, 2511 Sejong-ro, Sejong-City 30019, Korea"}]},{"given":"Inkyung","family":"Ahn","sequence":"additional","affiliation":[{"name":"Department of Mathematics, College of Science and Technology, Korea University, 2511 Sejong-ro, Sejong-City 30019, Korea"}]},{"given":"Jooyoung","family":"Park","sequence":"additional","affiliation":[{"name":"Department of Control and Instrumentation Engineering, Korea University, 2511 Sejong-ro, Sejong-City 30019, Korea"}]}],"member":"1968","published-online":{"date-parts":[[2021,4,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Averill, I., Lam, K.Y., and Lou, Y. (2017). The Role of Advection in a Two-Species Competition Model: A Bifurcation Approach, American Mathematical Society.","DOI":"10.1090\/memo\/1161"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2745","DOI":"10.3934\/dcdsb.2012.17.2745","article-title":"On limit systems for some population models with cross-diffusion","volume":"17","author":"Kuto","year":"2012","journal-title":"Discret. Contin. Dyn. Syst. B"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"435","DOI":"10.3934\/dcds.2004.10.435","article-title":"On a limiting system in the Lotka\u2013Volterra competition with cross-diffusion","volume":"10","author":"Lou","year":"2004","journal-title":"Discret. Contin. Dyn. Syst. A"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"5160","DOI":"10.1016\/j.jde.2017.01.017","article-title":"Nonexistence of nonconstant steady-state solutions in a triangular cross-diffusion model","volume":"262","author":"Lou","year":"2017","journal-title":"J. Differ. Equ."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1341","DOI":"10.1007\/s00285-013-0674-6","article-title":"Global asymptotic stability and the ideal free distribution in a starvation driven diffusion","volume":"68","author":"Kim","year":"2014","journal-title":"J. Math. Biol."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"254","DOI":"10.1007\/s11538-016-0142-8","article-title":"Evolution of dispersal with starvation measure and coexistence","volume":"78","author":"Kim","year":"2016","journal-title":"Bull. Math. Biol."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1016\/j.jmaa.2019.06.027","article-title":"Non-uniform dispersal of logistic population models with free boundaries in a spatially heterogeneous environment","volume":"479","author":"Choi","year":"2019","journal-title":"J. Math. Anal. Appl."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2141","DOI":"10.1007\/s00285-019-01336-5","article-title":"Intraguild predation with evolutionary dispersal in a spatially heterogeneous environment","volume":"78","author":"Choi","year":"2019","journal-title":"J. Math. Biol."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"96","DOI":"10.1016\/j.aml.2018.08.014","article-title":"Strong competition model with non-uniform dispersal in a heterogeneous environment","volume":"88","author":"Choi","year":"2019","journal-title":"Appl. Math. Lett."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"123860","DOI":"10.1016\/j.jmaa.2020.123860","article-title":"Predator-prey interaction systems with non-uniform dispersal in a spatially heterogeneous environment","volume":"485","author":"Choi","year":"2020","journal-title":"J. Math. Anal. Appl."},{"key":"ref_11","unstructured":"Skellam, J.G. (1973). The formulation and interpretation of mathematical models of diffusional process in population biology. The Mathematical Theory of The Dynamic of Biological Populations, Springer."},{"key":"ref_12","unstructured":"Okubo, A., and Levin, S.A. (2013). Diffusion and Ecological Problems: Modern Perspectives, Springer Science & Business Media."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1016\/0040-5809(91)90041-D","article-title":"Dispersal in patchy environments: The effects of temporal and spatial structure","volume":"39","author":"Cohen","year":"1991","journal-title":"Theor. Popul. Biol."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1146\/annurev.es.21.110190.002313","article-title":"Evolution of dispersal: Theoretical models and empirical tests using birds and mammals","volume":"21","author":"Johnson","year":"1990","journal-title":"Annu. Rev. Ecol. Syst."},{"key":"ref_15","unstructured":"Nagylaki, T. (2013). Introduction to Theoretical Population Genetics, Springer Science & Business Media."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Cantrell, R.S., and Cosner, C. (2004). Spatial Ecology Via Reaction-Diffusion Equations, John Wiley & Sons.","DOI":"10.1002\/0470871296"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"256","DOI":"10.1016\/j.aml.2019.06.021","article-title":"Effect of prey-taxis on predator\u2019s invasion in a spatially heterogeneous environment","volume":"98","author":"Choi","year":"2019","journal-title":"Appl. Math. Lett."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"4222","DOI":"10.1016\/j.jde.2019.10.019","article-title":"Global well-posedness and stability analysis of prey-predator model with indirect prey-taxis","volume":"268","author":"Ahn","year":"2020","journal-title":"J. Differ. Equ."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"5847","DOI":"10.1016\/j.jde.2015.12.024","article-title":"Global existence of solutions and uniform persistence of a diffusive predator-prey model with prey-taxis","volume":"260","author":"Wu","year":"2016","journal-title":"J. Differ. Equ."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1257","DOI":"10.1016\/j.jde.2016.10.010","article-title":"Global stability of prey-taxis systems","volume":"262","author":"Jin","year":"2017","journal-title":"J. Differ. Equ."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"2056","DOI":"10.1016\/j.nonrwa.2009.05.005","article-title":"Global existence of classical solutions to a predator & prey model with nonlinear prey-taxis","volume":"11","author":"Tao","year":"2010","journal-title":"Nonlinear Anal. Real World Appl."},{"key":"ref_22","first-page":"365","article-title":"Artificial adaptive agents in economic theory","volume":"81","author":"Holland","year":"1991","journal-title":"Am. Econ. Rev."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Macal, C., and North, M. (2014, January 7\u201310). Introductory tutorial: Agent-based modeling and simulation. Proceedings of the Winter Simulation Conference 2014, Savannah, GA, USA.","DOI":"10.1109\/WSC.2014.7019874"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Sutton, R.S., and Barto, A.G. (1998). Introduction to Reinforcement Learning, MIT Press.","DOI":"10.1109\/TNN.1998.712192"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/nature24270","article-title":"Mastering the game of go without human knowledge","volume":"550","author":"Silver","year":"2017","journal-title":"Nature"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"604","DOI":"10.1038\/s41586-020-03051-4","article-title":"Mastering atari, go, chess and shogi by planning with a learned model","volume":"588","author":"Schrittwieser","year":"2020","journal-title":"Nature"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"350","DOI":"10.1038\/s41586-019-1724-z","article-title":"Grandmaster level in StarCraft II using multi-agent reinforcement learning","volume":"575","author":"Vinyals","year":"2019","journal-title":"Nature"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Hahn, C., Ritz, F., Wikidal, P., Phan, T., Gabor, T., and Linnhoff-Popien, C. (2020). Foraging swarms using multi-agent reinforcement learning. Artificial Life Conference Proceedings, MIT Press.","DOI":"10.1162\/isal_a_00267"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Ritz, F., Hohnstein, F., M\u00fcller, R., Phan, T., Gabor, T., Hahn, C., and Linnhoff-Popien, C. (2020). Towards ecosystem management from greedy reinforcement learning in a predator-prey setting. Artificial Life Conference Proceedings, MIT Press.","DOI":"10.1162\/isal_a_00273"},{"key":"ref_31","unstructured":"Phan, T., Belzner, L., Schmid, K., Gabor, T., Ritz, F., Feld, S., and Linnhoff-Popien, C. (2021, April 13). A Distributed Policy Iteration Scheme for Cooperative Multi-Agent Policy Approximation. Available online: https:\/\/ala2020.vub.ac.be\/papers\/ALA2020_paper_36.pdf."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Hahn, C., Phan, T., Gabor, T., Belzner, L., and Linnhoff-Popien, C. (2019). Emergent escape-based flocking behavior using multi-agent reinforcement learning. Artificial Life Conference Proceedings, MIT Press.","DOI":"10.1162\/isal_a_00226.xml"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Gabor, T., Sedlmeier, A., Kiermeier, M., Phan, T., Henrich, M., Pichlmair, M., Kempter, B., Klein, C., Sauer, H., and Wieghardt, J. (2019, January 13\u201317). Scenario co-evolution for reinforcement learning on a grid world smart factory domain. Proceedings of the Genetic and Evolutionary Computation Conference, New York, NY, USA.","DOI":"10.1145\/3321707.3321831"},{"key":"ref_34","first-page":"1","article-title":"Deep reinforcement learning for swarm systems","volume":"20","author":"Adrian","year":"2019","journal-title":"J. Mach. Learn. Res."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1038\/s41586-019-1857-0","article-title":"Long-term cyclic persistence in an experimental predator & prey system","volume":"577","author":"Blasius","year":"2020","journal-title":"Nature"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"773","DOI":"10.3390\/e21080773","article-title":"Deep-reinforcement learning-based co-evolution in a predator & prey system","volume":"21","author":"Wang","year":"2019","journal-title":"Entropy"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"100815","DOI":"10.1016\/j.ecocom.2020.100815","article-title":"A reinforcement learning-based predator-prey model","volume":"42","author":"Wang","year":"2020","journal-title":"Ecol. Complex."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"750","DOI":"10.1007\/s10458-019-09421-1","article-title":"A survey and critique of multiagent deep reinforcement learning","volume":"33","author":"Kartal","year":"2019","journal-title":"Auton. Agents -Multi-Agent Syst."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1021\/j150111a004","article-title":"Contribution to the theory of periodic reactions","volume":"14","author":"Lotka","year":"2002","journal-title":"J. Phys. Chem."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Allman, E.S., Allman, E.S., and Rhodes, J.A. (2004). Mathematical Models in Biology: An Introduction, Cambridge University Press.","DOI":"10.1017\/CBO9780511790911"},{"key":"ref_41","first-page":"061902","article-title":"Spontaneous emergence of spatial patterns in a predator-prey model","volume":"76","author":"Carneiro","year":"2007","journal-title":"Phys. Rev."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Gupta, J.K., Egorov, M., and Kochenderfer, M. (2017). Cooperative multi-agent control using deep reinforcement learning. International Conference on Autonomous Agents and Multiagent Systems, Springer.","DOI":"10.1007\/978-3-319-71682-4_5"},{"key":"ref_43","unstructured":"Papoudakis, G., Christianos, F., Rahman, A., and Albrecht, S.V. (2019). Dealing with non-stationarity in multi-agent deep reinforcement learning. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Zhang, Q., Dong, H., and Pan, W. (2020). Lyapunov-based reinforcement learning for decentralized multi-agent control. International Conference on Distributed Artificial Intelligence, Springer.","DOI":"10.1007\/978-3-030-64096-5_5"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Lockhart, E., Lanctot, M., P\u00e9rolat, J., Lespiau, J.B., Morrill, D., Timbers, F., and Tuyls, K. (2019). Computing approximate equilibria in sequential adversarial games by exploitability descent. arXiv.","DOI":"10.24963\/ijcai.2019\/66"},{"key":"ref_46","unstructured":"Timbers, F., Lockhart, E., Schmid, M., Lanctot, M., and Bowling, M. (2020). Approximate exploitability: Learning a best response in large games. arXiv."},{"key":"ref_47","unstructured":"Tang, J., Paster, K., and Abbeel, P. (2021, April 13). Equilibrium Finding via Asymmetric Self-Play Reinforcement Learning. Available online: https:\/\/drive.google.com\/file\/d\/0B_utB5Y8Y6D5eWJ4Vk1hSDZzZDhwMFlDYjlRVGpmWGlZVWJB\/view."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1016\/S0927-0507(05)80172-0","article-title":"Markov decision processes","volume":"Volume 2","author":"Puterman","year":"1990","journal-title":"Handbooks in Operations Research and Management Science"},{"key":"ref_49","unstructured":"Nachum, O., and Dai, B. (2020). Reinforcement learning via Fenchel-Rockafellar duality. arXiv."},{"key":"ref_50","unstructured":"Belousov, B., and Peters, J. (2017). f-Divergence constrained policy improvement. arXiv."},{"key":"ref_51","unstructured":"Nachum, O., Dai, B., Kostrikov, I., Chow, Y., Li, L., and Schuurmans, D. (2019). Algaedice: Policy gradient from arbitrary experience. arXiv."},{"key":"ref_52","unstructured":"Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Belousov, B., and Peters, J. (2019). Entropic regularization of markov decision processes. Entropy, 21.","DOI":"10.3390\/e21070674"},{"key":"ref_54","unstructured":"Loshchilov, I., and Hutter, F. (2017). Decoupled weight decay regularization. arXiv."},{"key":"ref_55","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., and Antiga, L. (2019). PyTorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems 32, Curran Associates, Inc."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1109\/MCSE.2007.55","article-title":"Matplotlib: A 2D graphics environment","volume":"9","author":"Hunter","year":"2007","journal-title":"Comput. Sci. Eng."},{"key":"ref_57","unstructured":"Yu, L., Song, J., and Ermon, S. (2019). Multi-agent adversarial inverse reinforcement learning. arXiv."},{"key":"ref_58","unstructured":"Riasanow, T., Fl\u00f6tgen, R.J., Greineder, M., M\u00f6slein, D., B\u00f6hm, M., and Krcmar, H. (2019, January 15\u201317). Co-evolution in business ecosystems: Findings from literature. Proceedings of the 40 Years EMISA 2019, Tutzing, Germany."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/4\/461\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:47:42Z","timestamp":1760161662000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/4\/461"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,13]]},"references-count":58,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2021,4]]}},"alternative-id":["e23040461"],"URL":"https:\/\/doi.org\/10.3390\/e23040461","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,4,13]]}}}