{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,18]],"date-time":"2025-09-18T10:39:04Z","timestamp":1758191944035,"version":"3.44.0"},"publisher-location":"New York, NY, USA","reference-count":55,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,12,18]]},"DOI":"10.1145\/3719545.3719556","type":"proceedings-article","created":{"date-parts":[[2025,9,16]],"date-time":"2025-09-16T09:38:41Z","timestamp":1758015521000},"page":"81-93","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["RegFTRL: Efficient Equilibrium Learning in Two-Player Zero-Sum Games"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-9897-4370","authenticated-orcid":false,"given":"Zijian","family":"Fang","sequence":"first","affiliation":[{"name":"Sun Yat-Sen University, Guangzhou, Guangdong, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-5655-9691","authenticated-orcid":false,"given":"Zongkai","family":"Liu","sequence":"additional","affiliation":[{"name":"Sun Yat-Sen University, Guangzhou, Guangdong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4371-3663","authenticated-orcid":false,"given":"Chao","family":"Yu","sequence":"additional","affiliation":[{"name":"Sun Yat-Sen University, Guangzhou, Guangdong, China"}]}],"member":"320","published-online":{"date-parts":[[2025,9,16]]},"reference":[{"key":"e_1_3_3_2_2_2","unstructured":"Kenshi Abe Kaito Ariu Mitsuki Sakamoto and Atsushi Iwasaki. 2023. A Slingshot Approach to Learning in Monotone Games. arXiv abs\/2305.16610 (2023)."},{"key":"e_1_3_3_2_3_2","unstructured":"Kenshi Abe Kaito Ariu Mitsuki Sakamoto Kentaro Toyoshima and Atsushi Iwasaki. 2022. Last-Iterate Convergence with Full- and Noisy-Information Feedback in Two-Player Zero-Sum Games. arXiv abs\/2208.09855 (2022)."},{"key":"e_1_3_3_2_4_2","series-title":"Proceedings of Machine Learning Research","first-page":"1","volume-title":"Uncertainty in Artificial Intelligence","volume":"180","author":"Abe Kenshi","year":"2022","unstructured":"Kenshi Abe, Mitsuki Sakamoto, and Atsushi Iwasaki. 2022. Mutation-driven follow the regularized leader for last-iterate convergence in zero-sum games. In Uncertainty in Artificial Intelligence(Proceedings of Machine Learning Research, Vol.\u00a0180). 1\u201310."},{"key":"e_1_3_3_2_5_2","first-page":"27","volume-title":"Proceedings of the 24th Annual Conference on Learning Theory","author":"Abernethy Jacob","year":"2011","unstructured":"Jacob Abernethy, Peter\u00a0L Bartlett, and Elad Hazan. 2011. Blackwell approachability and no-regret learning are equivalent. In Proceedings of the 24th Annual Conference on Learning Theory. JMLR Workshop and Conference Proceedings, 27\u201346."},{"key":"e_1_3_3_2_6_2","first-page":"263","volume-title":"Annual Conference on Learning Theory","author":"Abernethy Jacob","year":"2008","unstructured":"Jacob Abernethy, Elad Hazan, and Alexander Rakhlin. 2008. Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization. In Annual Conference on Learning Theory. 263\u2013274."},{"key":"e_1_3_3_2_7_2","unstructured":"Yu Bai Chi Jin and Tiancheng Yu. 2020. Near-optimal reinforcement learning with self-play. Advances in Neural Information Processing Systems 33 (2020) 2159\u20132170."},{"key":"e_1_3_3_2_8_2","series-title":"Proceedings of Machine Learning Research","first-page":"363","volume-title":"International Conference on Machine Learning","volume":"80","author":"Balduzzi David","year":"2018","unstructured":"David Balduzzi, S\u00e9bastien Racani\u00e8re, James Martens, Jakob\u00a0N. Foerster, Karl Tuyls, and Thore Graepel. 2018. The Mechanics of n-Player Differentiable Games. In International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol.\u00a080). 363\u2013372."},{"key":"e_1_3_3_2_9_2","doi-asserted-by":"crossref","unstructured":"David Blackwell. 1956. An analog of the minimax theorem for vector payoffs. Pacific J. Math. 6 (1956) 1\u20138.","DOI":"10.2140\/pjm.1956.6.1"},{"key":"e_1_3_3_2_10_2","doi-asserted-by":"crossref","unstructured":"Michael Bowling Neil Burch Michael Johanson and Oskari Tammelin. 2017. Heads-up limit hold\u2019em poker is solved. Commun. ACM 60 11 (2017) 81\u201388.","DOI":"10.1145\/3131284"},{"key":"e_1_3_3_2_11_2","volume-title":"Some notes on computation of games solutions","author":"Brown George\u00a0W","year":"1949","unstructured":"George\u00a0W Brown. 1949. Some notes on computation of games solutions. Technical Report. RAND CORP SANTA MONICA CA."},{"key":"e_1_3_3_2_12_2","volume-title":"Advances in Neural Information Processing Systems","author":"Cai Yang","year":"2022","unstructured":"Yang Cai, Argyris Oikonomou, and Weiqiang Zheng. 2022. Finite-Time Last-Iterate Convergence for Learning in Multi-Player Games. In Advances in Neural Information Processing Systems."},{"key":"e_1_3_3_2_13_2","first-page":"27952","volume-title":"Advances in Neural Information Processing Systems","author":"Cen Shicong","year":"2021","unstructured":"Shicong Cen, Yuting Wei, and Yuejie Chi. 2021. Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization. In Advances in Neural Information Processing Systems. 27952\u201327964."},{"key":"e_1_3_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511546921"},{"key":"e_1_3_3_2_15_2","volume-title":"International Conference on Learning Representations","author":"Daskalakis Constantinos","year":"2018","unstructured":"Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis, and Haoyang Zeng. 2018. Training GANs with Optimism. In International Conference on Learning Representations."},{"key":"e_1_3_3_2_16_2","series-title":"LIPIcs","first-page":"27:1\u201327:18","volume-title":"Information Technology Convergence and Services","author":"Daskalakis Constantinos","year":"2019","unstructured":"Constantinos Daskalakis and Ioannis Panageas. 2019. Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization. In Information Technology Convergence and Services(LIPIcs, Vol.\u00a0124). 27:1\u201327:18."},{"key":"e_1_3_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3485447.3512059"},{"key":"e_1_3_3_2_18_2","first-page":"5222","volume-title":"Advances in Neural Information Processing Systems","author":"Farina Gabriele","year":"2019","unstructured":"Gabriele Farina, Christian Kroer, and Tuomas Sandholm. 2019. Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions. In Advances in Neural Information Processing Systems. 5222\u20135232."},{"key":"e_1_3_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.5555\/3237383.3237408"},{"key":"e_1_3_3_2_20_2","doi-asserted-by":"crossref","unstructured":"Daniel Friedman. 1991. Evolutionary games in economics. Econometrica: journal of the econometric society (1991) 637\u2013666.","DOI":"10.2307\/2938222"},{"key":"e_1_3_3_2_21_2","volume-title":"Advances in Neural Information Processing Systems","author":"Gorbunov Eduard","year":"2022","unstructured":"Eduard Gorbunov, Adrien\u00a0B. Taylor, and Gauthier Gidel. 2022. Last-Iterate Convergence of Optimistic Gradient Method for Monotone Variational Inequalities. In Advances in Neural Information Processing Systems."},{"key":"e_1_3_3_2_22_2","first-page":"805","volume-title":"International Conference on Machine Learning","author":"Heinrich Johannes","year":"2015","unstructured":"Johannes Heinrich, Marc Lanctot, and David Silver. 2015. Fictitious self-play in extensive-form games. In International Conference on Machine Learning. 805\u2013813."},{"key":"e_1_3_3_2_23_2","unstructured":"Johannes Heinrich and David Silver. 2016. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. arXiv abs\/1603.01121 (2016)."},{"key":"e_1_3_3_2_24_2","first-page":"6369","volume-title":"Advances in Neural Information Processing Systems","author":"H\u00e9liou Am\u00e9lie","year":"2017","unstructured":"Am\u00e9lie H\u00e9liou, Johanne Cohen, and Panayotis Mertikopoulos. 2017. Learning with Bandit Feedback in Potential Games. In Advances in Neural Information Processing Systems. 6369\u20136378."},{"key":"e_1_3_3_2_25_2","doi-asserted-by":"crossref","unstructured":"Samid Hoda Andrew Gilpin Javier\u00a0F. Pena and Tuomas Sandholm. 2010. Smoothing Techniques for Computing Nash Equilibria of Sequential Games. Mathematics of Operations Research 35 (2010) 494\u2013512.","DOI":"10.1287\/moor.1100.0452"},{"key":"e_1_3_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781139173179"},{"key":"e_1_3_3_2_27_2","unstructured":"Marc Lanctot Edward Lockhart Jean-Baptiste Lespiau Vin\u00edcius\u00a0Flores Zambaldi Satyaki Upadhyay Julien P\u00e9rolat Sriram Srinivasan Finbarr Timbers Karl Tuyls Shayegan Omidshafiei Daniel Hennes Dustin Morrill Paul Muller Timo Ewalds Ryan Faulkner J\u00e1nos Kram\u00e1r Bart\u00a0De Vylder Brennan Saeta James Bradbury David Ding Sebastian Borgeaud Matthew Lai Julian Schrittwieser Thomas\u00a0W. Anthony Edward Hughes Ivo Danihelka and Jonah Ryan-Davis. 2019. OpenSpiel: A Framework for Reinforcement Learning in Games. arXiv abs\/1908.09453 (2019)."},{"key":"e_1_3_3_2_28_2","first-page":"4190","volume-title":"Advances in Neural Information Processing Systems","author":"Lanctot Marc","year":"2017","unstructured":"Marc Lanctot, Vin\u00edcius\u00a0Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien P\u00e9rolat, David Silver, and Thore Graepel. 2017. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. In Advances in Neural Information Processing Systems. 4190\u20134203."},{"key":"e_1_3_3_2_29_2","first-page":"14293","volume-title":"Advances in Neural Information Processing Systems","author":"Lee Chung-Wei","year":"2021","unstructured":"Chung-Wei Lee, Christian Kroer, and Haipeng Luo. 2021. Last-iterate Convergence in Extensive-Form Games. In Advances in Neural Information Processing Systems. 14293\u201314305."},{"key":"e_1_3_3_2_30_2","unstructured":"Alistair Letcher David Balduzzi S\u00e9bastien Racani\u00e8re James Martens Jakob\u00a0N. Foerster Karl Tuyls and Thore Graepel. 2019. Differentiable Game Mechanics. The Journal of Machine Learning Research 20 (2019) 3032\u20133071."},{"key":"e_1_3_3_2_31_2","series-title":"Proceedings of Machine Learning Research","first-page":"907","volume-title":"International Conference on Artificial Intelligence and Statistics","volume":"89","author":"Liang Tengyuan","year":"2019","unstructured":"Tengyuan Liang and James Stokes. 2019. Interaction Matters: A Note on Non-asymptotic Local Convergence of Generative Adversarial Networks. In International Conference on Artificial Intelligence and Statistics(Proceedings of Machine Learning Research, Vol.\u00a089). 907\u2013915."},{"key":"e_1_3_3_2_32_2","unstructured":"Mingyang Liu Asuman\u00a0E. Ozdaglar Tiancheng Yu and Kaiqing Zhang. 2022. The Power of Regularization in Solving Extensive-Form Games. arXiv abs\/2206.09495 (2022)."},{"key":"e_1_3_3_2_33_2","volume-title":"International Conference on Learning Representations","author":"McAleer Stephen\u00a0Marcus","year":"2023","unstructured":"Stephen\u00a0Marcus McAleer, Gabriele Farina, Marc Lanctot, and Tuomas Sandholm. 2023. ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret. In International Conference on Learning Representations."},{"key":"e_1_3_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611975031.172"},{"key":"e_1_3_3_2_35_2","series-title":"Proceedings of Machine Learning Research","first-page":"1497","volume-title":"International Conference on Artificial Intelligence and Statistics","volume":"108","author":"Mokhtari Aryan","year":"2020","unstructured":"Aryan Mokhtari, Asuman\u00a0E. Ozdaglar, and Sarath Pattathil. 2020. A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach. In International Conference on Artificial Intelligence and Statistics(Proceedings of Machine Learning Research, Vol.\u00a0108). 1497\u20131507."},{"key":"e_1_3_3_2_36_2","doi-asserted-by":"crossref","unstructured":"Dov Monderer and Lloyd\u00a0S Shapley. 1996. Potential games. Games and Economic Behavior 14 1 (1996) 124\u2013143.","DOI":"10.1006\/game.1996.0044"},{"key":"e_1_3_3_2_37_2","doi-asserted-by":"crossref","unstructured":"Shayegan Omidshafiei Christos\u00a0H. Papadimitriou Georgios Piliouras Karl Tuyls Mark Rowland Jean-Baptiste Lespiau Wojciech\u00a0M. Czarnecki Marc Lanctot Julien P\u00e9rolat and R\u00e9mi Munos. 2019. \u03b1 -Rank: Multi-Agent Evaluation by Evolution. Scientific reports 9 1 (2019) 9937.","DOI":"10.1038\/s41598-019-45619-9"},{"key":"e_1_3_3_2_38_2","series-title":"Proceedings of Machine Learning Research","first-page":"8525","volume-title":"International Conference on Machine Learning","volume":"139","author":"P\u00e9rolat Julien","year":"2021","unstructured":"Julien P\u00e9rolat, R\u00e9mi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro\u00a0A. Ortega, Neil Burch, Thomas\u00a0W. Anthony, David Balduzzi, Bart\u00a0De Vylder, Georgios Piliouras, Marc Lanctot, and Karl Tuyls. 2021. From Poincar\u00e9 Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization. In International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol.\u00a0139). 8525\u20138535."},{"key":"e_1_3_3_2_39_2","volume-title":"Advances in Neural Information Processing System","author":"Rakhlin Alexander","year":"2013","unstructured":"Alexander Rakhlin and Karthik Sridharan. 2013. Optimization, Learning, and Games with Predictable Sequences. In Advances in Neural Information Processing System, Vol.\u00a026."},{"key":"e_1_3_3_2_40_2","doi-asserted-by":"crossref","unstructured":"Julia Robinson. 1951. An iterative method of solving a game. Annals of Mathematics (1951) 296\u2013301.","DOI":"10.2307\/1969530"},{"key":"e_1_3_3_2_41_2","doi-asserted-by":"crossref","unstructured":"Yuzuru Sato Eizo Akiyama and J\u00a0Doyne Farmer. 2002. Chaos in learning a simple two-person game. Proceedings of the National Academy of Sciences 99 7 (2002) 4748\u20134751.","DOI":"10.1073\/pnas.032086299"},{"key":"e_1_3_3_2_42_2","first-page":"7623","volume-title":"Advances in Neural Information Processing Systems","author":"Sch\u00e4fer Florian","year":"2019","unstructured":"Florian Sch\u00e4fer and Anima Anandkumar. 2019. Competitive Gradient Descent. In Advances in Neural Information Processing Systems. 7623\u20137633."},{"key":"e_1_3_3_2_43_2","unstructured":"John Schulman Filip Wolski Prafulla Dhariwal Alec Radford and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arXiv abs\/1707.06347 (2017)."},{"key":"e_1_3_3_2_44_2","doi-asserted-by":"crossref","unstructured":"Shai Shalev-Shwartz. 2012. Online Learning and Online Convex Optimization. Foundations and Trends\u00ae in Machine Learning 4 (2012) 107\u2013194.","DOI":"10.1561\/2200000018"},{"key":"e_1_3_3_2_45_2","volume-title":"International Conference on Learning Representations","author":"Sokota Samuel","year":"2023","unstructured":"Samuel Sokota, Ryan D\u2019Orazio, J.\u00a0Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, and Christian Kroer. 2023. A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games. In International Conference on Learning Representations."},{"key":"e_1_3_3_2_46_2","unstructured":"Oskari Tammelin. 2014. Solving Large Imperfect Information Games Using CFR+. arXiv abs\/1407.5042 (2014)."},{"key":"e_1_3_3_2_47_2","first-page":"10279","volume-title":"International Conference on Machine Learning","author":"Tian Yi","year":"2021","unstructured":"Yi Tian, Yuanhao Wang, Tiancheng Yu, and Suvrit Sra. 2021. Online Learning in Unknown Markov Games. In International Conference on Machine Learning. 10279\u201310288."},{"key":"e_1_3_3_2_48_2","unstructured":"Emmanouil-Vasileios Vlatakis-Gkaragkounis Lampros Flokas and Georgios Piliouras. 2019. Poincar\u00e9 recurrence cycles and spurious equilibria in gradient-descent-ascent for non-convex non-concave zero-sum games. Advances in Neural Information Processing Systems 32 (2019)."},{"key":"e_1_3_3_2_49_2","unstructured":"Michael Walton and Viliam Lis\u00fd. 2021. Multi-agent Reinforcement Learning in OpenSpiel: A Reproduction Report. arXiv abs\/2103.00187 (2021)."},{"key":"e_1_3_3_2_50_2","unstructured":"Zifan Wang Yi Shen Michael Zavlanos and Karl\u00a0Henrik Johansson. 2022. No-Regret Learning in Strongly Monotone Games Converges to a Nash Equilibrium. (2022)."},{"key":"e_1_3_3_2_51_2","volume-title":"International Symposium on Artificial Intelligence and Mathematics","author":"Warmuth Manfred\u00a0K","year":"1997","unstructured":"Manfred\u00a0K Warmuth, Arun\u00a0K Jagota, et\u00a0al. 1997. Continuous and discrete-time nonlinear gradient descent: Relative loss bounds and convergence. In International Symposium on Artificial Intelligence and Mathematics, Vol.\u00a0326. Citeseer."},{"key":"e_1_3_3_2_52_2","volume-title":"International Conference on Learning Representations","author":"Wei Chen-Yu","year":"2021","unstructured":"Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, and Haipeng Luo. 2021. Linear Last-iterate Convergence in Constrained Saddle-point Optimization. In International Conference on Learning Representations."},{"key":"e_1_3_3_2_53_2","volume-title":"International Conference on Learning Representations","author":"Yadav Abhay\u00a0Kumar","year":"2018","unstructured":"Abhay\u00a0Kumar Yadav, Sohil Shah, Zheng Xu, David\u00a0W. Jacobs, and Tom Goldstein. 2018. Stabilizing Adversarial Nets with Prediction Methods. In International Conference on Learning Representations."},{"key":"e_1_3_3_2_54_2","first-page":"471","volume-title":"Global Theory of Dynamical Systems: Proceedings of an International Conference Held at Northwestern University","author":"Zeeman E\u00a0Christopher","year":"2006","unstructured":"E\u00a0Christopher Zeeman. 2006. Population dynamics from game theory. In Global Theory of Dynamical Systems: Proceedings of an International Conference Held at Northwestern University. Springer, 471\u2013497."},{"key":"e_1_3_3_2_55_2","first-page":"1641","volume-title":"Advances in Neural Information Processing Systems","author":"Zinkevich Martin","year":"2005","unstructured":"Martin Zinkevich, Amy Greenwald, and Michael\u00a0L. Littman. 2005. Cyclic Equilibria in Markov Games. In Advances in Neural Information Processing Systems. 1641\u20131648."},{"key":"e_1_3_3_2_56_2","first-page":"1729","volume-title":"Advances in Neural Information Processing Systems","author":"Zinkevich Martin","year":"2007","unstructured":"Martin Zinkevich, Michael Johanson, Michael\u00a0H. Bowling, and Carmelo Piccione. 2007. Regret Minimization in Games with Incomplete Information. In Advances in Neural Information Processing Systems. 1729\u20131736."}],"event":{"name":"DAI '24: 6th International Conference on Distributed Artificial Intelligences","acronym":"DAI '24","location":"Singapore Singapore"},"container-title":["Proceedings of the 2024 Sixth International Conference on Distributed Artificial Intelligences"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3719545.3719556","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,17]],"date-time":"2025-09-17T13:12:17Z","timestamp":1758114737000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3719545.3719556"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,18]]},"references-count":55,"alternative-id":["10.1145\/3719545.3719556","10.1145\/3719545"],"URL":"https:\/\/doi.org\/10.1145\/3719545.3719556","relation":{},"subject":[],"published":{"date-parts":[[2024,12,18]]},"assertion":[{"value":"2025-09-16","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}