{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T07:39:08Z","timestamp":1723016348394},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,8]]},"abstract":"<jats:p>The behaviour of multi-agent learning in competitive settings is often considered under the restrictive assumption of a zero-sum game. Only under this strict requirement is the behaviour of learning well understood; beyond this, learning dynamics can often display non-convergent behaviours which prevent fixed-point analysis. Nonetheless, many relevant competitive games do not satisfy the zero-sum assumption.\n\nMotivated by this, we study a smooth variant of Q-Learning, a popular reinforcement learning dynamics which balances the agents' tendency to maximise their payoffs with their propensity to explore the state space. We examine this dynamic in games which are `close' to network zero-sum games and find that Q-Learning converges to a neighbourhood around a unique equilibrium. The size of the neighbourhood is determined by the `distance' to the zero-sum game, as well as the exploration rates of the agents. We complement these results by providing a method whereby, given an arbitrary network game, the `nearest' network zero-sum game can be found efficiently. Importantly, our theoretical guarantees are widely applicable in different game settings, regardless of whether the dynamics ultimately reach an equilibrium, or remain non convergent.<\/jats:p>","DOI":"10.24963\/ijcai.2023\/16","type":"proceedings-article","created":{"date-parts":[[2023,8,11]],"date-time":"2023-08-11T08:31:30Z","timestamp":1691742690000},"page":"135-143","source":"Crossref","is-referenced-by-count":0,"title":["Beyond Strict Competition: Approximate Convergence of Multi-agent Q-Learning Dynamics"],"prefix":"10.24963","author":[{"given":"Aamal","family":"Hussain","sequence":"first","affiliation":[{"name":"Imperial College London"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Francesco","family":"Belardinelli","sequence":"additional","affiliation":[{"name":"Imperial College London"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Georgios","family":"Piliouras","sequence":"additional","affiliation":[{"name":"Singapore University of Technology and Design"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"10584","event":{"number":"32","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"acronym":"IJCAI-2023","name":"Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}","start":{"date-parts":[[2023,8,19]]},"theme":"Artificial Intelligence","location":"Macau, SAR China","end":{"date-parts":[[2023,8,25]]}},"container-title":["Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2023,8,11]],"date-time":"2023-08-11T08:32:09Z","timestamp":1691742729000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2023\/16"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2023,8]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2023\/16","relation":{},"subject":[],"published":{"date-parts":[[2023,8]]}}}