{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T21:40:19Z","timestamp":1771623619979,"version":"3.50.1"},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,8]]},"abstract":"<jats:p>The Markov Decision Problem (MDP) plays a central role in AI as an abstraction of sequential decision making. We contribute to the theoretical analysis of MDP PLANNING, which is the problem of computing an optimal policy for a given MDP. Specifically, we furnish improved STRONG WORST-CASE upper bounds on the running time of MDP planning. Strong bounds are those that depend only on the number of states n and the number of actions k in the specified MDP; they have no dependence on affiliated variables such as the discount factor and the number of bits needed to represent the MDP. Worst-case bounds apply to EVERY run of an algorithm; randomised algorithms can typically yield faster EXPECTED running times. While the special case of 2-action MDPs (that is, k = 2) has recently received some attention, bounds for general k have remained to be improved for several decades. Our contributions are to this general case. For k &gt;= 3, the tightest strong upper bound shown to date for MDP planning belongs to a family of algorithms called Policy Iteration. This bound is only a polynomial improvement over a trivial bound of poly(n, k) k^{n} [Mansour and Singh, 1999]. In this paper, we generalise a contrasting algorithm called the Fibonacci Seesaw, and derive a bound of poly(n, k) k^{0.6834n}. The key construct we use is a template to map algorithms for the 2-action setting to the general setting. Interestingly, this idea can also be used to design Policy Iteration algorithms with a running time upper bound of poly(n, k) k^{0.7207n}. Both our results improve upon bounds that have stood for several decades.<\/jats:p>","DOI":"10.24963\/ijcai.2017\/248","type":"proceedings-article","created":{"date-parts":[[2017,7,28]],"date-time":"2017-07-28T05:14:07Z","timestamp":1501218847000},"page":"1788-1794","source":"Crossref","is-referenced-by-count":2,"title":["Improved Strong Worst-case Upper Bounds for MDP Planning"],"prefix":"10.24963","author":[{"given":"Anchit","family":"Gupta","sequence":"first","affiliation":[{"name":"Indian Institute of Technology Bombay"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shivaram","family":"Kalyanakrishnan","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology Bombay"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"10584","event":{"name":"Twenty-Sixth International Joint Conference on Artificial Intelligence","theme":"Artificial Intelligence","location":"Melbourne, Australia","acronym":"IJCAI-2017","number":"26","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)","University of Technology Sydney (UTS)","Australian Computer Society (ACS)"],"start":{"date-parts":[[2017,8,19]]},"end":{"date-parts":[[2017,8,26]]}},"container-title":["Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2017,7,28]],"date-time":"2017-07-28T07:53:00Z","timestamp":1501228380000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2017\/248"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2017,8]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2017\/248","relation":{},"subject":[],"published":{"date-parts":[[2017,8]]}}}