{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,30]],"date-time":"2025-12-30T15:33:29Z","timestamp":1767108809570,"version":"3.41.0"},"reference-count":28,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2012,11,1]],"date-time":"2012-11-01T00:00:00Z","timestamp":1351728000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput. Theory"],"published-print":{"date-parts":[[2012,11]]},"abstract":"<jats:p>\n            We show that the problem of finding an optimal stochastic blind controller in a Markov decision process is an NP-hard problem. The corresponding decision problem is NP-hard in PSPACE and\n            <jats:sc>sqrt-sum<\/jats:sc>\n            -hard, hence placing it in NP would imply breakthroughs in long-standing open problems in computer science. Our result establishes that the more general problem of stochastic controller optimization in POMDPs is also NP-hard. Nonetheless, we outline a special case that is convex and admits efficient global solutions.\n          <\/jats:p>","DOI":"10.1145\/2382559.2382563","type":"journal-article","created":{"date-parts":[[2012,12,4]],"date-time":"2012-12-04T20:10:57Z","timestamp":1354651857000},"page":"1-8","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":28,"title":["On the Computational Complexity of Stochastic Controller Optimization in POMDPs"],"prefix":"10.1145","volume":"4","author":[{"given":"Nikos","family":"Vlassis","sequence":"first","affiliation":[{"name":"University of Luxembourg"}]},{"given":"Michael L.","family":"Littman","sequence":"additional","affiliation":[{"name":"Brown University"}]},{"given":"David","family":"Barber","sequence":"additional","affiliation":[{"name":"University College London"}]}],"member":"320","published-online":{"date-parts":[[2012,11]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1287\/moor.8.2.273"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1137\/070697926"},{"volume-title":"Proceedings of the 20th International Joint Conference on Artificial Intelligence.","author":"Amato C.","key":"e_1_2_1_3_1","unstructured":"Amato , C. , Bernstein , D. S. , and Zilberstein , S . 2007. Solving POMDPs using quadratically constrained linear programs . In Proceedings of the 20th International Joint Conference on Artificial Intelligence. Amato, C., Bernstein, D. S., and Zilberstein, S. 2007. Solving POMDPs using quadratically constrained linear programs. In Proceedings of the 20th International Joint Conference on Artificial Intelligence."},{"volume-title":"Proceedings of the 19th International Joint Conference on Artificial Intelligence.","author":"Bernstein D. S.","key":"e_1_2_1_4_1","unstructured":"Bernstein , D. S. , Hansen , E. A. , and Zilberstein , S . 2005. Bounded policy iteration for decentralized POMDPs . In Proceedings of the 19th International Joint Conference on Artificial Intelligence. Bernstein, D. S., Hansen, E. A., and Zilberstein, S. 2005. Bounded policy iteration for decentralized POMDPs. In Proceedings of the 19th International Joint Conference on Artificial Intelligence."},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Boyd S. and Vandenberghe L. 2004. Convex Optimization. Cambridge University Press Cambridge UK. Boyd S. and Vandenberghe L. 2004. Convex Optimization . Cambridge University Press Cambridge UK.","DOI":"10.1017\/CBO9780511804441"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/62212.62257"},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the 10th National Conference on Artificial Intelligence.","author":"Chrisman L.","year":"1992","unstructured":"Chrisman , L. 1992 . Reinforcement learning with perceptual aliasing: The perceptual distinctions approach . In Proceedings of the 10th National Conference on Artificial Intelligence. Chrisman, L. 1992. Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In Proceedings of the 10th National Conference on Artificial Intelligence."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1137\/080720826"},{"key":"e_1_2_1_9_1","unstructured":"Garey M. R. and Johnson D. S. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co. New York NY. Garey M. R. and Johnson D. S. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness . W. H. Freeman & Co. New York NY."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/800113.803626"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 14th International Conference on Uncertainty in Artificial Intelligence.","author":"Hansen E.","year":"1998","unstructured":"Hansen , E. 1998 . Solving POMDPs by searching in policy space . In Proceedings of the 14th International Conference on Uncertainty in Artificial Intelligence. Hansen, E. 1998. Solving POMDPs by searching in policy space. In Proceedings of the 14th International Conference on Uncertainty in Artificial Intelligence."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1016\/0377-2217(79)90146-2"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(98)00023-X"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 3rd International Conference on Simulation of Adaptive Behavior.","author":"Littman M. L.","year":"1994","unstructured":"Littman , M. L. 1994 . Memoryless policies: Theoretical limitations and practical results . In Proceedings of the 3rd International Conference on Simulation of Adaptive Behavior. Littman, M. L. 1994. Memoryless policies: Theoretical limitations and practical results. In Proceedings of the 3rd International Conference on Simulation of Adaptive Behavior."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622797.1622798"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622394.1622398"},{"volume-title":"Proceedings of the 16th National Conference on Artificial Intelligence.","author":"Madani O.","key":"e_1_2_1_17_1","unstructured":"Madani , O. , Hanks , S. , and Condon , A . 1999. On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems . In Proceedings of the 16th National Conference on Artificial Intelligence. Madani, O., Hanks, S., and Condon, A. 1999. On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems. In Proceedings of the 16th National Conference on Artificial Intelligence."},{"volume-title":"Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence.","author":"Meuleau N.","key":"e_1_2_1_18_1","unstructured":"Meuleau , N. , Kim , K. , Kaelbling , L. , and Cassandra , A . 1999. Solving POMDPs by searching the space of finite policies . In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence. Meuleau, N., Kim, K., Kaelbling, L., and Cassandra, A. 1999. Solving POMDPs by searching the space of finite policies. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.4153\/CJM-1965-053-6"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/347476.347480"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1287\/moor.12.3.441"},{"volume-title":"School of Industrial and Systems Engineering","author":"Platzman L. K.","key":"e_1_2_1_22_1","unstructured":"Platzman , L. K. 1981. A feasible computational approach to infinite-horizon partially-observed Markov decision problems. Tech. rep ., School of Industrial and Systems Engineering , Georgia Institute of Technology . J-81-2. Platzman, L. K. 1981. A feasible computational approach to infinite-horizon partially-observed Markov decision problems. Tech. rep., School of Industrial and Systems Engineering, Georgia Institute of Technology. J-81-2."},{"key":"e_1_2_1_23_1","unstructured":"Poupart P. and Boutilier C. 2004. Bounded finite state controllers. In Advances in Neural Information Processing Systems 16 S. Thrun L. Saul and B. Sch\u00f6lkopf Eds. MIT Press Cambridge MA. Poupart P. and Boutilier C. 2004. Bounded finite state controllers. In Advances in Neural Information Processing Systems 16 S. Thrun L. Saul and B. Sch\u00f6lkopf Eds. MIT Press Cambridge MA."},{"volume-title":"Markov Decision Processes : Discrete Stochastic Dynamic Programming","author":"Puterman M.","key":"e_1_2_1_24_1","unstructured":"Puterman , M. 1994. Markov Decision Processes : Discrete Stochastic Dynamic Programming . John Wiley & Sons , New York . Puterman, M. 1994. Markov Decision Processes : Discrete Stochastic Dynamic Programming. John Wiley & Sons, New York."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/s001860400402"},{"volume-title":"Proceedings of the 11th International Conference on Machine Learning.","author":"Singh S. P.","key":"e_1_2_1_26_1","unstructured":"Singh , S. P. , Jaakkola , T. , and Jordan , M. I . 1994. Learning without state-estimation in partially observable Markovian decision processes . In Proceedings of the 11th International Conference on Machine Learning. Singh, S. P., Jaakkola, T., and Jordan, M. I. 1994. Learning without state-estimation in partially observable Markovian decision processes. In Proceedings of the 11th International Conference on Machine Learning."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.1971.5408604"},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Toussaint M. Storkey A. and Harmeling S. 2011. Expectation-Maximization methods for solving (PO)MDPs and optimal control problems. In Bayesian Time Series Models D. Barber A. T. Cemgil and S. Chiappa Eds. Cambridge University Press. Toussaint M. Storkey A. and Harmeling S. 2011. Expectation-Maximization methods for solving (PO)MDPs and optimal control problems. In Bayesian Time Series Models D. Barber A. T. Cemgil and S. Chiappa Eds. Cambridge University Press.","DOI":"10.1017\/CBO9780511984679.019"}],"container-title":["ACM Transactions on Computation Theory"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2382559.2382563","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2382559.2382563","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T09:34:38Z","timestamp":1750239278000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2382559.2382563"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,11]]},"references-count":28,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2012,11]]}},"alternative-id":["10.1145\/2382559.2382563"],"URL":"https:\/\/doi.org\/10.1145\/2382559.2382563","relation":{},"ISSN":["1942-3454","1942-3462"],"issn-type":[{"type":"print","value":"1942-3454"},{"type":"electronic","value":"1942-3462"}],"subject":[],"published":{"date-parts":[[2012,11]]},"assertion":[{"value":"2011-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-11-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}