{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T04:16:12Z","timestamp":1777522572664,"version":"3.51.4"},"reference-count":31,"publisher":"SAGE Publications","issue":"3-4","license":[{"start":{"date-parts":[[2000,6,1]],"date-time":"2000-06-01T00:00:00Z","timestamp":959817600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Adaptive Behavior"],"published-print":{"date-parts":[[2000,6]]},"abstract":"<jats:p>This paper is concerned with the autonomous learning of plans in probabilistic domains with out a priori domain-specific knowledge. In contrast to existing reinforcement learning algorithms that generate only reactive plans, and existing probabilistic planning algorithms that require a sub stantial amount of a priori knowledge in order to plan, a two-stage bottom-up process is devised in which first reinforcement learning\/dynamic programming is applied, without the use of a pri ori domain-specific knowledge, to acquire a reactive plan, and then explicit plans are extracted from the reactive plan. Several options for plan extraction are examined, each of which is based on a beam search that performs temporal projection in a restricted fashion, guided by the value functions resulting from reinforcement learning\/dynamic programming. Some completeness and soundness results are given. Examples in several domains are discussed that together demonstrate the working of the proposed model.<\/jats:p>","DOI":"10.1177\/105971230000800302","type":"journal-article","created":{"date-parts":[[2007,3,11]],"date-time":"2007-03-11T01:45:25Z","timestamp":1173577525000},"page":"225-253","source":"Crossref","is-referenced-by-count":11,"title":["Learning Plans without a priori Knowledge"],"prefix":"10.1177","volume":"8","author":[{"given":"Ron","family":"Sun","sequence":"first","affiliation":[{"name":"CECS Department, University of Missouri"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chad","family":"Sessions","sequence":"additional","affiliation":[{"name":"University of Alabama"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2000,6,1]]},"reference":[{"key":"atypb1","doi-asserted-by":"publisher","DOI":"10.1016\/S0921-8890(05)80026-0"},{"key":"atypb2","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.89.4.369"},{"key":"atypb3","doi-asserted-by":"publisher","DOI":"10.1109\/21.229449"},{"key":"atypb4","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(94)00011-O"},{"key":"atypb5","volume-title":"Dynamic Programming","author":"Bellman, R.","year":"1957"},{"key":"atypb6","doi-asserted-by":"publisher","DOI":"10.1109\/9.24227"},{"key":"atypb7","volume-title":"Neuro-Dynamic Programming","author":"Bertsekas, D.","year":"1996"},{"key":"atypb8","volume-title":"Prioritized goal decomposition of Markov decision processes. Proceedings of International Joint Conference on Artificial Intelligence 1997","author":"Boutilier, C.","year":"1997"},{"key":"atypb9","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(87)90092-0"},{"key":"atypb10","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-8640.1989.tb00324.x"},{"key":"atypb11","volume-title":"Proceedings of the National Conference on Artificial Intelligence 1993","author":"Dean, T."},{"key":"atypb12","doi-asserted-by":"publisher","DOI":"10.21236\/ADA254568"},{"key":"atypb13","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(96)00023-9"},{"key":"atypb14","volume-title":"A cognitive-developmental model of planning","author":"De Lisi, R.","year":"1998"},{"key":"atypb15","volume-title":"Probabilistic planning with information gathering and contingent execution. Proceedings of Artificial Intelligence Planning and Scheduling 1994","author":"Draper, D.","year":"1994"},{"key":"atypb16","volume-title":"Proceedings of the National Conference on Artificial Intelligence","author":"Drummond, M."},{"issue":"3","key":"atypb17","doi-asserted-by":"crossref","DOI":"10.1016\/0004-3702(81)90004-7","volume":"2","author":"Fikes, R.","year":"1971","journal-title":"Artificial Intelligence"},{"key":"atypb18","volume-title":"Proceedings of the National Conference on Artificial Intelligence","author":"Gat, E."},{"key":"atypb19","volume-title":"Proceedings of International Joint Conference on Artificial Intelligence","author":"Gelfand, J."},{"key":"atypb20","volume-title":"NRL task: navigation and collision avoidance","author":"Gordon, D.","year":"1994"},{"key":"atypb21","volume-title":"Stable function approximation in dynamic programming. Proceedings of International Conference on Machine Learning","author":"Gordon, D.","year":"1995"},{"key":"atypb22","volume-title":"Utilities models for goal directed decision theoretical planers","author":"Haddawy, P.","year":"1993"},{"key":"atypb23","doi-asserted-by":"publisher","DOI":"10.1613\/jair.301"},{"key":"atypb24","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(98)00023-X"},{"key":"atypb25","volume-title":"Proceedings of the National Conference on Artificial Intelligence 1994","author":"Knoblock, C."},{"key":"atypb26","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(94)00087-H"},{"key":"atypb27","author":"Lin, S.","year":"1995","journal-title":"European Workshop on Planning"},{"key":"atypb28","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence"},{"key":"atypb29","volume-title":"Proceedings of the AISB Summer Conference","author":"Warren, D."},{"key":"atypb30","volume-title":"Learning with Delayed Rewards","author":"Watkins, C.","year":"1989"},{"key":"atypb31","volume-title":"Proceedings of International Joint Conference on Artificial Intelligence","author":"Zhang, J."}],"container-title":["Adaptive Behavior"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/105971230000800302","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/105971230000800302","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T16:18:04Z","timestamp":1777393084000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/105971230000800302"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2000,6]]},"references-count":31,"journal-issue":{"issue":"3-4","published-print":{"date-parts":[[2000,6]]}},"alternative-id":["10.1177\/105971230000800302"],"URL":"https:\/\/doi.org\/10.1177\/105971230000800302","relation":{},"ISSN":["1059-7123","1741-2633"],"issn-type":[{"value":"1059-7123","type":"print"},{"value":"1741-2633","type":"electronic"}],"subject":[],"published":{"date-parts":[[2000,6]]}}}