{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,15]],"date-time":"2025-04-15T06:12:23Z","timestamp":1744697543486,"version":"3.38.0"},"reference-count":19,"publisher":"SAGE Publications","issue":"3-4","license":[{"start":{"date-parts":[[1997,1,1]],"date-time":"1997-01-01T00:00:00Z","timestamp":852076800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Adaptive Behavior"],"published-print":{"date-parts":[[1997,1]]},"abstract":"<jats:p> We explore the use of behavior-based architectures within the context of reinforcement learning and examine the effects of using different behavior-based architectures on the ability to learn correctly and efficiently the task at hand. In particular, we study the task of learning to push boxes in a simulated two-dimensional environment originally proposed by Mahadevan and Connell (1992). We examine issues such as effectiveness of learning, flexibility of the learning method to adapt to new environments, and effect of the behavior architecture on the ability to learn, and we report results obtained on a large number of simulation runs. <\/jats:p>","DOI":"10.1177\/105971239700500307","type":"journal-article","created":{"date-parts":[[2007,3,18]],"date-time":"2007-03-18T01:21:19Z","timestamp":1174180879000},"page":"365-390","source":"Crossref","is-referenced-by-count":6,"title":["Measuring the Effectiveness of Reinforcement Learning for Behavior-Based Robots"],"prefix":"10.1177","volume":"5","author":[{"given":"John","family":"Shackleton","sequence":"first","affiliation":[{"name":"University of Minnesota 200 Union Street SE, Room 4-192, Minoeapolis"}]},{"given":"Maria","family":"Gini","sequence":"additional","affiliation":[{"name":"University of Minnesota 200 Union Street SE, Room 4-192, Minoeapolis"}]}],"member":"179","published-online":{"date-parts":[[1997,1,1]]},"reference":[{"key":"atypb1","doi-asserted-by":"publisher","DOI":"10.1109\/JRA.1986.1087032"},{"key":"atypb2","doi-asserted-by":"publisher","DOI":"10.1145\/356770.356776"},{"key":"atypb3","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4615-3184-5_5"},{"key":"atypb4","doi-asserted-by":"publisher","DOI":"10.1007\/BF00996270"},{"key":"atypb5","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(94)90047-7"},{"key":"atypb6","volume-title":"Theories of learning","author":"Hilgard, E.R.","year":"1975","edition":"4"},{"volume-title":"Proceedings of the IEEE International Conference on Robotics and Automation, Minneapolis, MN","author":"Hougen, D.E.","key":"atypb7"},{"key":"atypb8","doi-asserted-by":"publisher","DOI":"10.1613\/jair.301"},{"volume-title":"Proceedings of the National Conference on Artificial Intelligence","author":"Koza, J.R.","key":"atypb9"},{"volume-title":"Proceedings of the Eleventh International Conference on Machine Learning","author":"Mahadevan, S.","key":"atypb10"},{"key":"atypb11","doi-asserted-by":"publisher","DOI":"10.1007\/BF00114727"},{"key":"atypb12","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(92)90058-6"},{"volume-title":"Proceedings of the Eleventh International Conference on Machine Learning","author":"Mataric, M.","key":"atypb13"},{"volume-title":"A reinforcement learning method for maximizing undiscounted rewards. In Proceedings of the Tenth International Conference on Machine Learning","year":"1993","author":"Schwartz, A.","key":"atypb14"},{"key":"atypb15","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992700"},{"volume-title":"Models of delayed reinforcement learning. Unpublished doctoral thesis","year":"1989","author":"Watkins, C.J.","key":"atypb16"},{"key":"atypb17","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992698"},{"volume-title":"Reinforcement learning for the adaptive control of perception and action. Unpublished doctoral thesis","year":"1992","author":"Whitehead, S.D.","key":"atypb18"},{"key":"atypb19","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4615-3184-5_3"}],"container-title":["Adaptive Behavior"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/105971239700500307","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/105971239700500307","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,3]],"date-time":"2025-03-03T14:08:27Z","timestamp":1741010907000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/105971239700500307"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1997,1]]},"references-count":19,"journal-issue":{"issue":"3-4","published-print":{"date-parts":[[1997,1]]}},"alternative-id":["10.1177\/105971239700500307"],"URL":"https:\/\/doi.org\/10.1177\/105971239700500307","relation":{},"ISSN":["1059-7123","1741-2633"],"issn-type":[{"type":"print","value":"1059-7123"},{"type":"electronic","value":"1741-2633"}],"subject":[],"published":{"date-parts":[[1997,1]]}}}