{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T16:23:15Z","timestamp":1773246195173,"version":"3.50.1"},"reference-count":57,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[1991,7,1]],"date-time":"1991-07-01T00:00:00Z","timestamp":678326400000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[1991,7]]},"DOI":"10.1007\/bf00058926","type":"journal-article","created":{"date-parts":[[2004,10,31]],"date-time":"2004-10-31T01:27:59Z","timestamp":1099186079000},"page":"45-83","source":"Crossref","is-referenced-by-count":119,"title":["Learning to perceive and act by trial and error"],"prefix":"10.1007","volume":"7","author":[{"given":"Steven D.","family":"Whitehead","sequence":"first","affiliation":[]},{"given":"Dana H.","family":"Ballard","sequence":"additional","affiliation":[]}],"member":"297","reference":[{"key":"CR1","unstructured":"Agre, P. E. (1988). The dynamic structure of everyday life. PhD thesis, MIT Artificial Intelligence Laboratory, Cambridge, MA."},{"key":"CR2","first-page":"268","volume-title":"Proceedings of the Sixth National Conference on Artificial Intelligence","author":"P. E. Agre","year":"1987","unstructured":"Agre, P. E., & Chapman, D. (1987). Pengi: An implementation of a theory of activity. Proceedings of the Sixth National Conference on Artificial Intelligence (pp. 268?272). Los Altos, CA: Morgan Kaufmann."},{"key":"CR3","first-page":"25","volume":"10","author":"J. S. Albus","year":"1975","unstructured":"Albus, J. S. (1975). A new approach to manipulator control: Cerebellar model articulation controller (CMAC). Transactions of the ASME: Journal of Dynamic Systems, Measurement and Control, 10, 25?61.","journal-title":"Transactions of the ASME: Journal of Dynamic Systems, Measurement and Control"},{"key":"CR4","volume-title":"Brains, behavior, and robotics","author":"J. S. Albus","year":"1981","unstructured":"Albus, J. S. (1981). Brains, behavior, and robotics. Peterborough, NH: BYTE Books."},{"key":"CR5","unstructured":"Anderson, C. W. (1986). Learning and problem solving with multilayer connectionist systems. PhD thesis, University of Massachusetts, Amherst, MA."},{"key":"CR6","first-page":"345","volume-title":"Proceedings of the Sixth International Conference on Machine Learning","author":"C. W. Anderson","year":"1989","unstructured":"Anderson, C. W. (1989). Towers of hanoi with connectionist networks: Learning new features. Proceedings of the Sixth International Conference on Machine Learning (pp. 345?350). San Mateo, CA: Morgan Kaufmann."},{"key":"CR7","first-page":"1635","volume-title":"Proceedings of the Eleventh International Joint conference on Artificial Intelligence","author":"D. H. Ballard","year":"1989","unstructured":"Ballard, D. H. (1989). Reference frames for animate vision. Proceedings of the Eleventh International Joint conference on Artificial Intelligence (pp. 1635?1641). Los Altos, CA: Morgan Kaufmann."},{"key":"CR8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/BF00335152","volume":"42","author":"A. B. Barto","year":"1981","unstructured":"Barto, A. B., & Sutton, R. S. (1981). Landmark learning: An illustration of associative search. Biological Cybernetics, 42, 1?8.","journal-title":"Biological Cybernetics"},{"key":"CR9","volume-title":"Advances in neural information processing systems 2","author":"A. B. Barto","year":"1990","unstructured":"Barto, A. B., Sutton, R. S., & Watkins, C. (1990a). Sequential decision problems and neural networks. In D. S. Touretzky (Ed.), Advances in neural information processing systems 2. San Mateo, CA: Morgan Kaufmann."},{"key":"CR10","volume-title":"Learning and computational neuroscience","author":"A. B. Barto","year":"1990","unstructured":"Barto, A. B., Sutton, R. S., & Watkins, C. J. C. (1990b). Learning and sequential decision making. In M. Gabrial & J. W. Moore (Eds.), Learning and computational neuroscience. Cambridge, MA: MIT Press. (Also COINS Tech Report 89?95, Dept. of Computer and Information Sciences, University of Massachusetts, Amherst, MA 01003)."},{"key":"CR11","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TSMC.1983.6313077","volume":"13","author":"A. G. Barto","year":"1983","unstructured":"Barto, A. G., Sutton, R. S., & Anderson, C. W. (1983). Neuron-like elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, SMC-13, 834?846.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics"},{"key":"CR12","volume-title":"Dynamic programming","author":"R. E. Bellman","year":"1957","unstructured":"Bellman, R. E. (1957). Dynamic programming. Princeton, NJ: Princeton University Press."},{"key":"CR13","volume-title":"Dynamic programming: Deterministic and stochastic models","author":"D. P. Bertsekas","year":"1987","unstructured":"Bertsekas, D. P. (1987) Dynamic programming: Deterministic and stochastic models. Englewood Cliffs, NJ: Prentice-Hall."},{"key":"CR14","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1016\/S0019-9958(75)90261-2","volume":"28","author":"L. Blum","year":"1975","unstructured":"Blum, L., & Blum, M. (1975). Toward a mathematical theory of inductive inference. Information and Control, 28, 125?155.","journal-title":"Information and Control"},{"key":"CR15","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1016\/B978-1-55860-036-2.50073-4","volume-title":"Proceedings of the Sixth International Conference on Machine Learning","author":"J. Blythe","year":"1989","unstructured":"Blythe, J., & Mitchell, T. M. (1989). On becoming reactive. Proceedings of the Sixth International Conference on Machine Learning (pp. 255?259). San Mateo, CA: Morgan Kaufmann."},{"key":"CR16","unstructured":"Booker, L. B. (1982). Intelligent behavior as an adaptation to the task environment. PhD thesis, University of Michigan."},{"key":"CR17","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1109\/JRA.1986.1087032","volume":"2","author":"R. A. Brooks","year":"1986","unstructured":"Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, 2, 14?22.","journal-title":"IEEE Journal of Robotics and Automation"},{"key":"CR18","first-page":"45","volume":"10","author":"D. Chapman","year":"1989","unstructured":"Chapman, D. (1989). Penguins can make cake. AI Magazine, 10, 45?50.","journal-title":"AI Magazine"},{"key":"CR19","unstructured":"Clocksin, W. F., & Moore, A. W. (1988). Some experiments in adaptive state-space robotics. (Technical report). University of Cambridge, Computer Laboratory."},{"key":"CR20","unstructured":"Drummond, M. (1989). Situated control rules. Proceedings of the Rochester Planning Workshop (pp. 18?34). (Technical Report 284). University of Rochester, Department of Computer Science."},{"key":"CR21","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1016\/0004-3702(72)90051-3","volume":"3","author":"R. E. Fikes","year":"1972","unstructured":"Fikes, R. E., Hart, P. E., & Nilsson, N. J. (1972). Learning and executing generalized robot plans. Artificial Intelligence, 3, 251?288.","journal-title":"Artificial Intelligence"},{"key":"CR22","first-page":"202","volume-title":"Proceedings of the Sixth National Conference on Artificial Intelligence","author":"R. J. Firby","year":"1987","unstructured":"Firby, R. J. (1987). An investigation into reactive planning in complex domains. Proceedings of the Sixth National Conference on Artificial Intelligence (pp. 202?206). Los Altos, CA: Morgan Kaufmann."},{"key":"CR23","doi-asserted-by":"crossref","unstructured":"Franklin, J. A. (1988). Refinement of robot motor skills through reinforcement learning. Proceedings of the 27th IEEE Conference on Decision and Control. Austin, TX.","DOI":"10.1109\/CDC.1988.194487"},{"key":"CR24","first-page":"677","volume-title":"Proceedings of the Sixth National Conference on Artificial Intelligence","author":"M. P. Georgeff","year":"1987","unstructured":"Georgeff, M. P., & Lansky, A. L. (1987). Reactive reasoning and planning. Proceedings of the Sixth National Conference on Artificial Intelligence (pp. 677?682.). Los Altos, CA: Morgan Kaufmann."},{"key":"CR25","volume-title":"The ecological approach to visual perception","author":"J. J. Gibson","year":"1979","unstructured":"Gibson, J. J. (1979). The ecological approach to visual perception. Boston, MA: Houghton Mifflin."},{"key":"CR26","first-page":"41","volume":"10","author":"M. L. Ginsberg","year":"1989","unstructured":"Ginsberg, M. L. (1989). Universal planning: An (almost) universally bad idea. AI Magazine, 10, 41?44.","journal-title":"AI Magazine"},{"key":"CR27","unstructured":"Girosi, F., & Poggio, T. (1989). Networks and the best approximation property (AI Memo No. 1164). Massachusetts Institute of Technology, Artificial Intelligence Laboratory."},{"key":"CR28","first-page":"198","volume-title":"Proceedings of the Seventh International Conference on Machine Learning","author":"D. G. Gordon","year":"1990","unstructured":"Gordon, D. G., & Grefenstette, J. J. (1990). Explanations of empirically derived reactive plans. Proceedings of the Seventh International Conference on Machine Learning (pp. 198?203). San Mateo, CA: Morgan Kaufmann."},{"key":"CR29","first-page":"355","volume":"5","author":"J. J. Grefenstette","year":"1990","unstructured":"Grefenstette, J. J., Ramsey, C., & Schultz, A. (1990). Learning sequential decision rules using simulation and competition. Machine Learning, 5, 355?382.","journal-title":"Machine Learning"},{"key":"CR30","first-page":"225","volume":"3","author":"J. J. Grefenstette","year":"1988","unstructured":"Grefenstette, J. J. (1988). Credit assignment in rule discovery systems based on genetic algorithms. Machine Learning, 3, 225?245.","journal-title":"Machine Learning"},{"key":"CR31","doi-asserted-by":"crossref","first-page":"340","DOI":"10.1016\/B978-1-55860-036-2.50087-4","volume-title":"Proceedings of the Sixth International Workshop on Machine Learning","author":"J. J. Grefenstette","year":"1989","unstructured":"Grefenstette, J. J. (1989). Incremental learning of control strategies with genetic algorithms. Proceedings of the Sixth International Workshop on Machine Learning (pp. 340?344). San Mateo, CA: Morgan Kaufmann."},{"key":"CR32","volume-title":"Adaptation in natural and artificial systems","author":"J. H. Holland","year":"1975","unstructured":"Holland, J. H. (1975). Adaptation in natural and artificial systems. Ann Arbor, MI: University of Michigan Press."},{"key":"CR33","volume-title":"Machine learning: An artificial intelligence approach (Volume II)","author":"J. H. Holland","year":"1986","unstructured":"Holland, J. H. (1986). Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In R. S. Michalski, J. G. Carbonell, & T. M. Mitchell (Eds.), Machine learning: An artificial intelligence approach (Volume II). San Mateo, CA: Morgan Kaufmann."},{"key":"CR34","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/3729.001.0001","volume-title":"Induction: processes of inference, learning, and discovery","author":"J. H. Holland","year":"1986","unstructured":"Holland, J. H., Holyoak, K. F., Nisbett, R. E., & Thagard, P. R. (1986). Induction: processes of inference, learning, and discovery. Cambridge, MA: MIT Press."},{"key":"CR35","volume-title":"Advances in neural information processing systems 1","author":"M. Hormel","year":"1989","unstructured":"Hormel, M. (1989). A self-organizing associative memory system for control applications. In D. S. Touretzky (Ed.), Advances in neural information processing systems 1. San Mateo, CA: Morgan Kaufmann."},{"key":"CR36","doi-asserted-by":"crossref","first-page":"350","DOI":"10.1016\/B978-1-55860-036-2.50089-8","volume-title":"Proceedings of the Sixth International Workshop on Machine Learning","author":"L. P. Kaelbling","year":"1989","unstructured":"Kaelbling, L. P. (1989). A formal framework for learning in embedded systems. Proceedings of the Sixth International Workshop on Machine Learning (pp. 350?353). San Mateo, CA: Morgan Kaufmann."},{"key":"CR37","first-page":"11","volume":"1","author":"J. E. Laird","year":"1986","unstructured":"Laird, J. E., Rosenbloom, P. S., & Newell, A. (1986). Chunking in soar: The anatomy of a general learning mechanism. Machine learning, 1, 11?46","journal-title":"Machine learning"},{"key":"CR38","unstructured":"Mahadevan, S., & Connell, J. (1990). Automatic programming of behavior-based robots using reinforcement learning (Research Report RC 16359). IBM T. J. Watson Research Center."},{"key":"CR39","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-97239-3","volume-title":"Neural networks for control","author":"W. T. Miller","year":"1990","unstructured":"Miller, W. T., Sutton, R. S., & Werbos, P. J. (1990). Neural networks for control. Cambridge, MA: MIT Press."},{"key":"CR40","unstructured":"Nilsson, N. J. (1989). Action networks. Proceedings of the Rochester Planning Workshop (Technical Report 284) (pp. 36?68). University of Rochester, Department of Computer Science."},{"key":"CR41","first-page":"211","volume-title":"Proceedings of the Seventh International Conference on Machine Learning","author":"C. Ramsey","year":"1990","unstructured":"Ramsey, C., Schultz, A., & Grefenstette, J. (1990). Simulation-assisted learning by competition: Effects of noise differences between training model and target environment. Proceedings of the Seventh International Conference on Machine Learning (pp. 211?215). San Mateo, CA: Morgan Kaufmann."},{"key":"CR42","volume-title":"Introduction to stochastic dynamic programming","author":"S. Ross","year":"1983","unstructured":"Ross, S. (1983). Introduction to stochastic dynamic programming. New York, NY: Academic Press."},{"key":"CR43","first-page":"1039","volume-title":"Proceedings of Ninth International Joint Conference on Artificial Intelligence","author":"M. J. Schoppers","year":"1987","unstructured":"Schoppers, M. J. (1987). Universal plans for reactive robots in unpredictable domains. Proceedings of Ninth International Joint Conference on Artificial Intelligence (pp. 1039?1046). Los Altos, CA: Morgan Kaufmann."},{"key":"CR44","unstructured":"Schoppers, M. J. (1989). Representation and automatic synthesis of reaction plans. PhD thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign."},{"key":"CR45","unstructured":"Sutton, R. S. (1984). Temporal credit assignment in reinforcement learning. PhD thesis, University of Massachusetts at Amherst."},{"key":"CR46","first-page":"9","volume":"3","author":"R. S. Sutton","year":"1988","unstructured":"Sutton, R. S. (1988). Learning to predict by the method of temporal differences. Machine Learning, 3, 9?44.","journal-title":"Machine Learning"},{"key":"CR47","doi-asserted-by":"crossref","unstructured":"Sutton, R. S. (1990a). First results with DYNA, an integrated architecture for learning, planning, and reacting. Proceedings of the AAAI Spring Symposium on Planning in Uncertain, Unpredictable, or Changing Environments.","DOI":"10.7551\/mitpress\/4939.003.0012"},{"key":"CR48","first-page":"216","volume-title":"Proceedings of the Seventh International Conference on Machine Learning","author":"R. S. Sutton","year":"1990","unstructured":"Sutton, R. S. (1990b). Integrating architectures for learning, planning and reacting based on approximating dynamic programming. Proceedings of the Seventh International Conference on Machine Learning (pp. 216?224). San Mateo, CA: Morgan Kaufmann."},{"key":"CR49","doi-asserted-by":"crossref","unstructured":"Ullman, S. (1984). Visual routines. Cognition, 18, 97?159. (Also in: Visual cognition, S. Pinker (Ed.), 1985).","DOI":"10.1016\/0010-0277(84)90023-4"},{"key":"CR50","unstructured":"Watkins, C. (1989). Learning from delayed rewards. PhD thesis, Cambridge University."},{"key":"CR51","unstructured":"Whitehead, S. D. (1989). Scaling in reinforcement learning (Technical Report TR 304). University of Rochester, Department of Computer Science."},{"key":"CR52","first-page":"333","volume-title":"Proceedings of the NASA Conference on Space Telerobotics","author":"S. D. Whitehead","year":"1989","unstructured":"Whitehead, S. D., & Ballard, D. H. (1989a). Reactive behavior, learning, and anticipation. Proceedings of the NASA Conference on Space Telerobotics (pp. 333?344). Pasadena, CA: Jet Propulsions Laboratory."},{"key":"CR53","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1016\/B978-1-55860-036-2.50090-4","volume-title":"Proceedings of the Sixth International Workshop on Machine Learning","author":"S. D. Whitehead","year":"1989","unstructured":"Whitehead, S. D., & Ballard, D. H. (1989b). A role for anticipation in reactive systems that learn. Proceedings of the Sixth International Workshop on Machine Learning (pp. 354?357). San Mateo, CA: Morgan Kaufmann."},{"key":"CR54","series-title":"Technical Report TR","volume-title":"A study of cooperative mechanisms for faster reinforcement learning","author":"S. D. Whitehead","year":"1991","unstructured":"Whitehead, S. D., & Ballard, D. H. (1991). A study of cooperative mechanisms for faster reinforcement learning (Technical Report TR 365). Rochester, NY: University of Rochester, Department of Computer Science."},{"key":"CR55","series-title":"Technical Report NU-CCS-87-3","volume-title":"Reinforcement-learning connectionist systems","author":"R. J. Williams","year":"1987","unstructured":"Williams, R. J. (1987). Reinforcement-learning connectionist systems (Technical Report NU-CCS-87?3). Boston, MA: Northeastern University, College of Computer Science."},{"key":"CR56","first-page":"199","volume":"2","author":"S. W. Wilson","year":"1987","unstructured":"Wilson, S. W. (1987). Classifier systems and the animate problem. Machine Learning, 2, 199?228.","journal-title":"Machine Learning"},{"key":"CR57","first-page":"882","volume-title":"Proceedings of Ninth National Conference on Artificial Intelligence","author":"R. C. Yee","year":"1990","unstructured":"Yee, R. C., Saxena, S., Utgoff, P. E., & Barto, A. G. (1990). Explaining temporal-differences to create useful concepts for evaluating states. Proceedings of Ninth National Conference on Artificial Intelligence (pp. 882?888). Cambridge, MA: MIT Press."}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/BF00058926.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/BF00058926\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/BF00058926","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,19]],"date-time":"2024-12-19T09:25:01Z","timestamp":1734600301000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/BF00058926"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1991,7]]},"references-count":57,"journal-issue":{"issue":"1","published-print":{"date-parts":[[1991,7]]}},"alternative-id":["BF00058926"],"URL":"https:\/\/doi.org\/10.1007\/bf00058926","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[1991,7]]}}}