{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,23]],"date-time":"2025-12-23T05:02:14Z","timestamp":1766466134458},"reference-count":48,"publisher":"Springer Science and Business Media LLC","issue":"3-4","license":[{"start":{"date-parts":[[1992,5,1]],"date-time":"1992-05-01T00:00:00Z","timestamp":704678400000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[1992,5]]},"DOI":"10.1007\/bf00992702","type":"journal-article","created":{"date-parts":[[2005,1,9]],"date-time":"2005-01-09T16:35:16Z","timestamp":1105288516000},"page":"363-395","source":"Crossref","is-referenced-by-count":26,"title":["A reinforcement connectionist approach to robot path finding in non-maze-like environments"],"prefix":"10.1007","volume":"8","author":[{"given":"Jos\ufffd Del R.","family":"Mill\ufffdn","sequence":"first","affiliation":[]},{"given":"Carme","family":"Torras","sequence":"additional","affiliation":[]}],"member":"297","reference":[{"key":"CR1","unstructured":"Agre, P.E., & Chapman, D. (1987). Pengi: An implementation of a theory of activity.Proceedings of the Seventh AAAI Conference (pp. 268?272)."},{"key":"CR2","volume-title":"Learning and problem solving with multilayer connectionist systems","author":"C.W. Anderson","year":"1986","unstructured":"Anderson, C.W. (1986).Learning and problem solving with multilayer connectionist systems. Ph.D. Thesis, Dept. of Computer and Information Science, University of Massachusetts, Amherst."},{"key":"CR3","doi-asserted-by":"crossref","unstructured":"Anderson, C.W. (1987). Strategy learning with multilayer connectionist representations.Proceedings of the Fourth International Workshop on Machine Learning (pp. 103?114).","DOI":"10.1016\/B978-0-934613-41-5.50014-3"},{"key":"CR4","doi-asserted-by":"crossref","unstructured":"Arkins, R.C. (1987). Motor schema based navigation for a mobile robot: An approach to programming by behavior.Proceedings of the IEEE International Conference on Robotics and Automation (pp. 264?271).","DOI":"10.1109\/ROBOT.1987.1088037"},{"key":"CR5","first-page":"229","volume":"4","author":"A.G. Barto","year":"1985","unstructured":"Barto, A.G. (1985). Learning by statistical cooperation of self-interested neuron-like computing elements.Human Neurobiology, 4, 229?256.","journal-title":"Human Neurobiology"},{"key":"CR6","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1109\/TSMC.1985.6313371","volume":"15","author":"A.G. Barto","year":"1985","unstructured":"Barto, A.G., & Anandan, P. (1985). Pattern-recognizing stochastic learning automata.IEEE Transactions on Systems, Man, and Cybernetics, 15, 360?374.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics"},{"key":"CR7","first-page":"835","volume":"13","author":"A.G. Barto","year":"1983","unstructured":"Barto, A.G., Sutton, R.S., & Anderson, C.W. (1983). Neuronlike elements that can solve difficult learning control problems.IEEE Transactions on Systems, Man, and Cybernetics, 13, 835?846.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics"},{"key":"CR8","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1007\/BF00453370","volume":"40","author":"A.G. Barto","year":"1981","unstructured":"Barto, A.G., Sutton, R.S., & Brouwer, P.S. (1981). Associative search network: A reinforcement learning associative memory.Biological Cybernetics, 40, 201?211.","journal-title":"Biological Cybernetics"},{"key":"CR9","series-title":"Technical Report","volume-title":"Learning and sequential decision making","author":"A.G. Barto","year":"1989","unstructured":"Barto, A.G., Sutton, R.S. & Watkins, C.J.C.H. (1989).Learning and sequential decision making (Technical Report COINS-89-95). University of Massachusetts, Amherst, MA: Dept. of Computer and Information Science."},{"key":"CR10","doi-asserted-by":"crossref","unstructured":"Blythe, J., & Mitchell, T.M. (1989). On becoming reactive.Proceedings of the Sixth International Workshop on Machine Learning (pp. 255?259).","DOI":"10.1016\/B978-1-55860-036-2.50073-4"},{"key":"CR11","volume-title":"Robot motion: Planning and control","year":"1982","unstructured":"Brady, M., Hollerbach, J.M., Johnson, T.L., Lozano-P\u00e9rez, T., & Mason, M.T., (Eds.) (1982).Robot motion: Planning and control. Cambridge, MA: MIT Press."},{"key":"CR12","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1109\/JRA.1986.1087032","volume":"2","author":"R.A. Brooks","year":"1986","unstructured":"Brooks, R.A. (1986). A robust layered control system for a mobile robot.IEEE Journal of Robotics and Automation, 2, 14?23.","journal-title":"IEEE Journal of Robotics and Automation"},{"key":"CR13","volume-title":"The complexiity of robot motion planning","author":"J.F. Canny","year":"1988","unstructured":"Canny, J.F. (1988).The complexiity of robot motion planning. Cambridge, MA: MIT Press."},{"key":"CR14","series-title":"Technical Report","volume-title":"Learning from delayed reinforcement in a complex domain","author":"D. Chapman","year":"1990","unstructured":"Chapman, D., & Kaelbling, L.P. (1990).Learning from delayed reinforcement in a complex domain (Technical Report 90-11). Palo Alto, CA: Teleos Research."},{"key":"CR15","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1016\/0004-3702(87)90069-5","volume":"31","author":"B.R. Donald","year":"1987","unstructured":"Donald, B.R. (1987). A search algorithm for robot motion planning with six degrees of freedom.Artificial Intelligence, 31, 295?353.","journal-title":"Artificial Intelligence"},{"key":"CR16","doi-asserted-by":"crossref","unstructured":"Graf, D.H., & LaLonde, W.R. (1988). A neural controller for collision-free movement of general robot manipulators.Proceedings of the IEEE Second International Conference on Neural Networks, Vol 1 (pp. 77?84).","DOI":"10.1109\/ICNN.1988.23831"},{"key":"CR17","series-title":"Technical Report","volume-title":"A stochastic algorithm for learning real-valued functions via reinforcement feedback","author":"V. Gullapalli","year":"1988","unstructured":"Gullapalli, V. (1988).A stochastic algorithm for learning real-valued functions via reinforcement feedback (Technical Report COINS-88-91). University of Massachusetts, Amherst, MA: Dept. of Computer and Information Science."},{"key":"CR18","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1177\/027836499000900103","volume":"9","author":"J. Ilari","year":"1990","unstructured":"Ilari, J., & Torras, C. (1990). 2D path planning: A configuration space heuristic approach.The International Journal of Robotics Research, 9, 75?91.","journal-title":"The International Journal of Robotics Research"},{"key":"CR19","first-page":"324","volume-title":"Advances in neural information procesing systems 2","author":"M.I. Jordan","year":"1990","unstructured":"Jordan, M.I., & Jacobs, R.A. (1990). Learning to control an unstable system with forward modeling. In D.S. Touretzky (Ed.),Advances in neural information procesing systems 2, 324?331. San Mateo, CA: Morgan Kaufmann."},{"key":"CR20","unstructured":"Jorgensen, C.C. (1987). Neural network representation of sensor graphs in autonomous robot navigation.Proceedings of the IEEE First International Conference on Neural Networks, Vol IV (pp. 507?515)."},{"key":"CR21","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1177\/027836498600500106","volume":"5","author":"O. Khatib","year":"1986","unstructured":"Khatib, O. (1986). Real time obstacle avoidance for manipulators and mobile robots.The International Journal of Robotics Research, 5, 90?98.","journal-title":"The International Journal of Robotics Research"},{"key":"CR22","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1207\/s15516709cog0902_2","volume":"9","author":"P. Langley","year":"1985","unstructured":"Langley, P. (1985). Learning to search: From weak methods to domain-specific heuristics.Cognitive Science, 9, 217?260.","journal-title":"Cognitive Science"},{"key":"CR23","unstructured":"Lin, L.-J. (1990). Self-improving reactive agents: Case studies of reinforcement learning frameworks.Proceedings of the First International Conference on the Simulation of Adaptive Behavior: From Animals to Animats (pp. 297?305)."},{"key":"CR24","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1109\/TC.1983.1676196","volume":"32","author":"T. Lozano-P\u00e9rez","year":"1983","unstructured":"Lozano-P\u00e9rez, T. (1983). Spatial planning: A configuration space approach.IEEE Transactions on Computers, 32, 108?120.","journal-title":"IEEE Transactions on Computers"},{"key":"CR25","doi-asserted-by":"crossref","first-page":"560","DOI":"10.1145\/359156.359164","volume":"22","author":"T. Lozano-P\u00e9rez","year":"1979","unstructured":"Lozano-P\u00e9rez, T., & Wesley, M. (1979). An algorithm for planning collison-free paths among polyhedral obstacles.Communications of the ACM, 22, 560?570.","journal-title":"Communications of the ACM"},{"key":"CR26","series-title":"Technical Report","volume-title":"Automatic programming of behavior-based robots using reinforcement learning","author":"S. Mahadevan","year":"1990","unstructured":"Mahadevan, S., & Connell, J. (1990).Automatic programming of behavior-based robots using reinforcement learning (Technical Report RC 16359). Yorktown Heights, NY: IBM, T.J. Watson Research Center."},{"key":"CR27","volume-title":"MURPHY:A neurally-inspired connectionist approach to learning and performance in vision-based robot motion planning","author":"B.W. Mel","year":"1989","unstructured":"Mel, B.W. (1989). MURPHY:A neurally-inspired connectionist approach to learning and performance in vision-based robot motion planning. Ph.D. Thesis, Graduate College, University of Illinois, Urbana-Champaign."},{"key":"CR28","unstructured":"Mill\u00e1n, J. del R., & Torras, C. (1990). Reinforcement learning: Discovering stable solutions in the robot path finding domain.Proceedings of the Ninth European Conference on Artificial Intelligence (pp. 219?221)."},{"key":"CR29","volume-title":"Progress in neural networks series, Vol 3","author":"J. del R. Mill\u00e1n","year":"1991","unstructured":"Mill\u00e1n, J. del R., & Torras, C. (1991a). Connectionist approaches to robot path finding. In O.M. Omidvar (Ed.),Progress in neural networks series, Vol 3. Norwood, NJ: Ablex."},{"key":"CR30","first-page":"298","volume-title":"Machine learning: Proceedings of the Eighth International Workshop","author":"J. del R. Mill\u00e1n","year":"1991","unstructured":"Mill\u00e1n, J. del R., & Torras, C. (1991b). Learning to avoid obstacles through reinforcement. In L. Birnbaum & G. Collins (Eds.)Machine learning: Proceedings of the Eighth International Workshop, 298?302. San Mateo, CA: Morgan Kaufmann."},{"key":"CR31","series-title":"Technical Report","doi-asserted-by":"crossref","DOI":"10.21236\/ADA452998","volume-title":"Discovering the structure of a reactive environment by exploration","author":"M.C. Mozer","year":"1989","unstructured":"Mozer, M.C., & Bachrach, J. (1989).Discovering the structure of a reactive environment by exploration (Technical Report CU-CS-451-89). Boulder, CO: University of Colorado, Dept. of Computer Science."},{"key":"CR32","unstructured":"Munro, P. (1987). A dual back-propagation scheme for scalar reward learning.Proceedings of the Ninth Annual Conference of the Cognitive Science Society (pp. 165?176)."},{"key":"CR33","doi-asserted-by":"crossref","unstructured":"Rivest, R.L., & Schapire, R.E. (1987). A new approach to unsupervised learning in deterministic environments.Proceedings of the Fourth International Workshop on Machine Learning (pp. 364?375).","DOI":"10.1016\/B978-0-934613-41-5.50039-8"},{"key":"CR34","volume-title":"Dynamic error propagation networks","author":"A.J. Robinson","year":"1989","unstructured":"Robinson, A.J. (1989).Dynamic error propagation networks. Ph.D. Thesis, Engineering Department, Cambridge University, Cambridge, England."},{"key":"CR35","unstructured":"Saerens, M., & Soquet, A. (1989). A neural controller.Proceedings of the First IEE International Conference on Artificial Neural Networks (pp. 211?215)."},{"key":"CR36","unstructured":"Schoppers, M.J. (1987). Universal plans for reactive robots in unpredictable environments.Proceedings of the Tenth International Joint Conference on Artificial Intelligence (pp. 1039?1046)."},{"key":"CR37","first-page":"348","volume-title":"Machine learning: Proceedings of the Eighth International Workshop","author":"S.P. Singh","year":"1991","unstructured":"Singh, S.P. (1991). Transfer of learning across compositions of sequential tasks. In L. Birnbaum & G. Collins (Eds.)Machine learning: Proceedings of the Eighth International Workshop, 348?352. San Mateo, CA: Morgan Kaufmann."},{"key":"CR38","unstructured":"Steels, L. (1988). Steps towards common sense.Proceedings of the Eighth European Conference on Artificial Intelligence (pp. 49?54)."},{"key":"CR39","volume-title":"Temporal credit assignment in reinforcement learning","author":"R.S. Sutton","year":"1984","unstructured":"Sutton, R.S. (1984).Temporal credit assignment in reinforcement learning. Ph.D. Thesis, Dept. of Computer and Information Science, University of Massachusetts, Amherst."},{"key":"CR40","first-page":"9","volume":"3","author":"R.S. Sutton","year":"1988","unstructured":"Sutton, R.S. (1988). Learning to predict by the methods of temporal differences.Machine Learning, 3, 9?44.","journal-title":"Machine Learning"},{"key":"CR41","doi-asserted-by":"crossref","unstructured":"Sutton, R.S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming.Proceedings of the Seventh International Conference on Machine Learning (pp. 216?224).","DOI":"10.1016\/B978-1-55860-141-3.50030-4"},{"key":"CR42","unstructured":"Torras, C. (1990). Motion planning and control: Symbolic and neural levels of computation.Proceedings of the Third COGNITIVA Conference (pp. 207?218)."},{"key":"CR43","volume-title":"Learning with delayed rewards","author":"C.J.C.H. Watkins","year":"1989","unstructured":"Watkins, C.J.C.H. (1989).Learning with delayed rewards. Ph.D. Thesis, Psychology Department, Cambridge University, Cambridge, England."},{"key":"CR44","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1109\/TSMC.1987.289329","volume":"17","author":"P.J. Werbos","year":"1987","unstructured":"Werbos, P.J. (1987). Building and understanding adaptive systems: A statistical\/numerical approach to factory automation and brain research.IEEE Transactions on Systems, Man, and Cybernetics, 17, 7?20.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics"},{"key":"CR45","volume-title":"Computational geometry","author":"S.H. Whitesides","year":"1985","unstructured":"Whitesides, S.H. (1985). Computational geometry and motion planning. In G. Toussaint (Ed.),Computational geometry. Amsterdam, New York, Oxford: North-Holland."},{"key":"CR46","series-title":"Technical Report","volume-title":"Reinforcement learning in connectionist networks: A mathematical analysis","author":"R.J. Williams","year":"1986","unstructured":"Williams, R.J. (1986).Reinforcement learning in connectionist networks: A mathematical analysis (Technical Report ICS-8605). San Diego, CA: University of California, Institute for Cognitive Science."},{"key":"CR47","series-title":"Technical Report","volume-title":"Reinforcement-learning connectionist systems","author":"R.J. Williams","year":"1987","unstructured":"Williams, R.J. (1987).Reinforcement-learning connectionist systems (Technical Report NU-CCS-87-3). Northeastern University, Boston, MA: College of Computer Science."},{"key":"CR48","first-page":"95","volume-title":"Advances in robotics, Vol. I. Algorithmic and geometric aspects of robotics","author":"C.-K. Yap","year":"1987","unstructured":"Yap, C.-K. (1987). Algorithmic motion planning. In J.T. Schwartz & C.-K. Yap (Eds.),Advances in robotics, Vol. I. Algorithmic and geometric aspects of robotics, 95?143. Hillsdale, NJ: Lawrence Erlbaum."}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/BF00992702.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/BF00992702\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/BF00992702","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,4,29]],"date-time":"2019-04-29T22:58:34Z","timestamp":1556578714000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/BF00992702"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1992,5]]},"references-count":48,"journal-issue":{"issue":"3-4","published-print":{"date-parts":[[1992,5]]}},"alternative-id":["BF00992702"],"URL":"https:\/\/doi.org\/10.1007\/bf00992702","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[1992,5]]}}}