{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,2]],"date-time":"2026-07-02T06:13:50Z","timestamp":1782972830781,"version":"3.54.5"},"reference-count":29,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[1997,3,1]],"date-time":"1997-03-01T00:00:00Z","timestamp":857174400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[1997,3,1]],"date-time":"1997-03-01T00:00:00Z","timestamp":857174400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Autonomous Robots"],"published-print":{"date-parts":[[1997,3]]},"DOI":"10.1023\/a:1008819414322","type":"journal-article","created":{"date-parts":[[2002,12,22]],"date-time":"2002-12-22T16:51:56Z","timestamp":1040575916000},"page":"73-83","source":"Crossref","is-referenced-by-count":247,"title":["Reinforcement Learning in the Multi-Robot Domain"],"prefix":"10.1007","volume":"4","author":[{"given":"Maja J.","family":"Matari\u0107","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","reference":[{"key":"125281_CR1","unstructured":"Asada, M., Uchibe, E., Noda, S., Tawaratsumida, S., and Hosoda, K. 1994. Coordination of multiple behaviors acquired by a avision-based reinforcement learning. In Proceedings IEEE\/RSJ\/GI International Conference on Intelligent Robots and Systems, Munich, Germany."},{"key":"125281_CR2","unstructured":"Atkeson, C.G. 1989. Using local models to control movement. In Proceedings, Neural Information Processing Systems Conference."},{"key":"125281_CR3","unstructured":"Atkeson, C.G., Aboaf, E.W., McIntyre, J., and Reinkensmeyer, D.J. 1988. Model-based robot learning. Technical Report AIM-1024, MIT."},{"key":"125281_CR4","unstructured":"Barto, A.G., Bradtke, S.J., and Singh, S.P. 1993. Learning to act using real-time dynamic programming. AI Journal."},{"key":"125281_CR5","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1109\/JRA.1986.1087032","volume":"RA-2","author":"R.A. Brooks","year":"1986","unstructured":"Brooks, R.A. 1986. A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, RA-2:14\u201323.","journal-title":"IEEE Journal of Robotics and Automation"},{"key":"125281_CR6","doi-asserted-by":"crossref","unstructured":"Brooks, R.A. 1990. The behavior language: user's guide. Technical Report AIM-1227, MIT Artificial Intelligence Lab.","DOI":"10.21236\/ADA225808"},{"key":"125281_CR7","unstructured":"Brooks, R.A. 1991. Intelligence without reason. In Proceedings, IJCAI-91."},{"key":"125281_CR8","unstructured":"Kaelbling, L.P. 1990. Learning in embedded systems, Ph.D. Thesis, Stanford University."},{"key":"125281_CR9","unstructured":"Lin, L.-J. 1991a. Programming robots using reinforcement learning and teaching. In Proceedings, AAAI-91, Pittsburgh, PA, pp. 781\u2013786."},{"key":"125281_CR10","doi-asserted-by":"crossref","unstructured":"Lin, L.-J. 1991b. Self-improving reactive agents: Case studies of reinforcement learning frameworks. In From Animals to Animats: International Conference on Simulation of Adaptive Behavior, The MIT Press.","DOI":"10.7551\/mitpress\/3115.003.0041"},{"key":"125281_CR11","unstructured":"Maes, P. and Brooks, R.A. 1990. Learning to coordinate behaviors. In Proceedings, AAAI-91, Boston, MA, pp. 796\u2013802."},{"key":"125281_CR12","unstructured":"Mahadevan, S. and Connell, J. 1990. Automatic programming of behavior-based robots using reinforcement learning. Technical report, IBM T.J. Watson Research Center Research Report."},{"key":"125281_CR13","unstructured":"Mahadevan, S. and Connell, J. 1991a. Automatic programming of behavior-based robots using reinforcement learning. In Proceedings, AAAI-91, Pittsburgh, PA, pp. 8\u201314."},{"key":"125281_CR14","doi-asserted-by":"crossref","unstructured":"Mahadevan, S. and Connell, J. 1991b. Scaling reinforcement learning to robotics by exploiting the subsumption architecture. In Eighth International Workshop on Machine Learning, Morgan Kaufmann, pp. 328\u2013337.","DOI":"10.1016\/B978-1-55860-200-7.50068-4"},{"key":"125281_CR15","unstructured":"Matari\u0107, M.J. 1992a. Behavior-based systems: Key properties and implications. In IEEE International Conference on Robotics and Automation, Workshop on Architectures for Intelligent Control Systems, Nice, France, pp. 46\u201354."},{"key":"125281_CR16","doi-asserted-by":"crossref","unstructured":"Matari\u0107, M.J. 1992b. Designing emergent behaviors: From local interactions to collective intelligence. In From Animals to Animats: International Conference on Simulation of Adaptive Behavior, J.-A. Meyer, H. Roitblat, and S. Wilson (Eds.).","DOI":"10.7551\/mitpress\/3116.003.0059"},{"key":"125281_CR17","unstructured":"Matari\u0107, M.J. 1993. Kin recognition, similarity, and group behavior. In Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society, Boulder, Colorado, pp. 705\u2013710."},{"key":"125281_CR18","unstructured":"Matari\u0107, M.J. 1994a. Interaction and intelligent behavior, Technical Report AI-TR-1495, MIT Artificial Intelligence Lab."},{"key":"125281_CR19","doi-asserted-by":"crossref","unstructured":"Matari\u0107, M.J. 1994b. Learning to behave socially. In From Animals to Animats: International Conference on Simulation of Adaptive Behavior, D. Cliff, P. Husbands, J.-A. Meyer, and S. Wilson (Eds.), pp. 453\u2013462.","DOI":"10.7551\/mitpress\/3117.003.0065"},{"key":"125281_CR20","first-page":"266","volume-title":"Proceedings, Simulation of Adaptive Behavior SAB-94","author":"J.D.R. Mill\u00e1n","year":"1994","unstructured":"Mill\u00e1n, J.D.R. 1994. Learning reactive sequences from basic reflexes. In Proceedings, Simulation of Adaptive Behavior SAB-94, The MIT Press: Brighton, England, pp. 266\u2013274."},{"key":"125281_CR21","first-page":"571","volume":"4","author":"A.W. Moore","year":"1992","unstructured":"Moore, A.W. 1992. Fast, robust adaptive control by learning only forward models. Advances in Neural Information Processing, 4:571\u2013579.","journal-title":"Advances in Neural Information Processing"},{"key":"125281_CR22","unstructured":"Parker, L.E. 1994. Heterogeneous multi-robot cooperation, Ph.D. thesis, MIT."},{"key":"125281_CR23","doi-asserted-by":"crossref","unstructured":"Pomerleau, D.A. 1992. Neural network perception for mobile robotic guidance, Ph.D. thesis, Carnegie Mellon University, School of Computer Science.","DOI":"10.1007\/978-1-4615-3192-0"},{"key":"125281_CR24","first-page":"57","volume":"14","author":"S. Schaal","year":"1994","unstructured":"Schaal, S. and Atkeson, C.C. 1994. Robot juggling: An implementation of memory-bassed learning. Control Systems Magazine, 14:57\u201371.","journal-title":"Control Systems Magazine"},{"issue":"1","key":"125281_CR25","first-page":"9","volume":"3","author":"R. Sutton","year":"1988","unstructured":"Sutton, R. 1988. Learning to predict by method of temporal differences. Machine Learning, 3(1):9\u201344.","journal-title":"Machine Learning"},{"key":"125281_CR26","doi-asserted-by":"crossref","unstructured":"Tan, M. 1993. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings, Tenth International Conference on Machine Learning, Amherst, MA, pp. 330\u2013337.","DOI":"10.1016\/B978-1-55860-307-3.50049-6"},{"key":"125281_CR27","unstructured":"Thrun, S.B. and Mitchell, T.M. 1993. Integrating inductive neural network learning and explanation-based learning. In Proceedings, IJCAI-93, Chambery, France."},{"key":"125281_CR28","first-page":"279","volume":"8","author":"C.J.C.H. Watkins","year":"1992","unstructured":"Watkins, C.J.C.H. and Dayan, P. 1992. Q-learning. Machine Learning, 8:279\u2013292.","journal-title":"Machine Learning"},{"key":"125281_CR29","doi-asserted-by":"crossref","unstructured":"Whitehead, S.D., Karlsson, J., and Tenenberg, J. 1993. Learning multiple goal behavior via task decomposition and dynamic policy merging. In Robot Learning, J.H. Connell and S. Mahadevan (Eds.), Kluwer Academic Publishers, pp. 45\u201378.","DOI":"10.1007\/978-1-4615-3184-5_3"}],"container-title":["Autonomous Robots"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1023\/A:1008819414322.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1023\/A:1008819414322\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1023\/A:1008819414322.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,24]],"date-time":"2025-05-24T07:15:26Z","timestamp":1748070926000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1023\/A:1008819414322"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1997,3]]},"references-count":29,"journal-issue":{"issue":"1","published-print":{"date-parts":[[1997,3]]}},"alternative-id":["125281"],"URL":"https:\/\/doi.org\/10.1023\/a:1008819414322","relation":{},"ISSN":["0929-5593","1573-7527"],"issn-type":[{"value":"0929-5593","type":"print"},{"value":"1573-7527","type":"electronic"}],"subject":[],"published":{"date-parts":[[1997,3]]}}}