{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T10:00:48Z","timestamp":1768557648864,"version":"3.49.0"},"publisher-location":"Berlin, Heidelberg","reference-count":20,"publisher":"Springer Berlin Heidelberg","isbn-type":[{"value":"9783540252603","type":"print"},{"value":"9783540322740","type":"electronic"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005]]},"DOI":"10.1007\/978-3-540-32274-0_18","type":"book-chapter","created":{"date-parts":[[2010,7,5]],"date-time":"2010-07-05T20:21:44Z","timestamp":1278361304000},"page":"275-294","source":"Crossref","is-referenced-by-count":10,"title":["Multi-agent Reinforcement Learning in Stochastic Single and Multi-stage Games"],"prefix":"10.1007","author":[{"given":"Katja","family":"Verbeeck","sequence":"first","affiliation":[]},{"given":"Ann","family":"Now\u00e9","sequence":"additional","affiliation":[]},{"given":"Maarten","family":"Peeters","sequence":"additional","affiliation":[]},{"given":"Karl","family":"Tuyls","sequence":"additional","affiliation":[]}],"member":"297","reference":[{"key":"18_CR1","unstructured":"Boutilier, C.: Sequential optimality and coordination in multiagent systems. In: Proceedings of the International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 478\u2013485 (1999)"},{"key":"18_CR2","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1007\/978-3-540-32274-0_4","volume-title":"Adaptive Agents and Multi-Agent Systems II","author":"M. Carpenter","year":"2005","unstructured":"Carpenter, M., Kudenko, D.: Baselines for joint-action reinforcement learning of coordination in cooperative multi-agent systems. In: Kudenko, D., Kazakov, D., Alonso, E. (eds.) AAMAS 2004. LNCS, vol.\u00a03394, pp. 55\u201372. Springer, Heidelberg (2005)"},{"key":"18_CR3","unstructured":"Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the fiftheenth National Conference on Artificial Intelligence, pp. 746\u2013752 (1998)"},{"key":"18_CR4","doi-asserted-by":"publisher","first-page":"1039","DOI":"10.1162\/jmlr.2003.4.6.1039","volume":"4","author":"J. Hu","year":"2003","unstructured":"Hu, J., Wellman, M.P.: Nash q-learning for general-sum stochastic games. Journal of Machine Learning Research\u00a04, 1039\u20131069 (2003)","journal-title":"Journal of Machine Learning Research"},{"key":"18_CR5","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1007\/978-3-540-32274-0_7","volume-title":"Adaptive Agents and Multi-Agent Systems II","author":"S. Kapetanakis","year":"2005","unstructured":"Kapetanakis, S., Kudenko, D., Strens, M.: Learning to coordinate using commitment sequences in cooperative multi-agent systems. In: Kudenko, D., Kazakov, D., Alonso, E. (eds.) AAMAS 2004. LNCS, vol.\u00a03394, pp. 106\u2013118. Springer, Heidelberg (2005)"},{"key":"18_CR6","unstructured":"Lauer, M., Riedmiller, M.: An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 535\u2013542 (2000)"},{"key":"18_CR7","unstructured":"Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 322\u2013328 (2001)"},{"key":"18_CR8","unstructured":"Narendra, K.S., Parthasarathy, K.: Learning automata approach to hierarchical multiobjective analysis. Technical Report No. 8811, Electrical Engineering. Yale University., New Haven, Connecticut (1988)"},{"key":"18_CR9","volume-title":"Learning Automata: An Introduction","author":"K.S. Narendra","year":"1989","unstructured":"Narendra, K.S., Thathachar, M.A.L.: Learning Automata: An Introduction. Prentice-Hall International, Inc., Englewood Cliffs (1989)"},{"key":"18_CR10","series-title":"LNAI","first-page":"382","volume-title":"Proceedings of the 12th European Conference on Machine Learning","author":"A. Now\u00e9","year":"2001","unstructured":"Now\u00e9, A., Parent, J., Verbeeck, K.: Social agents playing a periodical policy. In: Proceedings of the 12th European Conference on Machine Learning, Freiburg, Germany. LNCS (LNAI), vol.\u00a02168, pp. 382\u2013393. Springer, Heidelberg (2001)"},{"key":"18_CR11","volume-title":"A course in game theory","author":"J.O. Osborne","year":"1994","unstructured":"Osborne, J.O., Rubinstein, A.: A course in game theory. MIT Press, Cambridge (1994)"},{"key":"18_CR12","doi-asserted-by":"crossref","unstructured":"Parent, J., Verbeeck, K., Nowe, A., Steenhaut, K., Lemeire, J., Dirkx, E.: Adaptive load balancing of parallel applications with social reinforcement learning on heterogeneous systems. Scientific Programming (2004) (to appear)","DOI":"10.1155\/2004\/987356"},{"key":"18_CR13","unstructured":"Peeters, M., Verbeeck, K., Now\u00e9, A.: Multi-agent learning in conflicting multi-level games with incomplete information. In: Proceedings of the 2004 American Association for Artificial Intelligence (AAAI) Fall Symposium on Artificial Multi-Agent Learning (2004)"},{"key":"18_CR14","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"168","DOI":"10.1007\/978-3-540-30115-8_18","volume-title":"Machine Learning: ECML 2004","author":"P.J. Hoen","year":"2004","unstructured":"Hoen, P.J.\u2019t., Tuyls, K.: Analyzing multi-agent reinforcement learning using evolutionary dynamics. In: Boulicaut, J.-F., et al. (eds.) ECML 2004. LNCS, vol.\u00a03201, pp. 168\u2013179. Springer, Heidelberg (2004)"},{"issue":"6","key":"18_CR15","doi-asserted-by":"publisher","first-page":"711","DOI":"10.1109\/TSMCB.2002.1049606","volume":"32","author":"M.A.L. Thathachar","year":"2002","unstructured":"Thathachar, M.A.L., Sastry, P.S.: Varieties of learning automata: An overview. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics\u00a032(6), 711\u2013722 (2002)","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics"},{"issue":"2","key":"18_CR16","first-page":"297","volume":"139","author":"K. Tuyls","year":"2004","unstructured":"Tuyls, K., Nowe, A., Lenaerts, T., Manderick, B.: An evolutionary game theoretic perspective on learning in multi-agent systems. Synthese, section Knowledge, Rationality and Action\u00a0139(2), 297\u2013330 (2004)","journal-title":"Synthese, section Knowledge, Rationality and Action"},{"key":"18_CR17","unstructured":"Verbeeck, K., Now\u00e9, A., Parent, J., Tuyls, K.: Exploring selfish reinforcement learning in non-zero sum games (2004) (submitted)"},{"key":"18_CR18","unstructured":"Verbeeck, K., Now\u00e9, A., Peeters, M.: Multi-agent coordination in tree structured multi-stage games. In: Proceedings of the Fourth Symposium on Adaptive Agents and Multi-agent Systems (AISB 2004) Society for the study of Artificial Intelligence and Simulation of Behaviour, pp. 63\u201374 (2004)"},{"key":"18_CR19","unstructured":"Verbeeck, K., Now\u00e9, A., Tuyls, K.: Coordinated exploration in stochastic common interest games. In: Proceedings of the Third Symposium on Adaptive Agents and Multi-agent Systems (AISB 2003) Society for the study of Artificial Intelligence and Simulation of Behaviour (2003)"},{"key":"18_CR20","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1145\/301136.301167","volume-title":"Proceedings of the Third International Conference on Autonomous Agents (Agents 1999)","author":"D.H. Wolpert","year":"1999","unstructured":"Wolpert, D.H., Wheller, K.R., Tumer, K.: General principles of learning-based multi-agent systems. In: Etzioni, O., M\u00fcller, J.P., Bradshaw, J.M. (eds.) Proceedings of the Third International Conference on Autonomous Agents (Agents 1999), Seattle, WA, USA, pp. 77\u201383. ACM Press, New York (1999)"}],"container-title":["Lecture Notes in Computer Science","Adaptive Agents and Multi-Agent Systems II"],"original-title":[],"link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-540-32274-0_18.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,5,3]],"date-time":"2021-05-03T03:46:49Z","timestamp":1620013609000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/978-3-540-32274-0_18"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005]]},"ISBN":["9783540252603","9783540322740"],"references-count":20,"URL":"https:\/\/doi.org\/10.1007\/978-3-540-32274-0_18","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"value":"0302-9743","type":"print"},{"value":"1611-3349","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005]]}}}