{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T08:33:28Z","timestamp":1768552408419,"version":"3.49.0"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2007,4,30]],"date-time":"2007-04-30T00:00:00Z","timestamp":1177891200000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Auton Agent Multi-Agent Syst"],"published-print":{"date-parts":[[2007,8]]},"DOI":"10.1007\/s10458-007-0020-8","type":"journal-article","created":{"date-parts":[[2007,4,30]],"date-time":"2007-04-30T12:52:13Z","timestamp":1177937533000},"page":"91-108","source":"Crossref","is-referenced-by-count":33,"title":["Reaching pareto-optimality in prisoner\u2019s dilemma using conditional joint action learning"],"prefix":"10.1007","volume":"15","author":[{"given":"Dipyaman","family":"Banerjee","sequence":"first","affiliation":[]},{"given":"Sandip","family":"Sen","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2007,4,30]]},"reference":[{"issue":"2","key":"20_CR1","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1016\/S0004-3702(02)00121-2","volume":"136","author":"M.H. Bowling","year":"2002","unstructured":"Bowling M.H. and Veloso M.M. (2002). Multiagent learning using a variable learning rate. Artificial Intelligence 136(2): 215\u2013250","journal-title":"Artificial Intelligence"},{"key":"20_CR2","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1613\/jair.1332","volume":"22","author":"M.H. Bowling","year":"2004","unstructured":"Bowling M.H. and Veloso M.M. (2004). Existence of multiagent equilibria with limited agents. Journal of Artificial Intelligence Res. (JAIR) 22: 353\u2013384","journal-title":"Journal of Artificial Intelligence Res. (JAIR)"},{"key":"20_CR3","volume-title":"Theory of moves","author":"S.J. Brams","year":"1994","unstructured":"Brams S.J. (1994). Theory of moves. Cambridge University Press, Cambridge, UK"},{"key":"20_CR4","volume-title":"Iterative solution of games by fictiious play. In activity analysis of production and allocation","author":"G.W. Brown","year":"1951","unstructured":"Brown G.W. (1951). Iterative solution of games by fictiious play. In activity analysis of production and allocation. Wiley, New York"},{"key":"20_CR5","unstructured":"Claus, C., & Boutilier, C. (1997). The dynamics of reinforcement learning in cooperative multiagent systems. In Collected papers from AAAI-97 workshop on Multiagent Learning, (pp. 13\u201318). AAAI."},{"key":"20_CR6","unstructured":"Conitzer, V., &Sandholm, T. (2003). Awesome: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In ICML, (pp. 83\u201390)."},{"key":"20_CR7","doi-asserted-by":"crossref","unstructured":"Crandall, J. W., & Goodrich, M. A. (2005). Learning to compete, compromise, and cooperate in repeated general-sum games. In Proceedings of the nineteenth international conference on machine learning, pp. 161\u2013168.","DOI":"10.1145\/1102351.1102372"},{"key":"20_CR8","unstructured":"de Farias, D. P., & Megiddo, N. (2003). How to combine expert (and novice) advice when actions impact the environment? In NIPS."},{"key":"20_CR9","volume-title":"The theory of learning in games","author":"D. Fudenberg","year":"1998","unstructured":"Fudenberg D. and Levinem K. (1998). The theory of learning in games. MIT Press, Cambridge, MA"},{"key":"20_CR10","unstructured":"Greenwald, A. R., & Hall, K. (2003). Correlated q-learning. In ICML, pp. 242\u2013249."},{"key":"20_CR11","doi-asserted-by":"crossref","unstructured":"Greenwald, A. R., & Jafari, A. (2003). A general class of no-regret learning algorithms and game-theoretic equilibria. In COLT, pp. 2\u201312.","DOI":"10.1007\/978-3-540-45167-9_2"},{"key":"20_CR12","first-page":"1039","volume":"4","author":"J. Hu","year":"2003","unstructured":"Hu J. and Wellman M.P. (2003). Nash q-learning for general-sum stochastic games. Journal of Machine Learning Research 4: 1039\u20131069","journal-title":"Journal of Machine Learning Research"},{"key":"20_CR13","unstructured":"Kalai, A., & Vempala, S. (2002). Geometric algorithms for online optimization. Technical Report MIT-LCS-TR-861, MIT Laboratory for Computer Science."},{"key":"20_CR14","unstructured":"Kapetanakis, S., Kudenko, D., & Strens, M. (2004). Learning of coordination in cooperative multi-agent systems using commitment sequences. Artificial Intelligence and the Simulation of Behavior, 1(5)."},{"key":"20_CR15","doi-asserted-by":"crossref","unstructured":"Littlestone, N., & Warmuth, M. K. (1989). The weighted majority algorithm. In IEEE symposium on foundations of computer science, pp. 256\u2013261.","DOI":"10.1109\/SFCS.1989.63487"},{"key":"20_CR16","doi-asserted-by":"crossref","unstructured":"Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the eleventh international conference on machine learning, (pp. 157\u2013163). San Mateo, CA: Morgan Kaufmann.","DOI":"10.1016\/B978-1-55860-335-6.50027-1"},{"key":"20_CR17","unstructured":"Littman, N. L. (2001). Friend-or-foe q-learning in general-sum games. In Proceedings of the eighteenth international conference on machine learning, (pp. 322\u2013328) San Francisco, CA: Morgan Kaufmann."},{"key":"20_CR18","unstructured":"Littman, M. L., & Stone, P. (2001). Implicit negotiation in repeated games. In Intelligent agents VIII: Agent theories, architecture, and languages, pp. 393\u2013404."},{"key":"20_CR19","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1016\/j.dss.2004.08.007","volume":"39","author":"M.L. Littman","year":"2005","unstructured":"Littman M.L. and Stone P. (2005). A polynomial-time nash equilibrium algorithm for repeated games. Decision Support System 39: 55\u201366","journal-title":"Decision Support System"},{"key":"20_CR20","unstructured":"Mundhe, M., & Sen, S. (1999). Evaluating concurrent reinforcement learners. IJCAI-99 workshop on agents that learn about, from and with other agents."},{"issue":"3","key":"20_CR21","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1007\/s10458-005-2631-2","volume":"11","author":"L. Panait","year":"2005","unstructured":"Panait L. and Luke S. (2005). Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems 11(3): 387\u2013434","journal-title":"Autonomous Agents and Multi-Agent Systems"},{"key":"20_CR22","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1016\/0303-2647(95)01551-5","volume":"37","author":"T.W. Sandholm","year":"1995","unstructured":"Sandholm T.W. and Crites R.H. (1995). Multiagent reinforcement learning and iterated prisoner\u2019s dilemma. Biosystems Journal 37: 147\u2013166","journal-title":"Biosystems Journal"},{"key":"20_CR23","unstructured":"Sekaran, M., & Sen, S. (1994). Learning with friends and foes. In Sixteenth annual conference of the cognitive science society, (pp. 800\u2013805). Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers."},{"key":"20_CR24","doi-asserted-by":"crossref","unstructured":"Sen, S., Mukherjee, R., & Airiau, S. (2003). Towards a pareto-optimal solution in general-sum games. In Proceedings of the second intenational joint conference on autonomous agents and multiagent systems (pp. 153\u2013160). New York, NY: ACM Press.","DOI":"10.1145\/860575.860600"},{"issue":"1","key":"20_CR25","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1006\/jeth.2000.2746","volume":"98","author":"A. Mas-Colell","year":"2001","unstructured":"Mas-Colell A. and Hart S. (2001). A general class of adaptive strategies. Journal of Economic Theory 98(1): 26\u201354","journal-title":"Journal of Economic Theory"},{"key":"20_CR26","unstructured":"Singh, S. P., Kearns, M. J., & Mansour, Y. (2000) Nash convergence of gradient dynamics in general-sum games. In UAI, pp. 541\u2013548."},{"key":"20_CR27","unstructured":"Stimpson, J. L., Goodrich, M. A., & Walters, L. C. (2001) Satisficing and learning cooperation in the prisoner\u2019s dilemma. In Proceedings of the seventeenth international joint conference on artificial intelligence, pp. 535\u2013540."},{"issue":"1","key":"20_CR28","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1017\/S026988890500041X","volume":"20","author":"K. Tuyls","year":"2006","unstructured":"Tuyls K. and Now\u00e9 A. (2006). Evolutionary game theory and multi-agent reinforcement learning. The Knowledge Engineering Review 20(1): 63\u201390","journal-title":"The Knowledge Engineering Review"},{"key":"20_CR29","doi-asserted-by":"crossref","unstructured":"Verbeeck, K., Now\u00e9, A., Lenaerts, T., & Parentm, J. (2002). Learning to reach the pareto optimal nash equilibrium as a team. In LNAI 2557: Proceedings of the fifteenth Australian joint conference on artificial intelligence, Vol. (pp. 407\u2013418). Springer-Verlag.","DOI":"10.1007\/3-540-36187-1_36"},{"issue":"1","key":"20_CR30","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1023\/A:1021765422660","volume":"6","author":"J.M. Vidal","year":"2003","unstructured":"Vidal J.M. and Durfee E.H. (2003). Predicting the expected behavior of agents that learn about agents: the CLRI framework. Autonomous Agents and Multi-Agent Systems 6(1): 77\u2013107","journal-title":"Autonomous Agents and Multi-Agent Systems"},{"key":"20_CR31","unstructured":"Wei\u00df, G. Learning to coordinate actions in multi-agent systems. In Proceedings of the international joint conference on artificial intelligence, pp. 311\u2013316, August 1993."}],"container-title":["Autonomous Agents and Multi-Agent Systems"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10458-007-0020-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10458-007-0020-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10458-007-0020-8","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,5,29]],"date-time":"2019-05-29T17:28:23Z","timestamp":1559150903000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10458-007-0020-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,4,30]]},"references-count":31,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2007,8]]}},"alternative-id":["20"],"URL":"https:\/\/doi.org\/10.1007\/s10458-007-0020-8","relation":{},"ISSN":["1387-2532","1573-7454"],"issn-type":[{"value":"1387-2532","type":"print"},{"value":"1573-7454","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,4,30]]}}}