{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T13:38:56Z","timestamp":1760708336003},"reference-count":61,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2011,10,7]],"date-time":"2011-10-07T00:00:00Z","timestamp":1317945600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Auton Agent Multi-Agent Syst"],"published-print":{"date-parts":[[2013,1]]},"DOI":"10.1007\/s10458-011-9183-4","type":"journal-article","created":{"date-parts":[[2011,10,6]],"date-time":"2011-10-06T01:31:22Z","timestamp":1317864682000},"page":"86-119","source":"Crossref","is-referenced-by-count":10,"title":["Cooperative reinforcement learning in topology-based multi-agent systems"],"prefix":"10.1007","volume":"26","author":[{"given":"Dan","family":"Xiao","sequence":"first","affiliation":[]},{"given":"Ah-Hwee","family":"Tan","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2011,10,7]]},"reference":[{"key":"9183_CR1","unstructured":"Panait, L., & Luke, S. (2003). Cooperative multi-agent learning: The state of the art. Tech. Rep., George Mason University, Technical Report GMU-CS-TR-2003-1."},{"key":"9183_CR2","doi-asserted-by":"crossref","unstructured":"Busoniu, L., Babuska, R., & De Schutter, B. (2006). Multi-agent reinforcement learning: A survey. In Proceedings of 9th international conference on control, automation, robotics and vision (ICARCV) (pp. 1\u20136).","DOI":"10.1109\/ICARCV.2006.345353"},{"key":"9183_CR3","unstructured":"Lesser, V. R., Corkill, D. D., & Durfee, E. H. (1987). An update on the distributed vehicle monitoring testbed. Tech. Rep., Computer and Information Science Department, Amherst, MA, USA."},{"key":"9183_CR4","unstructured":"Nunes, L., & Oliveira, E. (2004). Learning from multiple sources. In Proceedings of third international joint conference on autonomous agents and multi agent systems (AAMAS-2004)."},{"key":"9183_CR5","first-page":"671","volume-title":"Advances in neural information processing systems","author":"J. A. Boyan","year":"1994","unstructured":"Boyan J. A., Littman M. L. (1994) Packet routing in dynamically changing networks: A reinforcement learning approach. In: Cowan J. D., Tesauro G., Alspector J. (eds) Advances in neural information processing systems. Morgan Kaufmann Publishers Inc, San Francisco, CA, pp 671\u2013678"},{"key":"9183_CR6","doi-asserted-by":"crossref","unstructured":"Chang Y. H., Ho, T., & Kaelbling L. P. (2004). Mobilized ad-hoc networks: A reinforcement learning approach. In Proceedings of 2004 international conference on autonomic computing (pp. 240\u2013247).","DOI":"10.1109\/ICAC.2004.1301369"},{"key":"9183_CR7","unstructured":"Schneider, J., Wong, W. K., Moore, A., & Riedmiller, M. (1999). Distributed value functions. In Proceedings of 16th international conference on machine learning(pp. 371\u2013378). San Francisco, CA: Morgan Kaufmann."},{"issue":"4","key":"9183_CR8","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1016\/0957-4174(94)90080-9","volume":"7","author":"L. Z. Varga","year":"1994","unstructured":"Varga L. Z., Jennings N. R., Cockburn D. (1994) Integrating intelligent systems into a cooperating community for electricity distribution management. Expert Systems with Applications 7(4): 563\u2013579","journal-title":"Expert Systems with Applications"},{"key":"9183_CR9","doi-asserted-by":"crossref","unstructured":"Tan, A.-H. (2006). Self-organizing neural architecture for reinforcement learning. In Proceedings of international symposium on neural networks (ISNN\u201906), LNCS 3971, Chengdu, China (pp. 470\u2013475).","DOI":"10.1007\/11759966_70"},{"issue":"2","key":"9183_CR10","doi-asserted-by":"crossref","first-page":"230","DOI":"10.1109\/TNN.2007.905839","volume":"9","author":"A.-H. Tan","year":"2008","unstructured":"Tan A.-H., Lu N., Xiao D. (2008) Integrating temporal difference methods and self-organizing neural networks for reinforcement learning with delayed evaluative feedback. IEEE Transactions on Neural Networks 9(2): 230\u2013244","journal-title":"IEEE Transactions on Neural Networks"},{"key":"9183_CR11","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1016\/S0734-189X(87)80014-2","volume":"37","author":"G. A. Carpenter","year":"1987","unstructured":"Carpenter G. A., Grossberg S. (1987) A massively parallel architecture for a self-organizing neural pattern recognition machine. Computer Vision, Graphics, and Image Processing 37: 54\u2013115","journal-title":"Computer Vision, Graphics, and Image Processing"},{"key":"9183_CR12","doi-asserted-by":"crossref","first-page":"4919","DOI":"10.1364\/AO.26.004919","volume":"26","author":"G. A. Carpenter","year":"1987","unstructured":"Carpenter G. A., Grossberg S. (1987) ART 2: Self-organization of stable category recognition codes for analog input patterns. Applied Optics 26: 4919\u20134930","journal-title":"Applied Optics"},{"key":"9183_CR13","unstructured":"Xiao, D., & Tan, A. H. (2005). Cooperative cognitive agents and reinforcement learning in pursuit game. In Proceedings of third international conference on computational intelligence, robotics and autonomous systems (CIRAS\u201905), Singapore."},{"issue":"6","key":"9183_CR14","doi-asserted-by":"crossref","first-page":"1567","DOI":"10.1109\/TSMCB.2007.907040","volume":"37","author":"D. Xiao","year":"2007","unstructured":"Xiao D., Tan A.-H. (2007) Self-organizing neural architectures and cooperative learning in multi-agent environment. IEEE Transactions on Systems, Man, and Cybernetics-Part B 37(6): 1567\u20131580","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics-Part B"},{"key":"9183_CR15","doi-asserted-by":"crossref","unstructured":"Xiao, D., & Tan, A.-H. (2008). Scaling up multi-agent reinforcement learning in complex domains. In Proceedings of 2008 IEEE\/WIC\/ACM international conference on intelligent agent technology, Sydney (pp. 326\u2013329).","DOI":"10.1109\/WIIAT.2008.259"},{"issue":"2","key":"9183_CR16","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1137\/1033048","volume":"33","author":"R. K. Ahuja","year":"1991","unstructured":"Ahuja R. K., Magnanti T. L., Orlin J. B. (1991) Some recent advances in network flows. SIAM Review 33(2): 175\u2013219","journal-title":"SIAM Review"},{"key":"9183_CR17","unstructured":"Weihmayer, R., & Velthuijsen, H. (1994). Application of distributed ai and cooperative problem solving to telecommunications. AI Approaches to Telecommunications and Network Management."},{"key":"9183_CR18","doi-asserted-by":"crossref","unstructured":"Brauer, W., & Wei\u00df, G. (1998). Multi-machine scheduling\u2014a multi-agent learning approach. In Proceedings of the third international conference on multi-agent systems (pp. 42\u201348).","DOI":"10.1109\/ICMAS.1998.699030"},{"key":"9183_CR19","unstructured":"Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I., & Osawa, E. (1997). Robocup: The robot world cup initiative. In Proceedings of the first international conference on autonomous agents (Agents97), New York, 5\u20138 Feb 1997 (pp. 340\u2013347). New York: ACM Press."},{"key":"9183_CR20","doi-asserted-by":"crossref","first-page":"371","DOI":"10.1007\/978-4-431-67919-6_35","volume":"4","author":"P. Riley","year":"2000","unstructured":"Riley P., Veloso M. (2000) On behavior classification in adversarial environments. Distributed Autonomous Robotic Systems 4: 371\u2013380","journal-title":"Distributed Autonomous Robotic Systems"},{"issue":"3","key":"9183_CR21","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1016\/B978-0-934613-63-7.50011-5","volume":"101","author":"R. Steeb","year":"1988","unstructured":"Steeb R., Cammarata S., Hayes-Roth F., Thorndyke P., Wesson R. (1988) Distributed intelligence for air fleet control. Readings in Distributed Artificial Intelligence, 101(3): 90\u2013101","journal-title":"Readings in Distributed Artificial Intelligence,"},{"key":"9183_CR22","unstructured":"Huang, J., Jennings, N. R., & Fox, J. (1995). An agent architecture for distributed medical care. Intelligent Agents: Theories, Architectures, and Languages, LNAI 890."},{"key":"9183_CR23","doi-asserted-by":"crossref","unstructured":"Chalupsky, H., Gil, Y., Knoblock, C. A., Lerman, K., Oh, J., Pynadath, D., Russ, T., & Tambe, M. (2002). Electric elves: Agent technology for supporting human organizations. AI Magazine, 23.","DOI":"10.21236\/ADA459956"},{"key":"9183_CR24","unstructured":"Crawford, E., & Veloso, M. (2004). Opportunities for learning in multi-agent meeting scheduling. In Proceedings of artificial multiagent learning, Carnegie Mellon University, Pittsburgh, PA, USA, Technical Report FS-04-02."},{"key":"9183_CR25","unstructured":"Wooldridge, M., Bussmann, S., & Klosterberg, M. (1996). Production sequencing as negotiation. In Proceedings of first international conference on the practical application of intelligent agents and multi-agent technology (PAAM-96) (pp. 709\u2013726)."},{"key":"9183_CR26","doi-asserted-by":"crossref","unstructured":"Vannelli, A. (1989). An interior point method for solving the global routing problem. In Custom integrated circuits conference, 1989, Proceedings of the IEEE 1989, San Diego, CA, USA, 15\u201318 May 1989 (pp. 3.4\/1\u20133.4\/4).","DOI":"10.1109\/CICC.1989.56681"},{"key":"9183_CR27","doi-asserted-by":"crossref","unstructured":"Roling, P. C., & Visser, H. G. (2008). Optimal airport surface traffic planning using mixed-integer linear programming. International Journal of Aerospace Engineering, 2008, 1\u201311.","DOI":"10.1155\/2008\/732828"},{"key":"9183_CR28","volume-title":"Introduction to operations research","year":"2001","unstructured":"Hillier, F. S., Lieberman, G. J. (eds) (2001) Introduction to operations research. McGraw-Hill, Oakland, CA"},{"issue":"2","key":"9183_CR29","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1287\/inte.1030.0051","volume":"34","author":"L.J. LeBlanc","year":"2003","unstructured":"LeBlanc L.J., Hill J.A., Greenwell G.W., Czesnat A.O., Galbreth M.R. (2003) Optimizing nu-kote\u2019s supply chain with linear programming. Interfaces, 34(2): 139\u2013146","journal-title":"Interfaces,"},{"key":"9183_CR30","doi-asserted-by":"crossref","first-page":"1181","DOI":"10.1109\/21.398680","volume":"25","author":"C. Goutis","year":"1995","unstructured":"Goutis C. (1995) A graphical method for solving a decision analysis problem. IEEE Transactions on Systems, Man and Cybernetics 25: 1181\u20131193","journal-title":"IEEE Transactions on Systems, Man and Cybernetics"},{"issue":"1","key":"9183_CR31","doi-asserted-by":"crossref","first-page":"29","DOI":"10.3329\/diujst.v5i1.4379","volume":"5","author":"B. C. Das","year":"2010","unstructured":"Das B. C. (2010) Effect of graphical method for solving mathematical programming problem. Daffodil International University Journal of Science and Technology 5(1): 29\u201336","journal-title":"Daffodil International University Journal of Science and Technology"},{"key":"9183_CR32","first-page":"163","volume":"1","author":"N. Szozda","year":"2008","unstructured":"Szozda N., \u015awierczek A. (2008) The success factors for supply chains of a short life cycle product. Total Logistic Management 1: 163\u2013173","journal-title":"Total Logistic Management"},{"key":"9183_CR33","unstructured":"Zhu, K. Q., Tan, K. C., & Lee, L. H. (2000). Heuristics for vehicle routing problem with time windows. In Proceedings of 6th international symposium on artificial intelligence and mathematics, AMAI 2000."},{"key":"9183_CR34","doi-asserted-by":"crossref","unstructured":"Sun, L.-J., Hu, X.-P., Li, Y.-X., Lu, J., & Yang, D.-L. (2008). A heuristic algorithm and a system for vehicle routing with multiple destinations in embedded equipment. In Proceedings of 7th international conference on mobile business, 2008, ICMB \u201908, Barcelona, 7\u20138 July 2008 (pp. 1\u20138).","DOI":"10.1109\/ICMB.2008.47"},{"issue":"4","key":"9183_CR35","doi-asserted-by":"crossref","first-page":"951","DOI":"10.2307\/2669334","volume":"45","author":"R. R. Lau","year":"2001","unstructured":"Lau R. R., Redlawsk D. P. (2001) Advantages and disadvantages of cognitive heuristics in political decision making. American Journal of Political Science 45(4): 951\u2013971","journal-title":"American Journal of Political Science"},{"key":"9183_CR36","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1016\/0888-613X(95)00091-T","volume":"14","author":"P. Wang","year":"1994","unstructured":"Wang P. (1994) Heuristics and normative models of judgment under uncertainty. International Journal of Approximate Reasoning 14: 221\u2013235","journal-title":"International Journal of Approximate Reasoning"},{"key":"9183_CR37","unstructured":"Rathnasabapathy, B., & Gmytrasiewicz, P. (2002) Formalizing multi-agent pomdps in the context of network routing. AAAI Technical Report WS-02-12, Department of Computer Science, University of Illinois at Chicago."},{"key":"9183_CR38","unstructured":"Schurr, N. (2007). Toward human-multiagent teams, Ph.D. thesis, Faculty of the Graduate School, University of Southern California, Los Angeles, CA, USA."},{"key":"9183_CR39","doi-asserted-by":"crossref","unstructured":"Han, J. (2006). Network-adaptive qos routing using local information. In Proceedings of 9th Asia-Pacific network operations and management symposium, APNOMS 2006, Pusan, Korea, 27\u201329 September 2006 (pp. 190\u2013199).","DOI":"10.1007\/11876601_20"},{"key":"9183_CR40","unstructured":"Munetomo, M., Takai, Y., & Sato, Y. (1997). An intelligent network routing algorithm by a genetic algorithm. In Proceedings of fourth international conference on neural information processing (pp. 547\u2013550)."},{"issue":"1","key":"9183_CR41","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1016\/j.cor.2007.09.005","volume":"36","author":"X.-B. Hu","year":"2009","unstructured":"Hu X.-B., Paolo E. D. (2009) An efficient genetic algorithm with uniform crossover for air traffic control. Computers and Operations Research 36(1): 245\u2013259","journal-title":"Computers and Operations Research"},{"issue":"7\u20138","key":"9183_CR42","doi-asserted-by":"crossref","first-page":"803","DOI":"10.1007\/s00170-005-2556-6","volume":"29","author":"W.-C. Yeh","year":"2006","unstructured":"Yeh W.-C. (2006) An efficient memetic algorithm for the multi-stage supply chain network problem. International Journal of Advanced Manufacturing Technology 29(7\u20138): 803\u2013813","journal-title":"International Journal of Advanced Manufacturing Technology"},{"issue":"6","key":"9183_CR43","doi-asserted-by":"crossref","first-page":"677","DOI":"10.20965\/jaciii.2007.p0677","volume":"11","author":"C.-W. Han","year":"2007","unstructured":"Han C.-W., Nobuhara H. (2007) Advanced genetic algorithms based on adaptive partitioning method. Journal of Advanced Computational Intelligence and Intelligent Informatics 11(6): 677\u2013680","journal-title":"Journal of Advanced Computational Intelligence and Intelligent Informatics"},{"key":"9183_CR44","unstructured":"Littman, M., & Boyan, J. (1993). A distributed reinforcement learning scheme for network routing. Tech. Rep., Robotics Institute, Pittsburgh, PA, USA, CMU-CS-93-165."},{"key":"9183_CR45","first-page":"895","volume":"4099\/2006","author":"J.-G. Baek","year":"2006","unstructured":"Baek J.-G., Kim C. O., Kwon I.-H. (2006) An adaptive inventory control model for a supply chain with nonstationary customer demands. Computers and Operations Research 4099\/2006: 895\u2013900","journal-title":"Computers and Operations Research"},{"key":"9183_CR46","unstructured":"Wan, A. D. M., & Braspenning, P. J. (1995). The bifurcation of DAI and adaptivism as synthesis. In Proceedings of the 1995 Dutch conference on AI (NAIC) (pp. 253\u2013262)."},{"key":"9183_CR47","doi-asserted-by":"crossref","unstructured":"Leopold, T., Kern-Isberner, G., & Peters, G. (2008). Combining reinforcement learning and belief revision\u2014a learning system for active vision. In Proceedings of the 19th British machine vision conference, Leeds, UK, 1\u20134 Sep 2008.","DOI":"10.5244\/C.22.48"},{"key":"9183_CR48","unstructured":"Stephan, V., Debes, K., Gross, H.-M., Wintrich, F., & Wintrich, H. (2000). A reinforcement learning based neural multiagent system for controlof a combustion process. In Proceedings of IEEE-INNS-ENNS international joint conference on IJCNN 2000, Como, Italy (Vol. 6, pp. 217\u2013222)."},{"key":"9183_CR49","unstructured":"Bradley, J., & Hayes, G. (2005). Adapting reinforcement learning for computer games: Using group utility functions. In Proceedings of IEEE symposium on computational intelligence and games, Colchester, Essex, UK."},{"key":"9183_CR50","unstructured":"Tan, A.-H., & Xiao, D. (2005). Self-organizing cognitive agents and reinforcement learning in a multi-agent environment. In Proceedings of IEEE\/ACM\/WIC international conference on intelligent agent technologies (pp. 351\u2013357)."},{"key":"9183_CR51","unstructured":"Wolpert, D. H., Tumer, K., & Frank, J. (1999). Using collective intelligence to route internet traffic. In Proceedings of the 1998 conference on advances in neural information processing systems II, Cambridge, MA, USA (pp. 952\u2013958). Cambridge, MA: MIT Press."},{"key":"9183_CR52","unstructured":"Subramanian, D., Druschel, P., & Chen, J. (1997). Ants and reinforcement learning: A case study in routing in dynamic networks. In Proceedings of fifteenth international joint conference on artificial intelligence (IJCAI-97) (pp. 832\u2013838). San Francisco, CA: Morgan Kaufmann."},{"key":"9183_CR53","unstructured":"Moore, B. (1988). ART 1 and pattern clustering. In Proceedings of 1988 connectionist models summer school (pp. 174\u2013185)."},{"key":"9183_CR54","doi-asserted-by":"crossref","first-page":"759","DOI":"10.1016\/0893-6080(91)90056-B","volume":"4","author":"G. A. Carpenter","year":"1991","unstructured":"Carpenter G. A., Grossberg S., Rosen D. B. (1991) Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system. Neural Networks 4: 759\u2013771","journal-title":"Neural Networks"},{"key":"9183_CR55","unstructured":"Tan, A.-H. (2007). Direct code access in self-organizing neural architectures for reinforcement learning. In Proceedings, international joint conference on artificial intelligence (IJCAI07), Hyderabad, India (pp. 1071\u20131076)."},{"key":"9183_CR56","unstructured":"P\u00e9rez-Uribe, A. (2002). Structure-adaptable digital neural networks, Ph.D. thesis, Swiss Federal Institute of Technology-Lausanne."},{"key":"9183_CR57","unstructured":"Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of 10th international conference on machine learning (pp. 330\u2013337)."},{"key":"9183_CR58","unstructured":"Onat, A. (1998). Q-learning with recurrent neural networks as a controller for the inverted pendulum problem. In Proceedings of the fifth international conference on neural information processing, Japan, 21\u201323 Oct 1998 (pp. 837\u2013840)."},{"key":"9183_CR59","unstructured":"Sandholm, T., & Crites, R. H. (1995). On multiagent q-learning in a semi-competitive domain. In Proceedings of the workshop on adaption and learning in multi-agent systems (IJCAI\u201995), London, UK (pp. 191\u2013205). London: Springer."},{"issue":"11\u201312","key":"9183_CR60","doi-asserted-by":"crossref","first-page":"769","DOI":"10.1016\/S0965-9978(02)00037-6","volume":"33","author":"M. Benhamadou","year":"2002","unstructured":"Benhamadou M. (2002) On the simplex algorithm \u2019revised form. Advances in Engineering Software 33(11\u201312): 769\u2013777","journal-title":"Advances in Engineering Software"},{"issue":"2","key":"9183_CR61","doi-asserted-by":"crossref","first-page":"183","DOI":"10.2140\/pjm.1955.5.183","volume":"5","author":"G. B. Dantzig","year":"1955","unstructured":"Dantzig G. B., Orden A., Wolfe P. (1955) The generalized simplex method for minimizing a linear form under linear inequality restraints. Pacific Journal of Mathematics 5(2): 183\u2013195","journal-title":"Pacific Journal of Mathematics"}],"container-title":["Autonomous Agents and Multi-Agent Systems"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10458-011-9183-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10458-011-9183-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10458-011-9183-4","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,6,17]],"date-time":"2019-06-17T03:40:54Z","timestamp":1560742854000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10458-011-9183-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,10,7]]},"references-count":61,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2013,1]]}},"alternative-id":["9183"],"URL":"https:\/\/doi.org\/10.1007\/s10458-011-9183-4","relation":{},"ISSN":["1387-2532","1573-7454"],"issn-type":[{"value":"1387-2532","type":"print"},{"value":"1573-7454","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,10,7]]}}}