{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,19]],"date-time":"2025-10-19T05:54:16Z","timestamp":1760853256788},"reference-count":61,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2007,1,4]],"date-time":"2007-01-04T00:00:00Z","timestamp":1167868800000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Auton Agent Multi-Agent Syst"],"published-print":{"date-parts":[[2007,8,16]]},"DOI":"10.1007\/s10458-006-9008-z","type":"journal-article","created":{"date-parts":[[2007,1,4]],"date-time":"2007-01-04T13:45:17Z","timestamp":1167918317000},"page":"147-196","source":"Crossref","is-referenced-by-count":26,"title":["A framework for meta-level control in multi-agent systems"],"prefix":"10.1007","volume":"15","author":[{"given":"Anita","family":"Raja","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Victor","family":"Lesser","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2007,1,4]]},"reference":[{"key":"9008_CR1","doi-asserted-by":"crossref","unstructured":"Barto, A., Sutton, R., & Anderson, C. (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, SMC-13, 834\u2013846.","DOI":"10.1109\/TSMC.1983.6313077"},{"key":"9008_CR2","volume-title":"Neuro-dynamic programming","author":"D. Bertsekas","year":"1996","unstructured":"Bertsekas D., Tsitsiklis J. (1996). Neuro-dynamic programming. Athena Scientific, Belmont, MA"},{"issue":"2","key":"9008_CR3","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1016\/0004-3702(94)90054-X","volume":"67","author":"M. Boddy","year":"1994","unstructured":"Boddy M., Dean T. (1994). Decision-theoretic deliberation scheduling for problem solving in time-constrained environments. Artificial Intelligence 67(2): 245\u2013286","journal-title":"Artificial Intelligence"},{"key":"9008_CR4","unstructured":"Boutlier, C. (1999). Sequential optimality and coordination in multiagent systems. In Proceedings of the sixteenth international joint conference on artificial intelligence."},{"key":"9008_CR5","unstructured":"Crites R., Barto A. (1996). Improving elevator performance using reinforcement learning, Multi-ag In Advances in Neural Information Processing Systems, pages 8: 1017\u20131023"},{"key":"9008_CR6","unstructured":"Dean, T., & Boddy, M. (1988). An analysis of time-dependent planning. In Proceedings of the seventh national conference on artificial intelligence (AAAI-88) (pp. 49\u201354). Saint Paul, Minnesota, USA: AAAI Press\/MIT Press."},{"key":"9008_CR7","unstructured":"Decker, K. (1996). Taems: a framework for environment centered analysis and design of coordination mechanisms. In G. O\u2019Hare & N. Jennings, (Eds.), Foundations of Distributed Artificial Intelligence, Chapter 16 (pp. 429\u2013448). Wiley Inter-Science."},{"key":"9008_CR8","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1613\/jair.639","volume":"13","author":"T. Dietterich","year":"2000","unstructured":"Dietterich T. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13: 227\u2013303","journal-title":"Journal of Artificial Intelligence Research"},{"key":"9008_CR9","unstructured":"Doyle J. (1983). What is rational psychology? toward a modern mental philosophy. AI Magazine 4(3): 50\u201353"},{"key":"9008_CR10","unstructured":"Garvey, A. & Lesser, V. (1996). Issues in design-to-time real-time scheduling. In AAAI Fall 1996 symposium on flexible computation, November."},{"key":"9008_CR11","unstructured":"Georgeff, M. & Lansky, A. (1987). Reactive reasoning and planning. In Proceedings of the sixth national conference on artificial intelligence, pp. 677\u2013682 Seattle, WA."},{"key":"9008_CR12","doi-asserted-by":"crossref","unstructured":"Goldman, R., Musliner, D. & Krebsbach, K. (2003). Managing online self-adaptation in real-time environments. In LNCS, vol. 2614, SV, pp. 6\u201323.","DOI":"10.1007\/3-540-36554-0_2"},{"key":"9008_CR13","unstructured":"Good, I. J. (1971). Twenty-seven principles of rationality. In V. P. Godambe & D. A. Sprott, (Eds.), Foundations of statistical inference (pp. 108\u2013141). Toronto: Holt Rinehart Wilson."},{"issue":"2","key":"9008_CR14","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1145\/242587.242593","volume":"7","author":"E. Hansen","year":"1996","unstructured":"Hansen E., Zilberstein S. (1996). Monitoring anytime algorithms. SIGART Bulletin 7(2): 28\u201333","journal-title":"SIGART Bulletin"},{"key":"9008_CR15","unstructured":"Harada, D. & Russell, S. (1999). Extended abstract: Learning search strategies. In Proceedings AAAI spring symposium on search techniques for problem solving under uncertainty and incomplete information, Stanford, CA, 1999."},{"key":"9008_CR16","doi-asserted-by":"crossref","unstructured":"Hayes-Roth, B. (1993). Opportunistic control of action in intelligent agents. In Proceedings of IEEE transactions on systems, man and cybernetics, pp. SMC\u201323(6), 1575\u20131587.","DOI":"10.1109\/21.257755"},{"key":"9008_CR17","unstructured":"Hayes-Roth, B., Uckun, S., Larsson, J. E., Gaba, D., Barr, J. & Chien, J. (1994). Guardian: A prototype intelligent agent for intensive-care monitoring. In Proceedings of the national conference on artificial intelligence, pp. 1503\u20131511."},{"key":"9008_CR18","unstructured":"Horling, B., Lesser, V. & Vincent, R. (2000). Multi-agent system simulation framework. In sixteenth IMACS World Congress 2000 on scientific computation, applied mathematics and simulation. Switzerland: EPFL, Lausanne."},{"issue":"1","key":"9008_CR19","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1007\/s10458-005-3998-9","volume":"12","author":"B. Horling","year":"2006","unstructured":"Horling B., Lesser V., Vincent R., Wagner T. (2006). The soft real-time agent control architecture. Autonomous Agents and Multi-Agent Systems 12(1): 35\u201392","journal-title":"Autonomous Agents and Multi-Agent Systems"},{"key":"9008_CR20","unstructured":"Horvitz, E. (1988). Reasoning under varying and uncertain resource constraints. In National conference on artificial intelligence of the american association for AI (AAAI-88), pp. 111\u2013116."},{"key":"9008_CR21","unstructured":"Kaelbling, L. (1990). Learning in embedded systems. PhD thesis, Stanford University."},{"key":"9008_CR22","doi-asserted-by":"crossref","unstructured":"Kinney, M. & Tsatsoulis, C. (1998). Learning communication strategies in multiagent systems. Applied intelligence, 9(1), 71\u201391.","DOI":"10.1023\/A:1008251315338"},{"key":"9008_CR23","unstructured":"Kuwabara, K. (1996). Meta-level control of coordination protocols. In Proceedings of the third international conference on multi-agent systems (ICMAS96). pp. 104\u2013111."},{"key":"9008_CR24","unstructured":"Lagoudakis, M. & Littman, M. (2000). Reinforcement learning for algorithm selection. In Proceedings of the seventeenth national conference on artificial intelligence (AAAI-2000), pp. 1081."},{"key":"9008_CR25","unstructured":"Littman, M. & Boyan, J. (1993). A distributed reinforcement learning scheme for network routing. Technical Report CS-93-165."},{"key":"9008_CR26","unstructured":"Littman, M. (1994). Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the eleventh international conference on machine learning (ML-94) (pp. 157\u2013163). Morgan Kaufmann: New Brunswick, NJ."},{"key":"9008_CR27","doi-asserted-by":"crossref","unstructured":"Mataric, M. (1997). Reinforcement learning in the multi-robot domain. Autonomous Robots, 4(1), 73\u201383.","DOI":"10.1023\/A:1008819414322"},{"key":"9008_CR28","doi-asserted-by":"crossref","unstructured":"Musliner, D. J., Hendler, J. A., Agrawala, A. K., Durfee, E. H., Strosnider, J. K. & Paul, C. J. (1995). The Challenges of real-time AI. IEEE Computer, 28(1), 58\u201366.","DOI":"10.1109\/2.362628"},{"key":"9008_CR29","unstructured":"Musliner, D. (1996). Plan execution in mission-critical domains. In Working notes of the AAAI fall symposium on plan execution\u2013problems and issues."},{"key":"9008_CR30","unstructured":"Nakakuki, Y. & Sadeh, N. (1994). Increasing the efficiency of simulated annealing search by learning to recognize (un)promising runs. In Proceedings of the twelfth national conference on artificial intelligence (AAAI-94), pp. 1316\u20131322."},{"key":"9008_CR31","unstructured":"Parr, R. & Russell, S. (1997). Reinforcement learning with hierarchies of machines. In M. I. Jordan, M. J. Kearns, & S. A. Solla (Eds.), Advances in neural information processing systems, vol. 10, The MIT Press."},{"key":"9008_CR32","doi-asserted-by":"crossref","unstructured":"Puterman, M. L. (1994). Markov decision processes \u2013 discrete stochastic dynamic programming. Games as a Framework for Multi-Agent Reinforcement Learning. New York: John Wiley and Sons, Inc.","DOI":"10.1002\/9780470316887"},{"key":"9008_CR33","unstructured":"Raja, A. (2003). Meta-level control in multi-agent systems. PhD thesis, University of Massachusetts at Amherst, Amherst, Massachusetts."},{"key":"9008_CR34","unstructured":"Raja, A., Alexander, G. & Mappillai, V. (2006). Leveraging problem classification in online meta-cognition. In Proceedings of AAAI 2006 spring symposium on distributed plan and schedule management (pp. 97\u2013104) Stanford."},{"key":"9008_CR35","doi-asserted-by":"crossref","unstructured":"Raja, A., Lesser, V., & Wagner, T. (2000). Toward Robust Agent Control in Open Environments. In Proceedings of the fourth international conference on autonomous agents (pp. 84\u201391). Barcelona, Catalonia, Spain: ACM Press.","DOI":"10.1145\/336595.337054"},{"key":"9008_CR36","unstructured":"Russell, S. & Norvig, P. (1995). Artificial intelligence: A modern approach. Prentice Hall."},{"key":"9008_CR37","unstructured":"Russell, S. & Wefald, E. (1992). Do the right thing: studies in limited rationality. MIT press."},{"key":"9008_CR38","unstructured":"Russell, S. J., Subramanian, D. & Parr, R. (1993). Provably bounded optimal agents. In Proceedings of the thirteenth international joint conference on artificial intelligence (IJCAI-93), pp. 338\u2013344."},{"key":"9008_CR39","unstructured":"Russell, S. & Wefald, E. (1989). Principles of metareasoning. In Proceedings of the first international conference on principles of knowledge representation and reasoning. pp. 400\u2013411."},{"key":"9008_CR40","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1016\/0303-2647(95)01551-5","volume":"37","author":"T. Sandholm","year":"1995","unstructured":"Sandholm T., Crites R. (1995). Multiagent reinforcement learning in the iterated prisoner\u2019s dilemma. Biosystems Journal 37: 147\u2013166","journal-title":"Biosystems Journal"},{"key":"9008_CR41","doi-asserted-by":"crossref","unstructured":"Schut, M. & Wooldridge, M. (2001). The control of reasoning in resource-bounded agents. Knowledge Engineering Review, 16(3), 215\u2013240.","DOI":"10.1017\/S0269888901000157"},{"key":"9008_CR42","unstructured":"Sen, S., Sekaran, M. & Hale, J. (1994). Learning to coordinate without sharing information. In Proceedings of the twelfth national conference on artificial intelligence, (pp. 426\u2013431), Seattle, WA."},{"key":"9008_CR43","doi-asserted-by":"crossref","unstructured":"Simon, H., Latsis, S. J. (Ed.) (1976). From substantive to procedural rationality. In Method and Appraisal in Economic. Cambridge University Press, pp. 129\u2013148.","DOI":"10.1017\/CBO9780511572203.006"},{"key":"9008_CR44","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1016\/0004-3702(75)90002-8","volume":"6","author":"H. Simon","year":"1974","unstructured":"Simon H., Kadane J. (1974). Optimal problem solving search: All-or-none solutions. Artificial Intelligence 6: 235\u2013247","journal-title":"Artificial Intelligence"},{"key":"9008_CR45","unstructured":"Simon, H. (1982). Models of bounded rationality. vol. 1. Cambridge, MA: The MIT Press."},{"key":"9008_CR46","unstructured":"Singh, S., Kearns, M., Litman, D. & Walker, M. (2000). Empirical evaluation of a reinforcement learning spoken dialogue system. In Proceedings of the seventeenth national conference on artificial intelligence, pp. 645\u2013651."},{"key":"9008_CR47","unstructured":"Sugawara, T. & Lesser, V. (1993). On-line learning of coordination plans. In Proceedings of the twelth international workshop on distributed artificial intelligence, pp. 335\u2013345,371\u2013377."},{"key":"9008_CR48","unstructured":"Sutton, R. & Barto, A. (1998). Reinforcement learning. MIT Press."},{"key":"9008_CR49","unstructured":"Sutton, R. (1984). Temporal credit assignment in reinforcement learning. PhD thesis, University of Massachusetts Amherst."},{"issue":"1","key":"9008_CR50","first-page":"9","volume":"3","author":"R. Sutton","year":"1988","unstructured":"Sutton R. (1988). Learning to predict by the method of temporal differences. Machine Learning 3(1): 9\u201344","journal-title":"Machine Learning"},{"key":"9008_CR51","doi-asserted-by":"crossref","unstructured":"Sutton, R., Precup, D. & Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1\u20132), 181\u2013211.","DOI":"10.1016\/S0004-3702(99)00052-1"},{"key":"9008_CR52","unstructured":"Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the tenth international conference on machine learning, pp. 330\u2013337."},{"key":"9008_CR53","unstructured":"Vincent, R., Horling, B. & Lesser, V. (2001). An agent infrastructure to build and evaluate multi-agent systems: The java agent framework and multi-agent system simulator. In Wagner and Rana (Eds.), Lecture notes in artificial intelligence: infrastructure for agents, multi-agent systems, and scalable multi-agent systems, vol. 1887. Springer."},{"key":"9008_CR54","doi-asserted-by":"crossref","unstructured":"Wagner, T., Garvey, A. & Lesser, V. (1998). Criteria-directed heuristic task scheduling. International Journal of Approximate Reasoning, Special Issue on Scheduling, 19(1\u20132), 91\u2013118. A version also available as UMASS CS TR-97-59.","DOI":"10.1016\/S0888-613X(98)10006-3"},{"key":"9008_CR55","unstructured":"Watkins, C. (1989). Learning from delayed rewards. PhD thesis, Cambridge, England."},{"issue":"1","key":"9008_CR56","first-page":"45","volume":"7","author":"S.D. Whitehead","year":"1991","unstructured":"Whitehead S.D., Ballard D.H. (1991). Learning to perceive and act by trial and error. Machine Learning 7(1): 45\u201383","journal-title":"Machine Learning"},{"key":"9008_CR57","doi-asserted-by":"crossref","unstructured":"Zhang, X. & Lesser, V. (2002). Multi-linked negotiation in multi-agent system. In Proceedings of the first international joint conference on autonomous agents and multiagent systems (AAMAS 2002), pp. 1207\u20131214.","DOI":"10.1145\/545056.545101"},{"key":"9008_CR58","unstructured":"Zilberstein S., Mouaddib A. (1999). Reactive control of dynamic progressive processing. IJCAI, 1268\u20131273"},{"key":"9008_CR59","unstructured":"Zilberstein, S. & Russell, S. J. (1992). Efficient resource-bounded reasoning in AT-RALPH. In James Hendler, (Edn.), Proceedings of the first international conference of artificial intelligence planning systems (AIPS 92) (pp. 260\u2013268) Morgan Kaufmann: College Park, Maryland, USA."},{"issue":"1\u20132","key":"9008_CR60","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1016\/0004-3702(94)00074-3","volume":"82","author":"S. Zilberstein","year":"1996","unstructured":"Zilberstein S., Russell S.J. (1996). Optimal composition of real-time systems. Artificial Intelligence 82(1\u20132):181\u2013213","journal-title":"Artificial Intelligence"},{"key":"9008_CR61","unstructured":"Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. International Conference in Machine Learning, pp. 929\u2013936."}],"container-title":["Autonomous Agents and Multi-Agent Systems"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10458-006-9008-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10458-006-9008-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10458-006-9008-z","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,8,7]],"date-time":"2021-08-07T06:43:53Z","timestamp":1628318633000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10458-006-9008-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,1,4]]},"references-count":61,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2007,8,16]]}},"alternative-id":["9008"],"URL":"https:\/\/doi.org\/10.1007\/s10458-006-9008-z","relation":{},"ISSN":["1387-2532","1573-7454"],"issn-type":[{"value":"1387-2532","type":"print"},{"value":"1573-7454","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,1,4]]}}}