{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,29]],"date-time":"2025-09-29T08:14:11Z","timestamp":1759133651184},"reference-count":50,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2007,1,20]],"date-time":"2007-01-20T00:00:00Z","timestamp":1169251200000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Appl Intell"],"published-print":{"date-parts":[[2007,10,1]]},"DOI":"10.1007\/s10489-006-0034-y","type":"journal-article","created":{"date-parts":[[2007,1,19]],"date-time":"2007-01-19T21:19:29Z","timestamp":1169241569000},"page":"249-267","source":"Crossref","is-referenced-by-count":14,"title":["A layered approach to learning coordination knowledge in multiagent environments"],"prefix":"10.1007","volume":"27","author":[{"given":"Guray","family":"Erus","sequence":"first","affiliation":[]},{"given":"Faruk","family":"Polat","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2007,1,20]]},"reference":[{"key":"34_CR1","doi-asserted-by":"crossref","unstructured":"Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press","DOI":"10.1016\/S1474-6670(17)38315-5"},{"key":"34_CR2","doi-asserted-by":"crossref","unstructured":"Stone P, Veloso M (1997) Multiagent systems: a survey from a machine learning perspective. Tech Rep, Mellon University","DOI":"10.21236\/ADA333248"},{"key":"34_CR3","unstructured":"Sen S, Sekaran M, Hale J (1994) Learning to coordinate without sharing information. In: Proceedings of the 12th national conference on artificial intelligence, pp 426\u2013431"},{"key":"34_CR4","doi-asserted-by":"crossref","unstructured":"Tan M (1993) Multi-agent reinforcement learning: independent vs. cooperative agents. In: Proceedings of the tenth international conference on machine learning, pp 330\u2013337","DOI":"10.1016\/B978-1-55860-307-3.50049-6"},{"key":"34_CR5","unstructured":"Kuter U, Polat F (2000) Learning better in dynamic, partially observable environment. In: Lindemann G (eds) Proc. of European conf. on artificial intelligence (ECAI) workshop on modeling artificial societies and hybrid organization, Berlin, pp 50\u201368"},{"issue":"1","key":"34_CR6","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1023\/A:1015009422110","volume":"17","author":"F Polat","year":"2002","unstructured":"Polat F, Abul O (2002) Learning sequences of compatible actions among agents. Artif Intell Rev 17(1): 21\u201337","journal-title":"Artif Intell Rev"},{"key":"34_CR7","unstructured":"Weiss G (1993) Learning to coordinate actions in multiagent systems. In: Proceedings of the 13th international joint conference on artificial intelligence, pp 311\u2013316"},{"key":"34_CR8","unstructured":"Korf RE (1992) A simple solution to pursuit games. In: Working papers of the 11th international workshop on distributed artificial intelligence, pp 183\u2013194"},{"key":"34_CR9","unstructured":"Durfee E, Vidal MJ (1995) Recursive agent modeling using limited rationality. In: Proceedings of the first international conference on multiagent systems (ICMAS\u201995), pp 376\u2013383"},{"key":"34_CR10","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1007\/3-540-60923-7_22","volume-title":"Adaptation and learning in multiagent systems","author":"T Haynes","year":"1996","unstructured":"Haynes T, Sen S (1996) Evolving behavioral strategies in predators and prey. In: Weiss G, Sen S (eds) Adaptation and learning in multiagent systems, Springer Verlag, Berlin, pp 113\u2013126"},{"key":"34_CR11","unstructured":"Tusscher KHWJ, Hagen SHG, Wiering MA (2000) The influence of communication on the choice to behave cooperatively. In: Proc. of 10th Belgian-Dutch conference on machine learning (BENELEARN\u20192000)"},{"issue":"2","key":"34_CR12","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1023\/A:1019935502139","volume":"18","author":"S Senkul","year":"2002","unstructured":"Senkul S, Polat F (2002) Learning intelligent behavior in a non-stationary and partially observable environment. Artif Intell Rev 18(2):97\u2013115","journal-title":"Artif Intell Rev"},{"issue":"4","key":"34_CR13","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1109\/5326.897075","volume":"30","author":"O Abul","year":"2000","unstructured":"Abul O, Polat F (2000) Multi-agent reinforcement learning using function approximation. IEEE Trans Syst, Man and Cybern, Part C, 30(4):485\u2013497","journal-title":"IEEE Trans Syst, Man and Cybern, Part C"},{"key":"34_CR14","unstructured":"Park M, Choi J (2002) New reinforcement learning method using multiple q-tables. In: world multiconference on systemics, cybernetics and informatics, pp 88\u201392"},{"key":"34_CR15","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1613\/jair.301","volume":"4","author":"LP Kaelbling","year":"1996","unstructured":"Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237\u2013285","journal-title":"J Artif Intell Res"},{"key":"34_CR16","volume-title":"Dynamic programming: deterministic and stochastic models","author":"DP Bertsekas","year":"1987","unstructured":"Bertsekas DP (1987) Dynamic programming: deterministic and stochastic models. Prentice-Hall, Englewood Cliffs, NJ"},{"key":"34_CR17","volume-title":"Dynamic programming and optimal control","author":"DP Bertsekas","year":"1995","unstructured":"Bertsekas DP (1995) Dynamic programming and optimal control. Athena scientific, Belmont, MA"},{"key":"34_CR18","first-page":"175","volume-title":"Learning in graphical models","author":"DJC MacKay","year":"1999","unstructured":"MacKay DJC (1999) Introduction to monte carlo methods. In: Jordan M (eds) Learning in graphical models. MIT Press, Cambridge, MA: pp 175\u2013204"},{"key":"34_CR19","unstructured":"Watkins CJCH (1989) Learning from delayed rewards. Ph.D. dissertation, Cambridge University"},{"key":"34_CR20","first-page":"25","volume-title":"Distributed artificial intelligence meets machine learning\u2013learning in multiagent systems, vol. 1221","author":"N Ono","year":"1997","unstructured":"Ono N, Fukomoto K (1997) A modular approach to multiagent reinforcement learning. In: Weiss G (eds) Distributed artificial intelligence meets machine learning\u2013learning in multiagent systems, vol. 1221, Springer-Verlag, Berlin, Germany, pp 25\u201339"},{"key":"34_CR21","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1016\/S0921-8890(01)00114-2","volume":"35","author":"K Park","year":"2001","unstructured":"Park K, Kim YJ, Kim JH (2001) Modular Q-learning based multiagent cooperation for robot soccer. Robot Auton Syst 35:109\u2013122","journal-title":"Robot Auton Syst"},{"key":"34_CR22","doi-asserted-by":"crossref","unstructured":"Kaya M, Alhajj R (2004) Modular fuzzy-reinforcement learning approach with internal model capabilities for multiagent systems. IEEE Trans. on Syst, Man, and Cybern, Part B 34(2): 1210\u20131223","DOI":"10.1109\/TSMCB.2003.821869"},{"key":"34_CR23","volume-title":"The theory of learning in games","author":"D Fudenberg","year":"1998","unstructured":"Fudenberg D, Levine D (1998) The theory of learning in games. MIT Press, Cambridge, MA"},{"key":"34_CR24","unstructured":"Littman ML (1994) Markov games as a framework for multiagent learning. In: Proceedings of the international conference on machine learning. San francisco, CA, pp 157\u2013163"},{"key":"34_CR25","doi-asserted-by":"crossref","first-page":"2017","DOI":"10.1162\/089976699300016070","volume":"8","author":"C Szepesvari","year":"1999","unstructured":"Szepesvari C, Littman ML (1999) A unified analysis of value-function-based reinforcement learning algorithms. Neur Comput 8:2017\u20132059","journal-title":"Neur Comput"},{"key":"34_CR26","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1016\/S1389-0417(01)00016-X","volume":"2","author":"J Hu","year":"2001","unstructured":"Hu J, Wellman MP (2001) Learning about other agents in a dynamic multiagent system. J Cognit Syst Res 2:67\u201379","journal-title":"J Cognit Syst Res"},{"key":"34_CR27","unstructured":"Claus C, Boutilier C (1998) The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the national conference on artificial intelligence, pp 746\u2013752"},{"key":"34_CR28","first-page":"1039","volume":"4","author":"J Hu","year":"2003","unstructured":"Hu J, Wellman M (2003) Nash Q-learning for general-sum stochastic games. J Mach Learn Res 4:1039\u20131069","journal-title":"J Mach Learn Res"},{"key":"34_CR29","unstructured":"Littman ML (2001) Friend-or-foe: Q-Learning in general-sum games. In: Proceedings of the international conference on machine learning, pp 322\u2013328"},{"key":"34_CR30","unstructured":"Shoham Y, Powers R, Grenager T (2003) Multiagent reinforcement learning: a critical survey. Computer Science Department, Tech Rep Stanford University"},{"key":"34_CR31","unstructured":"Hu J (2003) Best-response algorithm for multiagent reinforcement learning. In: Proceedings of the international conference on machine learning"},{"key":"34_CR32","unstructured":"Weinberg M, Rosenschein JS (2004) Best-response multiagent learning in non-stationary environments. In: Proceedings of international joint conference on autonomous agents and multiagent systems, pp 506\u2013513"},{"key":"34_CR33","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1016\/S0004-3702(02)00121-2","volume":"136","author":"M Bowling","year":"2001","unstructured":"Bowling M, Veloso M (2001) Multiagent learning using a variable learning rate. Artif Intell 136:215\u2013250","journal-title":"Artif Intell"},{"key":"34_CR34","unstructured":"Strens M (2000) A bayesian framework for reinforcement learning. In: Proceedings of international conference on machine learning, Stanford University, CA"},{"key":"34_CR35","doi-asserted-by":"crossref","unstructured":"Chalkiadakis G, Boutilier C (2003) Coordination in multiagent reinforcement learning: a bayesian approach. In: Proceedings of international joint conference on autonomous agents and multiagent systems, Melbourne, Australia, pp 709\u2013716","DOI":"10.1145\/860575.860689"},{"issue":"4","key":"34_CR36","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1023\/A:1025696116075","volume":"13","author":"A Barto","year":"2003","unstructured":"Barto A, Mahadevan S (2003) Recent advances in hierarchical reinforcement learning. Discr Event Dynam Syst 13(4):341\u2013379","journal-title":"Discr Event Dynam Syst"},{"key":"34_CR37","unstructured":"Marthi B, Russell S, Latham D, Guestrin C (2005) Concurrent hierarchical reinforcement learning. In: The twentieth international joint conference on artificial intelligence, IJ CAI, (accepted for presentation), Edinburgh, Scotland"},{"issue":"1\u20132","key":"34_CR38","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1016\/S0004-3702(99)00052-1","volume":"112","author":"R Sutton","year":"1999","unstructured":"Sutton R, Precup D, Singh S (1999) Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artif intell 112(1\u20132):181\u2013211","journal-title":"Artif intell"},{"key":"34_CR39","unstructured":"Parr R, Russell S (1998) Reinforcement learning with hierarchies of machines. In: Advances in neural information processing systems: Proc. of the 1997 conference. MIT Press, Cambridge, MA"},{"key":"34_CR40","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1613\/jair.639","volume":"9","author":"T Dietterich","year":"2000","unstructured":"Dietterich T (2000) Hierarchical reinforcement learning with the maxq value function decomposition. J Artif Intell Res 9:227\u2013303","journal-title":"J Artif Intell Res"},{"key":"34_CR41","doi-asserted-by":"crossref","unstructured":"Menache I, Mannor S, Shimkin N (2002) Q-cut dynamic discovery of sub-goals in reinforcement learning. In: Proc. of European conference on machine learning, ECML\u201902, Springer-Verlag, London, UK, pp 295\u2013306","DOI":"10.1007\/3-540-36755-1_25"},{"key":"34_CR42","doi-asserted-by":"crossref","unstructured":"Stolle M, Precup D (2002) Learning options in reinforcement learning. In Proc. of the int\u2019l symposium on abstarction, reformulation and approximation. Springer-Verlag, London, UK, pp 212\u2013223","DOI":"10.1007\/3-540-45622-8_16"},{"key":"34_CR43","doi-asserted-by":"crossref","unstructured":"Simsek O, Barto A (2004) Using relative novelty to identify useful temporal abstractions in reinforcement learning. In: Proc. of int\u2019l conference on machine learning, ICML\u201904, Banff, Canada","DOI":"10.1145\/1015330.1015353"},{"issue":"3\u20134","key":"34_CR44","first-page":"293","volume":"8","author":"L Lin","year":"1992","unstructured":"Lin L (1992) Self-improving reactive agents based on in reinforcement learning, planning and teaching. Mach Learn 8(3\u20134):293\u2013321","journal-title":"Mach Learn"},{"key":"34_CR45","unstructured":"Picklett M, Barto A (2002) An algorithm for creating useful macro-actions in reinforcement learning. In: Proc. of int\u2019l conference on machine learning, ICML\u201902"},{"key":"34_CR46","unstructured":"Girgin S, Polat F, Alhajj R (2006) Learning by automatic option discovery from conditionally terminating sequences. In: Proc. of the 17th European conference on artificial intelligence (ECAI), Riva del garda, Italy"},{"key":"34_CR47","doi-asserted-by":"crossref","unstructured":"Girgin S, Polat F (2005) Option discovery in reinforcement learning using frequent subsequences of actions. In: Proc. of international conference on intelligent agents web technologies and internet commerce, IAWTIC. IEEE, Vienna, Austria","DOI":"10.1109\/CIMCA.2005.1631294"},{"key":"34_CR48","unstructured":"Girgin S, Polat F, Alhajj R (2007) State similarity based approach for improving performance in rl. In: The twentieth international joint conference on artificial intelligence IJCAI, (Accepted for presentation), Hyderabad, India"},{"key":"34_CR49","doi-asserted-by":"crossref","unstructured":"Ghavamzadeh M, Mahadevan S, Makar R (2006) Hierarchical multiagent reinforcement learning. J Auton Agents Multiagent Syst 13(2):197\u2013229, DOI: 10.1007\/s10458-006-7035-4","DOI":"10.1007\/s10458-006-7035-4"},{"key":"34_CR50","unstructured":"Rovatsos M, Fischer F, Weiss G (2004) Hierarchical reinforcement learning for communicating agents. In: Hoe WVD (eds) Proceedings of the 2nd European Workshop on Multiagent Systems (EUMAS), pp 593\u2013604"}],"container-title":["Applied Intelligence"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-006-0034-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10489-006-0034-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-006-0034-y","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,5,29]],"date-time":"2019-05-29T18:25:38Z","timestamp":1559154338000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10489-006-0034-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,1,20]]},"references-count":50,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2007,10,1]]}},"alternative-id":["34"],"URL":"https:\/\/doi.org\/10.1007\/s10489-006-0034-y","relation":{},"ISSN":["0924-669X","1573-7497"],"issn-type":[{"value":"0924-669X","type":"print"},{"value":"1573-7497","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,1,20]]}}}