{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,27]],"date-time":"2025-03-27T20:33:08Z","timestamp":1743107588195,"version":"3.40.3"},"publisher-location":"Berlin, Heidelberg","reference-count":100,"publisher":"Springer Berlin Heidelberg","isbn-type":[{"type":"print","value":"9783642398742"},{"type":"electronic","value":"9783642398759"}],"license":[{"start":{"date-parts":[[2013,1,1]],"date-time":"2013-01-01T00:00:00Z","timestamp":1356998400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2013,1,1]],"date-time":"2013-01-01T00:00:00Z","timestamp":1356998400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2013]]},"DOI":"10.1007\/978-3-642-39875-9_2","type":"book-chapter","created":{"date-parts":[[2013,11,13]],"date-time":"2013-11-13T14:29:24Z","timestamp":1384352964000},"page":"13-46","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Behavioral Hierarchy: Exploration and Representation"],"prefix":"10.1007","author":[{"given":"Andrew G.","family":"Barto","sequence":"first","affiliation":[]},{"given":"George","family":"Konidaris","sequence":"additional","affiliation":[]},{"given":"Christopher","family":"Vigorito","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2013,9,28]]},"reference":[{"key":"2_CR1","doi-asserted-by":"publisher","first-page":"338","DOI":"10.1007\/3-540-45657-0_25","volume-title":"Computer aided verification: 14th international conference, proceedings (Lecture notes in computer science)","author":"R. Alur","year":"2002","unstructured":"Alur, R., McDougall, M., Yang, Z. (2002). Exploiting behavioral hierarchy for efficient model checking. In E. Brinksma & K. G. Larsen (Eds.), Computer aided verification: 14th international conference, proceedings (Lecture notes in computer science) (pp. 338\u2013342). Berlin: Springer."},{"unstructured":"Amarel, S. (1981). Problems of representation in heuristic problemsolving: related issues in the development ofexpert systems. Technical Report CBM-TR-118, Laboratory for Computer Science, Rutgers University, New Brunswick NJ.","key":"2_CR2"},{"key":"2_CR3","doi-asserted-by":"publisher","first-page":"1036","DOI":"10.1037\/0033-295X.111.4.1036","volume":"111","author":"J. R. Anderson","year":"2004","unstructured":"Anderson, J.\u00a0R. (2004). An integrated theory of mind. Psychological Review, 111, 1036\u20131060.","journal-title":"Psychological Review"},{"volume-title":"An introduction to intelligent and autonomous control","year":"1993","unstructured":"Antsaklis, P.\u00a0J., & Passino, K.\u00a0M. (Eds.), (1993). An introduction to intelligent and autonomous control. Norwell MA: Kluwer.","key":"2_CR4"},{"key":"2_CR5","first-page":"438","volume-title":"Proceedings of the 8-th conference on intelligent autonomous systems, IAS-8","author":"B. Bakker","year":"2004","unstructured":"Bakker, B., & Schmidhuber, J. (2004). Hierarchical reinforcement learning based on subgoal discovery and subpolicy specialization. In F. Groen, N. Amato, A. Bonarini, E. Yoshida, B.\u00a0Kr\u00f6se (Eds.), Proceedings of the 8-th conference on intelligent autonomous systems, IAS-8 (pp. 438\u2013445). Amsterdam, The Netherlands: IOS."},{"volume-title":"Intrinsically motivated learning in natural and artificial systems","year":"2012","unstructured":"Baldassarre, G., & Mirolli, M. (Eds.), (2012). Intrinsically motivated learning in natural and artificial systems. Berlin: Springer.","key":"2_CR6"},{"doi-asserted-by":"crossref","unstructured":"Barto, A., Singh, S., Chentanez, N. (2004). Intrinsically motivated learning of hierarchical collections of skills. In J. Triesch & T. Jebara (Eds.), Proceedings of the 2004 international conference on development and learning (pp. 112\u2013119). UCSD Institute for Neural Computation.","key":"2_CR7","DOI":"10.21236\/ADA440280"},{"key":"2_CR8","volume-title":"Intrinsically motivated learning in natural and artificial system","author":"A. G. Barto","year":"2012","unstructured":"Barto, A.\u00a0G. (2012). Intrinsic motivation and reinforcement learning. In G. Baldassarre & M.\u00a0Miroll (Eds.), Intrinsically motivated learning in natural and artificial system. Berlin: Springer."},{"key":"2_CR9","doi-asserted-by":"publisher","first-page":"341","DOI":"10.1023\/A:1025696116075","volume":"13","author":"A. G. Barto","year":"2003","unstructured":"Barto, A.\u00a0G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamcal Systems: Theory and Applications, 13, 341\u2013379.","journal-title":"Discrete Event Dynamcal Systems: Theory and Applications"},{"key":"2_CR10","volume-title":"Dynamic programming","author":"R. E. Bellman","year":"1957","unstructured":"Bellman, R.\u00a0E. (1957). Dynamic programming. Princeton: Princeton University Press."},{"unstructured":"Bernstein, D.\u00a0S. (1999). Reusing old policies to accelerate learning on new MDPs. Technical Report Technical Report UM-CS-1999-026, Department of Computer Science, University of Massachusetts Amherst.","key":"2_CR11"},{"key":"2_CR12","doi-asserted-by":"publisher","first-page":"395","DOI":"10.1037\/0033-295X.111.2.395","volume":"111","author":"M. M. Botvinick","year":"2004","unstructured":"Botvinick, M.\u00a0M., & Plaut, D.\u00a0C. (2004). Doing without schema hierarchies: a recurrent connectionist approach to normal and impaired routine sequential action. Psychological Review, 111, 395\u2013429.","journal-title":"Psychological Review"},{"key":"2_CR13","doi-asserted-by":"publisher","first-page":"262","DOI":"10.1016\/j.cognition.2008.08.011","volume":"113","author":"M. M. Botvinick","year":"2009","unstructured":"Botvinick, M.\u00a0M., Niv, Y., Barto, A.\u00a0G. (2009). Hierarchically organized behavior and its neural foundations: a reinforcement-learning perspective. Cognition, 113, 262\u2013280.","journal-title":"Cognition"},{"key":"2_CR14","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1016\/S0004-3702(00)00033-3","volume":"121","author":"C. Boutilier","year":"2000","unstructured":"Boutilier, C., Dearden, R., Goldszmdt, M. (2000). Stochastic dynamic programming with factored representations. Artificial Intelligence, 121, 49\u2013107.","journal-title":"Artificial Intelligence"},{"key":"2_CR15","first-page":"52","volume-title":"UAI \u201991: proceedings of the seventh annual conference on uncertainty in artificial intelligence","author":"W. Buntine","year":"1991","unstructured":"Buntine, W. (1991). Theory refinement on Bayesian networks. In B. D\u2019Ambrosio & P. Smets (Eds.), UAI \u201991: proceedings of the seventh annual conference on uncertainty in artificial intelligence (pp. 52\u201360). San Francisco: Morgan Kaufmann."},{"key":"2_CR16","doi-asserted-by":"publisher","first-page":"534","DOI":"10.1177\/02783649922066385","volume":"18","author":"R. R. Burridge","year":"1999","unstructured":"Burridge, R.\u00a0R., Rizzi, A.\u00a0A., Koditschek, D.\u00a0E. (1999). Sequential composition of dynamically dextrous robot behaviors. International Journal of Robotics Research, 18, 534\u2013555.","journal-title":"International Journal of Robotics Research"},{"key":"2_CR17","doi-asserted-by":"crossref","first-page":"3","DOI":"10.7551\/mitpress\/4734.003.0006","volume-title":"Modularity: understanding the development and evolution of natural complex systems","author":"W. Callebaut","year":"2005","unstructured":"Callebaut, W. (2005). The ubiquity of modularity. In W. Callebaut & D. Rasskin-Gutman (Eds.), Modularity: understanding the development and evolution of natural complex systems (pp. 3\u201328). Cambridge: MIT."},{"volume-title":"Modularity: understanding the development and evolution of natural complex systems","year":"2005","unstructured":"Callebaut, W., & Rasskin-Gutman, D. (Eds.) (2005). Modularity: understanding the development and evolution of natural complex systems. Cambridge: MIT.","key":"2_CR18"},{"key":"2_CR19","first-page":"1679","volume-title":"Machine learning, proceedings of the 29th international conference (ICML 2012)","author":"B. C. da Silva","year":"2012","unstructured":"da Silva, B.\u00a0C., Konidaris, G., & Barto, A.\u00a0G. (2012). Learning parameterized skills. In J. Langford\u00a0& J. Pineau (Eds.), Machine learning, proceedings of the 29th international conference (ICML 2012) (pp. 1679\u20131686). Omnipress: Edinburgh."},{"key":"2_CR20","doi-asserted-by":"publisher","first-page":"142","DOI":"10.1111\/j.1467-8640.1989.tb00324.x","volume":"5","author":"T. L. Dean","year":"1989","unstructured":"Dean, T.\u00a0L., & Kanazawa, K. (1989). A model for reasoning about persistence and causation. Computational Intelligence, 5, 142\u2013150.","journal-title":"Computational Intelligence"},{"doi-asserted-by":"crossref","unstructured":"Degris, T., Sigaud, O., Wuillemin, P.\u00a0H. (2006). Learning the structure of factored Markov decision processes in reinforcement learning problems. In W. W. Cohen & A. Moore (Eds.), Machine learning, proceedings of the twenty-third international conference (ICML 2006). ACM international conference proceeding series (vol. 148, pp. 257\u2013264). New York: ACM.","key":"2_CR21","DOI":"10.1145\/1143844.1143877"},{"key":"2_CR22","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1613\/jair.639","volume":"13","author":"T. G. Dietterich","year":"2000","unstructured":"Dietterich, T.\u00a0G. (2000a). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227\u2013303.","journal-title":"Journal of Artificial Intelligence Research"},{"key":"2_CR23","first-page":"994","volume-title":"Advances in neural information processing systems 12","author":"T. G. Dietterich","year":"2000","unstructured":"Dietterich, T.\u00a0G. (2000b). State abstraction in MAXQ hierarchical reinforcement learning. In S.\u00a0A.\u00a0Solla, T.\u00a0K. Leen, K.-R. M\u00fcller (Eds.), Advances in neural information processing systems 12 (pp. 994\u20131000). Cambridge: MIT."},{"key":"2_CR24","doi-asserted-by":"crossref","first-page":"363","DOI":"10.7551\/mitpress\/3118.003.0044","volume-title":"From animals to animats 4: proceedings of the fourth international conference on simulation of adaptive behavior","author":"B. Digney","year":"1996","unstructured":"Digney, B. (1996). Emergent hierarchical control structures: learning reactive\/hierarchical relationships inreinforcement environments. In P. Meas, M. Mataric, J.-A. Meyer, J. Pollack, S.\u00a0W. Wilson (Eds.), From animals to animats 4: proceedings of the fourth international conference on simulation of adaptive behavior (pp. 363\u2013372). Cambridge: MIT."},{"doi-asserted-by":"crossref","unstructured":"Diuk, C., Li, L., Leffler, B. (2009). The adaptive k-meteorologists problems and its application to structure learning and feature selection in reinforcement learning. In A.\u00a0P. Danyluk, L. Bottou, M.\u00a0L. Littman (Eds.), Proceedings of the 26th annual international conference on machine learning, ICML 2009. ACM international conference proceeding series (vol. 382, pp. 249\u2013256). New York: ACM.","key":"2_CR25","DOI":"10.1145\/1553374.1553406"},{"key":"2_CR26","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1016\/0004-3702(72)90051-3","volume":"3","author":"R. E. Fikes","year":"1972","unstructured":"Fikes, R.\u00a0E., Hart, P.\u00a0E., Nilsson, N.\u00a0J. (1972). Learning and executing generalized robot plans. Artificial Intelligence, 3, 251\u2013288.","journal-title":"Artificial Intelligence"},{"key":"2_CR27","first-page":"139","volume-title":"UAI \u201998: proceedings of the fourteenth conference on uncertainty in artificial intelligence","author":"N. Friedman","year":"1998","unstructured":"Friedman, N., Murphy, K., Russell, S. (1998). Learning the structure of dynamic probabilistic networks. In G.\u00a0F. Cooper & S. Moral (Eds.), UAI \u201998: proceedings of the fourteenth conference on uncertainty in artificial intelligence (pp. 139\u2013147). San Francisco: Morgan Kaufmann."},{"key":"2_CR28","first-page":"1003","volume-title":"IJCAI-03, Proceedings of the eighteenth international joint conference on artificial intelligence","author":"C. Guestrin","year":"2003","unstructured":"Guestrin, C., Koller, D., Gearhart, C., Kanodia, N. (2003). Generalizing plans to new environments in relational MDPs. In IJCAI-03, Proceedings of the eighteenth international joint conference on artificial intelligence (pp. 1003\u20131010). San Francisco: Morgan Kaufmann."},{"doi-asserted-by":"crossref","unstructured":"Hart, S., & Grupen, R. (2011). Learning generalizable control programs. IEEE Transactions on Autonomous Mental Development, 3, 216\u2013231. Special Issue on Representations and Architectures for Cognitive Systems.","key":"2_CR29","DOI":"10.1109\/TAMD.2010.2103311"},{"key":"2_CR30","volume-title":"Intrinsically motivated learning in natural and artificial systems","author":"S. Hart","year":"2012","unstructured":"Hart, S., & Grupen, R. (2012). Intrinsically motivated affordance discovery and modeling. In G. Baldassarre & M. Mirolli (Eds.), Intrinsically motivated learning in natural and artificial systems. Berlin: Springer."},{"key":"2_CR31","first-page":"197","volume":"20","author":"D. Heckerman","year":"1995","unstructured":"Heckerman, D., Geiger, D., Chickering, D. (1995). Learning Bayesian networks: the combination of knowledge and statistical data. Machine Learning, 20, 197\u2013243.","journal-title":"Machine Learning"},{"key":"2_CR32","first-page":"243","volume-title":"Machine learning, proceedings of the nineteenth international conference (ICML 2002)","author":"B. Hengst","year":"2002","unstructured":"Hengst, B. (2002). Discovering hierarchy in reinforcement learning with HEXQ. In C. Sammut & A. G. Hoffmann (Eds.), Machine learning, proceedings of the nineteenth international conference (ICML 2002) (pp. 243\u2013250). San Francisco: Morgan Kaufmann."},{"key":"2_CR33","doi-asserted-by":"publisher","first-page":"303","DOI":"10.1016\/S0921-8890(97)00044-4","volume":"22","author":"M. Huber","year":"1997","unstructured":"Huber, M., & Grupen, R.\u00a0A. (1997). A feedback control structure for on-line learning tasks. Robotics and Autonomous Systems, 22, 303\u2013315.","journal-title":"Robotics and Autonomous Systems"},{"key":"2_CR34","first-page":"67","volume-title":"Perceiving, acting, and knowing: toward an ecological psychology","author":"J. Gibson","year":"1977","unstructured":"Gibson, J. (1977). The theory of affordances. In R. Shaw & J. Bransford (Eds.), Perceiving, acting, and knowing: toward an ecological psychology (pp. 67\u201382). Hillsdale: Lawrence Erlbaum."},{"key":"2_CR35","first-page":"1054","volume-title":"Advances in neural information processing systems 14: proceedings of the 2001 neural information processing systems (NIPS) conference","author":"A. Jonsson","year":"2002","unstructured":"Jonsson, A., & Barto, A.\u00a0G. (2002). Automated state abstraction for options using the U-tree algorithm. In T.\u00a0G. Dietterich, S. Becker, Z. Ghahramani (Eds.), Advances in neural information processing systems 14: proceedings of the 2001 neural information processing systems (NIPS) conference (pp. 1054\u20131060). Cambridge: MIT."},{"key":"2_CR36","first-page":"2259","volume":"7","author":"A. Jonsson","year":"2006","unstructured":"Jonsson, A., & Barto, A.\u00a0G. (2006). Causal graph based decomposition of factored mdps. Journal of Machine Learning Research, 7, 2259\u20132301.","journal-title":"Journal of Machine Learning Research"},{"doi-asserted-by":"crossref","unstructured":"Jonsson, A., & Barto, A.\u00a0G. (2007). Active learning of dynamic Bayesian networks in Markov decision processes. In I. Miguel & W. Rumi (Eds.), Proceedings of Abstraction, reformulation, and approximation, 7th international symposium, SARA 2007, Whistler, Canada, July 18\u201321, 2007. Lecture notes in computer science: abstraction, reformulation, and approximation (vol. 4612, pp. 273\u2013284). Berlin: Springer.","key":"2_CR37","DOI":"10.1007\/978-3-540-73580-9_22"},{"key":"2_CR38","first-page":"895","volume-title":"IJCAI 2007, proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, India, 6\u201312 January 2007","author":"G. Konidaris","year":"2007","unstructured":"Konidaris, G., & Barto, A. (2007). Building portable options: Skill transfer in reinforcement learning. In M. Veloso (Ed.), IJCAI 2007, proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, India, 6\u201312 January 2007 (pp. 895\u2013900). Menlo Park: AAAI Press."},{"key":"2_CR39","first-page":"1107","volume-title":"IJCAI 2009, Proceedings of the 21st international joint conference on artificial intelligence, Pasadena, California, USA, 11\u201317 July 2009","author":"G. Konidaris","year":"2009","unstructured":"Konidaris, G., & Barto, A. (2009a). Efficient skill learning using abstraction selection. In C.\u00a0Boutilier (Ed.), IJCAI 2009, Proceedings of the 21st international joint conference on artificial intelligence, Pasadena, California, USA, 11\u201317 July 2009 (pp. 1107\u20131112). Menlo Park: AAAI Press."},{"unstructured":"Konidaris, G., & Barto, A. (2009b). Skill discovery in continuous reinforcement learning domains using skill chaining. In Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, A. Culotta (Eds.), Proceedings of the 2009 conference of Advances in neural information processing systems 22 (pp. 1015\u20131023). NIPS Foundation.","key":"2_CR40"},{"key":"2_CR41","first-page":"1333","volume":"13","author":"G. Konidaris","year":"2012","unstructured":"Konidaris, G., Barto, A., Scheidwasser, I. (2012a). Transfer in reinforcement learning via shared features. Journal of Machine Learning Research, 13, 1333\u20131371.","journal-title":"Journal of Machine Learning Research"},{"key":"2_CR42","first-page":"1468","volume-title":"Proceedings of the twenty-fifth AAAI conference on artificial intelligence, AAAI 2011","author":"G. Konidaris","year":"2011","unstructured":"Konidaris, G., Kuindersma, S., Grupen, R., Barto, A. (2011a). Autonomous skill acquisition on a mobile manipulator. In W. Burgard & D. Roth (Eds.), Proceedings of the twenty-fifth AAAI conference on artificial intelligence, AAAI 2011 (pp. 1468\u20131473). San Francisco: AAAI."},{"key":"2_CR43","doi-asserted-by":"publisher","first-page":"360","DOI":"10.1177\/0278364911428653","volume":"31","author":"G. Konidaris","year":"2012","unstructured":"Konidaris, G., Kuindersma, S., Grupen, R., Barto, A. (2012b). Robot learning from demonstration by constructing skill trees. The International Journal of Robotics Research, 31, 360\u2013375.","journal-title":"The International Journal of Robotics Research"},{"key":"2_CR44","first-page":"380","volume-title":"Proceedings of the twenty-fifth AAAI conference on artificial intelligence, AAAI 2011","author":"G. Konidaris","year":"2011","unstructured":"Konidaris, G., Osentoski, S., Thomas, P. (2011b). Value function approximation in reinforcement learning using the Fourier basis. In W. Burgard & D. Roth (Eds.), Proceedings of the twenty-fifth AAAI conference on artificial intelligence, AAAI 2011 (pp. 380\u2013385). San Francisco: AAAI."},{"unstructured":"Konidaris, G.\u00a0D. (2011). Autonomous robot skill acquisition. PhD thesis, Computer Science, University of Massachusetts Amherst.","key":"2_CR45"},{"key":"2_CR46","volume-title":"Learning to solve problems by searching for macro-operators","author":"R. E. Korf","year":"1985","unstructured":"Korf, R.\u00a0E. (1985). Learning to solve problems by searching for macro-operators. Boston: Pitman."},{"key":"2_CR47","doi-asserted-by":"publisher","first-page":"316","DOI":"10.1016\/j.cogsys.2008.07.003","volume":"10","author":"P. Langley","year":"2009","unstructured":"Langley, P., Choi, D., Rogers, S. (2009). Acquisition of hierarchical reactive skills in a unified cognitive architecture. Cognitive Systems Research, 10, 316\u2013332.","journal-title":"Cognitive Systems Research"},{"unstructured":"Langley, P., & Rogers, S. (2004). Cumulative learning of hierarchical skills. In J. Triesch & T.\u00a0Jebara (Eds.), Proceedings of the 2004 international conference on development and learning (pp. 1\u20138). UCSD Institute for Neural Computation.","key":"2_CR48"},{"key":"2_CR49","first-page":"112","volume-title":"Cerebral mechanisms in behavior: the Hixon symposium","author":"K. S. Lashley","year":"1951","unstructured":"Lashley, K.\u00a0S. (1951). The problem of serial order in behavior. In L.\u00a0A. Jeffress (Ed.), Cerebral mechanisms in behavior: the Hixon symposium (pp. 112\u2013136). New York: Wiley."},{"doi-asserted-by":"crossref","unstructured":"Lewis, F.\u00a0L., & Vrabie, D. (2009). Reinforcement learning and adaptive dynamic programming for feedback control. In IEEE circuits and systems magazine (vol.\u00a09, pp. 32\u201350). IEEE Circuits and Systems Society.","key":"2_CR50","DOI":"10.1109\/MCAS.2009.933854"},{"unstructured":"Li, L., Walsh, T., Littman, M. (2006). Towards a unified theory of state abstraction for MDPs. In International symposium on artificial intelligence and mathematics (ISAIM 2006), Fort Lauderdale, Florida, USA, 4\u20136 January 2006.","key":"2_CR51"},{"key":"2_CR52","first-page":"415","volume-title":"Proceedings, the twenty-first national conference on artificial intelligence and the eighteenth innovative applications of artificial intelligence conference","author":"Y. Liu","year":"2006","unstructured":"Liu, Y., & Stone, P. (2006). Value-function-based transfer for reinforcement learning using structure mapping. In Proceedings, the twenty-first national conference on artificial intelligence and the eighteenth innovative applications of artificial intelligence conference (pp. 415\u2013420). San Francisco: AAAI."},{"key":"2_CR53","volume-title":"Learning representation and control in Markov decision processes: new frontiers. Foundations and trends in machine learning (vol. 1)","author":"S. Mahadevan","year":"2009","unstructured":"Mahadevan, S. (2009). Learning representation and control in Markov decision processes: new frontiers. Foundations and trends in machine learning (vol. 1). Hanover: Now Publishers Inc."},{"key":"2_CR54","volume-title":"Dynamic abstraction in reinforcement learning via clustering. In C. E. Brodley (Ed.), Machine learning, proceedings of the twenty-first international conference (ICML 2004). ACM international conference proceeding series (vol. 69, pp. 560\u2013567)","author":"S. Mannor","year":"2004","unstructured":"Mannor, S., Menache, I., Hoze, A., Klein, U. (2004). Dynamic abstraction in reinforcement learning via clustering. In C.\u00a0E. Brodley (Ed.), Machine learning, proceedings of the twenty-first international conference (ICML 2004). ACM international conference proceeding series (vol. 69, pp. 560\u2013567). New York: ACM."},{"unstructured":"McCallum, A.\u00a0K. (1996). Reinforcement learning with selective perception and hidden state. PhD thesis, University of Rochester.","key":"2_CR55"},{"key":"2_CR56","first-page":"361","volume-title":"Proceedings of the eighteenth international conference on machine learning (ICML 2001)","author":"A. McGovern","year":"2001","unstructured":"McGovern, A., & Barto, A. (2001). Automatic discovery of subgoals in reinforcement learning using diverse density. In C.\u00a0E. Brodley & A.\u00a0P. Danyluk (Eds.), Proceedings of the eighteenth international conference on machine learning (ICML 2001) (pp. 361\u2013368). San Francisco: Morgan Kaufmann."},{"key":"2_CR57","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1007\/s10994-008-5061-y","volume":"73","author":"N. Mehta","year":"2008","unstructured":"Mehta, N., Natarajan, S., Tadepalli, P. (2008). Transfer in variable-reward hierarchical reinforcement learning. Machine Learning, 73, 289\u2013312.","journal-title":"Machine Learning"},{"key":"2_CR58","volume-title":"Q-Cut \u2013 Dynamic discovery of sub-goals in reinforcement learning. In Machine learning: ECML 2002, 13th European conference on machine learning. Lecture notes in computer science (vol. 2430, pp. 295\u2013306)","author":"I. Menache","year":"2002","unstructured":"Menache, I., Mannor, S., Shimkin, N. (2002). Q-Cut\u00a0\u2013 Dynamic discovery of sub-goals in reinforcement learning. In Machine learning: ECML 2002, 13th European conference on machine learning. Lecture notes in computer science (vol. 2430, pp. 295\u2013306). Berlin: Springer."},{"key":"2_CR59","doi-asserted-by":"publisher","DOI":"10.1037\/10039-000","volume-title":"Plans and the structure of behavior","author":"G. A. Miller","year":"1960","unstructured":"Miller, G.\u00a0A., Galanter, E., Pribram, K.\u00a0H. (1960). Plans and the structure of behavior. New York: Holt, Rinehart & Winston."},{"key":"2_CR60","first-page":"1175","volume-title":"IJCAI 2009, Proceedings of the 21st international joint conference on artificial intelligence, Pasadena, California, USA, 11\u201317 July 2009","author":"J. Mugan","year":"2009","unstructured":"Mugan, J., & Kuipers, B. (2009). Autonomously learning an action hierarchy using a learned qualitative state representation. In C. Boutilier (Ed.), IJCAI 2009, Proceedings of the 21st international joint conference on artificial intelligence, Pasadena, California, USA, 11\u201317 July 2009 (pp. 1175\u20131180). Menlo Park: AAAI Press."},{"key":"2_CR61","volume-title":"Active learning of causal Bayes net structure","author":"K. Murphy","year":"2001","unstructured":"Murphy, K. (2001). Active learning of causal Bayes net structure. Technical report, Computer Science Division, University of California, Berkeley CA."},{"doi-asserted-by":"crossref","unstructured":"Neumann, G., Maass, W., Peters, J. (2009). Learning complex motions by sequencing simpler motion templates. In A.\u00a0P. Danyluk, L. Bottou, M.\u00a0L. Littman (Eds.), Proceedings of the 26th annual international conference on machine learning, ICML 2009. ACM international conference proceeding series (vol. 382, pp. 753\u2013760). New York: ACM.","key":"2_CR62","DOI":"10.1145\/1553374.1553471"},{"key":"2_CR63","first-page":"279","volume-title":"Computers and thought","author":"A. Newell","year":"1963","unstructured":"Newell, A., Shaw, J.\u00a0C., Simon, H.\u00a0A. (1963). GPS, a program that simulates human thought. In J. Feldman (Ed.), Computers and thought (pp. 279\u2013293). New York: McGraw-Hill."},{"unstructured":"Niekum, S., & Barto, A.\u00a0G. (2011). Clustering via Dirichlet process mixture models for portable skill discovery. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, K. Weinberger (Eds.), Advances in neural information processing systems 24 (NIPS) (pp. 1818\u20131826). Curran Associates.","key":"2_CR64"},{"unstructured":"Osentoski, S., & Mahadevan, S. (2010). Basis function construction for hierarchical reinforcement learning. In W. van der Hoek, G.\u00a0A. Kaminka, Y. Lesp\u00e9rance, M. Luck, S. Sen (Eds.), 9th international conference on autonomous agents and multiagent systems (AAMAS 2010) (pp. 747\u2013754). International Foundation for Autonomous Agents and MultiAgent Systems (IFAAMAS).","key":"2_CR65"},{"unstructured":"Parr, R. (1998). Hierarchical control and learning for Markov decision processes. PhD thesis, University of California, Berkeley CA.","key":"2_CR66"},{"key":"2_CR67","first-page":"1043","volume-title":"Advances in neural information processing systems 10: proceedings of the 1997 conference","author":"R. Parr","year":"1998","unstructured":"Parr, R., & Russell, S. (1998). Reinforcement learning with hierarchies of machines. In M.\u00a0I.\u00a0Jordan, M.\u00a0J. Kearns, S.\u00a0A. Solla (Eds.), Advances in neural information processing systems 10: proceedings of the 1997 conference (pp. 1043\u20131049). Cambridge: MIT."},{"key":"2_CR68","volume-title":"Causality: models, reasoning, and inference","author":"J. Pearl","year":"2000","unstructured":"Pearl, J. (2000). Causality: models, reasoning, and inference. Cambridge: Cambridge University Press."},{"unstructured":"Perkins, T.\u00a0J., & Precup, D. (1999). Using options for knowledge transfer in reinforcement learning. Technical Report UM-CS-1999-034, University of Massachusetts Amherst.","key":"2_CR69"},{"key":"2_CR70","first-page":"506","volume-title":"Machine learning, proceedings of the nineteenth international conference (ICML 2002)","author":"M. Pickett","year":"2002","unstructured":"Pickett, M., & Barto, A.\u00a0G. (2002). PolicyBlocks: an algorithm for creating useful macro-actions in reinforcement learning. In C. Sammut & A. Hoffmann (Eds.), Machine learning, proceedings of the nineteenth international conference (ICML 2002) (pp. 506\u2013513). San Francisco: Morgan Kaufmann."},{"unstructured":"Ravindran, B., & Barto, A.\u00a0G. (2002). Model minimization in hierarchical reinforcement learning. In S. Koenig & R.\u00a0C. Holte (Eds.), Abstraction, reformulation and approximation, 5th international symposium, SARA 2002, Kananaskis, Alberta, Canada, 2\u20134 August 2002, proceedings. Lecture notes in computer science (vol. 2371, pp. 196\u2013211). Berlin: Springer.","key":"2_CR71"},{"key":"2_CR72","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1006\/ceps.1999.1020","volume":"25","author":"R. M. Ryan","year":"2000","unstructured":"Ryan, R.\u00a0M., & Deci, E.\u00a0L. (2000). Intrinsic and extrinsic motivations: classic definitions and new directions. Contemporary Educational Psychology, 25, 54\u201367.","journal-title":"Contemporary Educational Psychology"},{"key":"2_CR73","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1016\/0004-3702(74)90026-5","volume":"5","author":"E. D. Sacerdoti","year":"1974","unstructured":"Sacerdoti, E.\u00a0D. (1974). Planning in a hierarchy of abstraction spaces. Artificial Intelligence, 5, 115\u2013135.","journal-title":"Artificial Intelligence"},{"unstructured":"Schmidhuber, J. (1991a). Adaptive confidence and adaptive curiosity. Technical Report FKI-149-91, Institut f\u00fcr Informatik, Technische Universit\u00e4t M\u00fcnchen, Arcisstr. 21, 800 M\u00fcnchen 2, Germany.","key":"2_CR74"},{"key":"2_CR75","doi-asserted-by":"crossref","first-page":"222","DOI":"10.7551\/mitpress\/3115.003.0030","volume-title":"From animals to animats: proceedings of the first international conference on simulation of adaptive behavior (complex adaptive systems)","author":"J. Schmidhuber","year":"1991","unstructured":"Schmidhuber, J. (1991b). A possibility for implementing curiosity and boredom in model-building neural controllers. In J.-A. Meyer & S.\u00a0W. Wilson (Eds.), From animals to animats: proceedings of the first international conference on simulation of adaptive behavior (complex adaptive systems) (pp. 222\u2013227). Cambridge: MIT."},{"key":"2_CR76","doi-asserted-by":"publisher","first-page":"623","DOI":"10.1037\/0096-3445.135.4.623","volume":"135","author":"D. W. Schneider","year":"2006","unstructured":"Schneider, D.\u00a0W., & Logan, G.\u00a0D. (2006). Hierarchical control of cognitive processes: switching tasks in sequences. Journal of Experimental Psychology: General, 135, 623\u2013640.","journal-title":"Journal of Experimental Psychology: General"},{"key":"2_CR77","volume-title":"The sciences of the artificial","author":"H. A. Simon","year":"1996","unstructured":"Simon, H.\u00a0A. (1996). The sciences of the artificial, 3rd edn. Cambridge: MIT.","edition":"3"},{"key":"2_CR78","volume-title":"The structure of complexity in an evolving world: the role of near decomposability. In W. Callebaut & D. Rasskin-Gutman (Eds.), Modularity: understanding the development and evolution of natural complex systems (pp. ix\u2013xiii)","author":"H. A. Simon","year":"2005","unstructured":"Simon, H.\u00a0A. (2005). The structure of complexity in an evolving world: the role of near decomposability. In W. Callebaut & D. Rasskin-Gutman (Eds.), Modularity: understanding the development and evolution of natural complex systems (pp. ix\u2013xiii). Cambridge: MIT."},{"key":"2_CR79","first-page":"751","volume-title":"Machine learning, proceedings of the twenty-first international conference (ICML 2004) ACM international conference proceeding series","author":"\u00d6. \u015eim\u015fek","year":"2004","unstructured":"\u015eim\u015fek, \u00d6., & Barto, A. (2004). Using relative novelty to identify useful temporal abstractions in reinforcement learning. In C.\u00a0E. Brodley (Ed.), Machine learning, proceedings of the twenty-first international conference (ICML 2004) ACM international conference proceeding series (vol. 69, pp. 751\u2013758). New York: ACM."},{"key":"2_CR80","first-page":"1497","volume-title":"Advances in neural information processing systems 21, Proceedings of the twenty-second annual conference on neural information processing systems","author":"\u00d6. \u015eim\u015fek","year":"2009","unstructured":"\u015eim\u015fek, \u00d6., & Barto, A. (2009). Skill characterization based on betweenness. In D. Koller, D.\u00a0Schuurmans, Y. Bengio, L. Bottou (Eds.), Advances in neural information processing systems 21, Proceedings of the twenty-second annual conference on neural information processing systems (pp. 1497\u20131504). Red Hook: Curran Associates, Inc."},{"key":"2_CR81","first-page":"816","volume-title":"Machine learning, proceedings of the twenty-second international conference (ICML 2005) ACM international conference proceeding series","author":"\u00d6. \u015eim\u015fek","year":"2005","unstructured":"\u015eim\u015fek, \u00d6., Wolfe, A.\u00a0P., Barto, A. (2005). Identifying useful subgoals in reinforcement learning by local graph partitioning. In L.\u00a0D. Raedt & S. Wrobel (Eds.), Machine learning, proceedings of the twenty-second international conference (ICML 2005) ACM international conference proceeding series (vol. 119, pp. 816\u2013823). New York: ACM."},{"key":"2_CR82","first-page":"1281","volume-title":"Advances in neural information processing systems 17: proceedings of the 2004 conference","author":"S. Singh","year":"2005","unstructured":"Singh, S., Barto, A.\u00a0G., Chentanez, N. (2005). Intrinsically motivated reinforcement learning. In L.\u00a0K. Saul, Y. Weiss, L. Bottou (Eds.), Advances in neural information processing systems 17: proceedings of the 2004 conference (pp. 1281\u20131288). Cambridge: MIT."},{"doi-asserted-by":"crossref","unstructured":"Singh, S., Lewis, R.\u00a0L., Barto, A.\u00a0G., Sorg, J. (2010). Intrinsically motivated reinforcement learning: An evolutionary perspective. IEEE Transactions on Autonomous Mental Development, 2, 70\u201382. Special issue on Active Learning and Intrinsically Motivated Exploration in Robots: Advances and Challenges.","key":"2_CR83","DOI":"10.1109\/TAMD.2010.2051031"},{"unstructured":"Soni, V., & Singh, S. (2006). Reinforcement learning of hierarchical skills on the Sony Aibo robot. In L. Smith, O. Sporns, C. Yu, M. Gasser, C. Breazeal, G. Deak, J. Weng (Eds.), Fifth international conference on development and learning (ICDL). Bloomington IN.","key":"2_CR84"},{"key":"2_CR85","first-page":"469","volume-title":"UAI \u201902, Proceedings of the 18th conference in uncertainty in artificial intelligence","author":"H. Steck","year":"2002","unstructured":"Steck, H., & Jaakkola, T. (2002). Unsupervised active learning in large domains. In A. Darwiche & N. Friedman (Eds.), UAI \u201902, Proceedings of the 18th conference in uncertainty in artificial intelligence (pp. 469\u2013476). San Francisco: Morgan Kaufmann."},{"unstructured":"Strehl, A.\u00a0L., Diuk, C., Littman, M.\u00a0L. (2007). Efficient structure learning in factored-state MDPs. In Proceedings of the twenty-second AAAI conference on artificial intelligence. San Francisco: AAAI.","key":"2_CR86"},{"key":"2_CR87","volume-title":"Reinforcement learning: an introduction","author":"R. S. Sutton","year":"1998","unstructured":"Sutton, R.\u00a0S., & Barto, A.\u00a0G. (1998). Reinforcement learning: an introduction. Cambridge: MIT."},{"key":"2_CR88","doi-asserted-by":"publisher","first-page":"181","DOI":"10.1016\/S0004-3702(99)00052-1","volume":"112","author":"R. S. Sutton","year":"1999","unstructured":"Sutton, R.\u00a0S., Precup, D., Singh, S. (1999). Between MDPs and semi-MDPs: a framework for temporal abstraction inreinforcement learning. Artificial Intelligence, 112, 181\u2013211.","journal-title":"Artificial Intelligence"},{"key":"2_CR89","first-page":"1633","volume":"10","author":"M. E. Taylor","year":"2009","unstructured":"Taylor, M.\u00a0E., & Stone, P. (2009). Transfer learning for reinforcement learning domains: a survey. Journal of Machine Learning Research, 10, 1633\u20131685.","journal-title":"Journal of Machine Learning Research"},{"key":"2_CR90","first-page":"2125","volume":"8","author":"M. E. Taylor","year":"2007","unstructured":"Taylor, M.\u00a0E., Stone, P., Liu, Y. (2007). Transfer learning via inter-task mappings for temporal difference learning. Journal of Machine Learning Research, 8, 2125\u20132167.","journal-title":"Journal of Machine Learning Research"},{"key":"2_CR91","doi-asserted-by":"crossref","first-page":"17","DOI":"10.7551\/mitpress\/8727.003.0004","volume-title":"Robotics: science and systems V: proceedings of the fifth annual robotics: science and systems conference","author":"R. Tedrake","year":"2010","unstructured":"Tedrake, R. (2010). LQR-Trees: feedback motion planning on sparse randomized trees. In J.\u00a0Trinkle, Y. Matsuoka, J.\u00a0A. Castellanos (Eds.), Robotics: science and systems V: proceedings of the fifth annual robotics: science and systems conference (pp. 17\u201324). Cambridge: MIT."},{"key":"2_CR92","volume-title":"Stochastic policy gradient reinforcement learning on a simple 3D biped. In Proceedings of the IEEE international conference on intelligent robots and systems (IROS) (vol. 3, pp. 2849\u20132854)","author":"R. Tedrake","year":"2004","unstructured":"Tedrake, R., Zhang, T.\u00a0W., Seung, H.\u00a0S. (2004). Stochastic policy gradient reinforcement learning on a simple 3D biped. In Proceedings of the IEEE international conference on intelligent robots and systems (IROS) (vol.\u00a03, pp. 2849\u20132854). Japan: Sendai."},{"key":"2_CR93","doi-asserted-by":"publisher","first-page":"215","DOI":"10.1162\/neco.1994.6.2.215","volume":"6","author":"G. J. Tesauro","year":"1994","unstructured":"Tesauro, G.\u00a0J. (1994). TD\u2013gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6, 215\u2013219.","journal-title":"Neural Computation"},{"key":"2_CR94","first-page":"385","volume-title":"Advances in neural information processing systems 7: proceedings of the 1994 conference","author":"S. B. Thrun","year":"1995","unstructured":"Thrun, S.\u00a0B., & Schwartz, A. (1995). Finding structure in reinforcement learning. In G.\u00a0Tesauro, D.\u00a0S. Touretzky, T. Leen (Eds.), Advances in neural information processing systems 7: proceedings of the 1994 conference (pp. 385\u2013392). Cambridge: MIT."},{"key":"2_CR95","first-page":"863","volume-title":"Proceedings of the seventeenth international joint conference on artificial intelligence, IJCAI 2001","author":"S. Tong","year":"2001","unstructured":"Tong, S., & Koller, D. (2001). Active learning for structure in Bayesian networks. In B. Nebel (Ed.), Proceedings of the seventeenth international joint conference on artificial intelligence, IJCAI 2001 (pp. 863\u2013869). San Francisco: Morgan Kaufmann."},{"key":"2_CR96","volume-title":"Relational macros for transfer in reinforcement learning. In H. Blockeel, J. Ramon, J. Shavlik, P. Tadepalli (Eds.), Inductive logic programming 17th international conference, ILP 2007. Lecture notes in computer science (vol. 4894, pp. 254\u2013268)","author":"L. Torrey","year":"2008","unstructured":"Torrey, L., Shavlik, J., Walker, J., Maclin, R. (2008). Relational macros for transfer in reinforcement learning. In H. Blockeel, J. Ramon, J. Shavlik, P. Tadepalli (Eds.), Inductive logic programming 17th international conference, ILP 2007. Lecture notes in computer science (vol. 4894, pp. 254\u2013268). Berlin: Springer."},{"key":"2_CR97","volume-title":"Switching between representations in reinforcement learning. In R. Babuska & F. C. A. Groen (Eds.), Interactive collaborative information systems. Studies in computational intelligence (vol. 281, pp. 65\u201384)","author":"H. van Seijen","year":"2007","unstructured":"van Seijen, H., Whiteson, S., Kester, L. (2007). Switching between representations in reinforcement learning. In R. Babuska & F. C.\u00a0A. Groen (Eds.), Interactive collaborative information systems. Studies in computational intelligence (vol. 281, pp. 65\u201384). Berlin: Springer."},{"doi-asserted-by":"crossref","unstructured":"Vigorito, C., & Barto, A.\u00a0G. (2010). Intrinsically motivated hierarchical skill learning in structured environments. IEEE Transactions on Autonomous Mental Development, 2, 83\u201390. Special issue on Active Learning and Intrinsically Motivated Exploration in Robots: Advances and Challenges.","key":"2_CR98","DOI":"10.1109\/TAMD.2010.2050205"},{"key":"2_CR99","volume-title":"Pattern-directed inference systems","author":"D. A. Waterman","year":"1978","unstructured":"Waterman, D.\u00a0A., & Hayes-Roth, F. (1978). Pattern-directed inference systems. New York: Academic."},{"key":"2_CR100","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1037\/h0040934","volume":"66","author":"R. W. White","year":"1959","unstructured":"White, R.\u00a0W. (1959). Motivation reconsidered: the concept of competence. Psychological Review, 66, 297\u2013333.","journal-title":"Psychological Review"}],"container-title":["Computational and Robotic Models of the Hierarchical Organization of Behavior"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-642-39875-9_2","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,19]],"date-time":"2024-05-19T11:57:25Z","timestamp":1716119845000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-642-39875-9_2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013]]},"ISBN":["9783642398742","9783642398759"],"references-count":100,"URL":"https:\/\/doi.org\/10.1007\/978-3-642-39875-9_2","relation":{},"subject":[],"published":{"date-parts":[[2013]]},"assertion":[{"value":"28 September 2013","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}}]}}