{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,7]],"date-time":"2026-05-07T23:22:56Z","timestamp":1778196176429,"version":"3.51.4"},"reference-count":68,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2019,2,13]],"date-time":"2019-02-13T00:00:00Z","timestamp":1550016000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"INCT-CiD","award":["465.560\/2014-8"],"award-info":[{"award-number":["465.560\/2014-8"]}]},{"DOI":"10.13039\/501100003593","name":"CNPq","doi-asserted-by":"crossref","award":["141.761\/2016-4, 308.729\/2015-3 and 461.739\/2014-3"],"award-info":[{"award-number":["141.761\/2016-4, 308.729\/2015-3 and 461.739\/2014-3"]}],"id":[{"id":"10.13039\/501100003593","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100017149","name":"DeepMind","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100017149","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Comput. Surv."],"published-print":{"date-parts":[[2020,1,31]]},"abstract":"<jats:p>In machine learning, Reinforcement Learning (RL) is an important tool for creating intelligent agents that learn solely through experience. One particular subarea within the RL domain that has received great attention is how to define macro-actions, which are temporal abstractions composed of a sequence of primitive actions. This subarea, loosely called skill acquisition, has been under development for several years and has led to better results in a diversity of RL problems. Among the many skill acquisition approaches, graph-based methods have received considerable attention. This survey presents an overview of graph-based skill acquisition methods for RL. We cover a diversity of these approaches and discuss how they evolved throughout the years. Finally, we also discuss the current challenges and open issues in the area of graph-based skill acquisition for RL.<\/jats:p>","DOI":"10.1145\/3291045","type":"journal-article","created":{"date-parts":[[2019,2,14]],"date-time":"2019-02-14T19:36:17Z","timestamp":1550172977000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["Graph-Based Skill Acquisition For Reinforcement Learning"],"prefix":"10.1145","volume":"52","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5541-7207","authenticated-orcid":false,"given":"Matheus R. F.","family":"Mendon\u00c7a","sequence":"first","affiliation":[{"name":"National Laboratory for Scientific Computing (LNCC), RJ, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Artur","family":"Ziviani","sequence":"additional","affiliation":[{"name":"National Laboratory for Scientific Computing (LNCC), RJ, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andr\u00c9 M. S.","family":"Barreto","sequence":"additional","affiliation":[{"name":"National Laboratory for Scientific Computing (LNCC), RJ, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2019,2,13]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence. 1726--1734","author":"Bacon Pierre-Luc","year":"2017"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the Workshop on Multiagent Interaction Networks. 1--7.","author":"Bacon Pierre-Luc","year":"2013"},{"key":"e_1_2_1_3_1","volume-title":"Network Science","author":"Barab\u00e1si Albert-L\u00e1szl\u00f3"},{"key":"e_1_2_1_4_1","volume-title":"van Hasselt","author":"Barreto Andre","year":"2017"},{"key":"e_1_2_1_5_1","volume-title":"DeepMind lab. arXiv preprint arXiv:1612.03801","author":"Beattie Charles","year":"2016"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/2566972.2566979"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/1658843.1658862"},{"key":"e_1_2_1_8_1","volume-title":"OpenAI gym. arXiv preprint arXiv:1606.01540","author":"Brockman Greg","year":"2016"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICNC.2007.312"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5772\/13214"},{"key":"e_1_2_1_12_1","unstructured":"Ozg\u00fcr \u015eim\u015fek and A. Barto. 2007. Betweenness centrality as a basis for forming skills. Technical Report University of Massachusetts Department of Computer Science. Ozg\u00fcr \u015eim\u015fek and A. Barto. 2007. Betweenness centrality as a basis for forming skills. Technical Report University of Massachusetts Department of Computer Science."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015353"},{"key":"e_1_2_1_14_1","volume-title":"Barto","author":"\u015eim\u015fek Ozg\u00fcr","year":"2008"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1102351.1102454"},{"key":"e_1_2_1_16_1","volume-title":"Indian International Conference on Artificial Intelligence (IICAI\u201911)","author":"Davoodabadi Marzieh","year":"2011"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1993.5.4.613"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622262.1622268"},{"key":"e_1_2_1_19_1","doi-asserted-by":"crossref","unstructured":"Bruce L. Digney. 1998. Learning hierarchical control structures for multiple tasks and changing environments. From Animals to Animats 5: Proceedings of the International Conference on the Simulation of Adaptive Behavior. 321--330. Bruce L. Digney. 1998. Learning hierarchical control structures for multiple tasks and changing environments. From Animals to Animats 5: Proceedings of the International Conference on the Simulation of Adaptive Behavior. 321--330.","DOI":"10.7551\/mitpress\/3119.003.0050"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the International Conference on International Conference on Machine Learning (ICML\u201916)","volume":"48","author":"Duan Yan","year":"2016"},{"key":"e_1_2_1_21_1","volume-title":"Mohammad Ebrahim Shiri, and Parham Moradi","author":"Entezari Negin","year":"2010"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the International Florida Artificial Intelligence Research Society Conference, Ingrid Russell and Susan M. Haller (Eds.). AAAI Press, 346--350","author":"Goel Sandeep","year":"2003"},{"key":"e_1_2_1_23_1","unstructured":"Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems (NIPS\u201914). Curran Associates 2672--2680. Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems (NIPS\u201914). Curran Associates 2672--2680."},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the International Conference on Machine Learning (Proceedings of Machine Learning Research), Francis Bach and David Blei (Eds.)","volume":"37","author":"Gregor Karol","year":"2015"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201902)","author":"Hengst Bernhard","year":"2002"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30115-8_16"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/3007263.3007270"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2598339"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-01507-6_89"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2009.07.002"},{"key":"e_1_2_1_32_1","unstructured":"George Konidaris and Andrew Barto. 2009. Skill discovery in continuous reinforcement learning domains using skill chaining. In Advances in Neural Information Processing Systems (NIPS\u201909) Y. Bengio D. Schuurmans J. Lafferty C. Williams and A. Culotta (Eds.). 1015--1023. George Konidaris and Andrew Barto. 2009. Skill discovery in continuous reinforcement learning domains using skill chaining. In Advances in Neural Information Processing Systems (NIPS\u201909) Y. Bengio D. Schuurmans J. Lafferty C. Williams and A. Culotta (Eds.). 1015--1023."},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the International Conference on Machine Learning.","author":"Krishnamurthy Ramnandan","year":"2016"},{"key":"e_1_2_1_34_1","volume-title":"Hinton","author":"Krizhevsky Alex","year":"2012"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992699"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.14778\/2735508.2735517"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201917)","author":"Machado Marlos C."},{"key":"e_1_2_1_38_1","volume-title":"Eigenoption discovery through the deep successor representation. CoRR abs\/1710.11089","author":"Machado Marlos C.","year":"2017"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1102351.1102421"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807184"},{"key":"e_1_2_1_41_1","volume-title":"Advances in Neural Information Processing Systems (NIPS\u201916)","author":"Mankowitz Daniel J."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015355"},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS\u201998)","author":"Maron Oded","year":"1998"},{"key":"e_1_2_1_45_1","volume-title":"Proceedings of the European Workshop on Reinforcement Learning (EWRL\u201912)","author":"Mathew Vimal","year":"2012"},{"key":"e_1_2_1_46_1","volume-title":"Proceedings of the International Conference on Machine Learning, 361--368","author":"McGovern Amy"},{"key":"e_1_2_1_47_1","volume-title":"Fagg","author":"McGovern Amy","year":"1997"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.5555\/645329.650060"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2011.5947611"},{"key":"e_1_2_1_50_1","volume-title":"Proceedings of the International Conference on Machine Learning (Proceedings of Machine Learning Research), Maria Florina Balcan and Kilian Q. Weinberger (Eds.)","volume":"48","author":"Mnih Volodymyr","year":"2016"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"e_1_2_1_52_1","volume-title":"Some Applications of Laplace Eigenvalues of Graphs","author":"Mohar Bojan"},{"key":"e_1_2_1_53_1","volume-title":"Mohammad Ebrahim Shiri, and Negin Entezari","author":"Moradi Parham","year":"2010"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.69.026113"},{"key":"e_1_2_1_55_1","volume-title":"Advances in Neural Information Processing Systems (NIPS\u201915)","author":"Oh Junhyuk"},{"key":"e_1_2_1_56_1","volume-title":"Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS\u201910)","author":"Osentoski Sarah","year":"2010"},{"key":"e_1_2_1_58_1","volume-title":"AAAI Spring Symposium on Knowledge Representation and Ontology for Autonomous Systems. 1--8.","author":"Potts Duncan","year":"2004"},{"key":"e_1_2_1_60_1","volume-title":"Markov Decision Processes: Discrete Stochastic Dynamic Programming","author":"Puterman Martin L."},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCAS.2010.5537485"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.76.036106"},{"key":"e_1_2_1_63_1","unstructured":"Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems (NIPS\u201915). Curran Associates 91--99. Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems (NIPS\u201915). Curran Associates 91--99."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.868688"},{"key":"e_1_2_1_65_1","volume-title":"Gershman","author":"Stachenfeld Kimberly L.","year":"2014"},{"key":"e_1_2_1_66_1","volume-title":"Barto","author":"Sutton Richard S.","year":"1998"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(99)00052-1"},{"key":"e_1_2_1_68_1","doi-asserted-by":"crossref","unstructured":"Christian Szegedy Wei Liu Yangqing Jia Pierre Sermanet Scott Reed Dragomir Anguelov Dumitru Erhan Vincent Vanhoucke and Andrew Rabinovich. 2015. Going deeper with convolutions. In Computer Vision and Pattern Recognition (CVPR\u201915). 1--12. Christian Szegedy Wei Liu Yangqing Jia Pierre Sermanet Scott Reed Dragomir Anguelov Dumitru Erhan Vincent Vanhoucke and Andrew Rabinovich. 2015. Going deeper with convolutions. In Computer Vision and Pattern Recognition (CVPR\u201915). 1--12.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2013.04.010"},{"key":"e_1_2_1_70_1","volume-title":"Advances in Neural Information Processing Systems (NIPS\u201916)","author":"Vezhnevets Alexander"},{"key":"e_1_2_1_71_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201917)","author":"Vezhnevets Alexander Sasha","year":"2017"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2008.02.014"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1109\/LANOMS.2011.6102259"}],"container-title":["ACM Computing Surveys"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3291045","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3291045","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:01:52Z","timestamp":1750208512000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3291045"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,2,13]]},"references-count":68,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,1,31]]}},"alternative-id":["10.1145\/3291045"],"URL":"https:\/\/doi.org\/10.1145\/3291045","relation":{},"ISSN":["0360-0300","1557-7341"],"issn-type":[{"value":"0360-0300","type":"print"},{"value":"1557-7341","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,2,13]]},"assertion":[{"value":"2018-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-02-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}