{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T13:10:01Z","timestamp":1753881001796,"version":"3.41.2"},"reference-count":33,"publisher":"World Scientific Pub Co Pte Ltd","issue":"05","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Model. Simul. Sci. Comput."],"published-print":{"date-parts":[[2024,10]]},"abstract":"<jats:p> Several approaches have been proposed to remedy the dimensionality problem that reinforcement learning (RL) suffers from. Among the solutions, hierarchical reinforcement learning (HRL) consists of dividing an RL problem into sub-problems called options or abstract actions. Discovering abstract actions or options for HRL is challenging, multiple approaches are proposed. In this paper, we present a new approach, an agent with direction sense for an automatic option discovery. Our agent uses its direction sense to discover shortcuts and shortest paths between states that he has already visited, he detects bottlenecks for building termination conditions and initiation states for options. Thus, at the learning step, the agent uses his previous experience of exploration in parallel with an intrinsically motivated learning. The options discovered are task-independent and could be used for new tasks. Experimental results on maze problems and Tic-tac-toe game indicate better results compared with flat RL and another RL approach in general and special cases. <\/jats:p>","DOI":"10.1142\/s1793962324500442","type":"journal-article","created":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T14:46:41Z","timestamp":1721227601000},"source":"Crossref","is-referenced-by-count":0,"title":["An agent with a sense of direction for option discovery in hierarchical reinforcement learning"],"prefix":"10.1142","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7731-3644","authenticated-orcid":false,"given":"Zoulikha","family":"Koudad","sequence":"first","affiliation":[{"name":"Abu Bekr Belkaid University of Tlemcen, Tlemcen 13000, Algeria"},{"name":"Higher School in Applied Sciences of Tlemcen, BP 165 RP. Bel Horizon, Tlemcen 13000, Algeria"},{"name":"Laboratory of Research in Computer Science of Tlemcen LRIT, Tlemcen 13000, Algeria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-9117-047X","authenticated-orcid":false,"given":"Mohamed","family":"Merzoug","sequence":"additional","affiliation":[{"name":"Abu Bekr Belkaid University of Tlemcen, Tlemcen 13000, Algeria"},{"name":"Laboratory of Research in Computer Science of Tlemcen LRIT, Tlemcen 13000, Algeria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5828-2245","authenticated-orcid":false,"given":"Abdelkrim","family":"Benamar","sequence":"additional","affiliation":[{"name":"Abu Bekr Belkaid University of Tlemcen, Tlemcen 13000, Algeria"},{"name":"Laboratory of Research in Computer Science of Tlemcen LRIT, Tlemcen 13000, Algeria"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"219","published-online":{"date-parts":[[2024,8,27]]},"reference":[{"key":"S1793962324500442BIB001","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(99)00052-1"},{"key":"S1793962324500442BIB002","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(72)90051-3"},{"key":"S1793962324500442BIB003","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(85)90012-8"},{"key":"S1793962324500442BIB004","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(74)90026-5"},{"key":"S1793962324500442BIB005","first-page":"923","volume-title":"Proc. AAAI-90","author":"Knoblock C.","year":"2000"},{"key":"S1793962324500442BIB006","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(91)90024-E"},{"journal-title":"ML Workshop","year":"1992","author":"Singh S.","key":"S1793962324500442BIB007"},{"key":"S1793962324500442BIB008","first-page":"271","volume-title":"Advances in Neural Information Processing Systems","author":"Dayan P.","year":"1992"},{"key":"S1793962324500442BIB009","first-page":"167","volume-title":"Proc. 10th Int. Conf. Machine Learning (ICML-93)","author":"Kaelbling L. P.","year":"1993"},{"key":"S1793962324500442BIB010","doi-asserted-by":"publisher","DOI":"10.1613\/jair.639"},{"key":"S1793962324500442BIB011","first-page":"1497","volume-title":"Advances in Neural Information Processing Systems","author":"\u015eim\u015fek \u00d6.","year":"2009"},{"key":"S1793962324500442BIB012","doi-asserted-by":"publisher","DOI":"10.1109\/ADPRL.2011.5967384"},{"key":"S1793962324500442BIB013","first-page":"316","volume-title":"Int. Conf. Machine Learning","author":"Brunskill E.","year":"2014"},{"key":"S1793962324500442BIB015","first-page":"2295","volume-title":"34th Int. Conf. Machine Learning ICML\u201917","author":"Marlos C.","year":"2017"},{"key":"S1793962324500442BIB016","doi-asserted-by":"publisher","DOI":"10.1155\/2018\/2085721"},{"key":"S1793962324500442BIB017","doi-asserted-by":"publisher","DOI":"10.1142\/S0218213021500068"},{"key":"S1793962324500442BIB018","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-39875-9_12"},{"key":"S1793962324500442BIB019","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-39875-9_2"},{"key":"S1793962324500442BIB020","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1003779"},{"key":"S1793962324500442BIB021","doi-asserted-by":"publisher","DOI":"10.1016\/j.cognition.2008.08.011"},{"key":"S1793962324500442BIB022","first-page":"243","volume-title":"Int. Conf. Machine Learning (ICML)","author":"Hengst B.","year":"2002"},{"key":"S1793962324500442BIB023","doi-asserted-by":"publisher","DOI":"10.1145\/1102351.1102421"},{"key":"S1793962324500442BIB024","doi-asserted-by":"publisher","DOI":"10.1023\/A:1017984413808"},{"key":"S1793962324500442BIB025","first-page":"1","volume-title":"Int. Conf. Learning Representations","author":"Bagaria A","year":"2020"},{"key":"S1793962324500442BIB027","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"S1793962324500442BIB028","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01818-4_45"},{"key":"S1793962324500442BIB029","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.10916"},{"key":"S1793962324500442BIB030","volume-title":"Reinforcement Learning \u2014 An Introduction","author":"Sutton R. S.","year":"2018","edition":"2"},{"key":"S1793962324500442BIB031","doi-asserted-by":"publisher","DOI":"10.1109\/TAMD.2010.2050205"},{"key":"S1793962324500442BIB032","first-page":"19","volume-title":"Proc. 3rd Int. Conf. Development and Learning","author":"Barto A.","year":"2004"},{"key":"S1793962324500442BIB033","first-page":"1281","volume-title":"Proc. 17th Int. Conf. Neural Information Processing Systems","author":"Singh S.","year":"2004"},{"key":"S1793962324500442BIB034","volume-title":"Artificial Intelligence: A Modern Approach","author":"Russel S. J.","year":"2010","edition":"3"},{"key":"S1793962324500442BIB035","doi-asserted-by":"publisher","DOI":"10.1016\/S0893-6080(99)00060-X"}],"container-title":["International Journal of Modeling, Simulation, and Scientific Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S1793962324500442","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,7]],"date-time":"2024-11-07T08:00:19Z","timestamp":1730966419000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S1793962324500442"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,27]]},"references-count":33,"journal-issue":{"issue":"05","published-print":{"date-parts":[[2024,10]]}},"alternative-id":["10.1142\/S1793962324500442"],"URL":"https:\/\/doi.org\/10.1142\/s1793962324500442","relation":{},"ISSN":["1793-9623","1793-9615"],"issn-type":[{"type":"print","value":"1793-9623"},{"type":"electronic","value":"1793-9615"}],"subject":[],"published":{"date-parts":[[2024,8,27]]},"article-number":"2450044"}}