{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,14]],"date-time":"2025-05-14T02:30:00Z","timestamp":1747189800384,"version":"3.40.5"},"reference-count":33,"publisher":"World Scientific Pub Co Pte Ltd","issue":"04","funder":[{"name":"NSTC in Taiwan","award":["NSTC112-2221-E-001-008"],"award-info":[{"award-number":["NSTC112-2221-E-001-008"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Semantic Computing"],"published-print":{"date-parts":[[2024,12]]},"abstract":"<jats:p> Over the last decades, there has been growing interest in research in multiple and interdisciplinary fields of human-AI computing. In particular, approaches integrating human\u2019s perspective and design with reinforcement learning (RL) have received more attention. However, the current research on RL may need to consider its enhancement from human-inspired approaches further. In this work, we focus on enabling a meta-reinforcement learning (meta-RL) agent to achieve adaptation and generalization, according to modeling Markov Decision Processes (MDP) using Bayesian knowledge and analysis. By introducing a novel framework called human-inspired meta-RL (HMRL), we incorporate the agent performing resilient actions to leverage the dynamic dense reward based on the knowledge and prediction of a Bayesian analysis. The proposed framework can make the agent learn generalization and prevent the agent from failing catastrophically. The experimental results show that our approach helps the agent reduce computational costs with learning adaptation. In addition to the system design, we have also extended further algorithmic improvement based on learning within a deep Q-network (DQN) implementations for more complicated future tasks, which compared replay buffers to possibly enhance the optimization process. Finally, we conclude and anticipate that integrating human-inspired meta-RL can enable learning more formulations relating to robustness and scalability, leading to promising directions and more complex AI goals in the future. <\/jats:p>","DOI":"10.1142\/s1793351x2444001x","type":"journal-article","created":{"date-parts":[[2024,7,12]],"date-time":"2024-07-12T15:20:39Z","timestamp":1720797639000},"page":"547-569","source":"Crossref","is-referenced-by-count":0,"title":["Human-Inspired Meta-Reinforcement Learning Using Bayesian Knowledge and Enhanced Deep Q-Network"],"prefix":"10.1142","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7165-1893","authenticated-orcid":false,"given":"Joshua","family":"Ho","sequence":"first","affiliation":[{"name":"TIGP SNHCC, Academia Sinica, Taiwan"},{"name":"Institute of Information Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 115, Taiwan"},{"name":"Institute of Information Systems and Applications, National Tsing Hua University, Hsinchu, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2992-9898","authenticated-orcid":false,"given":"Chien-Min","family":"Wang","sequence":"additional","affiliation":[{"name":"TIGP SNHCC, Academia Sinica, Taiwan"},{"name":"Institute of Information Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 115, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5041-5795","authenticated-orcid":false,"given":"Chung-Ta","family":"King","sequence":"additional","affiliation":[{"name":"TIGP SNHCC, Academia Sinica, Taiwan"},{"name":"Institute of Information Systems and Applications, National Tsing Hua University, Hsinchu, Taiwan"},{"name":"Department of Computer Science, National Tsing Hua University, 101, Section 2, Kuang-Fu Road, Hsinchu 300044, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yi-Hsin","family":"You","sequence":"additional","affiliation":[{"name":"Institute of Information Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 115, Taiwan"},{"name":"Department of Computer Science, National Taiwan University, No. 1, Sec. 4, Roosevelt Rd., Taipei, 106319, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chi-Wei","family":"Feng","sequence":"additional","affiliation":[{"name":"Institute of Information Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 115, Taiwan"},{"name":"Department of Computer Science, National Tsing Hua University, 101, Section 2, Kuang-Fu Road, Hsinchu 300044, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"219","published-online":{"date-parts":[[2024,8,20]]},"reference":[{"key":"S1793351X2444001XBIB001","doi-asserted-by":"publisher","DOI":"10.1007\/s12369-022-00942-6"},{"key":"S1793351X2444001XBIB002","doi-asserted-by":"publisher","DOI":"10.1109\/AIKE48582.2020.00031"},{"key":"S1793351X2444001XBIB003","doi-asserted-by":"publisher","DOI":"10.1145\/3411763.3445016"},{"key":"S1793351X2444001XBIB004","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-52152-3_11"},{"key":"S1793351X2444001XBIB005","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2021.103535"},{"key":"S1793351X2444001XBIB006","doi-asserted-by":"publisher","DOI":"10.1109\/ICHMS53169.2021.9582667"},{"key":"S1793351X2444001XBIB007","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300233"},{"volume-title":"Human-centered AI: Foundations, approaches, and applications","year":"2021","author":"Zhang J.","key":"S1793351X2444001XBIB008"},{"key":"S1793351X2444001XBIB009","first-page":"1126","volume-title":"Int. Conf. Machine Learning","author":"Finn C.","year":"2017"},{"key":"S1793351X2444001XBIB011","doi-asserted-by":"publisher","DOI":"10.1109\/THMS.2019.2912447"},{"volume-title":"Usability Engineering: Scenario-Based Development of Human-Computer Interaction","year":"2003","author":"Rosson M. B.","key":"S1793351X2444001XBIB012"},{"key":"S1793351X2444001XBIB013","first-page":"577","volume-title":"Companion of the 2021 ACM\/IEEE Int. Conf. Human-Robot Interaction","author":"Faulkner T. A. K.","year":"2021"},{"key":"S1793351X2444001XBIB014","first-page":"1007","volume-title":"2022 31st IEEE Int. Conf. Robot and Human Interactive Communication","author":"Gray C.","year":"2022"},{"key":"S1793351X2444001XBIB016","doi-asserted-by":"publisher","DOI":"10.1609\/aiide.v15i1.5237"},{"key":"S1793351X2444001XBIB017","doi-asserted-by":"publisher","DOI":"10.1109\/ICHMS56717.2022.9980765"},{"key":"S1793351X2444001XBIB018","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-39077-2_13"},{"key":"S1793351X2444001XBIB019","doi-asserted-by":"publisher","DOI":"10.3390\/s21072514"},{"key":"S1793351X2444001XBIB020","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300831"},{"key":"S1793351X2444001XBIB022","doi-asserted-by":"publisher","DOI":"10.1109\/IROS51168.2021.9636463"},{"key":"S1793351X2444001XBIB024","first-page":"2328","volume-title":"Proc. 2023 Int. Conf. Autonomous Agents and Multiagent Systems","author":"Poletti S.","year":"2023"},{"key":"S1793351X2444001XBIB026","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5955"},{"key":"S1793351X2444001XBIB028","doi-asserted-by":"publisher","DOI":"10.1016\/j.cobeha.2015.07.007"},{"key":"S1793351X2444001XBIB029","doi-asserted-by":"publisher","DOI":"10.1214\/09-SS057"},{"key":"S1793351X2444001XBIB030","doi-asserted-by":"publisher","DOI":"10.1561\/2200000049"},{"key":"S1793351X2444001XBIB033","first-page":"7780","volume-title":"Int. Conf. Machine Learning","author":"Mitchell E.","year":"2021"},{"key":"S1793351X2444001XBIB036","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"S1793351X2444001XBIB040","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11796"},{"key":"S1793351X2444001XBIB041","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511803161"},{"key":"S1793351X2444001XBIB042","doi-asserted-by":"publisher","DOI":"10.1109\/MRA.2021.3065870"},{"key":"S1793351X2444001XBIB043","first-page":"1090","volume-title":"Proc. Third Int. Joint Conf. Autonomous Agents and Multiagent Systems-Volume 3","author":"Chalkiadakis G.","year":"2004"},{"key":"S1793351X2444001XBIB044","first-page":"1995","volume-title":"Int. Conf. Machine Learning","author":"Wang Z.","year":"2016"},{"key":"S1793351X2444001XBIB045","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"S1793351X2444001XBIB046","first-page":"1","volume-title":"Advances in Neural Information Processing Systems","volume":"30","author":"Andrychowicz M.","year":"2017"}],"container-title":["International Journal of Semantic Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S1793351X2444001X","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,6]],"date-time":"2024-11-06T03:11:57Z","timestamp":1730862717000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S1793351X2444001X"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,20]]},"references-count":33,"journal-issue":{"issue":"04","published-print":{"date-parts":[[2024,12]]}},"alternative-id":["10.1142\/S1793351X2444001X"],"URL":"https:\/\/doi.org\/10.1142\/s1793351x2444001x","relation":{},"ISSN":["1793-351X","1793-7108"],"issn-type":[{"type":"print","value":"1793-351X"},{"type":"electronic","value":"1793-7108"}],"subject":[],"published":{"date-parts":[[2024,8,20]]}}}