{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T14:06:57Z","timestamp":1753884417647,"version":"3.41.2"},"reference-count":36,"publisher":"World Scientific Pub Co Pte Ltd","issue":"06","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Patt. Recogn. Artif. Intell."],"published-print":{"date-parts":[[2025,5]]},"abstract":"<jats:p> Offline reinforcement learning (offline RL) endeavors to learn effective policies from a large batch of pre-collected datasets without any costly or dangerous online exploration. Nevertheless, offline RL always suffers from substantial algorithmic extrapolation errors and may fail when bootstrapping from out-of-distribution (OOD) actions or states. In this work, we introduce a practical and effective Bayesian uncertainty weighted optimization (BUWO) to leverage the Bayesian uncertainty to account for the epistemic uncertainty associated with each training sample and penalize the state-action pairs with high uncertainty. We compare BUWO with other prevailing offline RL algorithms on D4RL benchmarks. The experimental results demonstrate that the algorithm can enhance the average reward score by almost 15% without additional computational costs compared to the current state-of-the-art algorithm. <\/jats:p>","DOI":"10.1142\/s0218001425510012","type":"journal-article","created":{"date-parts":[[2025,3,1]],"date-time":"2025-03-01T03:52:26Z","timestamp":1740801146000},"source":"Crossref","is-referenced-by-count":0,"title":["Bayesian Uncertainty Weighted Optimization for Offline Reinforcement Learning"],"prefix":"10.1142","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9423-6369","authenticated-orcid":false,"given":"Tianyi","family":"Li","sequence":"first","affiliation":[{"name":"Ningbo Artificial Intelligence Institute, Shanghai Jiao Tong University, Shanghai 200030, P.\u00a0R.\u00a0China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3492-0211","authenticated-orcid":false,"given":"Genke","family":"Yang","sequence":"additional","affiliation":[{"name":"Ningbo Artificial Intelligence Institute, Shanghai Jiao Tong University, Shanghai 200030, P.\u00a0R.\u00a0China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8311-3419","authenticated-orcid":false,"given":"Jian","family":"Chu","sequence":"additional","affiliation":[{"name":"Ningbo Artificial Intelligence Institute, Shanghai Jiao Tong University, Shanghai 200030, P.\u00a0R.\u00a0China"}]}],"member":"219","published-online":{"date-parts":[[2025,5,9]]},"reference":[{"key":"S0218001425510012BIB001","doi-asserted-by":"publisher","DOI":"10.1007\/s11053-022-10051-w"},{"key":"S0218001425510012BIB002","doi-asserted-by":"publisher","DOI":"10.1007\/s00366-023-01852-5"},{"key":"S0218001425510012BIB003","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-023-43317-9"},{"key":"S0218001425510012BIB004","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2021.3084198"},{"key":"S0218001425510012BIB005","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-021-05738-9"},{"key":"S0218001425510012BIB006","doi-asserted-by":"publisher","DOI":"10.1109\/TSUSC.2023.3251302"},{"key":"S0218001425510012BIB007","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-020-03157-9"},{"key":"S0218001425510012BIB008","doi-asserted-by":"publisher","DOI":"10.1109\/TG.2022.3224088"},{"key":"S0218001425510012BIB009","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2023.3250269"},{"key":"S0218001425510012BIB010","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2021.10.031"},{"key":"S0218001425510012BIB011","doi-asserted-by":"publisher","DOI":"10.1007\/s10489-023-05007-3"},{"key":"S0218001425510012BIB012","doi-asserted-by":"publisher","DOI":"10.1038\/nature24270"},{"key":"S0218001425510012BIB013","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2022.3213026"},{"key":"S0218001425510012BIB014","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2024.3385910"},{"key":"S0218001425510012BIB015","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3054625"},{"key":"S0218001425510012BIB016","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-022-10348-5"},{"key":"S0218001425510012BIB017","doi-asserted-by":"publisher","DOI":"10.1142\/S0218126622502589"},{"key":"S0218001425510012BIB018","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"S0218001425510012BIB019","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-022-33259-z"},{"key":"S0218001425510012BIB020","first-page":"1","volume":"61","author":"Qi C.","year":"2023","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"S0218001425510012BIB021","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2020.3041265"},{"key":"S0218001425510012BIB022","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2022.3185139"},{"key":"S0218001425510012BIB023","doi-asserted-by":"publisher","DOI":"10.1007\/s11063-024-11593-1"},{"key":"S0218001425510012BIB024","doi-asserted-by":"publisher","DOI":"10.1038\/nature16961"},{"key":"S0218001425510012BIB025","doi-asserted-by":"publisher","DOI":"10.1109\/TETCI.2022.3140380"},{"key":"S0218001425510012BIB026","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2020.3023127"},{"key":"S0218001425510012BIB027","doi-asserted-by":"publisher","DOI":"10.1007\/s13042-023-01989-1"},{"key":"S0218001425510012BIB028","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2023.03.054"},{"key":"S0218001425510012BIB029","doi-asserted-by":"publisher","DOI":"10.1142\/S0218001421590357"},{"key":"S0218001425510012BIB030","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-023-06318-9"},{"key":"S0218001425510012BIB031","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.2023.3297711"},{"key":"S0218001425510012BIB032","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-022-07408-w"},{"key":"S0218001425510012BIB033","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-020-05663-3"},{"key":"S0218001425510012BIB034","doi-asserted-by":"publisher","DOI":"10.1007\/s40684-023-00547-y"},{"key":"S0218001425510012BIB035","doi-asserted-by":"publisher","DOI":"10.1142\/S0218001422520140"},{"key":"S0218001425510012BIB036","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2023.3292075"}],"container-title":["International Journal of Pattern Recognition and Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218001425510012","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,15]],"date-time":"2025-05-15T02:31:25Z","timestamp":1747276285000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S0218001425510012"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5]]},"references-count":36,"journal-issue":{"issue":"06","published-print":{"date-parts":[[2025,5]]}},"alternative-id":["10.1142\/S0218001425510012"],"URL":"https:\/\/doi.org\/10.1142\/s0218001425510012","relation":{},"ISSN":["0218-0014","1793-6381"],"issn-type":[{"type":"print","value":"0218-0014"},{"type":"electronic","value":"1793-6381"}],"subject":[],"published":{"date-parts":[[2025,5]]},"article-number":"2551001"}}