{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,24]],"date-time":"2025-09-24T00:15:11Z","timestamp":1758672911573,"version":"3.44.0"},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,9]]},"abstract":"<jats:p>Black-Box Optimization (BBO) has found successful applications in many fields of science and engineering. Recently, there has been a growing interest in meta-learning particular components of BBO algorithms to speed up optimization and get rid of tedious hand-crafted heuristics. As an extension, learning the entire algorithm from data requires the least labor from experts and can provide the most flexibility. In this paper, we propose RIBBO, a method to reinforce-learn a BBO algorithm from offline data in an end-to-end fashion. RIBBO employs expressive sequence models to learn the optimization histories produced by multiple behavior algorithms and tasks, leveraging the in-context learning ability of large models to extract task information and make decisions accordingly. Central to our method is to augment the optimization histories with regret-to-go tokens, which are designed to represent the performance of an algorithm based on cumulative regret over the future part of the histories. The integration of regret-to-go tokens enables RIBBO to automatically generate sequences of query points that are positively correlated to the user-desired regret, verified by its universally good empirical performance on diverse problems, including BBO benchmark, hyper-parameter optimization, and robot control problems.<\/jats:p>","DOI":"10.24963\/ijcai.2025\/994","type":"proceedings-article","created":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T08:10:40Z","timestamp":1758269440000},"page":"8939-8947","source":"Crossref","is-referenced-by-count":0,"title":["Reinforced In-Context Black-Box Optimization"],"prefix":"10.24963","author":[{"given":"Lei","family":"Song","sequence":"first","affiliation":[{"name":"National Key Laboratory for Novel Software Technology, Nanjing University, China"},{"name":"School of Artificial Intelligence, Nanjing University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chen-Xiao","family":"Gao","sequence":"additional","affiliation":[{"name":"National Key Laboratory for Novel Software Technology, Nanjing University, China"},{"name":"School of Artificial Intelligence, Nanjing University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ke","family":"Xue","sequence":"additional","affiliation":[{"name":"National Key Laboratory for Novel Software Technology, Nanjing University, China"},{"name":"School of Artificial Intelligence, Nanjing University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chenyang","family":"Wu","sequence":"additional","affiliation":[{"name":"National Key Laboratory for Novel Software Technology, Nanjing University, China"},{"name":"School of Artificial Intelligence, Nanjing University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dong","family":"Li","sequence":"additional","affiliation":[{"name":"Huawei Noah\u2019s Ark Lab, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianye","family":"Hao","sequence":"additional","affiliation":[{"name":"Huawei Noah\u2019s Ark Lab, China"},{"name":"College of Intelligence and Computing, Tianjin University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zongzhang","family":"Zhang","sequence":"additional","affiliation":[{"name":"National Key Laboratory for Novel Software Technology, Nanjing University, China"},{"name":"School of Artificial Intelligence, Nanjing University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chao","family":"Qian","sequence":"additional","affiliation":[{"name":"National Key Laboratory for Novel Software Technology, Nanjing University, China"},{"name":"School of Artificial Intelligence, Nanjing University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"10584","event":{"number":"34","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"acronym":"IJCAI-2025","name":"Thirty-Fourth International Joint Conference on Artificial Intelligence {IJCAI-25}","start":{"date-parts":[[2025,8,16]]},"theme":"Artificial Intelligence","location":"Montreal, Canada","end":{"date-parts":[[2025,8,22]]}},"container-title":["Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2025,9,23]],"date-time":"2025-09-23T11:35:47Z","timestamp":1758627347000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2025\/994"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2025,9]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2025\/994","relation":{},"subject":[],"published":{"date-parts":[[2025,9]]}}}