{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,24]],"date-time":"2025-09-24T00:14:47Z","timestamp":1758672887678,"version":"3.44.0"},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,9]]},"abstract":"<jats:p>Graph Transformers (GTs) have emerged as powerful tools for handling graph-structured data through global attention mechanisms. While GTs can effectively capture long-range dependencies, they introduce difficulties in optimization due to their complex, non-differentiable operators, which cannot be directly handled by standard gradient-based optimizers (such as Adam or AdamW). To investigate the above issues, this work adopts the line of Zeroth-Order Optimization (ZOO) technique. However, direct integration of ZOO incurs considerable challenges due to the sharp loss landscape and steep gradients within the GT parameter space. Under the above observations, we propose a Sharpness-aware Zeroth-order Optimizer (SZO) that combines Sharpness-Aware Minimization (SAM) technique facilitating convergence within a flatter neighborhood, and leverages parallel computing for efficient gradient estimation. Theoretically, we provide a comprehensive analysis of the optimizer from both convergence and generalization perspectives. Empirically, we conduct extensive experiments on various classical GTs across a wide range of benchmark datasets, which underscore the superior performance of SZO over the state-of-the-art optimizers.<\/jats:p>","DOI":"10.24963\/ijcai.2025\/348","type":"proceedings-article","created":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T08:10:40Z","timestamp":1758269440000},"page":"3126-3134","source":"Crossref","is-referenced-by-count":0,"title":["Sharpness-aware Zeroth-order Optimization for Graph Transformers"],"prefix":"10.24963","author":[{"given":"Yang","family":"Liu","sequence":"first","affiliation":[{"name":"Academy of Mathematics and Systems Science"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chuan","family":"Zhou","sequence":"additional","affiliation":[{"name":"Academy of Mathematics and Systems Science"},{"name":"University of Chinese Academy of Science"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuhan","family":"Lin","sequence":"additional","affiliation":[{"name":"Fudan University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuai","family":"Zhang","sequence":"additional","affiliation":[{"name":"Academy of Mathematics and Systems Science"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yang","family":"Gao","sequence":"additional","affiliation":[{"name":"Zhejiang University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhao","family":"Li","sequence":"additional","affiliation":[{"name":"Hangzhou Yugu Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shirui","family":"Pan","sequence":"additional","affiliation":[{"name":"Griffith University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"10584","event":{"number":"34","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"acronym":"IJCAI-2025","name":"Thirty-Fourth International Joint Conference on Artificial Intelligence {IJCAI-25}","start":{"date-parts":[[2025,8,16]]},"theme":"Artificial Intelligence","location":"Montreal, Canada","end":{"date-parts":[[2025,8,22]]}},"container-title":["Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2025,9,23]],"date-time":"2025-09-23T11:33:47Z","timestamp":1758627227000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2025\/348"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2025,9]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2025\/348","relation":{},"subject":[],"published":{"date-parts":[[2025,9]]}}}