{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T23:30:17Z","timestamp":1773876617718,"version":"3.50.1"},"reference-count":0,"publisher":"Association for the Advancement of Artificial Intelligence (AAAI)","issue":"16","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AAAI"],"abstract":"<jats:p>3D object detection is critical for autonomous driving, yet it remains fundamentally challenging to simultaneously maximize computational\nefficiency and capture long-range spatial dependencies.We observed that Mamba-based models, with their linear state-space design,\ncapture long-range dependencies at lower cost, offering a promising balance between efficiency and accuracy.However, existing methods\nrely on axis-aligned scanning within a fixed window, inevitably discarding spatial information. To address this problem, we propose\nWinMamba, a novel Mamba-based 3D feature-encoding backbone composed of stacked WinMamba blocks. To enhance the backbone with\nrobust multi-scale representation, the WinMamba block incorporates a window-scale-adaptive module that compensates voxel features\nacross varying resolutions during sampling. Meanwhile, to obtain rich contextual cues within the linear state space, we equip the WinMamba\nlayer with a learnable positional encoding and a window-shift strategy.Extensive experiments on the KITTI and Waymo datasets demonstrate\nthat WinMamba significantly outperforms the baseline. Ablation studies further validate the individual contributions of the WSF and AWF\nmodules in improving detection accuracy. The code will be made publicly available.<\/jats:p>","DOI":"10.1609\/aaai.v40i16.38347","type":"journal-article","created":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T00:25:42Z","timestamp":1773793542000},"page":"13432-13440","source":"Crossref","is-referenced-by-count":1,"title":["WinMamba: Multi-Scale Shifted Windows in State Space Model for 3D Object Detection"],"prefix":"10.1609","volume":"40","author":[{"given":"Longhui","family":"Zheng","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qiming","family":"Xia","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaolu","family":"Chen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhaoliang","family":"Liu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chenglu","family":"Wen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"9382","published-online":{"date-parts":[[2026,3,14]]},"container-title":["Proceedings of the AAAI Conference on Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/38347\/42309","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/38347\/42309","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T00:25:42Z","timestamp":1773793542000},"score":1,"resource":{"primary":{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/38347"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,14]]},"references-count":0,"journal-issue":{"issue":"16","published-online":{"date-parts":[[2026,3,17]]}},"URL":"https:\/\/doi.org\/10.1609\/aaai.v40i16.38347","relation":{},"ISSN":["2374-3468","2159-5399"],"issn-type":[{"value":"2374-3468","type":"electronic"},{"value":"2159-5399","type":"print"}],"subject":[],"published":{"date-parts":[[2026,3,14]]}}}