{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T07:26:41Z","timestamp":1740122801292,"version":"3.37.3"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T00:00:00Z","timestamp":1707436800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T00:00:00Z","timestamp":1707436800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62106086"],"award-info":[{"award-number":["62106086"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100003819","name":"Natural Science Foundation of Hubei Province","doi-asserted-by":"crossref","award":["2021CFB564"],"award-info":[{"award-number":["2021CFB564"]}],"id":[{"id":"10.13039\/501100003819","id-type":"DOI","asserted-by":"crossref"}]},{"name":"the Research Fund of Jianghan University","award":["2021yb060"],"award-info":[{"award-number":["2021yb060"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Process Lett"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Recently, 3D object detection technology based on point clouds has developed rapidly. However, too few points of distant and occluded objects are scanned by the sensor, and thus these objects suffer from too insufficient features to be detected. This case damages the detection accuracy. Therefore, we constitute a novel 3D object detection with Context-aware and dimensional Interaction Attention Network (CIANet) to explore vital geometric cues for enriching the feature representation of the object, thus boosting the overall detection performance. Specifically, in the first stage, we employ the 3D sparse convolution to extract voxel features, and then construct a Channel-Spatial Hybrid Attention (CSHA) module and a Contextual Self-Attention (CSA) module to enhance voxel features for generating proposals. The CSHA module aims to enhance the key information of the channel and spatial domains of 2D Bird\u2019s Eye View (BEV) features, and the CSA module is applied to supplement contextual information to the enhanced BEV features, thus generating accurate proposals. In the second stage, we construct a Dimensional Interaction Attention (DIA) module to refine Region of Interest (RoI) features within the proposals. It enhances the interactions among the channel and spatial dimensions of RoI features to learn accurate boundaries of objects for proposal refinement. Extensive experiments on the KITTI and Waymo benchmarks show the superior detection performance of CIANet compared to recent methods, especially for objects such as pedestrians and cyclists.<\/jats:p>","DOI":"10.1007\/s11063-024-11447-w","type":"journal-article","created":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T09:02:32Z","timestamp":1707469352000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Improving 3D Object Detection with Context-Aware and Dimensional Interaction Attention"],"prefix":"10.1007","volume":"56","author":[{"given":"Jing","family":"Zhou","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zixin","family":"Gong","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Junchi","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,2,9]]},"reference":[{"issue":"6","key":"11447_CR1","doi-asserted-by":"publisher","first-page":"5063","DOI":"10.1007\/s11063-022-10848-z","volume":"54","author":"B Liu","year":"2022","unstructured":"Liu B, Tian B, Wang H, Qiao J, Wang Z (2022) Fusenet: 3d object detection network with fused information for lidar point clouds. Neural Process Lett 54(6):5063\u20135078","journal-title":"Neural Process Lett"},{"key":"11447_CR2","doi-asserted-by":"crossref","unstructured":"Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3d object detection network for autonomous driving. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1907\u20131915","DOI":"10.1109\/CVPR.2017.691"},{"key":"11447_CR3","doi-asserted-by":"crossref","unstructured":"Yang B, Luo W, Urtasun R (2018) Pixor: real-time 3d object detection from point clouds. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7652\u20137660","DOI":"10.1109\/CVPR.2018.00798"},{"key":"11447_CR4","unstructured":"Zhou Y, Sun P, Zhang Y, Anguelov D, Gao J, Ouyang T, Guo J, Ngiam J, Vasudevan V (2020) End-to-end multi-view fusion for 3d object detection in lidar point clouds. In: Conference on robot learning. PMLR, pp 923\u2013932"},{"key":"11447_CR5","doi-asserted-by":"crossref","unstructured":"Shi S, Wang X, Li H (2019) Pointrcnn: 3d object proposal generation and detection from point cloud. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 770\u2013779","DOI":"10.1109\/CVPR.2019.00086"},{"key":"11447_CR6","doi-asserted-by":"crossref","unstructured":"Yang Z, Sun Y, Liu S, Jia J (2020) 3dssd: point-based 3d single stage object detector. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 11040\u201311048","DOI":"10.1109\/CVPR42600.2020.01105"},{"key":"11447_CR7","doi-asserted-by":"crossref","unstructured":"Shi W, Rajkumar R (2020) Point-gnn: graph neural network for 3d object detection in a point cloud. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 1711\u20131719","DOI":"10.1109\/CVPR42600.2020.00178"},{"key":"11447_CR8","unstructured":"Li J, Luo S, Zhu Z, Dai H, Krylov AS, Ding Y, Shao L (2020) 3d iou-net: Iou guided 3d object detector for point clouds. arXiv:2004.04962"},{"issue":"10","key":"11447_CR9","doi-asserted-by":"publisher","first-page":"3337","DOI":"10.3390\/s18103337","volume":"18","author":"Y Yan","year":"2018","unstructured":"Yan Y, Mao Y, Li B (2018) Second: sparsely embedded convolutional detection. Sensors 18(10):3337","journal-title":"Sensors"},{"key":"11447_CR10","doi-asserted-by":"crossref","unstructured":"Lang AH, Vora S, Caesar H, Zhou L, Yang J, Beijbom O (2019) Pointpillars: fast encoders for object detection from point clouds. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 12697\u201312705","DOI":"10.1109\/CVPR.2019.01298"},{"issue":"8","key":"11447_CR11","first-page":"2647","volume":"43","author":"S Shi","year":"2020","unstructured":"Shi S, Wang Z, Shi J, Wang X, Li H (2020) From points to parts: 3d object detection from point cloud with part-aware and part-aggregation network. IEEE Trans Pattern Anal Mach Intell 43(8):2647\u20132664","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"2","key":"11447_CR12","first-page":"1201","volume":"35","author":"J Deng","year":"2021","unstructured":"Deng J, Shi S, Li P, Zhou W, Zhang Y, Li H (2021) Voxel r-cnn: towards high performance voxel-based 3d object detection. Proc AAAI Conf Artif Intell 35(2):1201\u20131209","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"11447_CR13","unstructured":"Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652\u2013660"},{"key":"11447_CR14","unstructured":"Qi CR, Yi L, Su H, Guibas LJ (2017) \u201cPointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in neural information processing systems, vol 30"},{"key":"11447_CR15","doi-asserted-by":"crossref","unstructured":"Shi S, Guo C, Jiang L, Wang Z, Shi J, Wang X, Li H (2020) Pv-rcnn: point-voxel feature set abstraction for 3d object detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 10529\u201310538","DOI":"10.1109\/CVPR42600.2020.01054"},{"key":"11447_CR16","doi-asserted-by":"crossref","unstructured":"Yang Z, Sun Y, Liu S, Shen X, Jia J (2019) Std: sparse-to-dense 3d object detector for point cloud. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 1951\u20131960","DOI":"10.1109\/ICCV.2019.00204"},{"key":"11447_CR17","doi-asserted-by":"crossref","unstructured":"Noh J, Lee S, Ham (2021) Hvpr: hybrid voxel-point representation for single-stage 3d object detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 14605\u201314614","DOI":"10.1109\/CVPR46437.2021.01437"},{"key":"11447_CR18","doi-asserted-by":"crossref","unstructured":"Sheng H, Cai S, Liu Y, Deng B, Huang J, Hua X-S, Zhao M-J (2021) Improving 3d object detection with channel-wise transformer. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 2743\u20132752","DOI":"10.1109\/ICCV48922.2021.00274"},{"key":"11447_CR19","doi-asserted-by":"crossref","unstructured":"Rouhafzay G, Cretu A-M, Payeur P (2023) A deep model of visual attention for saliency detection on 3d objects. In: Neural processing letters, pp 1\u201321","DOI":"10.1007\/s11063-023-11180-w"},{"key":"11447_CR20","doi-asserted-by":"crossref","unstructured":"Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132\u20137141","DOI":"10.1109\/CVPR.2018.00745"},{"key":"11447_CR21","doi-asserted-by":"crossref","unstructured":"Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3\u201319","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"11447_CR22","doi-asserted-by":"crossref","unstructured":"Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) Eca-net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 11534\u201311542","DOI":"10.1109\/CVPR42600.2020.01155"},{"key":"11447_CR23","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser \u0141, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30"},{"issue":"07","key":"11447_CR24","first-page":"11677","volume":"34","author":"Z Liu","year":"2020","unstructured":"Liu Z, Zhao X, Huang T, Hu R, Zhou Y, Bai X (2020) Tanet: robust 3d object detection from point clouds with triple attention. Proc AAAI Conf Artif Intell 34(07):11677\u201311684","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"11447_CR25","doi-asserted-by":"crossref","unstructured":"Bhattacharyya P, Huang C, Czarnecki K (2021) Sa-det3d: self-attention based context-aware 3d object detection. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 3022\u20133031","DOI":"10.1109\/ICCVW54120.2021.00337"},{"key":"11447_CR26","doi-asserted-by":"crossref","unstructured":"Pan X, Xia Z, Song S, Li Le, Huang G (2021) 3d object detection with pointformer. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 7463\u20137472","DOI":"10.1109\/CVPR46437.2021.00738"},{"key":"11447_CR27","doi-asserted-by":"crossref","unstructured":"Mao J, Xue Y, Niu M, Bai H, Feng J, Liang X, Xu H, Xu C (2021) Voxel transformer for 3d object detection. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 3164\u20133173","DOI":"10.1109\/ICCV48922.2021.00315"},{"key":"11447_CR28","unstructured":"Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, vol 28"},{"key":"11447_CR29","doi-asserted-by":"crossref","unstructured":"Lin T-Y, Goyal P, Girshick R, He K, Doll\u00e1r P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980\u20132988","DOI":"10.1109\/ICCV.2017.324"},{"key":"11447_CR30","doi-asserted-by":"crossref","unstructured":"Sun P, Kretzschmar H, Dotiwalla X, Chouard A, Patnaik V, Tsui P, Guo J, Zhou Y, Chai Y, Caine B, et al. (2020) Scalability in perception for autonomous driving: Waymo open dataset. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 2446\u20132454","DOI":"10.1109\/CVPR42600.2020.00252"},{"key":"11447_CR31","doi-asserted-by":"crossref","unstructured":"Sheng H, Cai S, Zhao N, Deng B, Huang J, Hua X-S, Zhao M-J, Lee GH (2022) Rethinking iou-based optimization for single-stage 3d object detection. In: European conference on computer vision. Springer, pp 544\u2013561","DOI":"10.1007\/978-3-031-20077-9_32"},{"key":"11447_CR32","doi-asserted-by":"crossref","unstructured":"He C, Li R, Li S, Zhang L (2022) Voxel set transformer: a set-to-set approach to 3d object detection from point clouds. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 8417\u20138427","DOI":"10.1109\/CVPR52688.2022.00823"},{"key":"11447_CR33","doi-asserted-by":"crossref","unstructured":"Zhou C, Zhang Y, Chen J, Huang D (2023) Octr: octree-based transformer for 3d object detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 5166\u20135175","DOI":"10.1109\/CVPR52729.2023.00500"},{"key":"11447_CR34","doi-asserted-by":"crossref","unstructured":"Jiang T, Song N, Liu H, Yin R, Gong Y, Yao J (2021) Vic-net: voxelization information compensation network for point cloud 3d object detection. In: 2021 IEEE international conference on robotics and automation (ICRA). IEEE, pp 13408\u201313414","DOI":"10.1109\/ICRA48506.2021.9561597"},{"key":"11447_CR35","doi-asserted-by":"crossref","unstructured":"Zheng W, Tang W, Jiang L, Fu C-W (2021) Se-ssd: self-ensembling single-stage object detector from point cloud. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 14494\u201314503","DOI":"10.1109\/CVPR46437.2021.01426"},{"key":"11447_CR36","doi-asserted-by":"crossref","unstructured":"Zhang Y, Hu Q, Xu G, Ma Y, Wan J, Guo Y (2022) Not all points are equal: Learning highly efficient point-based detectors for 3d lidar point clouds. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 18953\u201318962","DOI":"10.1109\/CVPR52688.2022.01838"},{"key":"11447_CR37","doi-asserted-by":"crossref","unstructured":"Guan T, Wang J, Lan S, Chandra R, Wu Z, Davis L, Manocha D (2022) M3detr: multi-representation, multi-scale, mutual-relation 3d object detection with transformers. In: Proceedings of the IEEE\/CVF winter conference on applications of computer vision, pp 772\u2013782","DOI":"10.1109\/WACV51458.2022.00235"},{"key":"11447_CR38","doi-asserted-by":"crossref","unstructured":"Liu Z, Huang T, Li B, Chen X, Wang X, Bai X (2022) Epnet++: cascade bi-directional fusion for multi-modal 3d object detection. In: IEEE transactions on pattern analysis and machine intelligence (2022)","DOI":"10.1109\/TPAMI.2022.3228806"},{"key":"11447_CR39","doi-asserted-by":"crossref","unstructured":"Li Z, Wang F, Wang N (2021) Lidar r-cnn: an efficient and universal 3d object detector. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 7546\u20137555","DOI":"10.1109\/CVPR46437.2021.00746"},{"key":"11447_CR40","doi-asserted-by":"crossref","unstructured":"Yang Z, Zhou Y, Chen Z, Ngiam J (2021) 3d-man: 3d multi-frame attention network for object detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 1863\u20131872","DOI":"10.1109\/CVPR46437.2021.00190"}],"container-title":["Neural Processing Letters"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-024-11447-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11063-024-11447-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-024-11447-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,29]],"date-time":"2024-02-29T20:11:42Z","timestamp":1709237502000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11063-024-11447-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,9]]},"references-count":40,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,2]]}},"alternative-id":["11447"],"URL":"https:\/\/doi.org\/10.1007\/s11063-024-11447-w","relation":{},"ISSN":["1573-773X"],"issn-type":[{"type":"electronic","value":"1573-773X"}],"subject":[],"published":{"date-parts":[[2024,2,9]]},"assertion":[{"value":"25 December 2023","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 February 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"All data generated or analyzed during this study are included in this article. The datasets generated during and\/or analyzed during the current study are available from the corresponding author on reasonable request.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Data Availability"}}],"article-number":"23"}}