{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,27]],"date-time":"2025-09-27T00:07:30Z","timestamp":1758931650632,"version":"3.44.0"},"reference-count":54,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T00:00:00Z","timestamp":1758844800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Henan Science and Technology R&D Program Joint Fund Project","award":["225200810098"],"award-info":[{"award-number":["225200810098"]}]},{"name":"Key R&D and Promotion Projects of Henan Province","award":["242102211008"],"award-info":[{"award-number":["242102211008"]}]}],"content-domain":{"domain":["www.mdpi.com"],"crossmark-restriction":true},"short-container-title":["Information"],"abstract":"<jats:p>To address the limitation of receptive fields caused by the use of local convolutions in current point cloud object detection methods, this paper proposes a LiDAR point cloud object detection algorithm that integrates global features. The proposed method employs a Voxel Mapping Block (VMB) and a Global Feature Extraction Block (GFEB) to convert the point cloud data into a one-dimensional long sequence. It then utilizes non-local convolutions to model the entire voxelized point cloud and incorporate global contextual information, thereby enhancing the network\u2019s receptive field and its capability to extract and learn global features. Furthermore, a Voxel Channel Feature Extraction (VCFE) module is designed to capture local spatial information by associating features across different channels, effectively mitigating the spatial information loss introduced during the one-dimensional transformation. The experimental results demonstrate that, compared with state-of-the-art methods, the proposed approach improves the average precision of vehicle, pedestrian, and cyclist targets on the Waymo subset by 0.64%, 0.71%, and 0.66%, respectively. On the nuScenes dataset, the detection accuracy for var targets increased by 0.7%, with NDS and mAP improving by 0.3% and 0.5%, respectively. In particular, the method exhibits outstanding performance in small object detection, significantly enhancing the overall accuracy of point cloud object detection.<\/jats:p>","DOI":"10.3390\/info16100832","type":"journal-article","created":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T11:28:32Z","timestamp":1758886112000},"page":"832","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Boosting LiDAR Point Cloud Object Detection via Global Feature Fusion"],"prefix":"10.3390","volume":"16","author":[{"given":"Xu","family":"Zhang","sequence":"first","affiliation":[{"name":"School of Computer Science and Technology, Zhengzhou University of Light Industry, Zhengzhou 450000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fengchang","family":"Tian","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Zhengzhou University of Light Industry, Zhengzhou 450000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiaxing","family":"Sun","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Zhengzhou University of Light Industry, Zhengzhou 450000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yan","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Zhengzhou University of Light Industry, Zhengzhou 450000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,9,26]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"127608","DOI":"10.1016\/j.eswa.2025.127608","article-title":"PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection","volume":"281","author":"Li","year":"2025","journal-title":"Expert Syst. Appl."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"104155","DOI":"10.1016\/j.ipm.2025.104155","article-title":"VoxT-GNN: A 3D Object Detection Approach from Point Cloud Based on Voxel-Level Transformer and Graph Neural Network","volume":"62","author":"Zheng","year":"2025","journal-title":"Inf. Process. Manag."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"10652","DOI":"10.1109\/TITS.2024.3420432","article-title":"TransFusion: Transformer-Based Multi-Modal Fusion for 3D Object Detection in Foggy Weather Based on Spatial Vision Transformer","volume":"25","author":"Zhang","year":"2024","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_4","unstructured":"Gu, A., and Dao, T. (2023). Mamba: Linear-time sequence modeling with selective state spaces. arXiv."},{"key":"ref_5","unstructured":"Qi, C.R., Yi, L., Su, H., and Guibas, L.J. (2017, January 4\u20139). PointNet++: Deep hierarchical feature learning on point sets in a metric space. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Choy, C., Gwak, J.Y., and Savarese, S. (2019, January 15\u201320). 4D spatio-temporal convnets: Minkowski convolutional neural networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00319"},{"key":"ref_7","unstructured":"Deng, J., Zhang, S., Dayoub, F., Ouyang, W., Zhang, Y., and Reid, I. (2024). PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Wang, Y., Cui, Y., and Chau, L.-P. (2025). 3DGeoDet: General-purpose geometry-aware image-based 3D object detection. arXiv.","DOI":"10.1109\/TMM.2025.3581780"},{"key":"ref_9","first-page":"3039","article-title":"State-space models","volume":"4","author":"Hamilton","year":"1994","journal-title":"Handb. Econom."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Yang, Z., Sun, Y., Liu, S., and Jia, J. (2020, January 13\u201319). 3dssd: Point-based 3d single stage object detector. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01105"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Hu, Q., Xu, G., Ma, Y., Wan, J., and Guo, Y. (2022, January 18\u201324). Not all points are equal: Learning highly efficient point-based detectors for 3d lidar point clouds. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01838"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Shi, S., Wang, X., and Li, H. (2019, January 15\u201320). Pointrcnn: 3d object proposal generation and detection from point cloud. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00086"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Graham, B., Engelcke, M., and Van Der Maaten, L. (2018, January 18\u201323). 3D semantic segmentation with submanifold sparse convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00961"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Graham, B., and Van der Maaten, L. (2017). Submanifold sparse convolutional networks. arXiv.","DOI":"10.1109\/CVPR.2018.00961"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., and Beijbom, O. (2019, January 15\u201320). Pointpillars: Fast encoders for object detection from point clouds. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01298"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Yan, Y., Mao, Y., and Li, B. (2018). Second: Sparsely embedded convolutional detection. Sensors, 18.","DOI":"10.3390\/s18103337"},{"key":"ref_17","first-page":"3555","article-title":"Cia-ssd: Confident iou-aware single-stage object detector from point cloud","volume":"35","author":"Zheng","year":"2021","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TIM.2022.3216413","article-title":"Voxel-RCNN-Complex: An effective 3D point cloud object detector for complex traffic conditions","volume":"71","author":"Wang","year":"2022","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_19","first-page":"221","article-title":"3D Object Detection in LiDAR Point Clouds Fusing Point Attention Mechanism","volume":"52","author":"Liu","year":"2023","journal-title":"Acta Photonica Sin."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Chen, Y., Liu, J., Zhang, X., Qi, X., and Jia, J. (2023, January 17\u201324). Voxelnext: Fully sparse voxelnet for 3d object detection and tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.02076"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"2795","DOI":"10.1109\/TMM.2023.3304054","article-title":"SP-Det: Leveraging Saliency Prediction for Voxel-based 3D Object Detection in Sparse Point Cloud","volume":"26","author":"An","year":"2024","journal-title":"IEEE Trans. Multimed."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Yin, T., Zhou, X., and Krahenbuhl, P. (2021, January 20\u201325). Center-based 3d object detection and tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01161"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wang, H., Shi, C., Shi, S., Lei, M., Wang, S., He, D., Schiele, B., and Wang, L. (2023, January 17\u201324). Dsvt: Dynamic sparse voxel transformer with rotated sets. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01299"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Shi, S., Guo, C., Jiang, L., Wang, Z., Shi, J., Wang, X., and Li, H. (2020, January 13\u201319). Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01054"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1007\/s11263-022-01710-9","article-title":"PV-RCNN++: Point-voxel feature set abstraction with local vector representation for 3D object detection","volume":"131","author":"Shi","year":"2023","journal-title":"Int. J. Comput. Vis."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Sheng, H., Cai, S., Liu, Y., Deng, B., Huang, J., Hua, X.S., and Zhao, M.J. (2021, January 10\u201317). Improving 3d object detection with channel-wise transformer. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00274"},{"key":"ref_27","first-page":"2771","article-title":"Object Detection Network Fusing Point Cloud and Voxel Information","volume":"45","author":"Liu","year":"2024","journal-title":"Comput. Eng. Des."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Yang, H., Wang, W., Chen, M., Lin, B., He, T., Chen, H., He, X., and Ouyang, W. (2023, January 17\u201324). Pvt-ssd: Single-stage 3d object detector with point-voxel transformer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01295"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"14894","DOI":"10.1109\/JSEN.2024.3380898","article-title":"PVC-SSD: Point-Voxel Dual-Channel Fusion with Cascade Point Estimation for Anchor-Free Single-Stage 3D Object Detection","volume":"24","author":"Deng","year":"2024","journal-title":"IEEE Sens. J."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"2165256","DOI":"10.1080\/15481603.2023.2165256","article-title":"A shape-attention Pivot-Net for identifying central pivot irrigation systems from satellite images using a cloud computing platform: An application in the contiguous US","volume":"60","author":"Tian","year":"2023","journal-title":"GIScience Remote Sens."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"3627","DOI":"10.1007\/s11760-024-03028-0","article-title":"PVA-GCN: Point-voxel absorbing graph convolutional network for 3D human pose estimation from monocular video","volume":"18","author":"Liu","year":"2024","journal-title":"Signal Image Video Process."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"112741","DOI":"10.1016\/j.knosys.2024.112741","article-title":"Mutual information-driven self-supervised point cloud pre-training","volume":"307","author":"Xu","year":"2025","journal-title":"Knowl.-Based Syst."},{"key":"ref_33","unstructured":"Spatial Sparse Convolution Library. Available online: https:\/\/github.com\/traveller59\/spconv."},{"key":"ref_34","unstructured":"Gu, A., Goel, K., and R\u00e9, C. (2021). Efficiently modeling long sequences with structured state spaces. arXiv."},{"key":"ref_35","first-page":"572","article-title":"Combining recurrent, convolutional, and continuous-time models with linear state space layers","volume":"34","author":"Gu","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_36","unstructured":"Smith, J.T.H., Warrington, A., and Linderman, S.W. (2022). Simplified state space layers for sequence modeling. arXiv."},{"key":"ref_37","unstructured":"Zhu, L., Liao, B., Zhang, Q., Wang, X., Liu, W., and Wang, X. (2024). Vision Mamba: Efficient visual representation learning with bidirectional state space model. arXiv."},{"key":"ref_38","first-page":"103031","article-title":"Vmamba: Visual state space model","volume":"37","author":"Liu","year":"2024","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Yu, W., and Wang, X. (2024). Mambaout: Do we really need mamba for vision?. arXiv.","DOI":"10.1109\/CVPR52734.2025.00423"},{"key":"ref_40","unstructured":"Zhou, Y., Sun, P., Zhang, Y., Anguelov, D., Gao, J., Guo, J., Ngiam, J., and Vasudevan, V. (2020, January 16\u201318). End-to-end multi-view fusion for 3d object detection in lidar point clouds. Proceedings of the Conference on Robot Learning, Virtual."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G., and Beijbom, O. (2020, January 13\u201319). nuscenes: A multimodal dataset for autonomous driving. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01164"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Sun, P., Kretzschmar, H., Dotiwalla, X., Chouard, A., Patnaik, V., Tsui, P., Guo, J., Zhou, Y., Chai, Y., and Caine, B. (2020, January 13\u201319). Scalability in perception for autonomous driving: Waymo open dataset. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00252"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Liu, Y.X., Pan, X., and Zhu, J. (2024, January 21\u201323). 3D Pedestrian Detection Based on Pointpillar: SelfAttention-pointpillar. Proceedings of the 2024 9th International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS), Okinawa, Japan.","DOI":"10.1109\/ICIIBMS62405.2024.10792799"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Sheng, H., Cai, S., Zhao, N., Deng, B., Huang, J., Hua, X.-S., Zhao, M.-J., and Lee, G.H. (2022). Rethinking IoU-based optimization for single-stage 3D object detection. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-031-20077-9_32"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Ming, Q., Miao, L., Ma, Z., Zhao, L., Zhou, Z., Huang, X., Chen, Y., and Guo, Y. (2023, January 17\u201324). Deep dive into gradients: Better optimization for 3D object detection with gradient-corrected IoU supervision. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00497"},{"key":"ref_46","first-page":"2647","article-title":"From points to parts: 3d object detection from point cloud with part-aware and part-aggregation network","volume":"43","author":"Shi","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Guan, T., Wang, J., Lan, S., Chandra, R., Wu, Z., Davis, L., and Manocha, D. (2022, January 3\u20138). M3detr: Multi-representation, multi-scale, mutual-relation 3d object detection with transformers. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","DOI":"10.1109\/WACV51458.2022.00235"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Kim, Y., Shin, J., Kim, S., Lee, I.-J., Choi, J.W., and Kum, D. (2023, January 2\u20136). RN: Camera Radar Net for Accurate, Robust, Efficient 3D Perception. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV 2023), Paris, France.","DOI":"10.1109\/ICCV51070.2023.01615"},{"key":"ref_49","first-page":"1160","article-title":"Craft: Camera-radar 3d object detection with spatio-contextual fusion transformer","volume":"37","author":"Kim","year":"2023","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Lin, Z., Liu, Z., Xia, Z., Wang, X., Wang, Y., Qi, S., Dong, Y., Dong, N., Zhang, L., and Zhu, C. (2024, January 16\u201322). Rcbevdet: Radar-camera fusion in bird\u2019s eye view for 3d object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.01414"},{"key":"ref_51","unstructured":"Lin, Z., Liu, Z., Wang, Y., Zhang, L., and Zhu, C. (2024). RCBEVDet++: Toward high-accuracy radar-camera fusion 3D perception network. arXiv."},{"key":"ref_52","first-page":"221","article-title":"Sasas: Semantics-augmented set abstraction for point-based 3d object detection","volume":"36","author":"Chen","year":"2022","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Bai, X., Hu, Z., Zhu, X., Huang, Q., Chen, Y., Fu, H., and Tai, C.-L. (2022, January 18\u201324). Transfusion: Robust lidar-camera fusion for 3d object detection with transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00116"},{"key":"ref_54","first-page":"34899","article-title":"Fully convolutional one-stage 3d object detection on lidar range images","volume":"35","author":"Tian","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/10\/832\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T11:35:30Z","timestamp":1758886530000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/10\/832"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,26]]},"references-count":54,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2025,10]]}},"alternative-id":["info16100832"],"URL":"https:\/\/doi.org\/10.3390\/info16100832","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,26]]}}}