{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:53:59Z","timestamp":1760144039632,"version":"build-2065373602"},"reference-count":40,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2024,3,15]],"date-time":"2024-03-15T00:00:00Z","timestamp":1710460800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China under Grants","award":["62101517","62176200","YYJC052022004","2022JC-45","111 Project"],"award-info":[{"award-number":["62101517","62176200","YYJC052022004","2022JC-45","111 Project"]}]},{"name":"Research Project of SongShan Laboratory","award":["62101517","62176200","YYJC052022004","2022JC-45","111 Project"],"award-info":[{"award-number":["62101517","62176200","YYJC052022004","2022JC-45","111 Project"]}]},{"name":"Natural Science Basic Research Program of Shaanxi","award":["62101517","62176200","YYJC052022004","2022JC-45","111 Project"],"award-info":[{"award-number":["62101517","62176200","YYJC052022004","2022JC-45","111 Project"]}]},{"name":"Fund for Foreign Scholars in University Research and Teaching Programs","award":["62101517","62176200","YYJC052022004","2022JC-45","111 Project"],"award-info":[{"award-number":["62101517","62176200","YYJC052022004","2022JC-45","111 Project"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>With the continuous emergence and development of 3D sensors in recent years, it has become increasingly convenient to collect point cloud data for 3D object detection tasks, such as the field of autonomous driving. But when using these existing methods, there are two problems that cannot be ignored: (1) The bird\u2019s eye view (BEV) is a widely used method in 3D objective detection; however, the BEV usually compresses dimensions by combined height, dimension, and channels, which makes the process of feature extraction in feature fusion more difficult. (2) Light detection and ranging (LiDAR) has a much larger effective scanning depth, which causes the sector to become sparse in deep space and the uneven distribution of point cloud data. This results in few features in the distribution of neighboring points around the key points of interest. The following is the solution proposed in this paper: (1) This paper proposes multi-scale feature fusion composed of feature maps at different levels made of Deep Layer Aggregation (DLA) and a feature fusion module for the BEV. (2) A point completion network is used to improve the prediction results by completing the feature points inside the candidate boxes in the second stage, thereby strengthening their position features. Supervised contrastive learning is applied to enhance the segmentation results, improving the discrimination capability between the foreground and background. Experiments show these new additions can achieve improvements of 2.7%, 2.4%, and 2.5%, respectively, on KITTI easy, moderate, and hard tasks. Further ablation experiments show that each addition has promising improvement over the baseline.<\/jats:p>","DOI":"10.3390\/rs16061045","type":"journal-article","created":{"date-parts":[[2024,3,15]],"date-time":"2024-03-15T12:02:39Z","timestamp":1710504159000},"page":"1045","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Three-Dimensional Point Cloud Object Detection Based on Feature Fusion and Enhancement"],"prefix":"10.3390","volume":"16","author":[{"given":"Yangyang","family":"Li","sequence":"first","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Joint International Research Laboratory of Intelligent Perception and Computation, International Research Center for Intelligent Perception and Computation, Collaborative Innovation Center of Quantum Information of Shaanxi Province, School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Zejun","family":"Ou","sequence":"additional","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Joint International Research Laboratory of Intelligent Perception and Computation, International Research Center for Intelligent Perception and Computation, Collaborative Innovation Center of Quantum Information of Shaanxi Province, School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2725-0918","authenticated-orcid":false,"given":"Guangyuan","family":"Liu","sequence":"additional","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Joint International Research Laboratory of Intelligent Perception and Computation, International Research Center for Intelligent Perception and Computation, Collaborative Innovation Center of Quantum Information of Shaanxi Province, School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Zichen","family":"Yang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Joint International Research Laboratory of Intelligent Perception and Computation, International Research Center for Intelligent Perception and Computation, Collaborative Innovation Center of Quantum Information of Shaanxi Province, School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Yanqiao","family":"Chen","sequence":"additional","affiliation":[{"name":"The 54th Research Institute of China Electronics Technology Group Corporation, Shijiazhuang 050081, China"}]},{"given":"Ronghua","family":"Shang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Joint International Research Laboratory of Intelligent Perception and Computation, International Research Center for Intelligent Perception and Computation, Collaborative Innovation Center of Quantum Information of Shaanxi Province, School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Licheng","family":"Jiao","sequence":"additional","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Joint International Research Laboratory of Intelligent Perception and Computation, International Research Center for Intelligent Perception and Computation, Collaborative Innovation Center of Quantum Information of Shaanxi Province, School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]}],"member":"1968","published-online":{"date-parts":[[2024,3,15]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"3782","DOI":"10.1109\/TITS.2019.2892405","article-title":"A Survey on 3D Object Detection Methods for Autonomous Driving Applications","volume":"20","author":"Arnold","year":"2019","journal-title":"IEEE Trans. Intell. Transport. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Ku, J., Mozifian, M., Lee, J., Harakeh, A., and Waslander, S.L. (2018, January 1\u20135). Joint 3D Proposal Generation and Object Detection from View Aggregation. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8594049"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"3434","DOI":"10.1109\/LRA.2018.2852843","article-title":"RT3D: Real-Time 3-D Vehicle Detection in LiDAR Point Cloud for Autonomous Driving","volume":"3","author":"Zeng","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Khanh, T.T., Hoang Hai, T., Nguyen, V., Nguyen, T.D.T., Thien Thu, N., and Huh, E.-N. (2020, January 3\u20135). The Practice of Cloud-Based Navigation System for Indoor Robot. Proceedings of the 2020 14th International Conference on Ubiquitous Information Management and Communication (IMCOM), Taichung, Taiwan.","DOI":"10.1109\/IMCOM48794.2020.9001709"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Yu, S.-L., Westfechtel, T., Hamada, R., Ohno, K., and Tadokoro, S. (2017, January 11\u201313). Vehicle Detection and Localization on Bird\u2019s Eye View Elevation Images Using Convolutional Neural Network. Proceedings of the 2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR), Shanghai, China.","DOI":"10.1109\/SSRR.2017.8088147"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Engelcke, M., Rao, D., Wang, D.Z., Tong, C.H., and Posner, I. (June, January 29). Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989161"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1007\/s11263-007-0095-3","article-title":"Robust Object Detection with Interleaved Categorization and Segmentation","volume":"77","author":"Leibe","year":"2008","journal-title":"Int. J. Comput. Vis."},{"key":"ref_8","first-page":"234","article-title":"U-Net: Convolutional Networks for Biomedical Image Segmentation","volume":"Volume 9351","author":"Navab","year":"2015","journal-title":"Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2015"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1007\/978-3-319-46448-0_2","article-title":"SSD: Single Shot MultiBox Detector","volume":"Volume 9905","author":"Leibe","year":"2016","journal-title":"Computer Vision\u2014ECCV 2016"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhou, Y., and Tuzel, O. (2018, January 18\u201323). VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00472"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, Faster, Stronger. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Beltran, J., Guindel, C., Moreno, F.M., Cruzado, D., Garcia, F., and De La Escalera, A. (2018, January 4\u20137). BirdNet: A 3D Object Detection Framework from LiDAR Information. Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA.","DOI":"10.1109\/ITSC.2018.8569311"},{"key":"ref_14","first-page":"197","article-title":"Complex-YOLO: An Euler-Region-Proposal for Real-Time 3D Object Detection on Point Clouds","volume":"Volume 11129","author":"Roth","year":"2019","journal-title":"Computer Vision\u2014ECCV 2018 Workshops"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Qi, C.R., Liu, W., Wu, C., Su, H., and Guibas, L.J. (2018, January 18\u201323). Frustum pointnets for 3d object detection from rgb-d data. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00102"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"663","DOI":"10.1007\/978-3-030-01270-0_39","article-title":"Deep Continuous Fusion for Multi-Sensor 3D Object Detection","volume":"Volume 11220","author":"Ferrari","year":"2018","journal-title":"Computer Vision\u2014ECCV 2018"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Mohapatra, S., Yogamani, S., Gotzig, H., Milz, S., and Mader, P. (2021, January 19\u201322). BEVDetNet: Bird\u2019s Eye View LiDAR Point Cloud Based Real-Time 3D Object Detection for Autonomous Driving. Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA.","DOI":"10.1109\/ITSC48978.2021.9564490"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Dollar, P., and Girshick, R. (2017, January 22\u201329). Mask R-CNN. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Graham, B., Engelcke, M., and Maaten, L.V.D. (2018, January 18\u201323). 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00961"},{"key":"ref_20","unstructured":"Dai, X., Li, M., Zhai, P., Tong, S., Gao, X., Huang, S., Zhu, Z., You, C., and Ma, Y. (2022). Revisiting Sparse Convolutional Model for Visual Recognition. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Chen, X., Ma, H., Wan, J., Li, B., and Xia, T. (2017, January 21\u201326). Multi-View 3D Object Detection Network for Autonomous Driving. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.691"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Charles, R.Q., Su, H., Kaichun, M., and Guibas, L.J. (2017, January 21\u201326). PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.16"},{"key":"ref_23","unstructured":"Qi, C.R., Yi, L., Su, H., and Guibas, L.J. (2017, January 4\u20139). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Vora, S., Lang, A.H., Helou, B., and Beijbom, O. (2020, January 13\u201319). PointPainting: Sequential Fusion for 3D Object Detection. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00466"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Shi, S., Wang, X., and Li, H. (2019, January 15\u201320). PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00086"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Yang, B., Luo, W., and Urtasun, R. (2018, January 18\u201323). PIXOR: Real-Time 3D Object Detection from Point Clouds. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00798"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Yan, Y., Mao, Y., and Li, B. (2018). SECOND: Sparsely Embedded Convolutional Detection. Sensors, 18.","DOI":"10.3390\/s18103337"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Yang, Z., Sun, Y., Liu, S., and Jia, J. (2020, January 13\u201319). 3DSSD: Point-Based 3D Single Stage Object Detector. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01105"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Qi, C.R., Litany, O., He, K., and Guibas, L. (November, January 27). Deep Hough Voting for 3D Object Detection in Point Clouds. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00937"},{"key":"ref_31","first-page":"2647","article-title":"From Points to Parts: 3D Object Detection from Point Cloud with Part-Aware and Part-Aggregation Network","volume":"43","author":"Shi","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Qi, C.R., Chen, X., Litany, O., and Guibas, L.J. (2020, January 13\u201319). ImVoteNet: Boosting 3D Object Detection in Point Clouds With Image Votes. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00446"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Dettmers, T., Minervini, P., Stenetorp, P., and Riedel, S. (2018, January 2\u20137). Convolutional 2D Knowledge Graph Embeddings. Proceedings of the AAAI\u201918: AAAI Conference on Artificial Intelligence, New Orleans, LA, USA.","DOI":"10.1609\/aaai.v32i1.11573"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"3555","DOI":"10.1609\/aaai.v35i4.16470","article-title":"CIA-SSD: Confident IoU-Aware Single-Stage Object Detector From Point Cloud","volume":"35","author":"Zheng","year":"2021","journal-title":"AAAI"},{"key":"ref_35","unstructured":"Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. (2020, January 13\u201318). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning, Virtual."},{"key":"ref_36","first-page":"18661","article-title":"Supervised contrastive learning","volume":"33","author":"Khosla","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Yuan, W., Khot, T., Held, D., Mertz, C., and Hebert, M. (2018, January 5\u20138). Pcn: Point completion network. Proceedings of the 2018 International Conference on 3D Vision (3DV), Verona, Italy.","DOI":"10.1109\/3DV.2018.00088"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16\u201321). Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1007\/s11263-014-0733-5","article-title":"The Pascal Visual Object Classes Challenge: A Retrospective","volume":"111","author":"Everingham","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., and Beijbom, O. (2019, January 15\u201320). PointPillars: Fast Encoders for Object Detection From Point Clouds. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01298"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/6\/1045\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:14:22Z","timestamp":1760105662000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/6\/1045"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,15]]},"references-count":40,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2024,3]]}},"alternative-id":["rs16061045"],"URL":"https:\/\/doi.org\/10.3390\/rs16061045","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2024,3,15]]}}}