{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,24]],"date-time":"2026-07-24T15:17:20Z","timestamp":1784906240020,"version":"3.55.0"},"reference-count":40,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2019,9,22]],"date-time":"2019-09-22T00:00:00Z","timestamp":1569110400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Three-dimensional (3D) object detection is an important research in 3D computer vision with significant applications in many fields, such as automatic driving, robotics, and human\u2013computer interaction. However, the low precision is an urgent problem in the field of 3D object detection. To solve it, we present a framework for 3D object detection in point cloud. To be specific, a designed Backbone Network is used to make fusion of low-level features and high-level features, which makes full use of various information advantages. Moreover, the two-dimensional (2D) Generalized Intersection over Union is extended to 3D use as part of the loss function in our framework. Empirical experiments of Car, Cyclist, and Pedestrian detection have been conducted respectively on the KITTI benchmark. Experimental results with average precision (AP) have shown the effectiveness of the proposed network.<\/jats:p>","DOI":"10.3390\/s19194093","type":"journal-article","created":{"date-parts":[[2019,9,23]],"date-time":"2019-09-23T03:26:32Z","timestamp":1569209192000},"page":"4093","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":45,"title":["3D-GIoU: 3D Generalized Intersection over Union for Object Detection in Point Cloud"],"prefix":"10.3390","volume":"19","author":[{"given":"Jun","family":"Xu","sequence":"first","affiliation":[{"name":"College of Information Science and Engineering, Hunan University, Changsha 410082, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4752-3370","authenticated-orcid":false,"given":"Yanxin","family":"Ma","sequence":"additional","affiliation":[{"name":"College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410073, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Songhua","family":"He","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Hunan University, Changsha 410082, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6296-2307","authenticated-orcid":false,"given":"Jiahua","family":"Zhu","sequence":"additional","affiliation":[{"name":"College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410073, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2019,9,22]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1109\/TIM.2018.2840598","article-title":"Binary Volumetric Convolutional Neural Networks for 3-D Object Recognition","volume":"68","author":"Chao","year":"2019","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1169","DOI":"10.1109\/TMM.2018.2875512","article-title":"Learning Multi-view Representation with LSTM for 3D Shape Recognition and Retrieval","volume":"21","author":"Chao","year":"2019","journal-title":"IEEE Trans. Multimed."},{"key":"ref_3","unstructured":"Ankit, K., Ozan, I., Peter, O., Mohit, I., James, B., Ishaan, G., Victor, Z., Romain, P., and Richard, S. (2015). Ask Me Anything Dynamic Memory Networks for Natural Language Processing. arXiv."},{"key":"ref_4","unstructured":"Alexis, C., Holger, S., Yann Le, C., and Lo\u00efc, B. (2018). Deep Convolutional Networks for Natural Language Processing. arXiv."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Chen, X., Ma, H., Wan, J., Li, B., and Xia, T. (2017, January 21\u201326). Multi-View 3D Object Detection Network for Autonomous Driving. Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.691"},{"key":"ref_6","unstructured":"Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., and Urtasun, R. (July, January 26). Monocular 3d Object Detection for Autonomous Driving. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Li, B., Ouyang, W., Sheng, L., Zeng, X., and Wang, X. (2019, January 15\u201321). GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00111"},{"key":"ref_8","unstructured":"Guan, P., and Ulrich, N. (2016, January 4\u20138). 3D Point Cloud Object Detection with Multi-View Convolutional Neural Network. Proceedings of the IEEE Conference on International Conference on Pattern Recognition (ICPR), Cancun, Mexico."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"3434","DOI":"10.1109\/LRA.2018.2852843","article-title":"RT3D: Real-Time 3-D Vehicle Detection in LiDAR Point Cloud for Autonomous Driving","volume":"3","author":"Zeng","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_10","unstructured":"Fran\u00e7ois, P., Francis, C., and Roland, S. (2015). A Review of Point Cloud Registration Algorithms for Mobile Robotics, Foundations and Trends\u00ae in Robotics, Mike Casey."},{"key":"ref_11","unstructured":"Boyoon, J., and Sukhatme, G.S. (2004, January 10\u201313). Detecting Moving Objects Using a Single Camera on a Mobile Robot in an Outdoor Environment. Proceedings of the 8th Conference on Intelligent Autonomous Systems, Amsterdam, The Netherlands."},{"key":"ref_12","first-page":"313","article-title":"A Study of Challenging Issues on Video Surveillance System for Object Detection","volume":"4","author":"Lavanya","year":"2017","journal-title":"J. Basic Appl. Eng. Res."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1419","DOI":"10.1109\/TSMC.2018.2830099","article-title":"Efficient Deep CNN-Based Fire Detection and Localization in Video Surveillance Applications","volume":"49","author":"Khan","year":"2019","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_14","unstructured":"Cheng-bin, J., Shengzhe, L., Trung, D.D., and Hakil, K. (2015). Real-Time Human Action Recognition Using CNN Over Temporal Images for Static Video Surveillance Cameras. Advances in Multimedia Information Processing\u2014PCM 2015, Springer."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Qi, C.R., Liu, W., Wu, C., Su, H., and Guibas, L.J. (2018, January 18\u201322). Frustum PointNets for 3D Object Detection from RGB-D Data. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00102"},{"key":"ref_16","unstructured":"(2018, April 28). Kitti 3D Object Detection Benchmark Leader Board. Available online: http:\/\/www.cvlibs.net\/datasets\/kitti\/eval_object.php?obj_benchmark=3d."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhou, Y., and Tuzel, O. (2018, January 18\u201322). VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00472"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Yan, Y., Mao, Y., and Li, B. (2018). SECOND: Sparsely Embedded Convolutional Detection. Sensors, 18.","DOI":"10.3390\/s18103337"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Simon, M., Milz, S., Amende, K., and Gross, H.M. (2018, January 8\u201314). Complex-YOLO: An Euler-Region-Proposal for Real-Time 3D Object Detection on Point Clouds. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-11009-3_11"},{"key":"ref_20","unstructured":"Hamid, R., Nathan, T., JunYoung, G., Amir, S., Ian, R., and Silvio, S. (2019, January 16\u201320). Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Yang, B., Luo, W., and Urtasun, R. (2018, January 18\u201322). PIXOR: Real-time 3D Object Detection from Point Clouds. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00798"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Li, B., Zhang, T., and Xia, T. (2016). Vehicle detection from 3D lidar using fully convolutional network. arXiv.","DOI":"10.15607\/RSS.2016.XII.042"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Engelcke, M., Rao, D., Wang, D.Z., Tong, C.H., and Posner, I. (June, January 29). Vote3deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989161"},{"key":"ref_24","unstructured":"Qi, C.R., Su, H., Mo, K., and Guibas, L.J. (2017, January 21\u201326). PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 24\u201327). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_26","unstructured":"Kiwoo, S., Youngwook Paul, K., and Masayoshi, T. (2018). RoarNet: A Robust 3D Object Detection based on RegiOn Approximation Refinement. arXiv."},{"key":"ref_27","unstructured":"Liu, W., Ji, R., and Li, S. (2015, January 7\u201312). Towards 3D Object Detection with Bimodal Deep Boltzmann Machines over RGBD Imagery. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA."},{"key":"ref_28","unstructured":"Zhuo, D., and Londin, J.L. (2017, January 21\u201326). Amodal Detection of 3D Objects: Inferring 3D Bounding Boxes from 2D Ones in RGB-Depth Images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA."},{"key":"ref_29","unstructured":"Qianhui, L., Huifang, M., Yue, W., Li, T., and Rong, X. (2015). 3D-SSD: Learning Hierarchical Features from RGB-D Images for Amodal 3D Object Detection. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Song, S., and Xiao, J. (2016, January 27\u201330). Deep Sliding Shapes for Amodal 3D Object Detection in Rgb-d Images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.94"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Ling, M., Yang, B., Wang, S., and Raquel, U. (2018, January 8\u201314). Deep Continuous Fusion for Multi-Sensor 3D Object Detection. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01270-0_39"},{"key":"ref_32","unstructured":"Huitl, R., Schroth, G., Hilsenbeck, S., Schweiger, F., and Steinbach, E. (October, January 30). TUMindoor: An Extensive Image and Point Cloud Dataset for Visual Indoor Localization and Mapping. Proceedings of the IEEE International Conference on Image Processing, Orlando, FL, USA."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Ku, J., Mozifian, M., Lee, J., Harakeh, A., and Waslander, S. (2018, January 1\u20135). Joint 3D Proposal Generation and Object Detection from View Aggregation. Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Madrid, Spain.","DOI":"10.1109\/IROS.2018.8594049"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Li, M., Hu, Y., Zhao, N., and Qian, Q. (2019). One-Stage Multi-Sensor Data Fusion Convolutional Neural Network for 3D Object Detection. Sensors, 19.","DOI":"10.3390\/s19061434"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Xu, J., Ma, Y., He, S., Zhu, J., Xiao, Y., and Zhang, J. (2019, January 11\u201313). PVFE: Point-Voxel Feature Encoders for 3D Object Detection. Proceedings of the IEEE International Conference on Signal, Information and Data Processing, Chongqing, China. accepted.","DOI":"10.1109\/ICSIDP47821.2019.9173478"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Dollar, P. (2017, January 22\u201329). Focal Loss for Dense Object Detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The Pascal Visual Object Classes (VOC) Challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_39","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_40","unstructured":"Alex, H., Sourabh, V., Holger, C., Zhou, L., Jiong, Y., and Oscar, B. (2019, January 16\u201320). PointPillars: Fast Encoders for Object Detection from Point Clouds. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/19\/19\/4093\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:22:54Z","timestamp":1760188974000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/19\/19\/4093"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,9,22]]},"references-count":40,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2019,10]]}},"alternative-id":["s19194093"],"URL":"https:\/\/doi.org\/10.3390\/s19194093","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,9,22]]}}}