{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:14:45Z","timestamp":1760148885714,"version":"build-2065373602"},"reference-count":33,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T00:00:00Z","timestamp":1687219200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>In recent years, point cloud-based 3D object detection has seen tremendous success. Previous point-based methods use Set Abstraction (SA) to sample the key points and abstract their features, which did not fully take density variation into consideration in point sampling and feature extraction. The SA module can be split into three parts: point sampling, grouping and feature extraction. Previous sampling methods focus more on distances among points in Euclidean space or feature space, ignoring the point density, thus making it more likely to sample points in Ground Truth (GT) containing dense points. Furthermore, the feature extraction module takes the relative coordinates and point features as input, while raw point coordinates can represent more informative attributes, i.e., point density and direction angle. So, this paper proposes Density-aware Semantics-Augmented Set Abstraction (DSASA) for solving the above two issues, which takes a deep look at the point density in the sampling process and enhances point features using onefold raw point coordinates. We conduct the experiments on the KITTI dataset and verify the superiority of DSASA.<\/jats:p>","DOI":"10.3390\/s23125757","type":"journal-article","created":{"date-parts":[[2023,6,21]],"date-time":"2023-06-21T02:30:51Z","timestamp":1687314651000},"page":"5757","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Boosting 3D Object Detection with Density-Aware Semantics-Augmented Set Abstraction"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2277-8871","authenticated-orcid":false,"given":"Tingyu","family":"Zhang","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"},{"name":"Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7701-8511","authenticated-orcid":false,"given":"Jian","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"},{"name":"Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2285-3851","authenticated-orcid":false,"given":"Xinyu","family":"Yang","sequence":"additional","affiliation":[{"name":"China Automotive Innovation Corporation, Nanjing 210000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,6,20]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Graham, B., and Van der Maaten, L. (2017). Submanifold sparse convolutional networks. arXiv.","DOI":"10.1109\/CVPR.2018.00961"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Yan, Y., Mao, Y., and Li, B. (2018). Second: Sparsely embedded convolutional detection. Sensors, 18.","DOI":"10.3390\/s18103337"},{"key":"ref_3","unstructured":"Qi, C.R., Su, H., Mo, K., and Guibas, L.J. (2017, January 21\u201326). Pointnet: Deep learning on point sets for 3d classification and segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA."},{"key":"ref_4","first-page":"5099","article-title":"Pointnet++: Deep hierarchical feature learning on point sets in a metric space","volume":"30","author":"Qi","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_5","first-page":"23192","article-title":"Pointnext: Revisiting pointnet++ with improved training and scaling strategies","volume":"35","author":"Qian","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Shi, S., Guo, C., Jiang, L., Wang, Z., Shi, J., Wang, X., and Li, H. (2020, January 13\u201319). Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01054"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1007\/s11263-022-01710-9","article-title":"PV-RCNN++: Point-voxel feature set abstraction with local vector representation for 3D object detection","volume":"131","author":"Shi","year":"2023","journal-title":"Int. J. Comput. Vis."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Shi, S., Wang, X., and Li, H. (2019, January 15\u201320). Pointrcnn: 3d object proposal generation and detection from point cloud. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00086"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Yang, Z., Sun, Y., Liu, S., and Jia, J. (2020, January 13\u201319). 3dssd: Point-based 3d single stage object detector. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01105"},{"key":"ref_10","unstructured":"Chen, C., Chen, Z., Zhang, J., and Tao, D. (March, January 22). Sasa: Semantics-augmented set abstraction for point-based 3d object detection. Proceedings of the AAAI Conference on Artificial Intelligence, Virtual."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Hu, Q., Xu, G., Ma, Y., Wan, J., and Guo, Y. (2022, January 18\u201324). Not all points are equal: Learning highly efficient point-based detectors for 3d lidar point clouds. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01838"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16\u201321). Are we ready for autonomous driving? The kitti vision benchmark suite. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zhou, Y., and Tuzel, O. (2018, January 18\u201323). Voxelnet: End-to-end learning for point cloud based 3d object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00472"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., and Beijbom, O. (2019, January 15\u201320). Pointpillars: Fast encoders for object detection from point clouds. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01298"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Mao, J., Xue, Y., Niu, M., Bai, H., Feng, J., Liang, X., Xu, H., and Xu, C. (2021, January 11\u201317). Voxel transformer for 3d object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00315"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"He, C., Li, R., Li, S., and Zhang, L. (2022, January 18\u201324). Voxel set transformer: A set-to-set approach to 3d object detection from point clouds. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00823"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Sheng, H., Cai, S., Liu, Y., Deng, B., Huang, J., Hua, X.S., and Zhao, M.J. (2021, January 11\u201317). Improving 3d object detection with channel-wise transformer. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00274"},{"key":"ref_18","first-page":"5998","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Fan, L., Pang, Z., Zhang, T., Wang, Y.X., Zhao, H., Wang, F., Wang, N., and Zhang, Z. (2022, January 18\u201324). Embracing single stride 3d object detector with sparse transformer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00827"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Shi, W., and Rajkumar, R. (2020, January 13\u201319). Point-gnn: Graph neural network for 3d object detection in a point cloud. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00178"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"108524","DOI":"10.1016\/j.patcog.2022.108524","article-title":"BADet: Boundary-aware 3D object detection from point clouds","volume":"125","author":"Qian","year":"2022","journal-title":"Pattern Recognit."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Guan, T., Wang, J., Lan, S., Chandra, R., Wu, Z., Davis, L., and Manocha, D. (2022, January 3\u20138). M3detr: Multi-representation, multi-scale, mutual-relation 3d object detection with transformers. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","DOI":"10.1109\/WACV51458.2022.00235"},{"key":"ref_23","unstructured":"Hu, J.S., Kuai, T., and Waslander, S.L. (2022, January 18\u201324). Point density-aware voxels for lidar 3d object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1065","DOI":"10.1214\/aoms\/1177704472","article-title":"On estimation of a probability density function and mode","volume":"33","author":"Parzen","year":"1962","journal-title":"Ann. Math. Stat."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"832","DOI":"10.1214\/aoms\/1177728190","article-title":"Remarks on some nonparametric estimates of a density function","volume":"27","author":"Rosenblatt","year":"1956","journal-title":"Ann. Math. Stat."},{"key":"ref_26","unstructured":"Contributors, M. (2022, September 01). MMDetection3D: OpenMMLab Next-Generation Platform for General 3D Object Detection. 2020. Available online: https:\/\/github.com\/open-mmlab\/mmdetection3d."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Li, Z., Wang, F., and Wang, N. (2021, January 11\u201317). Lidar r-cnn: An efficient and universal 3d object detector. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/CVPR46437.2021.00746"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Chen, X., Ma, H., Wan, J., Li, B., and Xia, T. (2017, January 21\u201326). Multi-view 3d object detection network for autonomous driving. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.691"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Qi, C.R., Liu, W., Wu, C., Su, H., and Guibas, L.J. (2018, January 18\u201323). Frustum pointnets for 3d object detection from rgb-d data. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00102"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Chen, Y., Li, Y., Zhang, X., Sun, J., and Jia, J. (2022, January 18\u201324). Focal sparse convolutional networks for 3d object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00535"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Wu, H., Wen, C., Shi, S., Li, X., and Wang, C. (2023). Virtual Sparse Convolution for Multimodal 3D Object Detection. arXiv.","DOI":"10.1109\/CVPR52729.2023.02074"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"He, C., Zeng, H., Huang, J., Hua, X.S., and Zhang, L. (2020, January 13\u201319). Structure aware single-stage 3d object detection from point cloud. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01189"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Deng, J., Shi, S., Li, P., Zhou, W., Zhang, Y., and Li, H. (2021, January 4\u20137). Voxel r-cnn: Towards high performance voxel-based 3d object detection. Proceedings of the AAAI Conference on Artificial Intelligence, Online.","DOI":"10.1609\/aaai.v35i2.16207"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/12\/5757\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:57:15Z","timestamp":1760126235000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/12\/5757"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,20]]},"references-count":33,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2023,6]]}},"alternative-id":["s23125757"],"URL":"https:\/\/doi.org\/10.3390\/s23125757","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2023,6,20]]}}}