{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,23]],"date-time":"2025-10-23T17:07:00Z","timestamp":1761239220756,"version":"3.41.2"},"reference-count":54,"publisher":"Emerald","issue":"1","license":[{"start":{"date-parts":[[2024,12,5]],"date-time":"2024-12-05T00:00:00Z","timestamp":1733356800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJICC"],"published-print":{"date-parts":[[2025,3,7]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>In autonomous driving, the inherent sparsity of point clouds often limits the performance of object detection, while existing multimodal architectures struggle to meet the real-time requirements for 3D object detection. Therefore, the main purpose of this paper is to significantly enhance the detection performance of objects, especially the recognition capability for small-sized objects and to address the issue of slow inference speed. This will improve the safety of autonomous driving systems and provide feasibility for devices with limited computing power to achieve autonomous driving.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>BRTPillar first adopts an element-based method to fuse image and point cloud features. Secondly, a local-global feature interaction method based on an efficient additive attention mechanism was designed to extract multi-scale contextual information. Finally, an enhanced multi-scale feature fusion method was proposed by introducing adaptive spatial and channel interaction attention mechanisms, thereby improving the learning of fine-grained features.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>Extensive experiments were conducted on the KITTI dataset. The results showed that compared with the benchmark model, the accuracy of cars, pedestrians and cyclists on the 3D object box improved by 3.05, 9.01 and 22.65%, respectively; the accuracy in the bird\u2019s-eye view has increased by 2.98, 10.77 and 21.14%, respectively. Meanwhile, the running speed of BRTPillar can reach 40.27\u00a0Hz, meeting the real-time detection needs of autonomous driving.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>This paper proposes a boosting multimodal real-time 3D object detection method called BRTPillar, which achieves accurate location in many scenarios, especially for complex scenes with many small objects, while also achieving real-time inference speed.<\/jats:p><\/jats:sec>","DOI":"10.1108\/ijicc-07-2024-0328","type":"journal-article","created":{"date-parts":[[2024,12,4]],"date-time":"2024-12-04T21:55:41Z","timestamp":1733349341000},"page":"217-235","source":"Crossref","is-referenced-by-count":1,"title":["BRTPillar: boosting real-time 3D\u00a0object detection based point cloud\u00a0and\u00a0RGB image fusion in\u00a0autonomous driving"],"prefix":"10.1108","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1251-0943","authenticated-orcid":false,"given":"Zhitian","family":"Zhang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hongdong","family":"Zhao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yazhou","family":"Zhao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dan","family":"Chen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ke","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yanqi","family":"Li","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"140","published-online":{"date-parts":[[2024,12,5]]},"reference":[{"key":"key2025030511415893700_ref048","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2024.108669","article-title":"MSTSENet: multiscale spectral\u2013spatial transformer with squeeze and excitation network for hyperspectral image classification","volume":"134","year":"2024","journal-title":"Engineering Applications of Artificial Intelligence"},{"issue":"1","key":"key2025030511415893700_ref001","doi-asserted-by":"publisher","first-page":"20","DOI":"10.3390\/wevj15010020","article-title":"Emerging trends in autonomous vehicle perception: multimodal fusion for 3D object detection","volume":"15","year":"2024","journal-title":"World Electric Vehicle Journal"},{"issue":"2","key":"key2025030511415893700_ref049","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1109\/TLA.2024.10412035","article-title":"Dyfusion: cross-attention 3d object detection with dynamic fusion","volume":"22","year":"2024","journal-title":"IEEE Latin America Transactions"},{"key":"key2025030511415893700_ref002","doi-asserted-by":"publisher","first-page":"1907","DOI":"10.1109\/CVPR.2017.691","article-title":"Multi-view 3d object detection network for autonomous driving","year":"2017"},{"key":"key2025030511415893700_ref003","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.120519","article-title":"Consistency-and dependence-guided knowledge distillation for object detection in remote sensing images","volume":"229","year":"2023","journal-title":"Expert Systems with Applications"},{"key":"key2025030511415893700_ref004","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2024.3424303","article-title":"FIBNet: privacy-enhancing approach for face biometrics based on the information bottleneck principle","volume":"2024","year":"2024","journal-title":"IEEE Transactions on Information Forensics and Security"},{"issue":"4","key":"key2025030511415893700_ref005","doi-asserted-by":"publisher","first-page":"265","DOI":"10.1109\/MNET.010.2300029","article-title":"Multimodal cooperative 3D object detection over connected vehicles for autonomous driving","volume":"37","year":"2023","journal-title":"IEEE Network"},{"issue":"2","key":"key2025030511415893700_ref006","doi-asserted-by":"publisher","first-page":"722","DOI":"10.1109\/TITS.2020.3023541","article-title":"Deep learning for image and point cloud fusion in autonomous driving: a review","volume":"23","year":"2021","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"issue":"11","key":"key2025030511415893700_ref007","doi-asserted-by":"publisher","first-page":"1231","DOI":"10.1177\/0278364913491297","article-title":"Vision meets robotics: the kitti dataset","volume":"32","year":"2013","journal-title":"The International Journal of Robotics Research"},{"issue":"4","key":"key2025030511415893700_ref008","doi-asserted-by":"publisher","first-page":"1348","DOI":"10.3390\/app14041348","article-title":"Multi-layer fusion 3D object detection via lidar point cloud and camera image","volume":"14","year":"2024","journal-title":"Applied Sciences"},{"key":"key2025030511415893700_ref009","doi-asserted-by":"publisher","first-page":"236","DOI":"10.1016\/j.patrec.2021.08.028","article-title":"Deep multi-scale and multi-modal fusion for 3D object detection","volume":"151","year":"2021","journal-title":"Pattern Recognition Letters"},{"issue":"4","key":"key2025030511415893700_ref010","doi-asserted-by":"publisher","DOI":"10.1117\/1.JEI.32.4.043013","article-title":"DBCR-YOLO: improved YOLOv5 based on double-sampling and broad-feature coordinate-attention residual module for water surface object detection","volume":"32","year":"2023","journal-title":"Journal of Electronic Imaging"},{"issue":"7","key":"key2025030511415893700_ref050","doi-asserted-by":"publisher","first-page":"2969","DOI":"10.1108\/EC-08-2020-0428","article-title":"M2R-Net: deep network for arbitrary oriented vehicle detection in MiniSAR images","volume":"38","year":"2021","journal-title":"Engineering Computations"},{"key":"key2025030511415893700_ref011","doi-asserted-by":"publisher","first-page":"1093","DOI":"10.1016\/j.ins.2022.06.091","article-title":"Deconv-transformer (DecT): a histopathological image classification model for breast cancer based on color deconvolution and transformer architecture","volume":"608","year":"2022","journal-title":"Information Sciences"},{"issue":"1","key":"key2025030511415893700_ref051","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1108\/IJICC-03-2023-0053","article-title":"BFFNet: a bidirectional feature fusion network for semantic segmentation of remote sensing objects","volume":"17","year":"2024","journal-title":"International Journal of Intelligent Computing and Cybernetics"},{"key":"key2025030511415893700_ref012","doi-asserted-by":"publisher","DOI":"10.3389\/fpls.2024.1368697","article-title":"LFMNet: a lightweight model for identifying leaf diseases of maize with high similarity","volume":"15","year":"2024","journal-title":"Frontiers in Plant Science"},{"issue":"5","key":"key2025030511415893700_ref013","doi-asserted-by":"publisher","first-page":"485","DOI":"10.1108\/SR-01-2022-0022","article-title":"Overview of LiDAR point cloud target detection methods based on deep learning","volume":"42","year":"2022","journal-title":"Sensor Review"},{"issue":"2","key":"key2025030511415893700_ref014","doi-asserted-by":"publisher","DOI":"10.1007\/s11704-022-1250-2","article-title":"Teachers cooperation: team-knowledge distillation for multiple cross-domain few-shot learning","volume":"17","year":"2023","journal-title":"Frontiers of Computer Science"},{"key":"key2025030511415893700_ref015","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/IROS.2018.8594049","article-title":"Joint 3d proposal generation and object detection from view aggregation","year":"2018"},{"issue":"3","key":"key2025030511415893700_ref053","doi-asserted-by":"publisher","first-page":"503","DOI":"10.1108\/IJICC-02-2024-0077","article-title":"EYE-YOLO: a multi-spatial pyramid pooling and Focal-EIOU loss inspired tiny YOLOv7 for fundus eye disease detection","volume":"17","year":"2024","journal-title":"International Journal of Intelligent Computing and Cybernetics"},{"key":"key2025030511415893700_ref016","doi-asserted-by":"publisher","first-page":"12697","DOI":"10.1109\/CVPR.2019.01298","article-title":"Pointpillars: fast encoders for object detection from point clouds","year":"2019"},{"key":"key2025030511415893700_ref017","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2023.104895","article-title":"SGF3D: similarity-guided fusion network for 3D object detection","volume":"142","year":"2024","journal-title":"Image and Vision Computing"},{"key":"key2025030511415893700_ref018","doi-asserted-by":"publisher","first-page":"641","DOI":"10.1007\/978-3-030-01270-0_39","article-title":"Deep continuous fusion for multi-sensor 3d object detection","year":"2018"},{"key":"key2025030511415893700_ref019","doi-asserted-by":"publisher","first-page":"11677","DOI":"10.1609\/aaai.v34i07.6837","article-title":"Tanet: robust 3d object detection from point clouds with triple attention","year":"2020"},{"key":"key2025030511415893700_ref020","doi-asserted-by":"publisher","first-page":"647","DOI":"10.1109\/ISMAR55827.2022.00082","article-title":"MFF-PR: point cloud and image multi-modal feature fusion for place recognition","year":"2022"},{"issue":"4","key":"key2025030511415893700_ref021","doi-asserted-by":"publisher","first-page":"1242","DOI":"10.1109\/TETCI.2023.3259441","article-title":"3D object detection and tracking based on lidar-camera fusion and IMM-UKF algorithm towards highway driving","volume":"7","year":"2023","journal-title":"IEEE Transactions on Emerging Topics in Computational Intelligence"},{"key":"key2025030511415893700_ref022","doi-asserted-by":"publisher","first-page":"10386","DOI":"10.1109\/IROS45743.2020.9341791","article-title":"CLOCs: camera-LiDAR object candidates fusion for 3D object detection","year":"2020"},{"key":"key2025030511415893700_ref023","doi-asserted-by":"publisher","first-page":"187","DOI":"10.1109\/WACV49574.2022.00187","article-title":"Fast-CLOCs: fast camera-LiDAR object candidates fusion for 3D object detection","year":"2022"},{"key":"key2025030511415893700_ref025","doi-asserted-by":"publisher","first-page":"918","DOI":"10.1109\/CVPR.2018.00102","article-title":"Frustum pointnets for 3d object detection from rgb-d data","year":"2018"},{"key":"key2025030511415893700_ref024","doi-asserted-by":"publisher","first-page":"652","DOI":"10.1109\/CVPR.2017.16","article-title":"Pointnet: deep learning on point sets for 3d classification and segmentation","year":"2017","journal-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition"},{"key":"key2025030511415893700_ref026","doi-asserted-by":"publisher","first-page":"6153657","DOI":"10.1155\/2020\/6153657","article-title":"Fine-grained lung cancer classification from PET and CT images based on multidimensional attention mechanism","volume":"1","year":"2020","journal-title":"Complexity"},{"key":"key2025030511415893700_ref027","doi-asserted-by":"publisher","first-page":"17425","DOI":"10.1109\/CVPR.2019.00086","article-title":"Swiftformer: efficient additive attention for transformer-based real-time mobile vision applications","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","year":"2023"},{"key":"key2025030511415893700_ref028","doi-asserted-by":"publisher","first-page":"1711","DOI":"10.1109\/CVPR42600.2020.00178","article-title":"Point-gnn: graph neural network for 3d object detection in a point cloud","year":"2020"},{"key":"key2025030511415893700_ref029","doi-asserted-by":"publisher","first-page":"770","DOI":"10.1109\/CVPR.2019.00086","article-title":"Pointrcnn: 3d object proposal generation and detection from point cloud","year":"2019"},{"key":"key2025030511415893700_ref030","doi-asserted-by":"publisher","first-page":"7276","DOI":"10.1109\/ICRA.2019.8794195","article-title":"Mvx-net: multimodal voxelnet for 3d object detection","year":"2019"},{"issue":"1","key":"key2025030511415893700_ref056","doi-asserted-by":"publisher","DOI":"10.1117\/1.JRS.18.017501","article-title":"EPAWFusion: multimodal fusion for 3D object detection based on enhanced points and adaptive weights","volume":"18","year":"2024","journal-title":"Journal of Applied Remote Sensing"},{"key":"key2025030511415893700_ref031","doi-asserted-by":"publisher","first-page":"4604","DOI":"10.1109\/CVPR42600.2020.00466","article-title":"Pointpainting: sequential fusion for 3d object detection","year":"2020"},{"key":"key2025030511415893700_ref032","doi-asserted-by":"publisher","first-page":"1742","DOI":"10.1109\/IROS40897.2019.8968513","article-title":"Frustum convnet: sliding frustums to aggregate local point-wise features for amodal 3d object detection","year":"2019"},{"issue":"2","key":"key2025030511415893700_ref033","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1108\/IJICC-05-2022-0161","article-title":"Research on pedestrian detection based on multi-level fine-grained YOLOX algorithm","volume":"16","year":"2023","journal-title":"International Journal of Intelligent Computing and Cybernetics"},{"issue":"11","key":"key2025030511415893700_ref034","doi-asserted-by":"publisher","first-page":"11981","DOI":"10.1109\/TITS.2023.3285651","article-title":"Camo-mot: combined appearance-motion optimization for 3d multi-object tracking with camera-lidar fusion","volume":"24","year":"2023","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"issue":"7","key":"key2025030511415893700_ref035","doi-asserted-by":"publisher","first-page":"2542","DOI":"10.3390\/s22072542","article-title":"Mmwave radar and vision fusion for object detection in autonomous driving: a review","volume":"22","year":"2022","journal-title":"Sensors"},{"key":"key2025030511415893700_ref036","doi-asserted-by":"publisher","first-page":"12460","DOI":"10.1609\/aaai.v34i07.6933","article-title":"PI-RCNN: an efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module","year":"2020"},{"key":"key2025030511415893700_ref059","doi-asserted-by":"publisher","first-page":"244","DOI":"10.1109\/CVPR.2018.00033","article-title":"Pointfusion: deep sensor fusion for 3d bounding box estimation","year":"2018"},{"key":"key2025030511415893700_ref037","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/ISPCE-ASIA57917.2022.9970914","article-title":"3D object detection for point cloud in virtual driving environment","year":"2022"},{"issue":"10","key":"key2025030511415893700_ref038","doi-asserted-by":"publisher","first-page":"3337","DOI":"10.3390\/s18103337","article-title":"Second: sparsely embedded convolutional detection","volume":"18","year":"2018","journal-title":"Sensors"},{"issue":"12","key":"key2025030511415893700_ref039","doi-asserted-by":"publisher","DOI":"10.1016\/j.heliyon.2024.e32678","article-title":"Enhanced object detection in pediatric bronchoscopy images using YOLO-based algorithms with CBAM attention mechanism","volume":"10","year":"2024","journal-title":"Heliyon"},{"key":"key2025030511415893700_ref040","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2023.107079","article-title":"MCA: multidimensional collaborative attention in deep convolutional neural networks for image recognition","volume":"126","year":"2023","journal-title":"Engineering Applications of Artificial Intelligence"},{"issue":"17","key":"key2025030511415893700_ref041","doi-asserted-by":"publisher","first-page":"3427","DOI":"10.3390\/rs13173427","article-title":"Integrating normal vector features into an atrous convolution residual network for LiDAR point cloud classification","volume":"13","year":"2021","journal-title":"Remote Sensing"},{"issue":"10","key":"key2025030511415893700_ref042","doi-asserted-by":"publisher","first-page":"2692","DOI":"10.3390\/rs15102692","article-title":"FusionPillars: a 3D object detection network with cross-fusion and self-fusion","volume":"15","year":"2023","journal-title":"Remote Sensing"},{"key":"key2025030511415893700_ref043","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.122716","article-title":"MMAF-Net: multi-view multi-stage adaptive fusion for multi-sensor 3D object detection","volume":"242","year":"2024","journal-title":"Expert Systems with Applications"},{"issue":"7","key":"key2025030511415893700_ref044","doi-asserted-by":"publisher","first-page":"9786","DOI":"10.1109\/TITS.2021.3114199","article-title":"Intelligent content caching strategy in autonomous driving toward 6G","volume":"23","year":"2021","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"issue":"7","key":"key2025030511415893700_ref045","doi-asserted-by":"publisher","first-page":"9466","DOI":"10.1109\/TITS.2021.3122438","article-title":"SPIDER: a social computing inspired predictive routing scheme for softwarized vehicular networks","volume":"23","year":"2021","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"issue":"5","key":"key2025030511415893700_ref046","doi-asserted-by":"publisher","first-page":"210","DOI":"10.3390\/wevj15050210","article-title":"A multi-sensor 3D detection method for small objects","volume":"15","year":"2024","journal-title":"World Electric Vehicle Journal"},{"key":"key2025030511415893700_ref047","doi-asserted-by":"publisher","first-page":"4490","DOI":"10.1109\/CVPR.2018.00472","article-title":"Voxelnet: end-to-end learning for point cloud based 3d object detection","year":"2018"}],"container-title":["International Journal of Intelligent Computing and Cybernetics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IJICC-07-2024-0328\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IJICC-07-2024-0328\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T22:54:40Z","timestamp":1753397680000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/ijicc\/article\/18\/1\/217-235\/1245320"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,5]]},"references-count":54,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12,5]]},"published-print":{"date-parts":[[2025,3,7]]}},"alternative-id":["10.1108\/IJICC-07-2024-0328"],"URL":"https:\/\/doi.org\/10.1108\/ijicc-07-2024-0328","relation":{},"ISSN":["1756-378X","1756-3798"],"issn-type":[{"type":"print","value":"1756-378X"},{"type":"electronic","value":"1756-3798"}],"subject":[],"published":{"date-parts":[[2024,12,5]]}}}