{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,12]],"date-time":"2026-05-12T16:44:17Z","timestamp":1778604257517,"version":"3.51.4"},"reference-count":42,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2023,4,7]],"date-time":"2023-04-07T00:00:00Z","timestamp":1680825600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["51975428"],"award-info":[{"award-number":["51975428"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>In self-driving cars, object detection algorithms are becoming increasingly important, and the accurate and fast recognition of objects is critical to realize autonomous driving. The existing detection algorithms are not ideal for the detection of small objects. This paper proposes a YOLOX-based network model for multi-scale object detection tasks in complex scenes. This method adds a CBAM-G module to the backbone of the original network, which performs grouping operations on CBAM. It changes the height and width of the convolution kernel of the spatial attention module to 7 \u00d7 1 to improve the ability of the model to extract prominent features. We proposed an object-contextual feature fusion module, which can provide more semantic information and improve the perception of multi-scale objects. Finally, we considered the problem of fewer samples and less loss of small objects and introduced a scaling factor that could increase the loss of small objects to improve the detection ability of small objects. We validated the effectiveness of the proposed method on the KITTI dataset, and the mAP value was 2.46% higher than the original model. Experimental comparisons showed that our model achieved superior detection performance compared to other models.<\/jats:p>","DOI":"10.3390\/s23083794","type":"journal-article","created":{"date-parts":[[2023,4,7]],"date-time":"2023-04-07T04:13:12Z","timestamp":1680840792000},"page":"3794","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["CF-YOLOX: An Autonomous Driving Detection Model for Multi-Scale Object Detection"],"prefix":"10.3390","volume":"23","author":[{"given":"Shuiye","family":"Wu","sequence":"first","affiliation":[{"name":"School of Automobile and Traffic Engineering, Wuhan University of Science and Technology, Wuhan 430065, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9118-5095","authenticated-orcid":false,"given":"Yunbing","family":"Yan","sequence":"additional","affiliation":[{"name":"School of Automobile and Traffic Engineering, Wuhan University of Science and Technology, Wuhan 430065, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weiqiang","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Automobile and Traffic Engineering, Wuhan University of Science and Technology, Wuhan 430065, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,4,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1016\/j.neucom.2019.06.084","article-title":"Unsupervised pre-trained filter learning approach for efficient convolution neural network","volume":"365","author":"Tu","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"9477","DOI":"10.1109\/TPAMI.2021.3127674","article-title":"Pharmacological, Non-Pharmacological Policies and Mutation: An Artificial Intelligence Based Multi-Dimensional Policy Making Algorithm for Controlling the Casualties of the Pandemic Diseases","volume":"44","author":"Tutsoy","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"67176","DOI":"10.1109\/ACCESS.2018.2878868","article-title":"A Benchmark Dataset and Learning High-Level Semantic Embeddings of Multimedia for Cross-Media Retrieval","volume":"6","author":"Tu","year":"2018","journal-title":"IEEE Access"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Lee, C.-H., Lin, C.-R., and Chen, M.-S. (2001, January 5\u201310). Sliding-window filtering: An efficient algorithm for incremental mining. Proceedings of the Tenth International Conference on Information and Knowledge Management, Atlanta, GA, USA.","DOI":"10.1145\/502585.502630"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_6","unstructured":"Dalal, N., and Triggs, B. Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905), San Diego, CA, USA, 20\u201325 June 2005."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/PL00013831","article-title":"On a kernel-based method for pattern recognition, regression, approximation, and operator inversion","volume":"22","author":"Smola","year":"1998","journal-title":"Algorithmica"},{"key":"ref_8","first-page":"1612","article-title":"A short introduction to boosting","volume":"14","author":"Freund","year":"1999","journal-title":"J. Jpn. Soc. Artif. Intell."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"311","DOI":"10.3233\/AIC-170739","article-title":"CSFL: A novel unsupervised convolution neural network approach for visual pattern classification","volume":"30","author":"Tu","year":"2017","journal-title":"AI Commun."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"Imagenet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"ur Rehman, S., Tu, S., ur Rehman, O., Huang, Y., Magurawalage, C.M.S., and Chang, C. (2018). Optimization of CNN through Novel Training Strategy for Visual Classification Problems. Entropy, 20.","DOI":"10.3390\/e20040290"},{"key":"ref_12","unstructured":"Bochkovskiy, A., Wang, C.-Y., and Liao, H.-Y.M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"ur Rehman, S., Tu, S., Huang, Y., and Yang, Z. (2016, January 28\u201329). Face recognition: A novel un-supervised convolutional neural network method. Proceedings of the 2016 IEEE International Conference of Online Analysis and Computing Science (ICOACS), Chongqing, China.","DOI":"10.1109\/ICOACS.2016.7563066"},{"key":"ref_14","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2013MICCAI 2015: 18th International Conference, Munich, Germany. Proceedings, Part III 18."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Wojke, N., Bewley, A., and Paulus, D. (2017, January 17\u201320). Simple online and realtime tracking with a deep association metric. Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China.","DOI":"10.1109\/ICIP.2017.8296962"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014;, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_18","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster r-cnn: Towards real-time object detection with region proposal networks. Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A.C. Ssd: Single shot multibox detector. Proceedings of the Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11\u201314 October 2016, Proceedings, Part I 14.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"9541","DOI":"10.1109\/LRA.2022.3192200","article-title":"An end-to-end cascaded image deraining and object detection neural network","volume":"7","author":"Wang","year":"2022","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 18\u201324). Efficientdet: Scalable and efficient object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wong, A., Famuori, M., Shafiee, M.J., Li, F., Chwyl, B., and Chung, J. YOLO nano: A highly compact you only look once convolutional neural network for object detection. Proceedings of the 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing-NeurIPS Edition (EMC2-NIPS), Vancouver, BC, Canada, 13 December 2019.","DOI":"10.1109\/EMC2-NIPS53020.2019.00013"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"13421","DOI":"10.1007\/s11227-021-03813-5","article-title":"The improvement in obstacle detection in autonomous vehicles using YOLO non-maximum suppression fuzzy algorithm","volume":"77","author":"Zaghari","year":"2021","journal-title":"J. Supercomput."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"698","DOI":"10.1109\/LRA.2021.3130976","article-title":"CertainNet: Sampling-free uncertainty estimation for object detection","volume":"7","author":"Gasperini","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Choi, J., Chun, D., Kim, H., and Lee, H.-J. (November, January 27). Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00059"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"603","DOI":"10.1109\/TIV.2022.3165353","article-title":"Cross-Domain Object Detection for Autonomous Driving: A Stepwise Domain Adaptative YOLO Approach","volume":"7","author":"Li","year":"2022","journal-title":"IEEE Trans. Intell. Veh."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Li, G., Zhang, Y., Ouyang, D., and Qu, X. (2022, January 23\u201327). An Improved Lightweight Network Based on YOLOv5s for Object Detection in Autonomous Driving. Proceedings of the Computer Vision\u2013ECCV Workshops, Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-25056-9_37"},{"key":"ref_29","unstructured":"Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., and Nie, W. (2022). YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv."},{"key":"ref_30","unstructured":"Wang, C.-Y., Bochkovskiy, A., and Liao, H.-Y.M. (2022). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv."},{"key":"ref_31","unstructured":"Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021). Yolox: Exceeding yolo series in 2021. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1177\/0278364913491297","article-title":"Vision meets robotics: The KITTI dataset","volume":"32","author":"Geiger","year":"2013","journal-title":"Int. J. Robot. Res."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan, V., and Darrell, T. (2020, January 13\u201319). BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00271"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, C.-Y., Liao, H.-Y.M., Wu, Y.-H., Chen, P.-Y., Hsieh, J.-W., and Yeh, I.-H. (2020, January 14\u201319). CSPNet: A new backbone that can enhance learning capability of CNN. Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition workshops, Seattle, WA, USA.","DOI":"10.1109\/CVPRW50498.2020.00203"},{"key":"ref_35","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_37","unstructured":"Wang, K., Liew, J.H., Zou, Y., Zhou, D., and Feng, J. (November, January 27). Panet: Few-shot image semantic segmentation with prototype alignment. Proceedings of the IEEE\/CVF international conference on computer vision, Seoul, Republic of Korea."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.-Y., and Kweon, I.S. (2018, January 8\u201314). Cbam: Convolutional block attention module. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Yuan, Y., Chen, X., and Wang, J. Segmentation Transformer: Object-Contextual Representations for Semantic Segmentation. Proceedings of the Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, 23\u201328 August 2020, In Proceedings, Part VI 16.","DOI":"10.1007\/978-3-030-58539-6_11"},{"key":"ref_40","first-page":"12077","article-title":"SegFormer: Simple and efficient design for semantic segmentation with transformers","volume":"34","author":"Xie","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. Microsoft coco: Common objects in context. Proceedings of the Computer Vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland, 6\u201312 September 2014, In Proceedings, Part V 13.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_42","unstructured":"Jocher, G. (2023, March 16). YOLOv5. Available online: https:\/\/github.com\/ultralytics\/yolov5."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/8\/3794\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:11:52Z","timestamp":1760123512000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/8\/3794"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,7]]},"references-count":42,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2023,4]]}},"alternative-id":["s23083794"],"URL":"https:\/\/doi.org\/10.3390\/s23083794","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,4,7]]}}}