{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,26]],"date-time":"2026-06-26T01:29:11Z","timestamp":1782437351080,"version":"3.54.5"},"reference-count":50,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2024,3,29]],"date-time":"2024-03-29T00:00:00Z","timestamp":1711670400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Natural Science Basic Research Project of Shaanxi Provincial Department of Science and Technology","award":["2022JQ-677"],"award-info":[{"award-number":["2022JQ-677"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Due to the limited semantic information extraction with small objects and difficulty in distinguishing similar targets, it brings great challenges to target detection in remote sensing scenarios, which results in poor detection performance. This paper proposes an improved YOLOv5 remote sensing image target detection algorithm, SEB-YOLO (SPD-Conv + ECSPP + Bi-FPN + YOLOv5). Firstly, the space-to-depth (SPD) layer followed by a non-strided convolution (Conv) layer module (SPD-Conv) was used to reconstruct the backbone network, which retained the global features and reduced the feature loss. Meanwhile, the pooling module with the attention mechanism of the final layer of the backbone network was designed to help the network better identify and locate the target. Furthermore, a bidirectional feature pyramid network (Bi-FPN) with bilinear interpolation upsampling was added to improve bidirectional cross-scale connection and weighted feature fusion. Finally, the decoupled head is introduced to enhance the model convergence and solve the contradiction between the classification task and the regression task. Experimental results on NWPU VHR-10 and RSOD datasets show that the mAP of the proposed algorithm reaches 93.5% and 93.9%respectively, which is 4.0% and 5.3% higher than that of the original YOLOv5l algorithm. The proposed algorithm achieves better detection results for complex remote sensing images.<\/jats:p>","DOI":"10.3390\/s24072193","type":"journal-article","created":{"date-parts":[[2024,3,29]],"date-time":"2024-03-29T06:33:16Z","timestamp":1711693996000},"page":"2193","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":32,"title":["SEB-YOLO: An Improved YOLOv5 Model for Remote Sensing Small Target Detection"],"prefix":"10.3390","volume":"24","author":[{"given":"Yan","family":"Hui","sequence":"first","affiliation":[{"name":"School of Computer Science and Engineering, Xi\u2019an Technological University, Xi\u2019an 710021, China"},{"name":"State and Provincial Joint Engineering Laboratory of Advanced Network, Monitoring and Control, Xi\u2019an 710021, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-7748-8275","authenticated-orcid":false,"given":"Shijie","family":"You","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Xi\u2019an Technological University, Xi\u2019an 710021, China"},{"name":"State and Provincial Joint Engineering Laboratory of Advanced Network, Monitoring and Control, Xi\u2019an 710021, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiuhua","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Xi\u2019an Technological University, Xi\u2019an 710021, China"},{"name":"State and Provincial Joint Engineering Laboratory of Advanced Network, Monitoring and Control, Xi\u2019an 710021, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-6849-7439","authenticated-orcid":false,"given":"Panpan","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Xi\u2019an Technological University, Xi\u2019an 710021, China"},{"name":"State and Provincial Joint Engineering Laboratory of Advanced Network, Monitoring and Control, Xi\u2019an 710021, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jing","family":"Zhao","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Xi\u2019an Technological University, Xi\u2019an 710021, China"},{"name":"State and Provincial Joint Engineering Laboratory of Advanced Network, Monitoring and Control, Xi\u2019an 710021, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2024,3,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1598","DOI":"10.1016\/j.patrec.2011.01.004","article-title":"Face recognition using Histograms of Oriented Gradients","volume":"32","author":"Bueno","year":"2011","journal-title":"Pattern Recognit. Lett."},{"key":"ref_2","first-page":"545","article-title":"Graph-based visual saliency","volume":"19","author":"Harel","year":"2006","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"5283","DOI":"10.1109\/TGRS.2015.2420659","article-title":"Remote sensing image matching based on adaptive binning SIFT descriptor","volume":"53","author":"Sedaghat","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Yan, B., Wang, D., Lu, H., and Yang, X. (2020, January 14\u201319). Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible Noises. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00107"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Ji, L., and Yu-Xiao, N. (2023, January 12\u201315). Method of Insulator Detection Based on Improved Faster R-CNN. Proceedings of the 2023 6th International Conference on Electronics Technology (ICET), Chengdu, China.","DOI":"10.1109\/ICET58434.2023.10211953"},{"key":"ref_6","unstructured":"Zhaowei, C., and Vasconcelos, N. (2018, January 18\u201323). Cascade R-CNN: Delving Into High Quality Object Detection. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA."},{"key":"ref_7","unstructured":"Tsung-Yi, L., Goyal, P., Girshick, R., Kaiming, H., and Dollar, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"3442","DOI":"10.1109\/TIP.2019.2960869","article-title":"End-to-End Optimized ROI Image Compression","volume":"29","author":"Cai","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_9","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Proceedings of the 29th Annual Conference on Neural Information Processing Systems (NIPS), Montreal, BC, Canada."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1109\/TPAMI.2018.2844175","article-title":"Mask R-CNN","volume":"42","author":"He","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"15650","DOI":"10.1109\/TPAMI.2023.3292030","article-title":"Sparse R-CNN: An End-to-End Framework for Object Detection","volume":"45","author":"Sun","year":"2023","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 11\u201314). Ssd: Single shot multibox detector. Proceedings of the Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands. Proceedings, Part I 14.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_13","unstructured":"Xu, X., Jiang, Y., Chen, W., Huang, Y., Zhang, Y., and Sun, X. (2022). DAMO-YOLO: A Report on Real-Time Object Detection Design. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Adarsh, P., Rathi, P., and Kumar, M. (2020, January 6\u20137). YOLO v3-Tiny: Object Detection and Recognition using one stage improved model. Proceedings of the 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India.","DOI":"10.1109\/ICACCS48705.2020.9074315"},{"key":"ref_15","unstructured":"Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., and Nie, W. (2022). YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Wang, C.-Y., Bochkovskiy, A., and Mark Liao, H.-Y. (2022). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv.","DOI":"10.1109\/CVPR52729.2023.00721"},{"key":"ref_17","unstructured":"Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021). YOLOX: Exceeding YOLO Series in 2021. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., and Sun, J. (2021, January 19\u201325). You Only Look One-level Feature. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual.","DOI":"10.1109\/CVPR46437.2021.01284"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Gong, H., Mu, T., Li, Q., Dai, H., Li, C., He, Z., Wang, W., Han, F., Tuniyazi, A., and Li, H. (2022). Swin-Transformer-Enabled YOLOv5 with Attention Mechanism for Small Object Detection on Satellite Images. Remote Sens., 14.","DOI":"10.3390\/rs14122861"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Sumit, S.S., Awang Rambli, D.R., Mirjalili, S., Ejaz, M.M., and Miah, M.S.U. (2022). Restinet: On improving the performance of tiny-yolo-based cnn architecture for applications in human detection. Appl. Sci., 12.","DOI":"10.3390\/app12189331"},{"key":"ref_21","unstructured":"Glenn, J. (2022, February 22). YOLOv5-6.1\u2014TensorRT. TensorFlow Edge TPU and OpenVINO Export and Inference. Available online: https:\/\/github.com\/ultralytics\/YOLOv5\/releases\/tag\/v6.1."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Lu, X., Cao, G., Yang, Y., Jiao, L., and Liu, F. (2021, January 11\u201317). ViT-YOLO: Transformer-Basd YOLO for Object Detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00314"},{"key":"ref_23","first-page":"264","article-title":"Improved Surface Defect Detection of YOLOV5 Aluminum Profiles based on CBAM and BiFPN","volume":"8","author":"Hua","year":"2022","journal-title":"Int. Core J. Eng."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_26","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An incremental improvement. arXiv."},{"key":"ref_27","unstructured":"Bochkovskiy, A., Wang, C., and Liao, H. (2020). YOLOv4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"8135","DOI":"10.1007\/s12652-021-03584-0","article-title":"Robust detection method for improving small traffic sign recognition based on spatial pyramid pooling","volume":"14","author":"Dewi","year":"2023","journal-title":"J. Ambient. Intell. Humaniz. Comput."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201323). Path Aggregation Network for Instance Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_32","first-page":"227","article-title":"Improved FCOS Remote Sensing Image Detection Method Based on Distance Constrain","volume":"59","author":"Su","year":"2023","journal-title":"Comput. Eng. Appl."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Liu, M., Wang, X., Zhou, A., Fu, X., Ma, Y., and Piao, C. (2020). UAV-YOLO: Small Object Detection on Unmanned Aerial Vehicle Perspective. Sensors, 20.","DOI":"10.3390\/s20082238"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Luo, X., Wu, Y., and Wang, F. (2022). Target detection method of UAV aerial imagery based on improved YOLOv5. Remote Sens., 14.","DOI":"10.3390\/rs14195063"},{"key":"ref_35","first-page":"2215","article-title":"Remote sensing image target detection combining multi-scale and attention mechanism","volume":"56","author":"Zhang","year":"2022","journal-title":"J. Zhejiang Univ. (Eng. Ed.)"},{"key":"ref_36","first-page":"70","article-title":"Remote Sensing Image Object Detection Based on Ghostnet and YOLOv5 Fusion","volume":"30","author":"Xie","year":"2023","journal-title":"J. Dongguan Univ. Technol."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1068","DOI":"10.1109\/JSTARS.2020.2975606","article-title":"An optimized deep neural network detecting small and narrow rectangular objects in Google Earth Images","volume":"13","author":"Jiang","year":"2020","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zhou, J., Su, T., Li, K., and Dai, J. (2024). Small Target-YOLOv5: Enhancing the Algorithm for Small Object Detection in Drone Aerial Imagery Based on YOLOv5. Sensors, 24.","DOI":"10.3390\/s24010134"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Liu, Y., He, G., Wang, Z., Li, W., and Huang, H. (2022). NRT-YOLO: Improved YOLOv5 Based on Nested Residual Transformer for Tiny Remote Sensing Object Detection. Sensors, 22.","DOI":"10.3390\/s22134953"},{"key":"ref_40","first-page":"86","article-title":"A remote sensing image object detection algorithm with improvedYOLOv5s","volume":"18","author":"Zhao","year":"2023","journal-title":"CAAI Trans. Intell. Syst."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., and Hu, Q. (2020, January 16\u201320). ECA-Net: Efficient channel attention for deep convolutional neural networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01155"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Hou, Q., Zhou, D., and Feng, J. (2021, January 20\u201325). Coordinate attention for efficient mobile network design. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01350"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Xu, X., Feng, Z., Cao, C., Li, M., Wu, J., Wu, Z., Shang, Y., and Ye, S. (2021). An improved swin transformer-based model for remote sensing object detection and instance segmentation. Remote Sens., 13.","DOI":"10.3390\/rs13234779"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Ye, Y., Ren, X., Zhu, B., Tang, T., Tan, X., Gui, Y., and Yao, Q. (2022). An adaptive attention fusion mechanism convolutional network for object detection in remote sensing images. Remote Sens., 14.","DOI":"10.3390\/rs14030516"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 13\u201319). Efficientdet: Scalable and efficient object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_47","first-page":"209","article-title":"Remote Sensing Image Aircraft Target Detection Combined with Multiple Channel Attention","volume":"58","author":"Li","year":"2022","journal-title":"Comput. Eng. Appl."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Sunkara, R., and Luo, T. (2022). No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects. arXiv.","DOI":"10.1007\/978-3-031-26409-2_27"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"7405","DOI":"10.1109\/TGRS.2016.2601622","article-title":"Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images","volume":"54","author":"Cheng","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"2486","DOI":"10.1109\/TGRS.2016.2645610","article-title":"Accurate object localization in remote sensing images based on convolutional neural networks","volume":"55","author":"Long","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/7\/2193\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:20:42Z","timestamp":1760106042000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/7\/2193"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,29]]},"references-count":50,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2024,4]]}},"alternative-id":["s24072193"],"URL":"https:\/\/doi.org\/10.3390\/s24072193","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,3,29]]}}}