{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,27]],"date-time":"2026-05-27T20:52:58Z","timestamp":1779915178028,"version":"3.53.1"},"reference-count":39,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2025,8,21]],"date-time":"2025-08-21T00:00:00Z","timestamp":1755734400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Small object detection in UAV aerial imagery presents significant challenges due to scale variations, sparse feature representation, and complex backgrounds. To address these issues, this paper focuses on practical engineering improvements to the existing YOLOv8s framework, rather than proposing a fundamentally new algorithm. We introduce MultiScaleConv-YOLO (MSConv-YOLO), an enhanced model that integrates well-established techniques to improve detection performance for small targets. Specifically, the proposed approach introduces three key improvements: (1) a MultiScaleConv (MSConv) module that combines depthwise separable and dilated convolutions with varying dilation rates, enhancing multi-scale feature extraction while maintaining efficiency; (2) the replacement of CIoU with WIoU v3 as the bounding box regression loss, which incorporates a dynamic non-monotonic focusing mechanism to improve localization for small targets; and (3) the addition of a high-resolution detection head in the neck\u2013head structure, leveraging FPN and PAN to preserve fine-grained features and ensure full-scale coverage. Experimental results on the VisDrone2019 dataset show that MSConv-YOLO outperforms the baseline YOLOv8s by achieving a 6.9% improvement in mAP@0.5 and a 6.3% gain in recall. Ablation studies further validate the complementary impact of each enhancement. This paper presents practical and effective engineering enhancements to small object detection in UAV scenarios, offering an improved solution without introducing entirely new theoretical constructs. Future work will focus on lightweight deployment and adaptation to more complex environments.<\/jats:p>","DOI":"10.3390\/jimaging11080285","type":"journal-article","created":{"date-parts":[[2025,8,21]],"date-time":"2025-08-21T15:19:02Z","timestamp":1755789542000},"page":"285","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["MSConv-YOLO: An Improved Small Target Detection Algorithm Based on YOLOv8"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-5100-8183","authenticated-orcid":false,"given":"Linli","family":"Yang","sequence":"first","affiliation":[{"name":"College of Mechanical and Electrical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China"},{"name":"Faculty of Engineering and Applied Sciences, Cranfield University, Cranfield, Bedford MK43 0AL, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1519-9596","authenticated-orcid":false,"given":"Barmak","family":"Honarvar Shakibaei Asli","sequence":"additional","affiliation":[{"name":"Faculty of Engineering and Applied Sciences, Cranfield University, Cranfield, Bedford MK43 0AL, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2025,8,21]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Wang, D., Zhang, Y., Zhang, K., and Wang, L. (2020, January 13\u201319). Focalmix: Semi-supervised learning for 3d medical image detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00401"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Liu, M., Wang, X., Zhou, A., Fu, X., Ma, Y., and Piao, C. (2020). Uav-yolo: Small object detection on unmanned aerial vehicle perspective. Sensors, 20.","DOI":"10.3390\/s20082238"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s00607-020-00869-8","article-title":"An improved YOLO-based road traffic monitoring system","volume":"103","author":"Abbasi","year":"2021","journal-title":"Computing"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Wang, G., Chen, Y., An, P., Hong, H., Hu, J., and Huang, T. (2023). UAV-YOLOv8: A small-object-detection model based on improved YOLOv8 for UAV aerial photography scenarios. Sensors, 23.","DOI":"10.3390\/s23167190"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Bai, T. (November, January 30). Analysis on two-stage object detection based on convolutional neural networkorks. Proceedings of the 2020 International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE), Bangkok, Thailand.","DOI":"10.1109\/ICBASE51474.2020.00074"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"012033","DOI":"10.1088\/1742-6596\/1544\/1\/012033","article-title":"Overview of two-stage object detection algorithms","volume":"1544","author":"Du","year":"2020","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Chen, C., Liu, M.Y., Tuzel, O., and Xiao, J. (2016, January 20\u201324). R-CNN for small object detection. Proceedings of the Asian Conference on Computer Vision, Taipei, Taiwan.","DOI":"10.1007\/978-3-319-54193-8_14"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_9","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster r-cnn: Towards real-time object detection with region proposal networks. Proceedings of the 29th International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Xu, L., Yan, W., and Ji, J. (2023). The research of a novel WOG-YOLO algorithm for autonomous driving object detection. Sci. Rep., 13.","DOI":"10.1038\/s41598-023-30409-1"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Liu, Y., He, M., and Hui, B. (2025). ESO-DETR: An Improved Real-Time Detection Transformer Model for Enhanced Small Object Detection in UAV Imagery. Drones, 9.","DOI":"10.3390\/drones9020143"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"9243","DOI":"10.1007\/s11042-022-13644-y","article-title":"Object detection using YOLO: Challenges, architectural successors, datasets and applications","volume":"82","author":"Diwan","year":"2023","journal-title":"Multimed. Tools Appl."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"24344","DOI":"10.1109\/ACCESS.2020.2971026","article-title":"DF-SSD: An improved SSD object detection algorithm based on DenseNet and feature fusion","volume":"8","author":"Zhai","year":"2020","journal-title":"IEEE Access"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Kong, Y., Shang, X., and Jia, S. (2024). Drone-DETR: Efficient small object detection for remote sensing image using enhanced RT-DETR model. Sensors, 24.","DOI":"10.3390\/s24175496"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Sohan, M., Sai Ram, T., and Rami Reddy, C.V. (2024, January 18\u201320). A review on yolov8 and its advancements. Proceedings of the International Conference on Data Intelligence and Cognitive Informatics, Tirunelveli, India.","DOI":"10.1007\/978-981-99-7962-2_39"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Swathi, Y., and Challa, M. (2024, January 12\u201313). YOLOv8: Advancements and innovations in object detection. Proceedings of the International Conference on Smart Computing and Communication, Pune, India.","DOI":"10.1007\/978-981-97-1323-3_1"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"80479","DOI":"10.1109\/ACCESS.2023.3300372","article-title":"YOLOv5s_2E: Improved YOLOv5s for aerial small target detection","volume":"11","author":"Shi","year":"2023","journal-title":"IEEE Access"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Yang, R., Li, W., Shang, X., Zhu, D., and Man, X. (2023). KPE-YOLOv5: An improved small target detection algorithm based on YOLOv5. Electronics, 12.","DOI":"10.3390\/electronics12040817"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Li, H., Li, Y., Xiao, L., Zhang, Y., Cao, L., and Wu, D. (2025). RLRD-YOLO: An Improved YOLOv8 Algorithm for Small Object Detection from an Unmanned Aerial Vehicle (UAV) Perspective. Drones, 9.","DOI":"10.3390\/drones9040293"},{"key":"ref_20","unstructured":"Tong, Z., Chen, Y., Xu, Z., and Yu, R. (2023). Wise-IoU: Bounding box regression loss with dynamic focusing mechanism. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Qin, Q., Qiu, C., and Zhang, Z. (2024, January 1\u20133). Localizing Drones from Monocular Images using Modified YOLOv8. Proceedings of the 2024 7th International Conference on Advanced Algorithms and Control Engineering (ICAACE), Shanghai, China.","DOI":"10.1109\/ICAACE61206.2024.10548864"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"128837","DOI":"10.1109\/ACCESS.2019.2939201","article-title":"A survey of deep learning-based object detection","volume":"7","author":"Jiao","year":"2019","journal-title":"IEEE Access"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Kang, S., Hu, Z., Liu, L., Zhang, K., and Cao, Z. (2025). Object detection YOLO algorithms and their industrial applications: Overview and comparative analysis. Electronics, 14.","DOI":"10.3390\/electronics14061104"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"8515510","DOI":"10.1155\/2022\/8515510","article-title":"OAB-YOLOv5: One-Anchor-Based YOLOv5 for Rotated Object Detection in Remote Sensing Images","volume":"2022","author":"Liu","year":"2022","journal-title":"J. Sens."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"122669","DOI":"10.1016\/j.eswa.2023.122669","article-title":"DsP-YOLO: An anchor-free network with DsPAN for small object detection of multiscale defects","volume":"241","author":"Zhang","year":"2024","journal-title":"Expert Syst. Appl."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Guan, Z., Liu, B., Xie, M., and Yang, Z. (2024, January 19\u201321). YOLOv8 detection head improvements for FPGA deployments. Proceedings of the 2024 9th International Conference on Intelligent Computing and Signal Processing (ICSP), Xi\u2019an, China.","DOI":"10.1109\/ICSP62122.2024.10743202"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Li, H., Wu, A., Jiang, Z., Liu, F., and Luo, M. (2024, January 24\u201326). Improving object detection in YOLOv8n with the C2f-f module and multi-scale fusion reconstruction. Proceedings of the 2024 IEEE 6th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Chongqing, China,.","DOI":"10.1109\/IMCEC59810.2024.10575292"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Liao, H.Y.M., Wu, Y.H., Chen, P.Y., Hsieh, J.W., and Yeh, I.H. (2020, January 14\u201319). CSPNet: A new backbone that can enhance learning capability of CNN. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA.","DOI":"10.1109\/CVPRW50498.2020.00203"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201323). Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"44984","DOI":"10.1109\/ACCESS.2024.3380009","article-title":"An improved YOLOv8 algorithm for rail surface defect detection","volume":"12","author":"Wang","year":"2024","journal-title":"IEEE Access"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Sunkara, R., and Luo, T. (2022, January 19\u201323). No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects. Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Grenoble, France.","DOI":"10.1007\/978-3-031-26409-2_27"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1007\/s11554-024-01585-8","article-title":"Improved real-time object detection method based on YOLOv8: A refined approach","volume":"22","author":"Zhong","year":"2025","journal-title":"J. Real-Time Image Process."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, X., Gao, H., Jia, Z., and Li, Z. (2023). BL-YOLOv8: An improved road defect detection model based on YOLOv8. Sensors, 23.","DOI":"10.3390\/s23208361"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 13\u201319). Efficientdet: Scalable and efficient object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Wang, J., Xie, X., Liu, G., and Wu, L. (2025). A Lightweight PCB Defect Detection Algorithm Based on Improved YOLOv8-PCB. Symmetry, 17.","DOI":"10.3390\/sym17020309"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"28260","DOI":"10.1109\/ACCESS.2024.3368161","article-title":"Safety helmet detection based on improved YOLOv8","volume":"12","author":"Lin","year":"2024","journal-title":"IEEE Access"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Chollet, F. (2017, January 21\u201326). Xception: Deep learning with depthwise separable convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref_39","unstructured":"Yu, F., and Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. arXiv."}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/11\/8\/285\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:32:56Z","timestamp":1760034776000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/11\/8\/285"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,21]]},"references-count":39,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2025,8]]}},"alternative-id":["jimaging11080285"],"URL":"https:\/\/doi.org\/10.3390\/jimaging11080285","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,21]]}}}