{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T19:42:21Z","timestamp":1776282141556,"version":"3.50.1"},"reference-count":45,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2024,8,13]],"date-time":"2024-08-13T00:00:00Z","timestamp":1723507200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["51663001"],"award-info":[{"award-number":["51663001"]}]},{"name":"National Natural Science Foundation of China","award":["52063002"],"award-info":[{"award-number":["52063002"]}]},{"name":"National Natural Science Foundation of China","award":["42061067"],"award-info":[{"award-number":["42061067"]}]},{"name":"National Natural Science Foundation of China","award":["61741202"],"award-info":[{"award-number":["61741202"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Currently, existing deep learning methods exhibit many limitations in multi-target detection, such as low accuracy and high rates of false detection and missed detections. This paper proposes an improved Faster R-CNN algorithm, aiming to enhance the algorithm\u2019s capability in detecting multi-scale targets. This algorithm has three improvements based on Faster R-CNN. Firstly, the new algorithm uses the ResNet101 network for feature extraction of the detection image, which achieves stronger feature extraction capabilities. Secondly, the new algorithm integrates Online Hard Example Mining (OHEM), Soft non-maximum suppression (Soft-NMS), and Distance Intersection Over Union (DIOU) modules, which improves the positive and negative sample imbalance and the problem of small targets being easily missed during model training. Finally, the Region Proposal Network (RPN) is simplified to achieve a faster detection speed and a lower miss rate. The multi-scale training (MST) strategy is also used to train the improved Faster R-CNN to achieve a balance between detection accuracy and efficiency. Compared to the other detection models, the improved Faster R-CNN demonstrates significant advantages in terms of mAP@0.5, F1-score, and Log average miss rate (LAMR). The model proposed in this paper provides valuable insights and inspiration for many fields, such as smart agriculture, medical diagnosis, and face recognition.<\/jats:p>","DOI":"10.3390\/jimaging10080197","type":"journal-article","created":{"date-parts":[[2024,8,13]],"date-time":"2024-08-13T13:35:49Z","timestamp":1723556149000},"page":"197","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["A Multi-Scale Target Detection Method Using an Improved Faster Region Convolutional Neural Network Based on Enhanced Backbone and Optimized Mechanisms"],"prefix":"10.3390","volume":"10","author":[{"given":"Qianyong","family":"Chen","sequence":"first","affiliation":[{"name":"College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7832-4185","authenticated-orcid":false,"given":"Mengshan","family":"Li","sequence":"additional","affiliation":[{"name":"College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, China"}]},{"given":"Zhenghui","family":"Lai","sequence":"additional","affiliation":[{"name":"College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, China"}]},{"given":"Jihong","family":"Zhu","sequence":"additional","affiliation":[{"name":"College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, China"}]},{"given":"Lixin","family":"Guan","sequence":"additional","affiliation":[{"name":"College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, China"}]}],"member":"1968","published-online":{"date-parts":[[2024,8,13]]},"reference":[{"key":"ref_1","first-page":"1","article-title":"A Small-Sized Object Detection Oriented Multi-Scale Feature Fusion Approach with Application to Defect Detection","volume":"71","author":"Zeng","year":"2022","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1016\/j.patrec.2022.12.026","article-title":"Multi-scale self-attention-based feature enhancement for detection of targets with small image sizes","volume":"166","author":"Deng","year":"2023","journal-title":"Pattern Recogn. Lett."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"3501","DOI":"10.1109\/TNSRE.2023.3309847","article-title":"Multi-Scale Dynamic Graph Learning for Brain Disorder Detection with Functional MRI","volume":"31","author":"Ma","year":"2023","journal-title":"IEEE Trans. Neur. Syst. Rehabil."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"476","DOI":"10.1016\/j.neunet.2023.01.041","article-title":"Continual Object Detection: A review of definitions, strategies, and challenges","volume":"161","author":"Menezes","year":"2023","journal-title":"Neural Netw."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"204","DOI":"10.1016\/j.neucom.2023.01.056","article-title":"A systematic review and analysis of deep learning-based underwater object detection","volume":"527","author":"Xu","year":"2023","journal-title":"Neurocomputing"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Goswami, P.K., and Goswami, G. (2022, January 16\u201317). A Comprehensive Review on Real Time Object Detection using Deep Learing Model. Proceedings of the 2022 11th International Conference on System Modeling & Advancement in Research Trends (SMART), Moradabad, India.","DOI":"10.1109\/SMART55829.2022.10046972"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition  (CVPR), Coiumbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster r-cnn: Towards real-time object detection with region proposal networks","volume":"28","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Cai, Z., and Vasconcelos, N. (2018, January 18\u201322). Cascade r-cnn: Delving into high quality object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00644"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"107036","DOI":"10.1016\/j.comnet.2019.107036","article-title":"Faster R-CNN for multi-class fruit detection using a robotic vision system","volume":"168","author":"Wan","year":"2020","journal-title":"Comput. Netw."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"104846","DOI":"10.1016\/j.compag.2019.06.001","article-title":"Fruit detection for strawberry harvesting robot in non-structural environment based on Mask-RCNN","volume":"163","author":"Yu","year":"2019","journal-title":"Comput. Electron. Agric."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A.C. (2016, January 11\u201314). Ssd: Single shot multibox detector. Proceedings of the 2016 14th European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Liang, Q., Zhu, W., Long, J., Wang, Y., Sun, W., and Wu, W. (2018, January 9\u201311). A real-time detection framework for on-tree mango based on SSD network. Proceedings of the 2018 11th International Conference on Intelligent Robotics and Applications (ICIRA), Newcastle, NSW, Australia.","DOI":"10.1007\/978-3-319-97589-4_36"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"105998","DOI":"10.1016\/j.compag.2021.105998","article-title":"A deep learning approach for anthracnose infected trees classification in walnut orchards","volume":"182","author":"Anagnostis","year":"2021","journal-title":"Comput. Electron. Agric."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"4829","DOI":"10.1007\/s00371-024-03483-3","article-title":"EasyRP-R-CNN: A fast cyclone detection model","volume":"40","author":"Tian","year":"2024","journal-title":"Vis. Comput."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1999","DOI":"10.1007\/s00170-022-10335-8","article-title":"A new lightweight deep neural network for surface scratch detection","volume":"123","author":"Li","year":"2022","journal-title":"Int. J. Adv. Manuf. Tech."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"103830","DOI":"10.1016\/j.jvcir.2023.103830","article-title":"Rethinking PASCAL-VOC and MS-COCO dataset for small object detection","volume":"93","author":"Tong","year":"2023","journal-title":"J. Vis. Commun. Image R."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Demir, A., Yilmaz, F., and Kose, O. (2019, January 3\u20135). Early detection of skin cancer using deep learning architectures: Resnet-101 and inception-v3. Proceedings of the 2019 Medical Technologies Congress (TIPTEKNO 2019), Izmir, Turkey.","DOI":"10.1109\/TIPTEKNO47231.2019.8972045"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Shrivastava, A., Gupta, A., and Girshick, R. (2016, January 27\u201330). Training region-based object detectors with online hard example mining. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.89"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Bodla, N., Singh, B., Chellappa, R., and Davis, L.S. (2017, January 21\u201326). Soft-NMS--improving object detection with one line of code. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/ICCV.2017.593"},{"key":"ref_22","unstructured":"Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., and Ren, D. (2020, January 7\u201312). Distance-IoU loss: Faster and better learning for bounding box regression. Proceedings of the 2020 20th AAAI Conference on Artificial Intelligence (AAAI-20), New York, NY, USA."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"3511","DOI":"10.1007\/s10489-021-02534-9","article-title":"Multi-scale object detection for high-speed railway clearance intrusion","volume":"52","author":"Tian","year":"2022","journal-title":"Appl. Intell."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Wang, H., and Xiao, N. (2023). Underwater object detection method based on improved Faster RCNN. Appl. Sci., 13.","DOI":"10.3390\/app13042746"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Lu, X., Wang, H., Zhang, J.J., Zhang, Y.T., Zhong, J., and Zhuang, G.H. (2024). Research on J wave detection based on transfer learning and VGG16. Biomed. Signal. Process., 95.","DOI":"10.1016\/j.bspc.2024.106420"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"6400","DOI":"10.1007\/s10489-021-02293-7","article-title":"Deep learning in multi-object detection and tracking: State of the art","volume":"51","author":"Pal","year":"2021","journal-title":"Appl. Intell."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"3881","DOI":"10.1007\/s00202-023-01915-2","article-title":"Evaluation of visible contamination on power grid insulators using convolutional neural networks","volume":"105","author":"Corso","year":"2023","journal-title":"Electr. Eng."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1882","DOI":"10.1109\/TIP.2022.3148876","article-title":"Siamese Implicit Region Proposal Network with Compound Attention for Visual Tracking","volume":"31","author":"Chan","year":"2022","journal-title":"IEEE Trans. Image Process."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"5823","DOI":"10.3233\/JIFS-212389","article-title":"The improved faster-RCNN for spinal fracture lesions detection","volume":"42","author":"Sha","year":"2022","journal-title":"J. Intell. Fuzzy Syst."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhou, D., Fang, J., Song, X., Guan, C., Yin, J., Dai, Y., and Yang, R. (2019, January 16\u201319). Iou loss for 2d\/3d object detection. Proceedings of the 2019 International Conference on 3D Vision (3DV), Qu\u00e9bec City, QC, Canada.","DOI":"10.1109\/3DV.2019.00019"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1016\/j.neucom.2022.05.052","article-title":"Manhattan-distance IOU loss for fast and accurate bounding box regression and object detection","volume":"500","author":"Shen","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"4774","DOI":"10.1109\/TMM.2023.3257564","article-title":"Deep Blind Image Quality Assessment Powered by Online Hard Example Mining","volume":"25","author":"Wang","year":"2023","journal-title":"IEEE Trans. Multimed."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"921","DOI":"10.32604\/iasc.2023.038257","article-title":"PF-YOLOv4-Tiny: Towards Infrared Target Detection on Embedded Platform","volume":"37","author":"Li","year":"2023","journal-title":"Intell. Autom. Soft Comput."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"7181","DOI":"10.1109\/JSEN.2020.2977366","article-title":"Surface defect detection using image pyramid","volume":"20","author":"Xiao","year":"2020","journal-title":"IEEE Sens. J."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Sun, P., Zhang, R., Jiang, Y., Kong, T., Xu, C., Zhan, W., Tomizuka, M., Li, L., Yuan, Z., and Wang, C. (2021, January 20\u201325). Sparse R-CNN: End-to-End Object Detection with Learnable Proposals. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01422"},{"key":"ref_37","first-page":"1","article-title":"Infrared Small UAV Target Detection Based on Depthwise Separable Residual Dense Network and Multiscale Feature Fusion","volume":"71","author":"Fang","year":"2022","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"2065","DOI":"10.32604\/csse.2023.028053","article-title":"An Ontology Based Multilayer Perceptron for Object Detection","volume":"44","author":"Smart","year":"2023","journal-title":"Comput. Syst. Sci. Eng."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1016\/j.neucom.2022.02.012","article-title":"Automatic learning for object detection","volume":"484","author":"Zhang","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1162\/tacl_a_00542","article-title":"An Empirical Survey of Data Augmentation for Limited Data Learning in NLP","volume":"11","author":"Chen","year":"2023","journal-title":"Trans. Assoc. Comput. Linguist"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1148","DOI":"10.1109\/TPAMI.2023.3330862","article-title":"Differentiable Image Data Augmentation and Its Applications: A Survey","volume":"46","author":"Shi","year":"2024","journal-title":"IEEE Trans. Pattern Anal."},{"key":"ref_42","unstructured":"Gower, R.M., Loizou, N., Qian, X., Sailanbayev, A., Shulgin, E., and Richt\u00e1rik, P. (2019, January 9\u201315). SGD: General analysis and improved rates. Proceedings of the International Conference on Machine Learning (ICML), Long Beach, CA, USA."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_45","unstructured":"Bochkovskiy, A., Wang, C.-Y., and Liao, H.-Y.M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv."}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/10\/8\/197\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:36:11Z","timestamp":1760110571000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/10\/8\/197"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,13]]},"references-count":45,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2024,8]]}},"alternative-id":["jimaging10080197"],"URL":"https:\/\/doi.org\/10.3390\/jimaging10080197","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,13]]}}}