{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,23]],"date-time":"2026-03-23T18:36:36Z","timestamp":1774290996188,"version":"3.50.1"},"reference-count":46,"publisher":"MDPI AG","issue":"23","license":[{"start":{"date-parts":[[2023,11,30]],"date-time":"2023-11-30T00:00:00Z","timestamp":1701302400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62071474"],"award-info":[{"award-number":["62071474"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>A significant challenge in detecting objects in complex remote sensing (RS) datasets is from small objects. Existing detection methods achieve much lower accuracy on small objects than medium and large ones. These methods suffer from limited feature information, susceptibility to complex background interferences, and insufficient contextual information. To address these issues, a small object detection method with the enhanced receptive field, ERF-RTMDet, is proposed to achieve a more robust detection capability on small objects in RS images. Specifically, three modules are employed to enhance the receptive field of small objects\u2019 features. First, the Dilated Spatial Pyramid Pooling Fast Module is proposed to gather more contextual information on small objects and suppress the interference of background information. Second, the Content-Aware Reassembly of Features Module is employed for more efficient feature fusion instead of the nearest-neighbor upsampling operator. Finally, the Hybrid Dilated Attention Module is proposed to expand the receptive field of object features after the feature fusion network. Extensive experiments are conducted on the MAR20 and NWPU VHR-10 datasets. The experimental results show that our ERF-RTMDet attains higher detection precision on small objects while maintaining or slightly enhancing the detection precision on mid-scale and large-scale objects.<\/jats:p>","DOI":"10.3390\/rs15235575","type":"journal-article","created":{"date-parts":[[2023,11,30]],"date-time":"2023-11-30T07:44:42Z","timestamp":1701330282000},"page":"5575","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["ERF-RTMDet: An Improved Small Object Detection Method in Remote Sensing Images"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-6385-7086","authenticated-orcid":false,"given":"Shuo","family":"Liu","sequence":"first","affiliation":[{"name":"College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6182-4877","authenticated-orcid":false,"given":"Huanxin","family":"Zou","sequence":"additional","affiliation":[{"name":"College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yazhe","family":"Huang","sequence":"additional","affiliation":[{"name":"Tianjin Advanced Technology Research Institute, Tianjin 300457, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xu","family":"Cao","sequence":"additional","affiliation":[{"name":"College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shitian","family":"He","sequence":"additional","affiliation":[{"name":"College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3732-962X","authenticated-orcid":false,"given":"Meilin","family":"Li","sequence":"additional","affiliation":[{"name":"College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuqing","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,11,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Lai, J., Liu, C.L., Chen, X., Zhou, J., Tan, T., Zheng, N., and Zha, H. (2018, January 9\u201312). Pattern Recognition and Computer Vision. Proceedings of the Lecture Notes in Computer Science, Guildford, UK.","DOI":"10.1007\/978-3-030-03398-9"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S.K., Girshick, R.B., and Farhadi, A. (July, January 26). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, Faster, Stronger. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_5","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv."},{"key":"ref_6","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv."},{"key":"ref_7","unstructured":"Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., and Nie, W. (2022). YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Bochkovskiy, A., and Liao, H.Y.M. (2023, January 18\u201322). YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, QC, Canada.","DOI":"10.1109\/CVPR52729.2023.00721"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1109\/TPAMI.2018.2844175","article-title":"Mask R-CNN","volume":"42","author":"He","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Girshick, R.B., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1109\/TPAMI.2018.2858826","article-title":"Focal Loss for Dense Object Detection","volume":"42","author":"Lin","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"642","DOI":"10.1007\/s11263-019-01204-1","article-title":"CornerNet: Detecting Objects as Paired Keypoints","volume":"128","author":"Law","year":"2018","journal-title":"Int. J. Comput. Vis."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zhou, X., Zhuo, J., and Kr\u00e4henb\u00fchl, P. (2019, January 15\u201320). Bottom-Up Object Detection by Grouping Extreme and Center Points. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00094"},{"key":"ref_14","unstructured":"Zhou, X., Wang, D., and Kr\u00e4henb\u00fchl, P. (2019). Objects as Points. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Tian, Z., Shen, C., Chen, H., and He, T. (November, January 27). FCOS: Fully Convolutional One-Stage Object Detection. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00972"},{"key":"ref_16","unstructured":"Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021). YOLOX: Exceeding YOLO Series in 2021. arXiv."},{"key":"ref_17","unstructured":"Lyu, C., Zhang, W., Huang, H., Zhou, Y., Wang, Y., Liu, Y., Zhang, S., and Chen, K. (2022). RTMDet: An Empirical Study of Designing Real-Time Object Detectors. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"103752","DOI":"10.1016\/j.jvcir.2023.103752","article-title":"FE-YOLOv5: Feature enhancement network based on YOLOv5 for small object detection","volume":"90","author":"Wang","year":"2023","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Yang, C., Huang, Z., and Wang, N. (2022, January 18\u201324). QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01330"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201322). Path Aggregation Network for Instance Segmentation. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"012027","DOI":"10.1088\/1742-6596\/2253\/1\/012027","article-title":"Multi-scene small object detection with modified YOLOv4","volume":"2253","author":"Ziming","year":"2022","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"e1145","DOI":"10.7717\/peerj-cs.1145","article-title":"Lightweight multi-scale network for small object detection","volume":"8","author":"Li","year":"2022","journal-title":"PeerJ Comput. Sci."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1968","DOI":"10.1109\/TMM.2021.3074273","article-title":"Extended Feature Pyramid Network for Small Object Detection","volume":"24","author":"Deng","year":"2022","journal-title":"IEEE Trans. Multimed."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1016\/j.patrec.2023.03.009","article-title":"Small-object detection based on YOLOv5 in autonomous driving systems","volume":"168","author":"Mahaur","year":"2023","journal-title":"Pattern Recognit. Lett."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1016\/j.neunet.2022.08.029","article-title":"Attentional feature pyramid network for small object detection","volume":"155","author":"Min","year":"2022","journal-title":"Neural Netw. Off. Int. Neural Netw. Soc."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"3280","DOI":"10.1080\/01431161.2022.2089539","article-title":"Small object detection in remote sensing images based on attention mechanism and multi-scale feature fusion","volume":"43","author":"Zhang","year":"2022","journal-title":"Int. J. Remote. Sens."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Shi, T., Gong, J., Hu, J., Zhi, X., Zhang, W., Zhang, Y., Zhang, P., and Bao, G. (2022). Feature-Enhanced CenterNet for Small Object Detection in Remote Sensing Images. Remote. Sens., 14.","DOI":"10.3390\/rs14215488"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S.J., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft COCO: Common Objects in Context. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Wang, J., Chen, K., Xu, R., Liu, Z., Loy, C.C., and Lin, D. (November, January 27). CARAFE: Content-Aware ReAssembly of FEatures. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00310"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"823","DOI":"10.1109\/TLA.2022.9693567","article-title":"MSFYOLO: Feature fusion-based detection for small objects","volume":"20","author":"Song","year":"2022","journal-title":"IEEE Lat. Am. Trans."},{"key":"ref_32","unstructured":"Yu, F., and Koltun, V. (2015). Multi-Scale Context Aggregation by Dilated Convolutions. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Kim, M., Jeong, J.H., and Kim, S. (2021). ECAP-YOLO: Efficient Channel Attention Pyramid YOLO for Small Object Detection in Aerial Image. Remote. Sens., 13.","DOI":"10.3390\/rs13234851"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Shi, W., Caballero, J., Husz\u00e1r, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., and Wang, Z. (July, January 26). Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.207"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition","volume":"37","author":"He","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_36","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., and Cottrell, G. (2018, January 12\u201315). Understanding Convolution for Semantic Segmentation. Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA.","DOI":"10.1109\/WACV.2018.00163"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"2011","DOI":"10.1109\/TPAMI.2019.2913372","article-title":"Squeeze-and-Excitation Networks","volume":"42","author":"Hu","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Li, X., Wang, W., Wu, L., Chen, S., Hu, X., Li, J., Tang, J., and Yang, J. (2020). Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection. arXiv.","DOI":"10.1109\/CVPR46437.2021.01146"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Rezatofighi, S.H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I.D., and Savarese, S. (2019, January 15\u201320). Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"7405","DOI":"10.1109\/TGRS.2016.2601622","article-title":"Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images","volume":"54","author":"Cheng","year":"2016","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_42","unstructured":"Wenqi, Y., Gong, C., Meijun, W., Yanqing, Y., Xingxing, X., Xiwen, Y., and Junwei, H. (2022). MAR20: A Benchmark for Military Aircraft Recognition in Remote Sensing Images. Natl. Remote. Sens. Bull., 1\u201311."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Feng, C., Zhong, Y., Gao, Y., Scott, M.R., and Huang, W. (2021, January 11\u201317). TOOD: Task-aligned One-stage Object Detection. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), Virtual.","DOI":"10.1109\/ICCV48922.2021.00349"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"57552","DOI":"10.1109\/ACCESS.2020.2982658","article-title":"Adaptive Anchor Networks for Multi-Scale Object Detection in Remote Sensing Images","volume":"8","author":"Zhang","year":"2020","journal-title":"IEEE Access"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018, January 8\u201314). CBAM: Convolutional Block Attention Module. Proceedings of the 15th European Conference on Computer Vision (ECCV 2018), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Wang, Q., Wu, B., Zhu, P.F., Li, P., Zuo, W., and Hu, Q. (2020, January 13\u201319). ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01155"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/23\/5575\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:34:47Z","timestamp":1760132087000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/23\/5575"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,30]]},"references-count":46,"journal-issue":{"issue":"23","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["rs15235575"],"URL":"https:\/\/doi.org\/10.3390\/rs15235575","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,30]]}}}