{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T08:32:50Z","timestamp":1773909170817,"version":"3.50.1"},"reference-count":47,"publisher":"MDPI AG","issue":"13","license":[{"start":{"date-parts":[[2022,7,2]],"date-time":"2022-07-02T00:00:00Z","timestamp":1656720000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["No. 12003018"],"award-info":[{"award-number":["No. 12003018"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["No. XJS191305"],"award-info":[{"award-number":["No. XJS191305"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["No. 2018M633471"],"award-info":[{"award-number":["No. 2018M633471"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","award":["No. 12003018"],"award-info":[{"award-number":["No. 12003018"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","award":["No. XJS191305"],"award-info":[{"award-number":["No. XJS191305"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","award":["No. 2018M633471"],"award-info":[{"award-number":["No. 2018M633471"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"publisher","award":["No. 12003018"],"award-info":[{"award-number":["No. 12003018"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"publisher","award":["No. XJS191305"],"award-info":[{"award-number":["No. XJS191305"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"publisher","award":["No. 2018M633471"],"award-info":[{"award-number":["No. 2018M633471"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Convolutional neural networks (CNNs) have achieved milestones in object detection of synthetic aperture radar (SAR) images. Recently, vision transformers and their variants have shown great promise in detection tasks. However, ship detection in SAR images remains a substantial challenge because of the characteristics of strong scattering, multi-scale, and complex backgrounds of ship objects in SAR images. This paper proposes an enhancement Swin transformer detection network, named ESTDNet, to complete the ship detection in SAR images to solve the above problems. We adopt the Swin transformer of Cascade-R-CNN (Cascade R-CNN Swin) as a benchmark model in ESTDNet. Based on this, we built two modules in ESTDNet: the feature enhancement Swin transformer (FESwin) module for improving feature extraction capability and the adjacent feature fusion (AFF) module for optimizing feature pyramids. Firstly, the FESwin module is employed as the backbone network, aggregating contextual information about perceptions before and after the Swin transformer model using CNN. It uses single-point channel information interaction as the primary and local spatial information interaction as the secondary for scale fusion based on capturing visual dependence through self-attention, which improves spatial-to-channel feature expression and increases the utilization of ship information from SAR images. Secondly, the AFF module is a weighted selection fusion of each high-level feature in the feature pyramid with its adjacent shallow-level features using learnable adaptive weights, allowing the ship information of SAR images to be focused on the feature maps at more scales and improving the recognition and localization capability for ships in SAR images. Finally, the ablation study conducted on the SSDD dataset validates the effectiveness of the two components proposed in the ESTDNet detector. Moreover, the experiments executed on two public datasets consisting of SSDD and SARShip demonstrate that the ESTDNet detector outperforms the state-of-the-art methods, which provides a new idea for ship detection in SAR images.<\/jats:p>","DOI":"10.3390\/rs14133186","type":"journal-article","created":{"date-parts":[[2022,7,4]],"date-time":"2022-07-04T20:59:18Z","timestamp":1656968358000},"page":"3186","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":45,"title":["Ship Detection in SAR Images Based on Feature Enhancement Swin Transformer and Adjacent Feature Fusion"],"prefix":"10.3390","volume":"14","author":[{"given":"Kuoyang","family":"Li","sequence":"first","affiliation":[{"name":"School of Aerospace Science and Technology, Xidian University, Xi\u2019an 710126, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8681-5889","authenticated-orcid":false,"given":"Min","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Aerospace Science and Technology, Xidian University, Xi\u2019an 710126, China"}]},{"given":"Maiping","family":"Xu","sequence":"additional","affiliation":[{"name":"Shaanxi Academy of Aerospace Technology Application Co., Ltd., Xi\u2019an 710199, China"}]},{"given":"Rui","family":"Tang","sequence":"additional","affiliation":[{"name":"Shaanxi Academy of Aerospace Technology Application Co., Ltd., Xi\u2019an 710199, China"}]},{"given":"Liang","family":"Wang","sequence":"additional","affiliation":[{"name":"Shaanxi Academy of Aerospace Technology Application Co., Ltd., Xi\u2019an 710199, China"}]},{"given":"Hai","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Aerospace Science and Technology, Xidian University, Xi\u2019an 710126, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,7,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Fan, Y., Wang, F., and Wang, H. (2022). A Transformer-Based Coarse-to-Fine Wide-Swath SAR Image Registration Method under Weak Texture Conditions. Remote Sens., 14.","DOI":"10.3390\/rs14051175"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"141662","DOI":"10.1109\/ACCESS.2019.2943241","article-title":"A Lightweight Feature Optimizing Network for Ship Detection in SAR Image","volume":"7","author":"Zhang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_3","first-page":"1097","article-title":"ImageNet Classification with Deep Convolutional Neural Networks","volume":"25","author":"Krizhevsky","year":"2012","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1109\/LSP.2021.3049997","article-title":"Two-Stream Encoder GAN With Progressive Training for Co-Saliency Detection","volume":"28","author":"Qian","year":"2021","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Lin, S., Zhang, M., Cheng, X., Wang, L., Xu, M., and Wang, H. (2022). Hyperspectral Anomaly Detection via Dual Dictionaries Construction Guided by Two-Stage Complementary Decision. Remote Sens., 14.","DOI":"10.3390\/rs14081784"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, Faster, Stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_8","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv."},{"key":"ref_9","unstructured":"Bochkovskiy, A., Wang, C.-Y., and Liao, H.-Y.M. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 8\u201316). SSD: Single Shot MultiBox Detector. Proceedings of the 14th European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1109\/TPAMI.2018.2858826","article-title":"Focal Loss for Dense Object Detection","volume":"42","author":"Lin","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast R-CNN. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2016","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Cai, Z., and Vasconcelos, N. (2018, January 18\u201323). Cascade R-CNN: Delving Into High Quality Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00644"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.-Y., and Kweon, I.S. (2018, January 8\u201314). CBAM: Convolutional Block Attention Module. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_19","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-End Object Detection with Transformers. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021). Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. arXiv.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_22","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020). Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Dai, Z., Cai, B., Lin, Y., and Chen, J. (2021, January 20\u201325). UP-DETR: Unsupervised Pre-Training for Object Detection with Transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00165"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Wang, J., Lu, C., and Jiang, W. (2018). Simultaneous Ship Detection and Orientation Estimation in SAR Images Based on Attention Module and Angle Regression. Sensors, 18.","DOI":"10.3390\/s18092851"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Chang, Y.-L., Anagaw, A., Chang, L., Wang, Y., Hsiao, C.-Y., and Lee, W.-H. (2019). Ship Detection Based on YOLOv2 for SAR Imagery. Remote Sens., 11.","DOI":"10.3390\/rs11070786"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Qian, X., Lin, S., Cheng, G., Yao, X., Ren, H., and Wang, W. (2020). Object Detection in Remote Sensing Images Based on Improved Bounding Box Regression and Multi-Level Features Fusion. Remote Sens., 12.","DOI":"10.3390\/rs12010143"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Su, N., He, J., Yan, Y., Zhao, C., and Xing, X. (2022). SII-Net: Spatial Information Integration Network for Small Target Detection in SAR Images. Remote Sens., 14.","DOI":"10.3390\/rs14030442"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Li, J., Qu, C., and Shao, J. (2017, January 13\u201314). Ship Detection in SAR Images Based on an Improved Faster R-CNN. Proceedings of the SAR in Big Data Era (BIGSARDATA), Beijing, China.","DOI":"10.1109\/BIGSARDATA.2017.8124934"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Wang, Y., Wang, C., Zhang, H., Dong, Y., and Wei, S. (2019). A SAR Dataset of Ship Detection for Deep Learning under Complex Backgrounds. Remote Sens., 11.","DOI":"10.3390\/rs11070765"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Zhang, T., and Zhang, X. (2019). High-Speed Ship Detection in SAR Images Based on a Grid Convolutional Neural Network. Remote Sens., 11.","DOI":"10.3390\/rs11101206"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhou, K., Zhang, M., Wang, H., and Tan, J. (2022). Ship Detection in SAR Images Based on Multi-Scale Feature Extraction and Adaptive Feature Fusion. Remote Sens., 14.","DOI":"10.3390\/rs14030755"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Zhang, T., Zhang, X., and Ke, X. (2021). Quad-FPN: A Novel Quad Feature Pyramid Network for SAR Ship Detection. Remote Sens., 13.","DOI":"10.3390\/rs13142771"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"8983","DOI":"10.1109\/TGRS.2019.2923988","article-title":"Dense Attention Pyramid Networks for Multi-Scale Ship Detection in SAR Images","volume":"57","author":"Cui","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Xia, R., Chen, J., Huang, Z., Wan, H., Wu, B., Sun, L., Yao, B., Xiang, H., and Xing, M. (2022). CRTransSar: A Visual Transformer Based on Contextual Joint Representation Learning for SAR Ship Detection. Remote Sens., 14.","DOI":"10.3390\/rs14061488"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"666","DOI":"10.1109\/JSTARS.2021.3137390","article-title":"Ships Detection in SAR Images Based on Anchor-Free Model With Mask Guidance Features","volume":"15","author":"Qu","year":"2022","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Feng, Y., Chen, J., Huang, Z., Wan, H., Xia, R., Wu, B., Sun, L., and Xing, M. (2022). A Lightweight Position-Enhanced Anchor-Free Algorithm for SAR Ship Detection. Remote Sens., 14.","DOI":"10.3390\/rs14081908"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Rostami, M., Kolouri, S., Eaton, E., and Kim, K. (2019). Deep Transfer Learning for Few-Shot SAR Image Classification. Remote Sens., 11.","DOI":"10.20944\/preprints201905.0030.v1"},{"key":"ref_38","first-page":"135","article-title":"Ship Detection Based on Small Sample Learning","volume":"108","author":"Hao","year":"2020","journal-title":"J. Coast. Res."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhang, H., Zhang, X., Meng, G., Guo, C., and Jiang, Z. (2022). Few-Shot Multi-Class Ship Detection in Remote Sensing Images Using Attention Feature Map and Multi-Relation Detector. Remote Sens., 14.","DOI":"10.3390\/rs14122790"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Zhoa, J., and Liang, X. (2020, January 27\u201328). Zero-shot Learning Based on Semantic Embedding for Ship Detection. Proceedings of the 2020 3rd International Conference on Unmanned Systems (ICUS), Harbin, China.","DOI":"10.1109\/ICUS50048.2020.9274981"},{"key":"ref_41","first-page":"76","article-title":"Few shot object detection in remote sensing images","volume":"Volume 11862","author":"Bruzzone","year":"2021","journal-title":"Image and Signal Processing for Remote Sensing XXVII"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201322). Path Aggregation Network for Instance Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Kim, K., and Lee, H.S. (2020, January 23\u201328). Probabilistic Anchor Assignment with oU Prediction for Object Detection. Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK.","DOI":"10.1007\/978-3-030-58595-2_22"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Zhang, S., Chi, C., Yao, Y., Lei, Z., and Li, S.Z. (2020, January 14\u201319). Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00978"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Feng, C., Zhong, Y., Gao, Y., Scott, M.R., and Huang, W. TOOD: Task-Aligned One-Stage Object Detection. Proceedings of the 2021 IEEE International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00349"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., and Sun, J. (2021, January 19\u201325). You Only Look One-Level Feature. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01284"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft COCO: Common Objects in Context. Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/13\/3186\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:42:10Z","timestamp":1760139730000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/13\/3186"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,2]]},"references-count":47,"journal-issue":{"issue":"13","published-online":{"date-parts":[[2022,7]]}},"alternative-id":["rs14133186"],"URL":"https:\/\/doi.org\/10.3390\/rs14133186","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,2]]}}}