{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T19:34:58Z","timestamp":1780601698291,"version":"3.54.1"},"reference-count":55,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2021,3,30]],"date-time":"2021-03-30T00:00:00Z","timestamp":1617062400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Key R&amp;D Program of China","award":["2019YFB1405900"],"award-info":[{"award-number":["2019YFB1405900"]}]},{"name":"the Fundamental Research Funds for the Central Universities and USTB-NTUT Joint Research Program","award":["-"],"award-info":[{"award-number":["-"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Object detection is a significant and challenging problem in the study of remote sensing. Since remote sensing images are typically captured with a bird\u2019s-eye view, the aspect ratios of objects in the same category may obey a Gaussian distribution. Generally, existing object detection methods ignore exploring the distribution character of aspect ratios for improving performance in remote sensing tasks. In this paper, we propose a novel Self-Adaptive Aspect Ratio Anchor (SARA) to explicitly explore aspect ratio variations of objects in remote sensing images. To be concrete, our SARA can self-adaptively learn an appropriate aspect ratio for each category. In this way, we can only utilize a simple squared anchor (related to the strides of feature maps in Feature Pyramid Networks) to regress objects in various aspect ratios. Finally, we adopt an Oriented Box Decoder (OBD) to align the feature maps and encode the orientation information of oriented objects. Our method achieves a promising mAP value of 79.91% on the DOTA dataset.<\/jats:p>","DOI":"10.3390\/rs13071318","type":"journal-article","created":{"date-parts":[[2021,3,31]],"date-time":"2021-03-31T00:13:10Z","timestamp":1617149590000},"page":"1318","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["Self-Adaptive Aspect Ratio Anchor for Oriented Object Detection in Remote Sensing Images"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4632-0160","authenticated-orcid":false,"given":"Jie-Bo","family":"Hou","sequence":"first","affiliation":[{"name":"School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiaobin","family":"Zhu","sequence":"additional","affiliation":[{"name":"School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xu-Cheng","family":"Yin","sequence":"additional","affiliation":[{"name":"School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2021,3,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Ye, X., Xiong, F., Lu, J., Zhou, J., and Qian, Y. (2020). F3-Net: Feature Fusion and Filtration Network for Object Detection in Optical Remote Sensing Images. Remote Sens., 12.","DOI":"10.3390\/rs12244027"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Qiu, H., Li, H., Wu, Q., Meng, F., Ngan, K.N., and Shi, H. (2019). A2RMNet: Adaptively Aspect Ratio Multi-Scale Network for Object Detection in Remote Sensing Images. Remote Sens., 11.","DOI":"10.3390\/rs11131594"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Xiao, Z., Qian, L., Shao, W., Tan, X., and Wang, K. (2020). Axis Learning for Orientated Objects Detection in Aerial Images. Remote Sens., 12.","DOI":"10.3390\/rs12060908"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Yang, X., Yang, J., Yan, J., Zhang, Y., Zhang, T., Guo, Z., Sun, X., and Fu, K. (November, January 27). SCRDet: Towards More Robust Detection for Small, Cluttered and Rotated Objects. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00832"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Li, C., Luo, B., Hong, H., Su, X., Wang, Y., Liu, J., Wang, C., Zhang, J., and Wei, L. (2020). Object Detection Based on Global-Local Saliency Constraint in Aerial Images. Remote Sens., 12.","DOI":"10.3390\/rs12091435"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Wang, J., Ding, J., Guo, H., Cheng, W., Pan, T., and Yang, W. (2019). Mask OBB: A Semantic Attention-Based Mask Oriented Bounding Box Representation for Multi-Category Object Detection in Aerial Images. Remote Sens., 11.","DOI":"10.3390\/rs11242930"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhu, K., Chen, G., Tan, X., Zhang, L., Dai, F., Liao, P., and Gong, Y. (2019). Geospatial Object Detection on High Resolution Remote Sensing Imagery Based on Double Multi-Scale Feature Pyramid Network. Remote Sens., 11.","DOI":"10.3390\/rs11070755"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1074","DOI":"10.1109\/LGRS.2016.2565705","article-title":"Ship Rotated Bounding Box Space for Ship Extraction From High-Resolution Optical Satellite Images With Complex Backgrounds","volume":"13","author":"Liu","year":"2016","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Liu, Z., Hu, J., Weng, L., and Yang, Y. (2017, January 17\u201320). Rotated region based CNN for ship detection. Proceedings of the 2017 IEEE International Conference on Image Processing, ICIP 2017, Beijing, China.","DOI":"10.1109\/ICIP.2017.8296411"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Kwan, C., Chou, B., Yang, J., Rangamani, A., Tran, T.D., Zhang, J., and Etienne-Cummings, R. (2019). Deep Learning-Based Target Tracking and Classification for Low Quality Videos Using Coded Aperture Cameras. Sensors, 19.","DOI":"10.3390\/s19173702"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C., and Berg, A.C. (2016, January 11\u201314). SSD: Single Shot MultiBox Detector. Proceedings of the Computer Vision - ECCV 2016\u201414th European Conference, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Nguyen, P.H., Arsalan, M., Koo, J.H., Naqvi, R.A., Truong, N.Q., and Park, K.R. (2018). LightDenseYOLO: A Fast and Accurate Marker Tracker for Autonomous UAV Landing by Visible Light Camera Sensor on Drone. Sensors, 18.","DOI":"10.3390\/s18061703"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_14","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Lin, T., Goyal, P., Girshick, R.B., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal Loss for Dense Object Detection. Proceedings of the IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"3111","DOI":"10.1109\/TMM.2018.2818020","article-title":"Arbitrary-Oriented Scene Text Detection via Rotation Proposals","volume":"20","author":"Ma","year":"2018","journal-title":"IEEE Trans. Multimed."},{"key":"ref_17","unstructured":"Han, J., Ding, J., Li, J., and Xia, G. (2020). Align Deep Features for Oriented Object Detection. arXiv."},{"key":"ref_18","unstructured":"Tang, T., Liu, Y., Zheng, Y., Zhu, X., and Zhao, Y. (2020). Rotating Objects Detection in Aerial Images via Attention Denoising and Angle Loss Refining. DEStech Trans. Comput. Sci. Eng."},{"key":"ref_19","unstructured":"Yang, X., Yan, J., Yang, X., Tang, J., Liao, W., and He, T. (2020). SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing. arXiv."},{"key":"ref_20","unstructured":"Li, C., Xu, C., Cui, Z., Wang, D., Jie, Z., Zhang, T., and Yang, J. (2019, January 16\u201320). Learning Object-Wise Semantic Representation for Detection in Remote Sensing Imagery. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2019, Long Beach, CA, USA."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1016\/j.isprsjprs.2020.01.025","article-title":"Rotation-aware and multi-scale convolutional neural network for object detection in remote sensing images","volume":"161","author":"Fu","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Li, Y., Huang, Q., Pei, X., Jiao, L., and Shang, R. (2020). RADet: Refine Feature Pyramid Network and Multi-Layer Attention Network for Arbitrary-Oriented Object Detection of Remote Sensing Images. Remote Sens., 12.","DOI":"10.3390\/rs12030389"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Xiao, Z., Wang, K., Wan, Q., Tan, X., Xu, C., and Xia, F. (2021). A2S-Det: Efficiency Anchor Matching in Aerial Image Oriented Object Detection. Remote Sens., 13.","DOI":"10.3390\/rs13010073"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Ming, Q., Zhou, Z., Miao, L., Zhang, H., and Li, L. (2020). Dynamic Anchor Learning for Arbitrary-Oriented Object Detection. arXiv.","DOI":"10.1609\/aaai.v35i3.16336"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhou, Y., Ye, Q., Qiu, Q., and Jiao, J. (2017, January 21\u201326). Oriented Response Networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.527"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Ding, J., Xue, N., Long, Y., Xia, G., and Lu, Q. (2019, January 16\u201320). Learning RoI Transformer for Oriented Object Detection in Aerial Images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00296"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R.B. (2017, January 22\u201329). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, Faster, Stronger. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"3676","DOI":"10.1109\/TIP.2018.2825107","article-title":"Textboxes++: A single-shot oriented scene text detector","volume":"27","author":"Liao","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_30","unstructured":"Yang, X., Liu, Q., Yan, J., and Li, A. (2019). R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_32","unstructured":"Huang, L., Yang, Y., Deng, Y., and Yu, Y. (2015). DenseBox: Unifying Landmark Localization with End to End Object Detection. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., and Liang, J. (2017, January 21\u201326). EAST: An Efficient and Accurate Scene Text Detector. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.283"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Yu, J., Jiang, Y., Wang, Z., Cao, Z., and Huang, T.S. (2016, January 15\u201319). UnitBox: An Advanced Object Detection Network. Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016, Amsterdam, The Netherlands.","DOI":"10.1145\/2964284.2967274"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"He, W., Zhang, X.Y., Yin, F., and Liu, C.L. (2017, January 22\u201329). Deep Direct Regression for Multi-oriented Scene Text Detection. Proceedings of the IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy.","DOI":"10.1109\/ICCV.2017.87"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"5406","DOI":"10.1109\/TIP.2018.2855399","article-title":"Multi-oriented and multi-lingual scene text detection with direct regression","volume":"27","author":"He","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Tian, Z., Shen, C., Chen, H., and He, T. (November, January 27). FCOS: Fully Convolutional One-Stage Object Detection. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00972"},{"key":"ref_38","unstructured":"Zhou, X., Wang, D., and Kr\u00e4henb\u00fchl, P. (2019). Objects as Points. arXiv."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhu, C., He, Y., and Savvides, M. (2019, January 16\u201320). Feature Selective Anchor-Free Module for Single-Shot Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00093"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Lin, T., Doll\u00e1r, P., Girshick, R.B., He, K., Hariharan, B., and Belongie, S.J. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"765","DOI":"10.1007\/978-3-030-01264-9_45","article-title":"CornerNet: Detecting Objects as Paired Keypoints","volume":"Volume 11218","author":"Ferrari","year":"2018","journal-title":"Proceedings of the Computer Vision-ECCV 2018\u201415th European Conference"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., and Tian, Q. (27\u20132, January 27). CenterNet: Keypoint Triplets for Object Detection. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00667"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zhou, X., Zhuo, J., and Kr\u00e4henb\u00fchl, P. (2019, January 16\u201320). Bottom-Up Object Detection by Grouping Extreme and Center Points. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00094"},{"key":"ref_44","unstructured":"Wei, H., Zhou, L., Zhang, Y., Li, H., Guo, R., and Wang, H. (2019). Oriented Objects as pairs of Middle Lines. arXiv."},{"key":"ref_45","unstructured":"Deng, D., Liu, H., Li, X., and Cai, D. (2018, January 2\u20137). PixelLink: Detecting Scene Text via Instance Segmentation. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, LA, USA."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"5566","DOI":"10.1109\/TIP.2019.2900589","article-title":"TextField: Learning a Deep Direction Field for Irregular Scene Text Detection","volume":"28","author":"Xu","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Tian, Z., Shu, M., Lyu, P., Li, R., Zhou, C., Shen, X., and Jia, J. (2019, January 16\u201320). Learning Shape-Aware Embedding for Scene Text Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00436"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Wang, W., Xie, E., Li, X., Hou, W., Lu, T., Yu, G., and Shao, S. (2019, January 16\u201320). Shape Robust Text Detection with Progressive Scale Expansion Network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00956"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., and Wei, Y. (2017, January 22\u201329). Deformable Convolutional Networks. Proceedings of the IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy.","DOI":"10.1109\/ICCV.2017.89"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Xia, G., Bai, X., Ding, J., Zhu, Z., Belongie, S.J., Luo, J., Datcu, M., Pelillo, M., and Zhang, L. (2018, January 18\u201322). DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00418"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Pan, X., Ren, Y., Sheng, K., Dong, W., Yuan, H., Guo, X., Ma, C., and Xu, C. (2020, January 13\u201319). Dynamic Refinement Network for Oriented and Densely Packed Object Detection. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01122"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Azimi, S.M., Vig, E., Bahmanyar, R., K\u00f6rner, M., and Reinartz, P. (2018, January 2\u20136). Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery. Proceedings of the Computer Vision\u2014ACCV 2018\u201414th Asian Conference on Computer Vision, Perth, Australia.","DOI":"10.1007\/978-3-030-20893-6_10"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"10015","DOI":"10.1109\/TGRS.2019.2930982","article-title":"CAD-Net: A Context-Aware Detection Network for Objects in Remote Sensing Imagery","volume":"57","author":"Zhang","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_55","unstructured":"Xu, Y., Fu, M., Wang, Q., Wang, Y., Chen, K., Xia, G., and Bai, X. (2019). Gliding vertex on the horizontal bounding box for multi-oriented object detection. arXiv."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/7\/1318\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T13:33:10Z","timestamp":1760362390000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/7\/1318"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,30]]},"references-count":55,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2021,4]]}},"alternative-id":["rs13071318"],"URL":"https:\/\/doi.org\/10.3390\/rs13071318","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,3,30]]}}}