{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:21:26Z","timestamp":1760235686397,"version":"build-2065373602"},"reference-count":60,"publisher":"MDPI AG","issue":"18","license":[{"start":{"date-parts":[[2021,9,17]],"date-time":"2021-09-17T00:00:00Z","timestamp":1631836800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["U1903213"],"award-info":[{"award-number":["U1903213"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>To detect rotated objects in remote sensing images, researchers have proposed a series of arbitrary-oriented object detection methods, which place multiple anchors with different angles, scales, and aspect ratios on the images. However, a major difference between remote sensing images and natural images is the small probability of overlap between objects in the same category, so the anchor-based design can introduce much redundancy during the detection process. In this paper, we convert the detection problem to a center point prediction problem, where the pre-defined anchors can be discarded. By directly predicting the center point, orientation, and corresponding height and width of the object, our methods can simplify the design of the model and reduce the computations related to anchors. In order to further fuse the multi-level features and get accurate object centers, a deformable feature pyramid network is proposed, to detect objects under complex backgrounds and various orientations of rotated objects. Experiments and analysis on two remote sensing datasets, DOTA and HRSC2016, demonstrate the effectiveness of our approach. Our best model, equipped with Deformable-FPN, achieved 74.75% mAP on DOTA and 96.59% on HRSC2016 with a single-stage model, single-scale training, and testing. By detecting arbitrarily oriented objects from their centers, the proposed model performs competitively against oriented anchor-based methods.<\/jats:p>","DOI":"10.3390\/rs13183731","type":"journal-article","created":{"date-parts":[[2021,9,21]],"date-time":"2021-09-21T22:35:20Z","timestamp":1632263720000},"page":"3731","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Predicting Arbitrary-Oriented Objects as Points in Remote Sensing Images"],"prefix":"10.3390","volume":"13","author":[{"given":"Jian","family":"Wang","sequence":"first","affiliation":[{"name":"School of Information and Communications Engineering, Xi\u2019an Jiaotong University, Xi\u2019an 710049, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8379-4915","authenticated-orcid":false,"given":"Le","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Information and Communications Engineering, Xi\u2019an Jiaotong University, Xi\u2019an 710049, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7566-1634","authenticated-orcid":false,"given":"Fan","family":"Li","sequence":"additional","affiliation":[{"name":"School of Information and Communications Engineering, Xi\u2019an Jiaotong University, Xi\u2019an 710049, China"}]}],"member":"1968","published-online":{"date-parts":[[2021,9,17]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Xia, G.S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M., Pelillo, M., and Zhang, L. (2018, January 18-22). DOTA: A large-scale dataset for object detection in aerial images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00418"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Liu, Z., Yuan, L., Weng, L., and Yang, Y. (2017, January 24\u201326). A high resolution optical satellite image dataset for ship recognition and some new baselines. Proceedings of the International Conference on Pattern Recognition Applications and Methods, SCITEPRESS, Porto, Portugal.","DOI":"10.5220\/0006120603240331"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., and Lu, S. (2015, January 23\u201326). ICDAR 2015 competition on robust reading. Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Nancy, France.","DOI":"10.1109\/ICDAR.2015.7333942"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Nayef, N., Yin, F., Bizid, I., Choi, H., Feng, Y., Karatzas, D., Luo, Z., Pal, U., Rigaud, C., and Chazalon, J. (2017, January 13\u201315). Icdar2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt. Proceedings of the 2017 14th IAPR International Conference on Document analysis and Recognition (ICDAR), Kyoto, Japan.","DOI":"10.1109\/ICDAR.2017.237"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Reggiannini, M., Righi, M., Tampucci, M., Lo Duca, A., Bacciu, C., Bedini, L., D\u2019Errico, A., Di Paola, C., Marchetti, A., and Martinelli, M. (2019). Remote sensing for maritime prompt monitoring. J. Mar. Sci. Eng., 7.","DOI":"10.3390\/jmse7070202"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Moroni, D., Pieri, G., and Tampucci, M. (2019). Environmental decision support systems for monitoring small scale oil spills: Existing solutions, best practices and current challenges. J. Mar. Sci. Eng., 7.","DOI":"10.3390\/jmse7010019"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Almulihi, A., Alharithi, F., Bourouis, S., Alroobaea, R., Pawar, Y., and Bouguila, N. (2021). Oil spill detection in SAR images using online extended variational learning of dirichlet process mixtures of gamma distributions. Remote Sens., 13.","DOI":"10.3390\/rs13152991"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Zhang, L., Yang, X., and Shen, J. (2021). Frequency variability feature for life signs detection and localization in natural disasters. Remote Sens., 13.","DOI":"10.3390\/rs13040796"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhang, T., Zhang, X., Shi, J., and Wei, S. (2019). Depthwise separable convolution neural network for high-speed SAR ship detection. Remote Sens., 11.","DOI":"10.3390\/rs11212483"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Xiao, X., Wang, B., Miao, L., Li, L., Zhou, Z., Ma, J., and Dong, D. (2021). Infrared and visible image object detection via focused feature enhancement and cascaded semantic extension. Remote Sens., 13.","DOI":"10.3390\/rs13132538"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Tong, X., Sun, B., Wei, J., Zuo, Z., and Su, S. (2021). EAAU-Net: Enhanced asymmetric attention U-Net for infrared small target detection. Remote Sens., 13.","DOI":"10.3390\/rs13163200"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1745","DOI":"10.1109\/LGRS.2018.2856921","article-title":"Toward arbitrary-oriented ship detection with rotated region proposal and discrimination networks","volume":"15","author":"Zhang","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2486","DOI":"10.1109\/TGRS.2016.2645610","article-title":"Accurate object localization in remote sensing images based on convolutional neural networks","volume":"55","author":"Long","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1938","DOI":"10.1109\/JSTARS.2021.3049851","article-title":"A novel CNN-based detector for ship detection based on rotatable bounding box in SAR images","volume":"14","author":"Yang","year":"2021","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Tian, L., Cao, Y., He, B., Zhang, Y., He, C., and Li, D. (2021). Image enhancement driven by object characteristics and dense feature reuse network for ship target detection in remote sensing imagery. Remote Sens., 13.","DOI":"10.3390\/rs13071327"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Dong, Y., Chen, F., Han, S., and Liu, H. (2021). Ship object detection of remote sensing image based on visual attention. Remote Sens., 13.","DOI":"10.3390\/rs13163192"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (voc) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Yang, X., and Yan, J. (2020, January 23\u201328). Arbitrary-oriented object detection with circular smooth label. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58598-3_40"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., and Wei, Y. (2017, January 22\u201329). Deformable convolutional networks. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.89"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zhu, X., Hu, H., Lin, S., and Dai, J. (2019, January 16\u201320). Deformable convnets v2: More deformable, better results. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00953"},{"key":"ref_22","first-page":"91","article-title":"Faster r-cnn: Towards real-time object detection with region proposal networks","volume":"28","author":"Ren","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_24","unstructured":"Dai, J., Li, Y., He, K., and Sun, J. (2016, January 5\u201310). R-fcn: Object detection via region-based fully convolutional networks. Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 8\u201316). Ssd: Single shot multibox detector. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_30","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Zhang, S., Wen, L., Bian, X., Lei, Z., and Li, S.Z. (2018, January 18\u201322). Single-Shot refinement neural network for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00442"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Liu, S., and Huang, D. (2018, January 8\u201314). Receptive field block net for accurate and fast object detection. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01252-6_24"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Zhou, X., Zhuo, J., and Krahenbuhl, P. (2019, January 16\u201320). Bottom-up object detection by grouping extreme and center points. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00094"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Zhang, S., Chi, C., Yao, Y., Lei, Z., and Li, S.Z. (2020, January 14\u201319). Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00978"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Law, H., and Deng, J. (2018, January 8\u201314). Cornernet: Detecting objects as paired keypoints. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_45"},{"key":"ref_37","unstructured":"Zhou, X., Wang, D., and Kr\u00e4henb\u00fchl, P. (2019). Objects as points. arXiv."},{"key":"ref_38","unstructured":"Tian, Z., Shen, C., Chen, H., and He, T. (November, January 27). Fcos: Fully convolutional one-stage object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"3111","DOI":"10.1109\/TMM.2018.2818020","article-title":"Arbitrary-oriented scene text detection via rotation proposals","volume":"20","author":"Ma","year":"2018","journal-title":"IEEE Trans. Multimed."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"3676","DOI":"10.1109\/TIP.2018.2825107","article-title":"Textboxes++: A single-shot oriented scene text detector","volume":"27","author":"Liao","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Azimi, S.M., Vig, E., Bahmanyar, R., K\u00f6rner, M., and Reinartz, P. (2018, January 2\u20136). Towards multi-class object detection in unconstrained remote sensing imagery. Proceedings of the Asian Conference on Computer Vision, Perth, Australia.","DOI":"10.1007\/978-3-030-20893-6_10"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Ding, J., Xue, N., Long, Y., Xia, G.S., and Lu, Q. (2019, January 16\u201320). Learning roi transformer for oriented object detection in aerial images. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00296"},{"key":"ref_43","unstructured":"Yang, X., Liu, Q., Yan, J., Li, A., Zhang, Z., and Yu, G. (2019). R3det: Refined single-stage detector with feature refinement for rotating object. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Li, Y., Mao, H., Liu, R., Pei, X., Jiao, L., and Shang, R. (2021). A lightweight keypoint-based oriented object detection of remote sensing images. Remote Sens., 13.","DOI":"10.3390\/rs13132459"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Ming, Q., Miao, L., Zhou, Z., Song, J., and Yang, X. (2021). Sparse label assignment for oriented object detection in aerial images. Remote Sens., 13.","DOI":"10.3390\/rs13142664"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Qing, Y., Liu, W., Feng, L., and Gao, W. (2021). Improved YOLO network for free-angle remote sensing target detection. Remote Sens., 13.","DOI":"10.3390\/rs13112171"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Yang, X., Sun, H., Fu, K., Yang, J., Sun, X., Yan, M., and Guo, Z. (2018). Automatic Ship Detection in Remote Sensing Images from Google Earth of Complex Scenes Based on Multiscale Rotation Dense Feature Pyramid Networks. Remote Sens., 10.","DOI":"10.3390\/rs10010132"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_50","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"10015","DOI":"10.1109\/TGRS.2019.2930982","article-title":"CAD-Net: A context-aware detection network for objects in remote sensing imagery","volume":"57","author":"Zhang","year":"2019","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Li, C., Xu, C., Cui, Z., Wang, D., Zhang, T., and Yang, J. (2019, January 22\u201325). Feature-attentioned object detection in remote sensing imagery. Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803521"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Yang, F., Li, W., Hu, H., Li, W., and Wang, P. (2020). Multi-scale feature integrated attention-based rotation network for object detection in VHR aerial images. Sensors, 20.","DOI":"10.3390\/s20061686"},{"key":"ref_54","unstructured":"Qian, W., Yang, X., Peng, S., Guo, Y., and Yan, J. (2019). Learning modulated loss for rotated object detection. arXiv."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Jiang, Y., Zhu, X., Wang, X., Yang, S., Li, W., Wang, H., Fu, P., and Luo, Z. (2017). R2cnn: Rotational region cnn for orientation robust scene text detection. arXiv.","DOI":"10.1109\/ICPR.2018.8545598"},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Yi, J., Wu, P., Liu, B., Huang, Q., Qu, H., and Metaxas, D. (2021, January 5\u20139). Oriented object detection in aerial images with box boundary-aware vectors. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","DOI":"10.1109\/WACV48630.2021.00220"},{"key":"ref_57","unstructured":"Yang, X., Yang, J., Yan, J., Zhang, Y., Zhang, T., Guo, Z., Sun, X., and Fu, K. (November, January 27). Scrdet: Towards more robust detection for small, cluttered and rotated objects. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"173855","DOI":"10.1109\/ACCESS.2019.2956569","article-title":"SARD: Towards scale-aware rotated object detection in aerial imagery","volume":"7","author":"Wang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Li, C., Luo, B., Hong, H., Su, X., Wang, Y., Liu, J., Wang, C., Zhang, J., and Wei, L. (2020). Object Detection Based on Global-Local Saliency Constraint in Aerial Images. Remote Sens., 12.","DOI":"10.3390\/rs12091435"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Yang, X., Hou, L., Zhou, Y., Wang, W., and Yan, J. (2021, January 19\u201325). Dense label encoding for boundary discontinuity free rotation detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01556"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/18\/3731\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:01:24Z","timestamp":1760166084000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/18\/3731"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,17]]},"references-count":60,"journal-issue":{"issue":"18","published-online":{"date-parts":[[2021,9]]}},"alternative-id":["rs13183731"],"URL":"https:\/\/doi.org\/10.3390\/rs13183731","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2021,9,17]]}}}