{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T20:34:06Z","timestamp":1776458046193,"version":"3.51.2"},"reference-count":26,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2019,2,1]],"date-time":"2019-02-01T00:00:00Z","timestamp":1548979200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"publisher","award":["41701508"],"award-info":[{"award-number":["41701508"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"publisher","award":["41801349"],"award-info":[{"award-number":["41801349"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Recently, methods based on Faster region-based convolutional neural network (R-CNN) have been popular in multi-class object detection in remote sensing images due to their outstanding detection performance. The methods generally propose candidate region of interests (ROIs) through a region propose network (RPN), and the regions with high enough intersection-over-union (IoU) values against ground truth are treated as positive samples for training. In this paper, we find that the detection result of such methods is sensitive to the adaption of different IoU thresholds. Specially, detection performance of small objects is poor when choosing a normal higher threshold, while a lower threshold will result in poor location accuracy caused by a large quantity of false positives. To address the above issues, we propose a novel IoU-Adaptive Deformable R-CNN framework for multi-class object detection. Specially, by analyzing the different roles that IoU can play in different parts of the network, we propose an IoU-guided detection framework to reduce the loss of small object information during training. Besides, the IoU-based weighted loss is designed, which can learn the IoU information of positive ROIs to improve the detection accuracy effectively. Finally, the class aspect ratio constrained non-maximum suppression (CARC-NMS) is proposed, which further improves the precision of the results. Extensive experiments validate the effectiveness of our approach and we achieve state-of-the-art detection performance on the DOTA dataset.<\/jats:p>","DOI":"10.3390\/rs11030286","type":"journal-article","created":{"date-parts":[[2019,2,1]],"date-time":"2019-02-01T11:19:58Z","timestamp":1549019998000},"page":"286","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":128,"title":["IoU-Adaptive Deformable R-CNN: Make Full Use of IoU for Multi-Class Object Detection in Remote Sensing Imagery"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1453-3732","authenticated-orcid":false,"given":"Jiangqiao","family":"Yan","sequence":"first","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"}]},{"given":"Hongqi","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"}]},{"given":"Menglong","family":"Yan","sequence":"additional","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"}]},{"given":"Wenhui","family":"Diao","sequence":"additional","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"}]},{"given":"Xian","family":"Sun","sequence":"additional","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"}]},{"given":"Hao","family":"Li","sequence":"additional","affiliation":[{"name":"Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China"}]}],"member":"1968","published-online":{"date-parts":[[2019,2,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Yang, X., Sun, H., Fu, K., Yang, J., Sun, X., Yan, M., and Guo, Z. (2018). Automatic Ship Detection in Remote Sensing Images from Google Earth of Complex Scenes Based on Multiscale Rotation Dense Feature Pyramid Networks. Remote Sens., 10.","DOI":"10.3390\/rs10010132"},{"key":"ref_2","unstructured":"Yang, X., Sun, H., Sun, X., Yan, M., Guo, Z., and Fu, K. (2019, January 27). Position Detection and Direction Prediction for Arbitrary-Oriented Ships via Multiscale Rotation Region Convolutional Neural Network. Available online: https:\/\/arxiv.org\/ftp\/arxiv\/papers\/1806\/1806.04828.pdf."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Xia, G., Bai, X., Ding, J., Zhu, Z., Belongie, S.J., Luo, J., Datcu, M., Pelillo, M., and Zhang, L. (2018, January 18\u201322). DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. Proceedings of the Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00418"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"7405","DOI":"10.1109\/TGRS.2016.2601622","article-title":"Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images","volume":"54","author":"Cheng","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Han, X., Zhong, Y., and Zhang, L. (2017). An Efficient and Robust Integrated Geospatial Object Detection Framework for High Spatial Resolution Remote Sensing Imagery. Remote Sens., 9.","DOI":"10.3390\/rs9070666"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Xu, Z., Xu, X., Wang, L., Yang, R., and Pu, F. (2017). Deformable ConvNet with Aspect Ratio Constrained NMS for Object Detection in Remote Sensing Imagery. Remote Sens., 9.","DOI":"10.3390\/rs9121312"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1016\/j.isprsjprs.2018.02.014","article-title":"Multi-class geospatial object detection based on a position-sensitive balancing framework for high spatial resolution Remote Sensing imagery","volume":"138","author":"Zhong","year":"2018","journal-title":"Isprs J. Photogramm. Remote Sens."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Guo, W., Yang, W., Zhang, H., and Hua, G. (2018). Geospatial Object Detection in High Resolution Satellite Images Based on Multi-Scale Convolutional Neural Network. Remote Sens., 10.","DOI":"10.3390\/rs10010131"},{"key":"ref_9","unstructured":"Ren, S., He, K., Girshick, R.B., and Sun, J. (arXiv, 2015). Faster R-CNN: Towards real-time object detection with region proposal networks, arXiv."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs","volume":"40","author":"Chen","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_11","unstructured":"Dai, J., Li, Y., He, K., and Sun, J. (2016). R-FCN: Object Detection via Region-based Fully Convolutional Networks. Neural Inf. Process. Syst., 379\u2013387."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Ren, Y., Zhu, C., and Xiao, S. (2018). Deformable Faster R-CNN with Aggregating Multi-Layer Features for Partially Occluded Object Detection in Optical Remote Sensing Images. Remote Sens., 10.","DOI":"10.3390\/rs10091470"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zagoruyko, S., Lerer, A., Lin, T., Pinheiro, P.H.O., Gross, S., Chintala, S., and Dollar, P. (2016, January 19\u201322). A MultiPath Network for Object Detection. Proceedings of the British Machine Vision Conference, York, UK.","DOI":"10.5244\/C.30.15"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Xu, Y., Zhu, M., Li, S., Feng, H., Ma, S., and Che, J. (2018). End-to-End Airport Detection in Remote Sensing Images Combining Cascade Region Proposal Networks and Multi-Threshold Detection Networks. Remote Sens., 10.","DOI":"10.3390\/rs10101516"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Cai, Z., and Vasconcelos, N. (2018, January 18\u201322). Cascade R-CNN: Delving Into High Quality Object Detection. Proceedings of the Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00644"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Singh, B., and Davis, L.S. (2018, January 18\u201322). An Analysis of Scale Invariance in Object Detection\u2014SNIP. Proceedings of the Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00377"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhu, C., Tao, R., Luu, K., and Savvides, M. (2018, January 18\u201322). Seeing Small Faces From Robust Anchor\u2019s Perspective. Proceedings of the Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00538"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., and Wei, Y. (2017, January 22\u201329). Deformable Convolutional Networks. Proceedings of the International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.89"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Zhu, H., Chen, X., Dai, W., Fu, K., Ye, Q., and Jiao, J. (2015, January 27\u201330). Orientation robust object detection in aerial images using deep convolutional neural network. Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada.","DOI":"10.1109\/ICIP.2015.7351502"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1938","DOI":"10.1109\/LGRS.2015.2439517","article-title":"Fast Multiclass Vehicle Detection on Aerial Images","volume":"12","author":"Liu","year":"2015","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1074","DOI":"10.1109\/LGRS.2016.2565705","article-title":"Ship Rotated Bounding Box Space for Ship Extraction From High-Resolution Optical Satellite Images with Complex Backgrounds","volume":"13","author":"Liu","year":"2016","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1016\/j.jvcir.2015.11.002","article-title":"Vehicle detection in aerial imagery: A small target detection benchmark","volume":"34","author":"Razakarivony","year":"2015","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_23","unstructured":"Azimi, S.M., Vig, E., Bahmanyar, R., Korner, M., and Reinartz, P. (arXiv, 2018). Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery, arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L., Li, K., and Feifei, L. (2009, January 25). ImageNet: A large-scale hierarchical image database. Proceedings of the Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (arXiv, 2017). YOLO9000: Better, Faster, Stronger, arXiv.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 8\u201316). SSD: Single Shot MultiBox Detector. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/3\/286\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:30:10Z","timestamp":1760185810000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/3\/286"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,2,1]]},"references-count":26,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2019,2]]}},"alternative-id":["rs11030286"],"URL":"https:\/\/doi.org\/10.3390\/rs11030286","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,2,1]]}}}