{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T12:55:46Z","timestamp":1772715346537,"version":"3.50.1"},"reference-count":59,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2018,1,18]],"date-time":"2018-01-18T00:00:00Z","timestamp":1516233600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"The National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["No.61331016"],"award-info":[{"award-number":["No.61331016"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"The CETC key laboratory of aerospace information applications","award":["KX162600018"],"award-info":[{"award-number":["KX162600018"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Daily acquisition of large amounts of aerial and satellite images has facilitated subsequent automatic interpretations of these images. One such interpretation is object detection. Despite the great progress made in this domain, the detection of multi-scale objects, especially small objects in high resolution satellite (HRS) images, has not been adequately explored. As a result, the detection performance turns out to be poor. To address this problem, we first propose a unified multi-scale convolutional neural network (CNN) for geospatial object detection in HRS images. It consists of a multi-scale object proposal network and a multi-scale object detection network, both of which share a multi-scale base network. The base network can produce feature maps with different receptive fields to be responsible for objects with different scales. Then, we use the multi-scale object proposal network to generate high quality object proposals from the feature maps. Finally, we use these object proposals with the multi-scale object detection network to train a good object detector. Comprehensive evaluations on a publicly available remote sensing object detection dataset and comparisons with several state-of-the-art approaches demonstrate the effectiveness of the presented method. The proposed method achieves the best mean average precision (mAP) value of 89.6%, runs at 10 frames per second (FPS) on a GTX 1080Ti GPU.<\/jats:p>","DOI":"10.3390\/rs10010131","type":"journal-article","created":{"date-parts":[[2018,1,18]],"date-time":"2018-01-18T12:19:48Z","timestamp":1516277988000},"page":"131","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":159,"title":["Geospatial Object Detection in High Resolution Satellite Images Based on Multi-Scale Convolutional Neural Network"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8616-0221","authenticated-orcid":false,"given":"Wei","family":"Guo","sequence":"first","affiliation":[{"name":"School of Electronic Information, Wuhan University, Wuhan 430072, China"},{"name":"The CETC Key Laboratory of Aerospace Information Applications, Shijiazhuang 050081, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3263-8768","authenticated-orcid":false,"given":"Wen","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Electronic Information, Wuhan University, Wuhan 430072, China"},{"name":"The CETC Key Laboratory of Aerospace Information Applications, Shijiazhuang 050081, China"}]},{"given":"Haijian","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Electronic Information, Wuhan University, Wuhan 430072, China"}]},{"given":"Guang","family":"Hua","sequence":"additional","affiliation":[{"name":"School of Electronic Information, Wuhan University, Wuhan 430072, China"}]}],"member":"1968","published-online":{"date-parts":[[2018,1,18]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Yu, H., Yang, W., Xia, G., and Liu, G. (2016). A Color-Texture-Structure Descriptor for High-Resolution Satellite Image Classification. Remote Sens., 8.","DOI":"10.3390\/rs8030259"},{"key":"ref_2","unstructured":"Cheng, G., Han, J., Zhou, P., and Guo, L. (2014, January 13\u201318). Scalable multi-class geospatial object detection in high spatial resolution remote sensing images. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Quebec City, QC, Canada."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.isprsjprs.2016.03.014","article-title":"A Survey on Object Detection in Optical Remote Sensing Images","volume":"117","author":"Cheng","year":"2016","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"4069","DOI":"10.1109\/JSTARS.2014.2308301","article-title":"Detection of buildings in multispectral very high spatial resolution images using the percentage occupancy hit-or-miss transform","volume":"7","author":"Stankov","year":"2014","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1109\/TGRS.2010.2053713","article-title":"A probabilistic framework to detect buildings in aerial and satellite images","volume":"49","author":"Sirmacek","year":"2011","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"4895","DOI":"10.1109\/JSTARS.2015.2467377","article-title":"A Hierarchical Oil Tank Detector with Deep Surrounding Features for High-Resolution Optical Satellite Imagery","volume":"8","author":"Zhang","year":"2015","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1347","DOI":"10.1109\/LGRS.2015.2401600","article-title":"Circular oil tank detection from panchromatic satellite images: A new automated approach","volume":"12","author":"Ok","year":"2015","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"508","DOI":"10.1109\/TCSVT.2014.2358031","article-title":"Efficient feature selection and classification for vehicle detection","volume":"25","author":"Wen","year":"2015","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"2485","DOI":"10.1016\/j.ijleo.2015.06.024","article-title":"Vehicle detection in remote sensing imagery based on salient information and local shape feature","volume":"126","author":"Yu","year":"2015","journal-title":"Opt. Int. J. Light Electron Opt."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Cai, H., and Su, Y. (2005, January 19\u201320). Airplane detection in remote sensing image with a circle-frequency filter. Proceedings of the International Conference on Space Information Technology, Beijing, China.","DOI":"10.1117\/12.657743"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Bo, S., and Jing, Y. (2010, January 16\u201318). Region-based airplane detection in remotely sensed imagery. Proceedings of the 2010 3rd International Congress on Image and Signal Processing (CISP), Yantai, China.","DOI":"10.1109\/CISP.2010.5647478"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2768","DOI":"10.1016\/j.ijleo.2013.12.003","article-title":"An automated airplane detection system for large panchromatic image with high spatial resolution","volume":"125","author":"An","year":"2014","journal-title":"Opt. Int. J. Light Electron Opt."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2486","DOI":"10.1109\/TGRS.2016.2645610","article-title":"Accurate Object Localization in Remote Sensing Images Based on Convolutional Neural Networks","volume":"55","author":"Long","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"14680","DOI":"10.3390\/rs71114680","article-title":"Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery","volume":"7","author":"Hu","year":"2015","journal-title":"Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1109\/LGRS.2010.2055033","article-title":"Satellite Image Classification via Two-layer Sparse Coding with Biased Image Representation","volume":"8","author":"Dai","year":"2011","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 24\u201327). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 13\u201316). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards real time object detection with region proposal networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., and Reed, S. (2016, January 27\u201330). SSD: Single Shot MultiBox Detector. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Gao, Y., Guo, S., Huang, K., Chen, J., Gong, Q., Zou, Y., Bai, T., and Overett, G. (2017, January 11\u201314). Scale Optimization for Full-Image-CNN Vehicle Detection. Proceedings of the IEEE Intelligent Vehicles Symposium (IV), Redondo Beach, CA, USA.","DOI":"10.1109\/IVS.2017.7995812"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Han, X., Zhong, Y., and Zhang, L. (2017). An Efficient and Robust Integrated Geospatial Object Detection Framework for High Spatial Resolution Remote Sensing Imagery. Remote Sens., 7.","DOI":"10.3390\/rs9070666"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Lin, H., Shi, Z., and Zou, Z. (2017). Maritime Semantic Labeling of Optical Remote Sensing Images with Multi-Scale Fully Convolutional Network. Remote Sens., 9.","DOI":"10.3390\/rs9050480"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1016\/S0031-3203(96)00068-4","article-title":"Object detection using Gabor filters","volume":"30","author":"Jain","year":"1997","journal-title":"Pattern Recognit."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"140","DOI":"10.1016\/j.isprsjprs.2015.01.013","article-title":"Water flow based geometric active deformable model for road network","volume":"102","author":"Leninisha","year":"2015","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/j.isprsjprs.2013.09.004","article-title":"Automated detection of buildings from single VHR multispectral images using shadow information and graph cuts","volume":"86","author":"Ok","year":"2013","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1701","DOI":"10.1109\/TGRS.2012.2207123","article-title":"Automated detection of arbitrarily shaped buildings in complex environments from monocular VHR optical satellite imagery","volume":"51","author":"Ok","year":"2013","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1016\/j.isprsjprs.2013.09.014","article-title":"Geographic object-based image analysis-towards a new paradigm","volume":"87","author":"Blaschke","year":"2014","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"3542","DOI":"10.1016\/j.patcog.2015.04.018","article-title":"Feature representation for statistical-learning-based object detection: A review","volume":"48","author":"Li","year":"2015","journal-title":"Pattern Recognit."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"9705","DOI":"10.3390\/rs70809705","article-title":"Identification of Forested Landslides Using LiDAR Data, Object-based Image Analysis, and Machine Learning Algorithms","volume":"7","author":"Li","year":"2015","journal-title":"Remote Sens."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"3965","DOI":"10.1109\/TGRS.2017.2685945","article-title":"AID: A Benchmark Dataset for Performance Evaluation of Aerial Scene Classification","volume":"55","author":"Xia","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_32","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201326). Histograms of Oriented Gradients for Human Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA."},{"key":"ref_33","unstructured":"Liao, S., Zhu, X., Lei, Z., Zhang, L., and Li, S. (2007, January 27\u201329). Learning multi-scale block local binary patterns for face recognition. Proceedings of the International Conference on Biometrics (ICB), Seoul, Korea."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"2037","DOI":"10.1109\/TPAMI.2006.244","article-title":"Face description with local binary patterns: Application to face recognition","volume":"28","author":"Ahonen","year":"2006","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1080\/01431161.2012.705443","article-title":"Automatic landslide detection from remote-sensing imagery using a scene classification method based on BoVW and pLSA","volume":"34","author":"Cheng","year":"2013","journal-title":"Int. J. Remote Sens."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1109\/LGRS.2011.2161569","article-title":"Automatic Target Detection in High-Resolution Remote Sensing Images Using Spatial Sparse Coding Bag-of-Words Model","volume":"9","author":"Sun","year":"2011","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1109\/TGRS.2014.2321557","article-title":"Hyperspectral image de-noising via sparse representation and low-rank constraint","volume":"53","author":"Zhao","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"4472","DOI":"10.1109\/TGRS.2015.2400449","article-title":"Learning High-level Features for satellite Image Classification with Limited Labeled Samples","volume":"53","author":"Yang","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"6844","DOI":"10.1109\/TGRS.2014.2303895","article-title":"A discriminative metric learning based anomaly detection method","volume":"52","author":"Du","year":"2014","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Ren, X., and Ramanan, D. (2013, January 25\u201327). Histograms of Sparse Codes for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.417"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-Vector Networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach. Learn."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1109\/72.80344","article-title":"Adaptive nearest neighbor pattern classification","volume":"2","author":"Geva","year":"2002","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_43","unstructured":"Tim, K. (2013, January 25\u201328). Random decision forests. Proceedings of the International Conference on Document Analysis and Recognition, Washington, DC, USA."},{"key":"ref_44","unstructured":"Kirzhevsky, A., Sutskever, I., and Hinton, G. (2012, January 3\u20138). ImageNet classification with deep convolutional neural networks. Proceedings of the International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"1074","DOI":"10.1109\/LGRS.2016.2565705","article-title":"Ship Rotated Bounding Box Space for Ship Extraction from High-Resolution Optical Satellite Images with Complex Backgrounds","volume":"13","author":"Liu","year":"2016","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_46","unstructured":"Zhang, S., Wen, L., Bian, X., Lei, Z., and Li, S. (arXiv, 2014). Single-Shot Refinement Neural Network for Object Detection, arXiv."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Cai, Z., Fan, Q., Feris, R., and Vasconcelos, N. (2016, January 8\u201316). A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection. Proceedings of the IEEE European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46493-0_22"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Lin, T., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_49","unstructured":"Fu, C., Liu, W., Ranga, A., Tyagi, A., and Berg, A. (arXiv, 2017). DSSD: Deconvolutional Single Shot Detector, arXiv."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Kong, T., Sun, F., Yao, A., Liu, H., Lu, M., and Chen, Y. (2017, January 21\u201326). Ron: Reverse connection with objectness prior networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.557"},{"key":"ref_51","unstructured":"Shrivastava, A., Sukthankar, R., Malik, J., and Gupta, A. (arXiv, 2016). Beyond Skip Connections: Top-Down Modulation for Object Detection, arXiv."},{"key":"ref_52","unstructured":"Simonyan, K., and Zisserman, A. (arXiv, 2014). Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Shrivastava, A., Gupta, A., and Girshick, R. (2015, January 7\u201312). Training Region-based Object Detectors with Online Hard Example Mining. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2016.89"},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"6026","DOI":"10.3390\/rs5116026","article-title":"Exploring the use of Google Earth imagery and object-based methods in land use\/cover mapping","volume":"5","author":"Hu","year":"2013","journal-title":"Remote Sens."},{"key":"ref_56","unstructured":"(2017, June 26). NWPU VHR-10 Dataset. Available online: http:\/\/www.escience.cn\/people\/gongcheng\/NWPU-VHR-10.html."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"366","DOI":"10.1109\/LGRS.2009.2035644","article-title":"Object classification of aerial images with bag-of-visual words","volume":"7","author":"Xu","year":"2010","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1016\/j.isprsjprs.2014.10.002","article-title":"Multi-class geospatial object detection and geographic image classification based on collection of part detectors","volume":"98","author":"Cheng","year":"2014","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"7405","DOI":"10.1109\/TGRS.2016.2601622","article-title":"Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images","volume":"54","author":"Cheng","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/10\/1\/131\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T14:51:43Z","timestamp":1760194303000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/10\/1\/131"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,1,18]]},"references-count":59,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2018,1]]}},"alternative-id":["rs10010131"],"URL":"https:\/\/doi.org\/10.3390\/rs10010131","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,1,18]]}}}