{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T13:58:07Z","timestamp":1760623087459,"version":"build-2065373602"},"reference-count":54,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2018,5,24]],"date-time":"2018-05-24T00:00:00Z","timestamp":1527120000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Geospatial object detection from high spatial resolution (HSR) remote sensing imagery is a heated and challenging problem in the field of automatic image interpretation. Despite convolutional neural networks (CNNs) having facilitated the development in this domain, the computation efficiency under real-time application and the accurate positioning on relatively small objects in HSR images are two noticeable obstacles which have largely restricted the performance of detection methods. To tackle the above issues, we first introduce semantic segmentation-aware CNN features to activate the detection feature maps from the lowest level layer. In conjunction with this segmentation branch, another module which consists of several global activation blocks is proposed to enrich the semantic information of feature maps from higher level layers. Then, these two parts are integrated and deployed into the original single shot detection framework. Finally, we use the modified multi-scale feature maps with enriched semantics and multi-task training strategy to achieve end-to-end detection with high efficiency. Extensive experiments and comprehensive evaluations on a publicly available 10-class object detection dataset have demonstrated the superiority of the presented method.<\/jats:p>","DOI":"10.3390\/rs10060820","type":"journal-article","created":{"date-parts":[[2018,5,28]],"date-time":"2018-05-28T03:54:21Z","timestamp":1527479661000},"page":"820","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":57,"title":["Geospatial Object Detection in Remote Sensing Imagery Based on Multiscale Single-Shot Detector with Activated Semantics"],"prefix":"10.3390","volume":"10","author":[{"given":"Shiqi","family":"Chen","sequence":"first","affiliation":[{"name":"Science and Technology on Automatic Target Recognition Laboratory, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Ronghui","family":"Zhan","sequence":"additional","affiliation":[{"name":"Science and Technology on Automatic Target Recognition Laboratory, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Jun","family":"Zhang","sequence":"additional","affiliation":[{"name":"Science and Technology on Automatic Target Recognition Laboratory, National University of Defense Technology, Changsha 410073, China"}]}],"member":"1968","published-online":{"date-parts":[[2018,5,24]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"3892","DOI":"10.1109\/JSTARS.2014.2319195","article-title":"AIS-Based Evaluation of Target Detectors and SAR Sensors Characteristics for Maritime Surveillance","volume":"8","author":"Pelich","year":"2015","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"3978","DOI":"10.1109\/TGRS.2007.907109","article-title":"A multiple conditional random field\u2019s ensemble framework for urban area detection in remote sensing optical images","volume":"45","author":"Zhong","year":"2007","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"9705","DOI":"10.3390\/rs70809705","article-title":"Identification of Forested Landslides Using LiDAR Data, Object-based Image Analysis, and Machine Learning Algorithms","volume":"7","author":"Li","year":"2015","journal-title":"Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.isprsjprs.2016.03.014","article-title":"A Survey on Object Detection in Optical Remote Sensing Images","volume":"117","author":"Cheng","year":"2016","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"950","DOI":"10.1109\/TGRS.2017.2756911","article-title":"Large-scale remote sensing image retrieval by deep hashing neural networks","volume":"56","author":"Li","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1016\/j.ins.2017.07.010","article-title":"Feature guided Gaussian mixture model with semi-supervised EM and local geometric constraint for retinal image registration","volume":"471","author":"Ma","year":"2017","journal-title":"Inf. Sci."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1109\/LGRS.2015.2503142","article-title":"Unsupervised Multilayer Feature Learning for Satellite Image Scene Classification","volume":"13","author":"Li","year":"2016","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"6469","DOI":"10.1109\/TGRS.2015.2441954","article-title":"Robust Feature Matching for Remote Sensing Image Registration via Locally Linear Transforming","volume":"53","author":"Ma","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"4069","DOI":"10.1109\/JSTARS.2014.2308301","article-title":"Detection of buildings in multispectral very high spatial resolution images using the percentage occupancy hit-or-miss transform","volume":"7","author":"Stankov","year":"2014","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1016\/S0031-3203(96)00068-4","article-title":"Object detection using Gabor filters","volume":"30","author":"Jain","year":"1997","journal-title":"Pattern Recognit."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"140","DOI":"10.1016\/j.isprsjprs.2015.01.013","article-title":"Water flow based geometric active deformable model for road network","volume":"102","author":"Leninisha","year":"2015","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/j.isprsjprs.2013.09.004","article-title":"Automated detection of buildings from single VHR multispectral images using shadow information and graph cuts","volume":"86","author":"Ok","year":"2013","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1701","DOI":"10.1109\/TGRS.2012.2207123","article-title":"Automated detection of arbitrarily shaped buildings in complex environments from monocular VHR optical satellite imagery","volume":"51","author":"Ok","year":"2013","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1016\/j.isprsjprs.2013.09.014","article-title":"Geographic object-based image analysis-towards a new paradigm","volume":"87","author":"Blaschke","year":"2014","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"3542","DOI":"10.1016\/j.patcog.2015.04.018","article-title":"Feature representation for statistical-learning-based object detection: A review","volume":"48","author":"Li","year":"2015","journal-title":"Pattern Recognit."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Siva, P., Russell, C., and Xiang, T. (2012, January 7\u201313). In defense of negative mining for annotating weakly labeled data. Proceedings of the European Conference on Computer Vision, Firenze, Italy.","DOI":"10.1007\/978-3-642-33712-3_43"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1016\/j.isprsjprs.2013.08.001","article-title":"Object detection in remote sensing imagery using a discriminatively trained mixture model","volume":"85","author":"Cheng","year":"2013","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-Vector Networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach. Learn."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1346","DOI":"10.1109\/TGRS.2014.2337883","article-title":"A sparse representation-based binary hypothesis model for target detection in hyperspectral images","volume":"53","author":"Zhang","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1109\/72.80344","article-title":"Adaptive nearest neighbor pattern classification","volume":"2","author":"Geva","year":"2002","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_21","unstructured":"Tim, K. (2013, January 25\u201328). Random decision forests. Proceedings of the International Conference on Document Analysis and Recognition, Washington, DC, USA."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (voc) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1016\/j.neucom.2015.09.116","article-title":"Deep learning for visual understanding: A review","volume":"187","author":"Guo","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1134\/S1054661816010065","article-title":"A survey of deep learning methods and software tools for image classification and object detection","volume":"26","author":"Druzhkov","year":"2016","journal-title":"Pattern Recognit. Image Anal."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 24\u201327). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 13\u201316). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards real time object detection with region proposal networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., and Reed, S. (2016, January 27\u201330). SSD: Single Shot MultiBox Detector. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Cai, Z., Fan, Q., Feris, R.S., and Vasconcelos, N. (2016). A unified multi-scale deep convolutional neural network for fast object detection. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46493-0_22"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"640","DOI":"10.1109\/TPAMI.2016.2572683","article-title":"Fully convolutional networks for semantic segmentation","volume":"39","author":"Shelhamer","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Han, X., Zhong, Y., and Zhang, L. (2017). An Efficient and Robust Integrated Geospatial Object Detection Framework for High Spatial Resolution Remote Sensing Imagery. Remote Sens., 9.","DOI":"10.3390\/rs9070666"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"7405","DOI":"10.1109\/TGRS.2016.2601622","article-title":"Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images","volume":"54","author":"Cheng","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_35","unstructured":"Dai, J., Li, Y., He, K., and Sun, J. (arXiv, 2016). R-FCN: Object Detection via Region-based Fully Convolutional Networks, arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Kang, M., Ji, K., Leng, X., and Lin, Z. (2017). Contextual Region-based Convolutional Neural Network with Multilayer Fusion for SAR Ship Detection. Remote Sens., 9.","DOI":"10.3390\/rs9080860"},{"key":"ref_37","unstructured":"Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., and Berg, A.C. (arXiv, 2017). DSSD: Deconvolutional single shot detector, arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Shen, Z., Liu, Z., Li, J., Jiang, Y.G., Chen, Y., and Xue, X. (arXiv, 2017). Dsod: Learning deeply supervised object detectors from scratch, arXiv.","DOI":"10.1109\/ICCV.2017.212"},{"key":"ref_39","unstructured":"Shen, Z., Shi, H., Feris, R., Cao, L., Yan, S., Liu, D., Wang, X., Xue, X., and Huang, T.S. (arXiv, 2017). Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids, arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (arXiv, 2017). Focal loss for dense object detection, arXiv.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Gidaris, S., and Komodakis, N. (2015, January 7\u201313). Object detection via a multi-region and semantic segmentation-aware cnn model. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.135"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Shrivastava, A., and Gupta, A. (2016). Contextual priming and feedback for faster r-cnn. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46448-0_20"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Qian, S., Xie, C., Shen, W., Wang, B., and Yuille, A.L. (arXiv, 2017). Single-Shot Object Detection with Enriched Semantics, arXiv.","DOI":"10.1109\/CVPR.2018.00609"},{"key":"ref_44","unstructured":"Yu, F., and Koltun, V. (arXiv, 2016). Multi-scale context aggregation by dilated convolutions, arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (arXiv, 2017). Squeeze-and-excitation networks, arXiv.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (arXiv, 2015). Deep residual learning for image recognition, arXiv.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_47","unstructured":"(2018, April 20). NWPU VHR-10 Dataset. Available online: http:\/\/www.escience.cn\/people\/gongcheng\/NWPU-VHR-10.html."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"366","DOI":"10.1109\/LGRS.2009.2035644","article-title":"Object classification of aerial images with bag-of-visual words","volume":"7","author":"Xu","year":"2010","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1109\/LGRS.2011.2161569","article-title":"Automatic target detection in high-resolution remote sensing images using spatial sparse coding bag-of-words model","volume":"9","author":"Sun","year":"2012","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/j.isprsjprs.2013.12.011","article-title":"Efficient, simultaneous detection of multi-class geospatial targets based on visual saliency modeling and discriminative learning of sparse coding","volume":"89","author":"Han","year":"2014","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1016\/j.isprsjprs.2014.10.002","article-title":"Multi-class geospatial object detection and geographic image classification based on collection of part detectors","volume":"98","author":"Cheng","year":"2014","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Guo, W., Yang, W., Zhang, H., and Hua, G. (2018). Geospatial Object Detection in High Resolution Satellite Images Based on Multi-Scale Convolutional Neural Network. Remote Sens., 10.","DOI":"10.3390\/rs10010131"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. (arXiv, 2014). Caffe: Convolutional Architecture for Fast Feature Embedding, arXiv.","DOI":"10.1145\/2647868.2654889"},{"key":"ref_54","first-page":"249","article-title":"Understanding the difficulty of training deep feedforward neural networks","volume":"9","author":"Glorot","year":"2010","journal-title":"J. Mach. Learn. Res."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/10\/6\/820\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:05:51Z","timestamp":1760195151000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/10\/6\/820"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,5,24]]},"references-count":54,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2018,6]]}},"alternative-id":["rs10060820"],"URL":"https:\/\/doi.org\/10.3390\/rs10060820","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2018,5,24]]}}}