{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,5]],"date-time":"2025-11-05T20:59:49Z","timestamp":1762376389533,"version":"build-2065373602"},"reference-count":52,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2018,5,9]],"date-time":"2018-05-09T00:00:00Z","timestamp":1525824000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61572307"],"award-info":[{"award-number":["61572307"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Recently, many researchers have been dedicated to using convolutional neural networks (CNNs) to extract global-context features (GCFs) for remote-sensing scene classification. Commonly, accurate classification of scenes requires knowledge about both the global context and local objects. However, unlike the natural images in which the objects cover most of the image, objects in remote-sensing images are generally small and decentralized. Thus, it is hard for vanilla CNNs to focus on both global context and small local objects. To address this issue, this paper proposes a novel end-to-end CNN by integrating the GCFs and local-object-level features (LOFs). The proposed network includes two branches, the local object branch (LOB) and global semantic branch (GSB), which are used to generate the LOFs and GCFs, respectively. Then, the concatenation of features extracted from the two branches allows our method to be more discriminative in scene classification. Three challenging benchmark remote-sensing datasets were extensively experimented on; the proposed approach outperformed the existing scene classification methods and achieved state-of-the-art results for all three datasets.<\/jats:p>","DOI":"10.3390\/rs10050734","type":"journal-article","created":{"date-parts":[[2018,5,10]],"date-time":"2018-05-10T03:48:27Z","timestamp":1525924107000},"page":"734","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":81,"title":["Improving Remote Sensing Scene Classification by Integrating Global-Context and Local-Object Features"],"prefix":"10.3390","volume":"10","author":[{"given":"Dan","family":"Zeng","sequence":"first","affiliation":[{"name":"Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, Shanghai Institute of Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4213-7641","authenticated-orcid":false,"given":"Shuaijun","family":"Chen","sequence":"additional","affiliation":[{"name":"Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, Shanghai Institute of Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China"}]},{"given":"Boyang","family":"Chen","sequence":"additional","affiliation":[{"name":"National Satellite Meteorological Center, No. 46, Zhongguancun South Street, Haidian District, Beijing 100081, China"}]},{"given":"Shuying","family":"Li","sequence":"additional","affiliation":[{"name":"The 16th Institute, China Aerospace Science and Technology Corporation, 108 West Hangtian Road, Xi\u2019an 710100, China"}]}],"member":"1968","published-online":{"date-parts":[[2018,5,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"6026","DOI":"10.3390\/rs5116026","article-title":"Exploring the use of Google Earth imagery and object-based methods in land use\/cover mapping","volume":"5","author":"Hu","year":"2013","journal-title":"Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1109\/MGRS.2016.2540798","article-title":"Deep learning for remote sensing data: A technical tutorial on the state of the art","volume":"4","author":"Zhang","year":"2016","journal-title":"IEEE Geosci. Remote Sens. Mag."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"4472","DOI":"10.1109\/TGRS.2015.2400449","article-title":"Learning high-level features for satellite image classification with limited labeled samples","volume":"53","author":"Yang","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"8588","DOI":"10.1080\/01431161.2013.845925","article-title":"Extreme value theory-based calibration for the fusion of multiple features in high-resolution satellite scene classification","volume":"34","author":"Shao","year":"2013","journal-title":"Int. J. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"3965","DOI":"10.1109\/TGRS.2017.2685945","article-title":"AID: A benchmark data set for performance evaluation of aerial scene classification","volume":"55","author":"Xia","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Shao, W., Yang, W., Xia, G.S., and Liu, G. (2013, January 16\u201318). A Hierarchical Scheme of Multiple Feature Fusion for High-Resolution Satellite Scene Categorization. Proceedings of the International Conference on Computer Vision Systems, St. Petersburg, Russia.","DOI":"10.1007\/978-3-642-39402-7_33"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Newsam, S. (2008, January 12\u201315). Comparing SIFT descriptors and Gabor texture features for classification of remote sensed imagery. Proceedings of the 15th IEEE International Conference on Image Processing (ICIP), San Diego, CA, USA.","DOI":"10.1109\/ICIP.2008.4712139"},{"key":"ref_9","unstructured":"Dos Santos, J.A., Penatti, O.A.B., and da Silva Torres, R. (2010, January 17\u201321). Evaluating the Potential of Texture and Color Descriptors for Remote Sensing Image Retrieval and Classification. Proceedings of the VISAPP, Angers, France."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1007\/BF00130487","article-title":"Color indexing","volume":"7","author":"Swain","year":"1991","journal-title":"Int. J. Comput. Vis."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"971","DOI":"10.1109\/TPAMI.2002.1017623","article-title":"Multiresolution gray-scale and rotation invariant texture classification with local binary patterns","volume":"24","author":"Ojala","year":"2002","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"745","DOI":"10.1007\/s11760-015-0804-2","article-title":"Land-use scene classification using multi-scale completed local binary patterns","volume":"10","author":"Chen","year":"2016","journal-title":"Signal Image Video Process."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"3180","DOI":"10.1016\/j.patcog.2015.02.001","article-title":"Learning LBP structure by maximizing the conditional mutual information","volume":"48","author":"Ren","year":"2015","journal-title":"Pattern Recognit."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1899","DOI":"10.1109\/JSTARS.2012.2228254","article-title":"Indexing of remote sensing images with different resolutions by multiple features","volume":"6","author":"Luo","year":"2013","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Newsam, S. (2010, January 3\u20135). Bag-of-visual-words and spatial extensions for land-use classification. Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA.","DOI":"10.1145\/1869790.1869829"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1947","DOI":"10.1109\/TGRS.2014.2351395","article-title":"Pyramid of spatial relatons for scene-level land use classification","volume":"53","author":"Chen","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"424","DOI":"10.1016\/j.patcog.2012.07.017","article-title":"Scene classification using a multi-resolution bag-of-features model","volume":"46","author":"Zhou","year":"2013","journal-title":"Pattern Recognit."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhao, B., Zhong, Y., Zhang, L., and Huang, B. (2016). The fisher kernel coding framework for high spatial resolution scene classification. Remote Sens., 8.","DOI":"10.3390\/rs8020157"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"2108","DOI":"10.1109\/TGRS.2015.2496185","article-title":"Dirichlet-derived multiple topic scene classification model for high spatial resolution remote sensing imagery","volume":"54","author":"Zhao","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Wu, H., Liu, B., Su, W., Zhang, W., and Sun, J. (2016). Hierarchical coding vectors for scene level land-use classification. Remote Sens., 8.","DOI":"10.3390\/rs8050436"},{"key":"ref_21","unstructured":"Yang, Y., and Newsam, S. (2011, January 6\u201313). Spatial pyramid co-occurrence for image classification. Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2296","DOI":"10.1080\/01431161.2014.890762","article-title":"A 2-D wavelet decomposition-based bag-of-visual-words model for land-use scene classification","volume":"35","author":"Zhao","year":"2014","journal-title":"Int. J. Remote Sens."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"4620","DOI":"10.1109\/JSTARS.2014.2339842","article-title":"Land-use scene classification using a concentric circle-structured multiscale bag-of-visual-words model","volume":"7","author":"Zhao","year":"2014","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_24","unstructured":"Lazebnik, S., Schmid, C., and Ponce, J. (2006, January 17\u201322). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New York, NY, USA."},{"key":"ref_25","first-page":"993","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"2077","DOI":"10.1109\/LGRS.2017.2751559","article-title":"Locality adaptive discriminant analysis for spectral-spatial classification of hyperspectral images","volume":"14","author":"Wang","year":"2017","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Bosch, A., Zisserman, A., and Munoz, X. (2006, January 7\u201313). Scene Classification via pLSA. Proceedings of the European Conference on Computer Vision (ECCV), Graz, Austria.","DOI":"10.1007\/11744085_40"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Wang, Q., Wan, J., and Yuan, Y. (2017). Deep metric learning for crowdedness regression. IEEE Trans. Circ. Syst. Video Technol.","DOI":"10.1109\/TCSVT.2017.2703920"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2609","DOI":"10.1109\/TIP.2018.2806279","article-title":"YoTube: Searching Action Proposal via Recurrent and Static Regression Networks","volume":"27","author":"Zhu","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_30","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20138). ImageNet classification with deep convolutional neural networks. Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, NY, USA."},{"key":"ref_31","unstructured":"Simonyan, K., and Zisserman, A. (2015, January 7\u20139). Very deep convolutional networks for large-scale image recognition. Proceedings of the International Conference on Learning Representations, San Diego, CA, USA."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., and Anguelov, D. (2015, January 11\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Girshick, R. (arXiv, 2015). Fast r-cnn, arXiv.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2014, January 6\u201312). Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10578-9_23"},{"key":"ref_35","unstructured":"Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., and Oliva, A. (2014, January 8\u201313). Learning deep features for scene recognition using places database. In Proceedings of Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_36","unstructured":"Herranz, L., Jiang, S., and Li, X. (July, January 26). Scene recognition with CNNs: Objects, scales and dataset bias. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2889","DOI":"10.1109\/JSTARS.2017.2683799","article-title":"Fusing local and global features for high-resolution scene classification","volume":"10","author":"Bian","year":"2017","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1109\/TGRS.2013.2241444","article-title":"Unsupervised feature learning for aerial scene classification","volume":"52","author":"Cheriyadat","year":"2014","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1793","DOI":"10.1109\/TGRS.2015.2488681","article-title":"Scene classification via a gradient boosting random convolutional network framework","volume":"54","author":"Zhang","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Liu, Y., Zhong, Y., Fei, F., Zhu, Q., and Qin, Q. (2018). Scene Classification Based on a Deep Random-Scale Stretched Convolutional Neural Network. Remote Sens., 10.","DOI":"10.3390\/rs10030444"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1895","DOI":"10.1109\/LGRS.2016.2616440","article-title":"Deep filter banks for land-use scene classification","volume":"13","author":"Wu","year":"2016","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_42","unstructured":"Castelluccio, M., Poggi, G., Sansone, C., and Verdoliva, L. (arXiv, 2015). Land use classification in remote sensing images by convolutional neural networks, arXiv."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1016\/j.patcog.2016.07.001","article-title":"Towards better exploiting convolutional neural networks for remote sensing scene classification","volume":"61","author":"Nogueira","year":"2017","journal-title":"Pattern Recognit."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"14680","DOI":"10.3390\/rs71114680","article-title":"Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery","volume":"7","author":"Hu","year":"2015","journal-title":"Remote Sens."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"4775","DOI":"10.1109\/TGRS.2017.2700322","article-title":"Deep feature fusion for VHR remote sensing scene classification","volume":"55","author":"Chaib","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"2321","DOI":"10.1109\/LGRS.2015.2475299","article-title":"Deep learning based feature selection for remote sensing scene classification","volume":"12","author":"Zou","year":"2015","journal-title":"IEEE Geosci. Remote. Sens. Lett."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Zitnick, C.L., and Dollar, P. (2014, January 6\u201312). Edge boxes: Locating object proposals from edges. Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_26"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. (2014, January 3\u20137). Caffe: Convolutional architecture for fast feature embedding. Proceedings of the ACM International Conference on Multimedia, Orlando, FL, USA.","DOI":"10.1145\/2647868.2654889"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Anwer, R.M., Khan, F.S., van de Weijer, J., Monlinier, M., and Laaksonen, J. (arXiv, 2017). Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification, arXiv.","DOI":"10.1016\/j.isprsjprs.2018.01.023"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"8639367","DOI":"10.1155\/2018\/8639367","article-title":"A Two-Stream Deep Fusion Framework for High-Resolution Aerial Scene Classification","volume":"2018","author":"Yu","year":"2018","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"209","DOI":"10.1016\/j.ins.2016.02.021","article-title":"Scene classification using local and global features with collaborative representation fusion","volume":"348","author":"Zou","year":"2016","journal-title":"Inf. Sci."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/10\/5\/734\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:04:00Z","timestamp":1760195040000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/10\/5\/734"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,5,9]]},"references-count":52,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2018,5]]}},"alternative-id":["rs10050734"],"URL":"https:\/\/doi.org\/10.3390\/rs10050734","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2018,5,9]]}}}