{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T21:42:10Z","timestamp":1774993330978,"version":"3.50.1"},"reference-count":59,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2020,12,31]],"date-time":"2020-12-31T00:00:00Z","timestamp":1609372800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"the National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["41971284"],"award-info":[{"award-number":["41971284"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"the China Postdoctoral Science Foundation","award":["2016M590716 and 2017T100581"],"award-info":[{"award-number":["2016M590716 and 2017T100581"]}]},{"name":"the Fundamental Research Funds for the Central Universities","award":["2042020kf0218"],"award-info":[{"award-number":["2042020kf0218"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Although the deep semantic segmentation network (DSSN) has been widely used in remote sensing (RS) image semantic segmentation, it still does not fully mind the spatial relationship cues between objects when extracting deep visual features through convolutional filters and pooling layers. In fact, the spatial distribution between objects from different classes has a strong correlation characteristic. For example, buildings tend to be close to roads. In view of the strong appearance extraction ability of DSSN and the powerful topological relationship modeling capability of the graph convolutional neural network (GCN), a DSSN-GCN framework, which combines the advantages of DSSN and GCN, is proposed in this paper for RS image semantic segmentation. To lift the appearance extraction ability, this paper proposes a new DSSN called the attention residual U-shaped network (AttResUNet), which leverages residual blocks to encode feature maps and the attention module to refine the features. As far as GCN, the graph is built, where graph nodes are denoted by the superpixels and the graph weight is calculated by considering the spectral information and spatial information of the nodes. The AttResUNet is trained to extract the high-level features to initialize the graph nodes. Then the GCN combines features and spatial relationships between nodes to conduct classification. It is worth noting that the usage of spatial relationship knowledge boosts the performance and robustness of the classification module. In addition, benefiting from modeling GCN on the superpixel level, the boundaries of objects are restored to a certain extent and there are less pixel-level noises in the final classification result. Extensive experiments on two publicly open datasets show that DSSN-GCN model outperforms the competitive baseline (i.e., the DSSN model) and the DSSN-GCN when adopting AttResUNet achieves the best performance, which demonstrates the advance of our method.<\/jats:p>","DOI":"10.3390\/rs13010119","type":"journal-article","created":{"date-parts":[[2020,12,31]],"date-time":"2020-12-31T14:31:49Z","timestamp":1609425109000},"page":"119","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":71,"title":["Combining Deep Semantic Segmentation Network and Graph Convolutional Neural Network for Semantic Segmentation of Remote Sensing Imagery"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2234-1792","authenticated-orcid":false,"given":"Song","family":"Ouyang","sequence":"first","affiliation":[{"name":"School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8203-1246","authenticated-orcid":false,"given":"Yansheng","family":"Li","sequence":"additional","affiliation":[{"name":"School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China"}]}],"member":"1968","published-online":{"date-parts":[[2020,12,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"042609","DOI":"10.1117\/1.JRS.11.042609","article-title":"A Comprehensive survey of deep learning in remote sensing: Theories, tools, and challenges for the community","volume":"11","author":"Ball","year":"2017","journal-title":"J. Appl. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Liu, B., Du, S., Du, S., and Zhang, X. (2020). Incorporating Deep Features into GEOBIA Paradigm for Remote Sensing Imagery Classification: A Patch-Based Approach. Remote Sens., 12.","DOI":"10.3390\/rs12183007"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.isprsjprs.2018.08.011","article-title":"Deep learning for remotely sensed data","volume":"145","author":"Mountrakis","year":"2018","journal-title":"J. Photogramm. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1016\/j.isprsjprs.2019.04.015","article-title":"Deep learning in remote sensing applications: A meta-analysis and review","volume":"152","author":"Ma","year":"2019","journal-title":"J. Photogramm. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"112045","DOI":"10.1016\/j.rse.2020.112045","article-title":"Accurate cloud detection in high-resolution remote sensing imagery by weakly supervised deep learning","volume":"250","author":"Li","year":"2020","journal-title":"Remote Sens. Environ."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1016\/j.isprsjprs.2018.09.014","article-title":"Deep networks under scene-level supervision for multi-class geospatial object detection from remote sensing images","volume":"146","author":"Li","year":"2018","journal-title":"J. Photogramm. Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1109\/MGRS.2017.2762307","article-title":"Deep Learning in Remote Sensing","volume":"5","author":"Zhu","year":"2017","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1109\/LGRS.2015.2503142","article-title":"Unsupervised multilayer feature learning for satellite image scene classification","volume":"13","author":"Li","year":"2016","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1016\/j.inffus.2020.10.008","article-title":"Image retrieval from remote sensing big data: A survey","volume":"67","author":"Li","year":"2021","journal-title":"Inf. Fusion."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Li, Y., Zhang, Y., and Zhu, Z. (2020). Error-tolerant deep learning for remote sensing image scene classification. IEEE Trans. Cybern., in press.","DOI":"10.1109\/TCYB.2020.2989241"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"6521","DOI":"10.1109\/TGRS.2018.2839705","article-title":"Learning source-invariant deep hashing convolutional neural networks for cross-source remote sensing image retrieval","volume":"56","author":"Li","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"950","DOI":"10.1109\/TGRS.2017.2756911","article-title":"Large-scale remote sensing image retrieval by deep hashing neural networks","volume":"56","author":"Li","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"Imagenet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1016\/j.knosys.2016.01.028","article-title":"Supervised remote sensing image segmentation using boosted convolutional neural networks","volume":"99","author":"Basaeed","year":"2016","journal-title":"Knowl. Based Syst."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1109\/MSP.2013.2279179","article-title":"Advances in hyperspectral image classification","volume":"31","author":"Tuia","year":"2014","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Noh, H., Hong, S., and Han, B. (2015, January 11\u201318). Learning deconvolutional network for semantic segmentation. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.178"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-Net: Convolutional Networks for Biomedical Image Segmentation. Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_20","unstructured":"Badrinarayanan, V., Kendall, A., and Cipolla, R. (2016, January 27\u201330). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2018, January 18\u201322). Mask R-CNN. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2016","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid Scene Parsing Network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Lin, G., Milan, A., Shen, C., and Reid, I. (2016, January 27\u201330). RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2017.549"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Chen, L., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Roy, A.G., Navab, N., and Wachinger, C. (2018, January 16\u201320). Concurrent Spatial and Channel \u2018Squeeze & Excitation\u2019 in Fully Convolutional Networks. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Granada, Spain.","DOI":"10.1007\/978-3-030-00928-1_48"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., Albanie, S., Sun, G., and Wu, E. (2018, January 18\u201322). Squeeze-and-Excitation Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_28","unstructured":"Oktay, O., Schlemper, J., Folgoc, L.L., and Lee, M. (2018, January 4\u20136). Attention U-Net: Learning Where to Look for the Pancreas. Proceedings of the International Conference on Medical Imaging with Deep Learning, Amsterdam, The Netherlands."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Li, H., Qiu, K., Chen, L., Mei, X., Hong, L., and Tao, C. (2020). SCAttNet: Semantic Segmentation Network with Spatial and Channel Attention Mechanism for High-Resolution Remote Sensing Images. IEEE Geosci. Remote Sens. Lett., 1\u20135.","DOI":"10.1109\/LGRS.2020.2988294"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J., and Kweon, I.S. (2018, January 8\u201314). CBAM: Convolutional Block Attention Module. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.isprsjprs.2019.02.006","article-title":"Semantic segmentation of slums in satellite images using transfer learning on fully convolutional neural networks","volume":"150","author":"Wurm","year":"2019","journal-title":"J. Photogramm. Remote Sens."},{"key":"ref_32","unstructured":"Sherrah, J. (2016). Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Marmanis, D., Wegner, J.D., Galliani, S., Schindler, K., Datcu, M., and Stilla, U. (2016, January 12\u201319). Semantic segmentation of aerial images with an ensemble of CNSS. Proceedings of the ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Prague, Czech Republic.","DOI":"10.5194\/isprs-annals-III-3-473-2016"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Maggiori, E., Tarabalka, Y., Charpiat, G., and Alliez, P. (2016, January 10\u201315). Fully Convolutional Neural Networks for Remote Sensing Image Classification. Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China.","DOI":"10.1109\/IGARSS.2016.7730322"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Kampffmeyer, M., Salberg, A.B., and Jenssen, R. (2016, January 27\u201330). Semantic segmentation of small objects and modeling of uncertainty in urban remote sensing images using deep convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPRW.2016.90"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Wang, C., and Li, L. (2020). Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation. Remote Sens., 12.","DOI":"10.3390\/rs12182932"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Audebert, N., Saux, B.L., and Lef\u00e8vre, S. (2016, January 20\u201324). Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-Scale Deep Networks. Proceedings of the Asian Conference on Computer Vision, Taipei, Taiwan.","DOI":"10.1007\/978-3-319-54181-5_12"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zhang, M., Hu, X., Zhao, L., Lv, Y., Luo, M., and Pang, S. (2017). Learning dual multi-scale manifold ranking for semantic segmentation of high resolution images. Remote Sens., 9.","DOI":"10.20944\/preprints201704.0061.v1"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Pan, X., Gao, L., Andrea, M., Zhang, B., Fan, Y., and Paolo, G. (2018). Semantic Labeling of High Resolution Aerial Imagery and LiDAR Data with Fine Segmentation Network. Remote Sens., 10.","DOI":"10.3390\/rs10050743"},{"key":"ref_40","unstructured":"Chen, K., Fu, K., Gao, X., Yan, M., Zhang, W., Zhang, Y., and Sun, X. (August, January 28). Effective fusion of multi-modal data with group convolutions for semantic segmentation of aerial imagery. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Yokohama, Japan."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1016\/j.isprsjprs.2017.11.009","article-title":"Classification with an edge: Improving semantic image segmentation with boundary detection","volume":"135","author":"Marmanis","year":"2018","journal-title":"J. Photogramm. Remote Sens."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1501","DOI":"10.3390\/rs12091501","article-title":"Remote Sensing Image Semantic Segmentation Based on Edge Information Guidance","volume":"12","author":"Chu","year":"2020","journal-title":"Remote Sens."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"863","DOI":"10.3233\/SW-190362","article-title":"Semantic referee: A neural-symbolic framework for enhancing geospatial semantic segmentation","volume":"10","author":"Alirezaie","year":"2019","journal-title":"Semant. Web."},{"key":"ref_44","unstructured":"Yong, L., Wang, R., Shan, S., and Chen, X. (2018, January 18\u201322). Structure inference net: Object detection using scene-level context and instance-level relationships. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1109\/TNN.2008.2005605","article-title":"The graph neural network model","volume":"20","author":"Scarselli","year":"2009","journal-title":"IEEE Trans Neural Netw."},{"key":"ref_46","unstructured":"Gori, M., Monfardini, G., and Scarselli, F. (August, January 31). A new model for learning in graph domains. Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, Montreal, Canada."},{"key":"ref_47","unstructured":"Kipf, T., and Welling, M. (2017, January 24\u201326). Semi-supervised classification with graph convolutional networks. Proceedings of the international conference on learning representations, Toulon, France."},{"key":"ref_48","unstructured":"Niepert, M., Ahmed, M., and Kutzkov, K. (2016, January 19\u201324). Learning Convolutional Neural Networks for Graphs. Proceedings of the International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Li, G., M\u00fcller, M., Thabet, A., and Ghanem, B. (2019, January 27\u201328). DeepGCNs: Can GCNs Go as Deep as CNNs?. Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00936"},{"key":"ref_50","unstructured":"Veli\u010dkovi\u0107, P., Cucurull, G., and Casanova, A. (May, January 30). Graph Attention Networks. Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Lu, Y., Chen, Y., Zhao, D., and Chen, J. (2020). Graph-FCN for image semantic segmentation. arXiv.","DOI":"10.1007\/978-3-030-22796-8_11"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Li, Y., Chen, R., and Zhang, Y. (2020, January 19\u201324). A CNN-GCN framework for multi-label aerial image scene classification. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Hawaii, HI, USA.","DOI":"10.1109\/IGARSS39084.2020.9323487"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Shao, Z., Yang, K., Zhou, W., and Hu, B. (2018). Performance evaluation of single-label and multi-label remote sensing image retrieval using a dense labeling dataset. Remote Sens., 10.","DOI":"10.3390\/rs10060964"},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., and Raska, R. (2018, January 18\u201322). Deepglobe 2018: A challenge to parse the earth through satellite images. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00031"},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"911","DOI":"10.1080\/15481603.2019.1587890","article-title":"Ontologies to interpret remote sensing images: Why do we need them?","volume":"56","author":"Arvor","year":"2019","journal-title":"Gisci. Remote Sens."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Gu, H., Li, H., and Yan, L. (2017). An Object-Based Semantic Classification Method for High Resolution Remote Sensing Imagery Using Ontology. Remote Sens., 9.","DOI":"10.3390\/rs9040329"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Garcia-Garcia, A., Orts-Escolano, S., Oprea, S.O., Villena-Martinez, V., and Garcia-Rodriguez, J. (2017). A Review on Deep Learning Techniques Applied to Semantic Segmentation. arXiv.","DOI":"10.1016\/j.asoc.2018.05.018"},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"2274","DOI":"10.1109\/TPAMI.2012.120","article-title":"SLIC Superpixels Compared to State-of-the-art Superpixel Methods","volume":"34","author":"Achanta","year":"2012","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/1\/119\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:48:39Z","timestamp":1760179719000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/1\/119"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,31]]},"references-count":59,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2021,1]]}},"alternative-id":["rs13010119"],"URL":"https:\/\/doi.org\/10.3390\/rs13010119","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,12,31]]}}}