{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T22:41:28Z","timestamp":1775169688220,"version":"3.50.1"},"reference-count":50,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2018,12,22]],"date-time":"2018-12-22T00:00:00Z","timestamp":1545436800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61671054"],"award-info":[{"award-number":["61671054"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61473034"],"award-info":[{"award-number":["61473034"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Open Project Program of the National Laboratory of Pattern Recognition","award":["201800027"],"award-info":[{"award-number":["201800027"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Dense semantic labeling is significant in high-resolution remote sensing imagery research and it has been widely used in land-use analysis and environment protection. With the recent success of fully convolutional networks (FCN), various types of network architectures have largely improved performance. Among them, atrous spatial pyramid pooling (ASPP) and encoder-decoder are two successful ones. The former structure is able to extract multi-scale contextual information and multiple effective field-of-view, while the latter structure can recover the spatial information to obtain sharper object boundaries. In this study, we propose a more efficient fully convolutional network by combining the advantages from both structures. Our model utilizes the deep residual network (ResNet) followed by ASPP as the encoder and combines two scales of high-level features with corresponding low-level features as the decoder at the upsampling stage. We further develop a multi-scale loss function to enhance the learning procedure. In the postprocessing, a novel superpixel-based dense conditional random field is employed to refine the predictions. We evaluate the proposed method on the Potsdam and Vaihingen datasets and the experimental results demonstrate that our method performs better than other machine learning or deep learning methods. Compared with the state-of-the-art DeepLab_v3+ our model gains 0.4% and 0.6% improvements in overall accuracy on these two datasets respectively.<\/jats:p>","DOI":"10.3390\/rs11010020","type":"journal-article","created":{"date-parts":[[2018,12,24]],"date-time":"2018-12-24T10:37:49Z","timestamp":1545647869000},"page":"20","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":96,"title":["Dense Semantic Labeling with Atrous Spatial Pyramid Pooling and Decoder for High-Resolution Remote Sensing Imagery"],"prefix":"10.3390","volume":"11","author":[{"given":"Yuhao","family":"Wang","sequence":"first","affiliation":[{"name":"School of Automation &amp; Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China"},{"name":"Key Laboratory of Knowledge Automation for Industrial Processes, Ministry of Education, Beijing 100083, China"}]},{"given":"Binxiu","family":"Liang","sequence":"additional","affiliation":[{"name":"School of Automation &amp; Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China"},{"name":"Key Laboratory of Knowledge Automation for Industrial Processes, Ministry of Education, Beijing 100083, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3125-3847","authenticated-orcid":false,"given":"Meng","family":"Ding","sequence":"additional","affiliation":[{"name":"Thermo Fisher Scientific, Richardson, TX 75081, USA"}]},{"given":"Jiangyun","family":"Li","sequence":"additional","affiliation":[{"name":"School of Automation &amp; Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China"},{"name":"Key Laboratory of Knowledge Automation for Industrial Processes, Ministry of Education, Beijing 100083, China"}]}],"member":"1968","published-online":{"date-parts":[[2018,12,22]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1109\/JPROC.2012.2211551","article-title":"Land-cover mapping by Markov modeling of spatial-contextual information in very-high-resolution remote sensing images","volume":"101","author":"Moser","year":"2013","journal-title":"Proc. IEEE"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Xu, Y., Wu, L., Xie, Z., and Chen, Z. (2018). Building extraction in very high resolution remote sensing imagery using deep learning and guided filters. Remote Sens., 10.","DOI":"10.3390\/rs10010144"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"920","DOI":"10.3390\/rs10060920","article-title":"High-resolution remote sensing image classification method based on convolutional neural network and restricted conditional random field","volume":"10","author":"Xin","year":"2018","journal-title":"Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"473","DOI":"10.5194\/isprs-annals-III-3-473-2016","article-title":"Semantic segmentation of aerial images with an ensemble of CNNs","volume":"3","author":"Marmanis","year":"2016","journal-title":"ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"389","DOI":"10.5721\/EuJRS20144723","article-title":"A review of remote sensing image classification techniques: The role of spatio-contextual information","volume":"47","author":"Li","year":"2014","journal-title":"Eur. J. Remote Sens."},{"key":"ref_6","unstructured":"Kampffmeyer, M., Arnt-Borre, S., and Robert, J. (July, January 26). Semantic segmentation of small objects and modeling of uncertainty in urban remote sensing images using deep convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Las Vegas, NV, USA."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"881","DOI":"10.1109\/TGRS.2016.2616585","article-title":"Dense semantic labeling of subdecimeter resolution images with convolutional neural networks","volume":"55","author":"Michele","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_8","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the Computer IEEE Computer Society Conference on Vision and Pattern Recognition, San Diego, CA, USA."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.cviu.2007.09.014","article-title":"Speeded-up robust features (SURF)","volume":"110","author":"Herbert","year":"2008","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/j.isprsjprs.2007.05.011","article-title":"Automatic recognition of man-made objects in high resolution optical remote sending images by SVM classification of geometric image features","volume":"62","author":"Inglada","year":"2007","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1016\/j.isprsjprs.2016.01.011","article-title":"Random forest in remote sensing: A review of applications and future directions","volume":"114","author":"Mariana","year":"2016","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1109\/LGRS.2009.2025059","article-title":"Unsupervised change detection in satellite images using principal component analysis and k-means clustering","volume":"3","author":"Turgay","year":"2009","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Wang, H., Wang, Y., Zhang, Q., Xiang, S., and Pan, C. (2017). Gated convolutional neural network for semantic segmentation in high-resolution images. Remote Sens., 9.","DOI":"10.3390\/rs9050446"},{"key":"ref_15","unstructured":"Yansong, L., Sankaranarayanan, P., Sildomar, T.M., and Eli, S. (2017, January 21\u201326). Dense semantic labeling of very-high-resolution aerial imagery and LiDAR with fully-convolutional neural networks and higher-order CRFs. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Honolulu, HI, USA."},{"key":"ref_16","unstructured":"Hyeonwoo, N., Seunghoon, H., and Bohyung, H. (2015, January 3\u20137). Learning Deconvolution Network for Semantic Segmentation. Proceedings of the IEEE International Conference on Computer Vision, Washington, DC, USA."},{"key":"ref_17","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). Imagenet classification with deep convolutional neural networks. Proceedings of the International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Li, F.-F. (2009, January 20\u201325). ImageNet: A large-scale hierarchical image database. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_19","unstructured":"Simonyan, K., and Zisserman, A. (arXiv, 2014). Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv."},{"key":"ref_20","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1080\/2150704X.2017.1410291","article-title":"Semantic pixel labelling in remote sensing images using a deep convolutional encoder-decoder model","volume":"9","author":"Wei","year":"2018","journal-title":"Remote Sens. Lett."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Fu, G., Liu, C., Zhou, R., Sun, T., and Zhang, Q. (2017). Classification for High Resolution Remote Sensing Imagery Using a Fully Convolutional Network. Remote Sens., 9.","DOI":"10.3390\/rs9050498"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid scene parsing network. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution and fully connected crfs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Peng, C., Zhang, X., Yu, G., Luo, G., and Sun, J. (2017, January 21\u201326). Large kernel matters-improve semantic segmentation by global convolutional network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.189"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (arXiv, 2018). Encoder-decoder with atrous separable convolution for semantic image segmentation, arXiv.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 8\u201310). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Maggiori, E., Tarabalka, Y., Charpiat, G., and Alliex, P. (2016, January 10\u201315). Fully convolutional networks for remote sensing image classification. Proceedings of the IEEE International Conference on Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China.","DOI":"10.1109\/IGARSS.2016.7730322"},{"key":"ref_32","unstructured":"Fisher, Y., and Vladlen, K. (arXiv, 2015). Multi-Scale Context Aggregation by Dilated Convolutions, arXiv."},{"key":"ref_33","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (arXiv, 2017). Rethinking atrous convolution for semantic image segmentation, arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"472","DOI":"10.1109\/TIT.1981.1056373","article-title":"Properties of cross-entropy minimization","volume":"27","author":"Shore","year":"1987","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_35","unstructured":"Bottou, L. (2010, January 22\u201327). Large-scale machine learning with stochastic gradient descent. Proceedings of the 19th International Conference on Computational Statistics, Paris, France."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"2988","DOI":"10.1016\/j.patcog.2015.04.019","article-title":"CRF learning with CNN features for image segmentation","volume":"48","author":"Liu","year":"2015","journal-title":"Pattern Recognit."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Alam, F.I., Zhou, J., Liew, A.W.C., and Jia, X.P. (2016, January 10\u201315). CRF learning with CNN features for hyperspectral image segmentation. Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China.","DOI":"10.1109\/IGARSS.2016.7730798"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1103\/RevModPhys.54.235","article-title":"The potts model","volume":"54","author":"Wu","year":"1982","journal-title":"Rev. Mod. Phys."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"2274","DOI":"10.1109\/TPAMI.2012.120","article-title":"Slic superpixels compared to state-of-the-art superpixel methods","volume":"34","author":"Achanta","year":"2012","journal-title":"IEEE Trans. Pattern Anal. Math. Intell."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Van den Bergh, M., Boix, X., Roig, G., de Capitani, B., and Van Gool, L. (2012, January 7\u201313). Seeds: Superpixels extracted via energy-driven sampling. Proceedings of the 12th European Conference on Computer Vision-Volume Part VII, Florence, Italy.","DOI":"10.1007\/978-3-642-33786-4_2"},{"key":"ref_41","unstructured":"Gerke, M. (2015). Use of the Stair Vision Library within the ISPRS 2D Semantic Labeling Benchmark (Vaihingen), University of Twente. Technical Report."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Liu, Y., Ren, Q., Geng, J., Ding, M., and Li, J. (2018). Efficient Patch-Wise Semantic Segmentation for Large-Scale Remote Sensing Images. Sensors, 18.","DOI":"10.3390\/s18103232"},{"key":"ref_43","unstructured":"Lin, M., Chen, Q., and Yan, S. (arXiv, 2013). Network in network, arXiv."},{"key":"ref_44","unstructured":"Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., and Isard, M. (2016, January 2\u20134). Tensorflow: A system for large-scale machines learning. Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, Savannan, GA, USA."},{"key":"ref_45","unstructured":"Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L. (arXiv, 2014). Semantic image segmentation with deep convolutional nets and fully connected crfs, arXiv."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (voc) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_47","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. (July, January 26). The cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (CVPR), Las Vegas, NV, USA."},{"key":"ref_48","unstructured":"Sherrah, J. (arXiv, 2016). Fully convolution networks for dense semantic labelling of high-resolution aerial imagery, arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Piramanayagam, S., Saber, E., Schwartzkopf, W., and Koehler, F. (2018). Supervised Classification of Multisensor Remotely Sensed Images Using a Deep Learning Framework. Remote Sens., 10.","DOI":"10.3390\/rs10091429"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Zhao, W., Fu, Y., Wei, X., and Wang, H. (2018). An Improved Image Semantic Segmentation Method Based on Superpixels and Conditional Random Fields. Appl. Sci., 8.","DOI":"10.3390\/app8050837"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/1\/20\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:35:45Z","timestamp":1760196945000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/1\/20"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,12,22]]},"references-count":50,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2019,1]]}},"alternative-id":["rs11010020"],"URL":"https:\/\/doi.org\/10.3390\/rs11010020","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,12,22]]}}}