{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,19]],"date-time":"2025-10-19T16:04:07Z","timestamp":1760889847122,"version":"build-2065373602"},"reference-count":54,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2019,5,15]],"date-time":"2019-05-15T00:00:00Z","timestamp":1557878400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61771351"],"award-info":[{"award-number":["61771351"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Semantic labeling for high resolution aerial images is a fundamental and necessary task in remote sensing image analysis. It is widely used in land-use surveys, change detection, and environmental protection. Recent researches reveal the superiority of Convolutional Neural Networks (CNNs) in this task. However, multi-scale object recognition and accurate object localization are two major problems for semantic labeling methods based on CNNs in high resolution aerial images. To handle these problems, we design a Context Fuse Module, which is composed of parallel convolutional layers with kernels of different sizes and a global pooling branch, to aggregate context information at multiple scales. We propose an Attention Mix Module, which utilizes a channel-wise attention mechanism to combine multi-level features for higher localization accuracy. We further employ a Residual Convolutional Module to refine features in all feature levels. Based on these modules, we construct a new end-to-end network for semantic labeling in aerial images. We evaluate the proposed network on the ISPRS Vaihingen and Potsdam datasets. Experimental results demonstrate that our network outperforms other competitors on both datasets with only raw image data.<\/jats:p>","DOI":"10.3390\/rs11101158","type":"journal-article","created":{"date-parts":[[2019,5,15]],"date-time":"2019-05-15T11:37:40Z","timestamp":1557920260000},"page":"1158","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":32,"title":["Context Aggregation Network for Semantic Labeling in Aerial Images"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7822-5216","authenticated-orcid":false,"given":"Wensheng","family":"Cheng","sequence":"first","affiliation":[{"name":"School of Electronic Information, Wuhan University, Wuhan 430072, China"},{"name":"The CETC Key Laboratory of Aerospace Information Applications, Shijiazhuang 050081, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3263-8768","authenticated-orcid":false,"given":"Wen","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Electronic Information, Wuhan University, Wuhan 430072, China"},{"name":"The CETC Key Laboratory of Aerospace Information Applications, Shijiazhuang 050081, China"}]},{"given":"Min","family":"Wang","sequence":"additional","affiliation":[{"name":"The CETC Key Laboratory of Aerospace Information Applications, Shijiazhuang 050081, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3359-3565","authenticated-orcid":false,"given":"Gang","family":"Wang","sequence":"additional","affiliation":[{"name":"The CETC Key Laboratory of Aerospace Information Applications, Shijiazhuang 050081, China"}]},{"given":"Jinyong","family":"Chen","sequence":"additional","affiliation":[{"name":"The CETC Key Laboratory of Aerospace Information Applications, Shijiazhuang 050081, China"}]}],"member":"1968","published-online":{"date-parts":[[2019,5,15]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1592","DOI":"10.1109\/TGRS.2014.2345739","article-title":"Multiple feature learning for hyperspectral image classification","volume":"53","author":"Li","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1109\/TGRS.2014.2318332","article-title":"Spectralspatial classification of hyperspectral data via morphological component analysis-based image separation","volume":"53","author":"Xue","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"3083","DOI":"10.1109\/TGRS.2015.2511197","article-title":"Multiple morphological component analysis based decomposition for remote sensing image classification","volume":"54","author":"Xu","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1109\/JPROC.2012.2211551","article-title":"Land-cover mapping by Markov modeling of spatial-contextual information in very-high-resolution remote sensing images","volume":"101","author":"Moser","year":"2013","journal-title":"Proc. IEEE"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"884","DOI":"10.1109\/TCYB.2016.2531179","article-title":"Joint dictionary learning for multispectral change detection","volume":"47","author":"Lu","year":"2017","journal-title":"IEEE Trans. Cybern."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1777","DOI":"10.3390\/rs3081777","article-title":"Segment-based land cover mapping of a suburban area-comparison of high-resolution remotely sensed datasets using classification trees and test field points","volume":"3","author":"Matikainen","year":"2011","journal-title":"Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"2320","DOI":"10.1016\/j.rse.2011.04.032","article-title":"Mapping urbanization dynamics at regional and global scales using multi-temporal dmsp\/ols nighttime light data","volume":"115","author":"Zhang","year":"2011","journal-title":"Remote Sens. Environ."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"920","DOI":"10.3390\/rs10060920","article-title":"High-resolution remote sensing image classification method based on convolutional neural network and restricted conditional random field","volume":"10","author":"Xin","year":"2018","journal-title":"Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_10","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the IEEE Conference on Vision and Pattern Recognition (CVPR), San Diego, CA, USA."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Rosten, E., and Drummond, T. (2006, January 7\u201313). Machine learning for high-speed corner detection. Proceedings of the European Conference on Computer Vision (ECCV), Graz, Austria.","DOI":"10.1007\/11744023_34"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1109\/LGRS.2009.2025059","article-title":"Unsupervised change detection in satellite images using principal component analysis and k-means clustering","volume":"3","author":"Turgay","year":"2009","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/j.isprsjprs.2007.05.011","article-title":"Automatic recognition of man-made objects in high resolution optical remote sending images by SVM classification of geometric image features","volume":"62","author":"Inglada","year":"2007","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1016\/j.isprsjprs.2016.01.011","article-title":"Random forest in remote sensing: A review of applications and future directions","volume":"114","author":"Mariana","year":"2016","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Li, F.-F. (2009, January 20\u201325). ImageNet: A large-scale hierarchical image database. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_16","unstructured":"Simonyan, K., and Zisserman, A. (arXiv, 2014). Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv."},{"key":"ref_17","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_19","unstructured":"Mnih, V. (2013). Machine Learning for Aerial Image Labeling. [Ph.D. Thesis, University of Toronto]."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"2868","DOI":"10.1109\/JSTARS.2016.2582921","article-title":"Semantic labeling of aerial and satellite imagery","volume":"9","author":"Paisitkriangkrai","year":"2016","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Nogueira, K., Mura, M.D., Chanussot, J., Schwartz, W.R., and Santos, J.A.D. (2016, January 4\u20138). Learning to semantically segment highresolution remote sensing images. Proceedings of IEEE International Conference on Pattern Recognition (ICPR), Cancun, Mexico.","DOI":"10.1109\/ICPR.2016.7900187"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.isprsjprs.2017.05.002","article-title":"Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks","volume":"130","author":"Alshehhi","year":"2017","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1016\/j.isprsjprs.2017.07.014","article-title":"A hybrid MLP-CNN classifier for very fine resolution remotely sensed image classification","volume":"140","author":"Zhang","year":"2018","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1109\/TGRS.2016.2612821","article-title":"Convolutional neural networks for large-scale remote-sensing image classification","volume":"55","author":"Maggiori","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Mostajabi, M., Yadollahpour, P., and Shakhnarovich, G. (2015, January 8\u201310). Feedforward semantic segmentation with zoom-out features. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298959"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"4141","DOI":"10.1109\/TGRS.2017.2689018","article-title":"Superpixel-based multiple local cnn for panchromatic and multispectral image classification","volume":"55","author":"Zhao","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 8\u201310). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Noh, H., Hong, S., and Han, B. (2015, January 11\u201318). Learning deconvolution network for semantic segmentation. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.178"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"Segnet: A deep convolutional encoder-decoder architecture for image segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention(MICCAI), Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Pohlen, T., Hermans, A., Mathias, M., and Leibe, B. (2017, January 21\u201326). Full-resolution residual networks for semantic segmentation in street scenes. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.353"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Lin, G., Milan, A., Shen, C., and Reid, I. (2017, January 21\u201326). Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.549"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid scene parsing network. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_34","unstructured":"Chen, L., Papandreou, G., Kokkinos, I., Murphy, K.P., and Yuille, A.L. (2015, January 7\u20139). Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA."},{"key":"ref_35","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (arXiv, 2017). Rethinking atrous convolution for semantic image segmentation, arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Peng, C., Zhang, X., Yu, G., Luo, G., and Sun, J. (2017, January 21\u201326). Large Kernel Matters\u2013Improve Semantic Segmentation by Global Convolutional Network. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.189"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Zhang, H., Dana, K., Shi, J., Zhang, Z., Wang, X., Tyagi, A., and Agrawal, A. (2018, January 18\u201322). Context encoding for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00747"},{"key":"ref_38","unstructured":"Yu, F., and Koltun, V. (2016, January 2\u20134). Multi-scale context aggregation by dilated convolutions. Proceedings of the International Conference on Learning Representations (ICLR), Caribe Hilton, San Juan, Puerto Rico."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., and Cottrell, G. (2018, January 12\u201315). Understanding convolution for semantic segmentation. Proceedings of the IEEE Winter Conference on Applications of Computer Vision(WACV), Lake Tahoe, NV, USA.","DOI":"10.1109\/WACV.2018.00163"},{"key":"ref_40","unstructured":"Ioffe, S., and Szegedy, C. (2015, January 7\u20139). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of International Conference on Machine Learning(ICML), San Diego, CA, USA."},{"key":"ref_41","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). Imagenet classification with deep convolutional neural networks. Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS), Lake Tahoe, NV, USA."},{"key":"ref_42","unstructured":"Liu, W., Rabinovich, A., and Berg, A.C. (2016, January 2\u20134). Parsenet: Looking wider to see better. Proceedings of the International Conference on Learning Representations (ICLR), Caribe Hilton, San Juan, Puerto Rico."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201322). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., and Sang, N. (2018, January 18\u201322). Learning a discriminative feature network for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00199"},{"key":"ref_46","unstructured":"ISPRS (2019, May 13). International Society For Photogrammetry And Remote Sensing. 2D Semantic Labeling Challenge. Available online: http:\/\/www2.isprs.org\/commissions\/comm3\/wg4\/semantic-labeling.html."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Piramanayagam, S., Saber, E., Schwartzkopf, W., and Koehler, F. (2018). Supervised Classification of Multisensor Remotely Sensed Images Using a Deep Learning Framework. Remote Sens., 10.","DOI":"10.3390\/rs10091429"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Liu, Y., Fan, B., Wang, L., Bai, J., Xiang, S., and Pan, C. (2017, January 17\u201320). Context-aware cascade network for semantic labeling in VHR image. Proceedings of the IEEE International Conference on Image Processing (ICIP), Beijing, China.","DOI":"10.1109\/ICIP.2017.8296346"},{"key":"ref_49","unstructured":"Gerke, M. (2015). Use of the Stair Vision Library within the ISPRS 2d Semantic Labeling Benchmark (Vaihingen), University of Twente. Technical Report."},{"key":"ref_50","unstructured":"Gould, S., Russakovsky, O., Goodfellow, I., and Baumstarck, P. (2011). The Stair Vision Library (v2.5), Stanford University."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"881","DOI":"10.1109\/TGRS.2016.2616585","article-title":"Dense semantic labeling of subdecimeter resolution images with convolutional neural networks","volume":"55","author":"Volpi","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_52","unstructured":"Sherrah, J. (arXiv, 2016). Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery, arXiv."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1016\/j.isprsjprs.2017.11.009","article-title":"Classification with an edge: Improving semantic image segmentation with boundary detection","volume":"135","author":"Marmanis","year":"2018","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1016\/j.isprsjprs.2018.04.014","article-title":"Algorithms for semantic segmentation of multispectral remote sensing imagery using deep learning","volume":"145","author":"Kemker","year":"2018","journal-title":"ISPRS J. Photogramm. Remote Sens."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/10\/1158\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:51:59Z","timestamp":1760187119000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/10\/1158"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,15]]},"references-count":54,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2019,5]]}},"alternative-id":["rs11101158"],"URL":"https:\/\/doi.org\/10.3390\/rs11101158","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2019,5,15]]}}}