{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:01:53Z","timestamp":1760148113993,"version":"build-2065373602"},"reference-count":48,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2023,3,30]],"date-time":"2023-03-30T00:00:00Z","timestamp":1680134400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62071456"],"award-info":[{"award-number":["62071456"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Semantic segmentation has played an essential role in remote sensing image interpretation for decades. Although there has been tremendous success in such segmentation with the development of deep learning in the field, several limitations still exist in the current encoder\u2013decoder models. First, the potential interdependencies of the context contained in each layer of the encoder\u2013decoder architecture are not well utilized. Second, multi-scale features are insufficiently used, because the upper-layer and lower-layer features are not directly connected in the decoder part. In order to solve those limitations, a global attention gate (GAG) module is proposed to fully utilize the interdependencies of the context and multi-scale features, and then a global multi-attention UResNeXt (GMAUResNeXt) module is presented for the semantic segmentation of remote sensing images. GMAUResNeXt uses GAG in each layer of the decoder part to generate the global attention gate (for utilizing the context features) and connects each global attention gate with the uppermost layer in the decoder part by using the Hadamard product (for utilizing the multi-scale features). Both qualitative and quantitative experimental results demonstrate that use of GAG in each layer lets the model focus on a certain pattern, which can help improve the effectiveness of semantic segmentation of remote sensing images. Compared with state-of-the-art methods, GMAUResNeXt not only outperforms MDCNN by 0.68% on the Potsdam dataset with respect to the overall accuracy but is also the MANet by 3.19% on the GaoFen image dataset. GMAUResNeXt achieves better performance and more accurate segmentation results than the state-of-the-art models.<\/jats:p>","DOI":"10.3390\/rs15071836","type":"journal-article","created":{"date-parts":[[2023,3,30]],"date-time":"2023-03-30T02:23:46Z","timestamp":1680143026000},"page":"1836","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Global Multi-Attention UResNeXt for Semantic Segmentation of High-Resolution Remote Sensing Images"],"prefix":"10.3390","volume":"15","author":[{"given":"Zhong","family":"Chen","sequence":"first","affiliation":[{"name":"National Key Laboratory of Science and Technology on Multi-Spectral Information Processing, Key Laboratory for Image Information Processing and Intelligence Control of Education Ministry, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China"}]},{"given":"Jun","family":"Zhao","sequence":"additional","affiliation":[{"name":"National Key Laboratory of Science and Technology on Multi-Spectral Information Processing, Key Laboratory for Image Information Processing and Intelligence Control of Education Ministry, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China"}]},{"given":"He","family":"Deng","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430081, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1109\/LGRS.2010.2051533","article-title":"Automatic Urban Water-Body Detection and Segmentation From Sparse ALSM Data via Spatially Constrained Model-Driven Clustering","volume":"8","author":"Yuan","year":"2011","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"102","DOI":"10.1016\/j.isprsjprs.2014.04.023","article-title":"Improved maize cultivated area estimation over a large scale combining MODIS\u2013EVI time series data and crop phenological information","volume":"94","author":"Zhang","year":"2014","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"6732","DOI":"10.1109\/TGRS.2016.2589279","article-title":"Adaptive Coherency Matrix Estimation for Polarimetric SAR Imagery Based on Local Heterogeneity Coefficients","volume":"54","author":"Yang","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_4","unstructured":"Pereira, F., Burges, C., Bottou, L., and Weinberger, K. (2012, January 3\u20136). ImageNet Classification with Deep Convolutional Neural Networks. Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs","volume":"40","author":"Chen","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., and Liang, J. (2018, January 20). UNet++: A Nested U-Net Architecture for Medical Image Segmentation. Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Granada, Spain.","DOI":"10.1007\/978-3-030-00889-5_1"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1007\/s13735-017-0141-z","article-title":"A review of semantic segmentation using deep neural networks","volume":"7","author":"Guo","year":"2018","journal-title":"Int. J. Multimed. Inf. Retr."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1007\/s10462-020-09854-1","article-title":"Deep semantic segmentation of natural and medical images: A review","volume":"54","author":"Abhishek","year":"2021","journal-title":"Artif. Intell. Rev."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1016\/j.isprsjprs.2021.05.004","article-title":"An attention-fused network for semantic segmentation of very-high-resolution remote sensing imagery","volume":"177","author":"Yang","year":"2021","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_11","first-page":"1","article-title":"Multiattention Network for Semantic Segmentation of Fine-Resolution Remote Sensing Images","volume":"60","author":"Li","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-Net: Convolutional Networks for Biomedical Image Segmentation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2015, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1016\/j.neucom.2016.12.002","article-title":"Exploiting the complementary strengths of multi-layer CNN features for image retrieval","volume":"237","author":"Yu","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1038\/nrn755","article-title":"Control of goal-directed and stimulus-driven attention in the brain","volume":"3","author":"Corbetta","year":"2002","journal-title":"Nat. Rev. Neurosci."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1016\/j.neucom.2021.03.091","article-title":"A review on the attention mechanism of deep learning","volume":"452","author":"Niu","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Cho, K., Merrienboer, B.V., Gulcehre, C., Bougares, F., Schwenk, H., and Bengio, Y. (2014, January 25\u201329). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Proceedings of the EMNLP, Doha, Qatar.","DOI":"10.3115\/v1\/D14-1179"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., and Lu, H. (2019, January 15\u201320). Dual Attention Network for Scene Segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00326"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Wang, X., Girshick, R., Gupta, A., and He, K. (2018, January 18\u201323). Non-Local Neural Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00813"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Yang, Z., He, X., Gao, J., Deng, L., and Smola, A. (2016, January 27\u201330). Stacked Attention Networks for Image Question Answering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.10"},{"key":"ref_20","unstructured":"Yu, Y., Ji, Z., Fu, Y., Guo, J., Pang, Y., and Zhang, Z.M. (2018, January 3\u20138). Stacked Semantics-Guided Attention Model for Fine-Grained Zero-Shot Learning. Proceedings of the Advances in Neural Information Processing Systems, Montr\u00e9al, QC, Canada."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1109\/JBHI.2020.2986926","article-title":"Multi-Scale Self-Guided Attention for Medical Image Segmentation","volume":"25","author":"Sinha","year":"2021","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"5367","DOI":"10.1109\/TGRS.2020.2964675","article-title":"Semantic Segmentation of Large-Size VHR Remote Sensing Images Using a Two-Stage Multiscale Training Architecture","volume":"58","author":"Ding","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_23","unstructured":"Miech, A., Laptev, I., and Sivic, J. (2017). Learnable pooling with context gating for video classification. arXiv."},{"key":"ref_24","unstructured":"Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., and Kainz, B. (2018). Attention u-net: Learning where to look for the pancreas. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Xie, S., Girshick, R., Dollar, P., Tu, Z., and He, K. (2017, January 21\u201326). Aggregated Residual Transformations for Deep Neural Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.634"},{"key":"ref_26","first-page":"3523","article-title":"Image Segmentation Using Deep Learning: A Survey","volume":"44","author":"Minaee","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_27","unstructured":"Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L. (2014). Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv."},{"key":"ref_28","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1016\/j.isprsjprs.2020.01.013","article-title":"ResUNet-a: A deep learning framework for semantic segmentation of remotely sensed data","volume":"162","author":"Diakogiannis","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Sun, K., Xiao, B., Liu, D., and Wang, J. (2019, January 15\u201320). Deep High-Resolution Representation Learning for Human Pose Estimation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00584"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Xu, Z., Zhang, W., Zhang, T., and Li, J. (2021). HRCNet: High-Resolution Context Extraction Network for Semantic Segmentation of Remote Sensing Images. Remote Sens., 13.","DOI":"10.3390\/rs13122290"},{"key":"ref_32","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., and Polosukhin, I. (2017, January 4\u20139). Attention is All you Need. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhang, J., Lin, S., Ding, L., and Bruzzone, L. (2020). Multi-Scale Context Aggregation for Semantic Segmentation of Remote Sensing Images. Remote Sens., 12.","DOI":"10.3390\/rs12040701"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018, January 8\u201314). CBAM: Convolutional Block Attention Module. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_35","unstructured":"Ramachandran, P., Parmar, N., Vaswani, A., Bello, I., Levskaya, A., and Shlens, J. (2019, January 8\u201314). Stand-Alone Self-Attention in Vision Models. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Wang, H., Zhu, Y., Green, B., Adam, H., Yuille, A., and Chen, L.C. (2020, January 23\u201328). Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation. Proceedings of the Computer Vision\u2014ECCV 2020, Glasgow, UK.","DOI":"10.1007\/978-3-030-58548-8_7"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Cho, K., Van Merri\u00ebnboer, B., Bahdanau, D., and Bengio, Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. arXiv.","DOI":"10.3115\/v1\/W14-4012"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long Short-Term Memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Chollet, F. (2017, January 21\u201326). Xception: Deep Learning With Depthwise Separable Convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Wang, Y., Liang, B., Ding, M., and Li, J. (2019). Dense Semantic Labeling with Atrous Spatial Pyramid Pooling and Decoder for High-Resolution Remote Sensing Imagery. Remote Sens., 11.","DOI":"10.3390\/rs11010020"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"(2018). Land cover mapping at very high resolution with rotation equivariant CNNs: Towards small yet accurate models. ISPRS J. Photogramm. Remote Sens., 145, 96\u2013107.","DOI":"10.1016\/j.isprsjprs.2018.01.021"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"111322","DOI":"10.1016\/j.rse.2019.111322","article-title":"Land-cover classification with high-resolution remote sensing images using transferable deep models","volume":"237","author":"Tong","year":"2020","journal-title":"Remote Sens. Environ."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1016\/j.isprsjprs.2017.11.009","article-title":"Classification with an edge: Improving semantic image segmentation with boundary detection","volume":"135","author":"Marmanis","year":"2018","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"4507","DOI":"10.1109\/TGRS.2018.2822783","article-title":"VPRS-Based Regional Decision Fusion of CNN and MRF Classifications for Very Fine Resolution Remotely Sensed Images","volume":"56","author":"Zhang","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_46","unstructured":"Sherrah, J. (2016). Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery. arXiv."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid Scene Parsing Network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_48","unstructured":"Wu, H., Zhang, J., Huang, K., Liang, K., and Yu, Y. (2019). Fastfcn: Rethinking dilated convolution in the backbone for semantic segmentation. arXiv."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/7\/1836\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:06:48Z","timestamp":1760123208000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/7\/1836"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,30]]},"references-count":48,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2023,4]]}},"alternative-id":["rs15071836"],"URL":"https:\/\/doi.org\/10.3390\/rs15071836","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2023,3,30]]}}}