{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T17:37:43Z","timestamp":1777484263909,"version":"3.51.4"},"reference-count":68,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2022,10,27]],"date-time":"2022-10-27T00:00:00Z","timestamp":1666828800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Semantic segmentation for remote sensing images (RSIs) plays an important role in many applications, such as urban planning, environmental protection, agricultural valuation, and military reconnaissance. With the boom in remote sensing technology, numerous RSIs are generated; this is difficult for current complex networks to handle. Efficient networks are the key to solving this challenge. Many previous works aimed at designing lightweight networks or utilizing pruning and knowledge distillation methods to obtain efficient networks, but these methods inevitably reduce the ability of the resulting models to characterize spatial and semantic features. We propose an effective deep supervision-based simple attention network (DSANet) with spatial and semantic enhancement losses to handle these problems. In the network, (1) a lightweight architecture is used as the backbone; (2) deep supervision modules with improved multiscale spatial detail (MSD) and hierarchical semantic enhancement (HSE) losses synergistically strengthen the obtained feature representations; and (3) a simple embedding attention module (EAM) with linear complexity performs long-range relationship modeling. Experiments conducted on two public RSI datasets (the ISPRS Potsdam dataset and Vaihingen dataset) exhibit the substantial advantages of the proposed approach. Our method achieves 79.19% mean intersection over union (mIoU) on the ISPRS Potsdam test set and 72.26% mIoU on the Vaihingen test set with speeds of 470.07 FPS on 512 \u00d7 512 images and 5.46 FPS on 6000 \u00d7 6000 images using an RTX 3090 GPU.<\/jats:p>","DOI":"10.3390\/rs14215399","type":"journal-article","created":{"date-parts":[[2022,10,27]],"date-time":"2022-10-27T22:36:17Z","timestamp":1666910177000},"page":"5399","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["DSANet: A Deep Supervision-Based Simple Attention Network for Efficient Semantic Segmentation in Remote Sensing Imagery"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6229-7995","authenticated-orcid":false,"given":"Wenxu","family":"Shi","sequence":"first","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5440-4081","authenticated-orcid":false,"given":"Qingyan","family":"Meng","sequence":"additional","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"},{"name":"Key Laboratory of Earth Observation of Hainan Province, Hainan Research Institute, Aerospace Information Research Institute, Chinese Academy of Sciences, Sanya 572029, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5073-1694","authenticated-orcid":false,"given":"Linlin","family":"Zhang","sequence":"additional","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"},{"name":"Key Laboratory of Earth Observation of Hainan Province, Hainan Research Institute, Aerospace Information Research Institute, Chinese Academy of Sciences, Sanya 572029, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Maofan","family":"Zhao","sequence":"additional","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chen","family":"Su","sequence":"additional","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100049, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tam\u00e1s","family":"Jancs\u00f3","sequence":"additional","affiliation":[{"name":"Alba Regia Technical Faculty, Obuda University, Budai ut 45, 8001 Szekesfehervar, Hungary"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,10,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"5","DOI":"10.3389\/fenvs.2015.00045","article-title":"A survey of remote-sensing big data","volume":"3","author":"Liu","year":"2015","journal-title":"Front. Environ. Sci."},{"key":"ref_2","first-page":"1","article-title":"3D data management: Controlling data volume, velocity and variety","volume":"6","author":"Laney","year":"2001","journal-title":"META Group Res. Note"},{"key":"ref_3","first-page":"217","article-title":"A segmentation and classification approach of IKONOS-2 imagery for land cover mapping to assist flood risk and flood damage assessment","volume":"4","author":"Jong","year":"2003","journal-title":"Int. J. Appl. Earth Obs. Geoinf."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1016\/j.rse.2017.11.024","article-title":"Supervised methods of image segmentation accuracy assessment in land cover mapping","volume":"205","author":"Costa","year":"2018","journal-title":"Remote Sens. Environ."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1080\/01431160601075582","article-title":"Object-based change detection using correlation image analysis and image segmentation","volume":"29","author":"Im","year":"2008","journal-title":"Int. J. Remote Sens."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"4434","DOI":"10.1080\/01431161.2011.648285","article-title":"Object-based change detection","volume":"33","author":"Chen","year":"2012","journal-title":"Int. J. Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"112480","DOI":"10.1016\/j.rse.2021.112480","article-title":"Mapping large-scale and fine-grained urban functional zones from VHR images using a multi-scale semantic segmentation network and object based approach","volume":"261","author":"Du","year":"2021","journal-title":"Remote Sens. Environ."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Wang, J., Hu, X., Meng, Q., Zhang, L., Wang, C., Liu, X., and Zhao, M. (2021). Developing a method to extract building 3d information from GF-7 data. Remote Sens., 13.","DOI":"10.3390\/rs13224532"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1109\/JSTARS.2010.2074186","article-title":"A multilevel hierarchical image segmentation method for urban impervious surface mapping using very high resolution imagery","volume":"4","author":"Li","year":"2010","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"602","DOI":"10.1109\/LGRS.2018.2794545","article-title":"Automatic water-body segmentation from high-resolution satellite images via deep networks","volume":"15","author":"Miao","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"749","DOI":"10.1109\/LGRS.2018.2802944","article-title":"Road extraction by deep residual u-net","volume":"15","author":"Zhang","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2021.3064606","article-title":"Hed-unet: Combined segmentation and edge detection for monitoring the antarctic coastline","volume":"60","author":"Heidler","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_15","unstructured":"Chen, L.-C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"6195","DOI":"10.1109\/TGRS.2019.2904868","article-title":"CDnet: CNN-based cloud detection for remote sensing imagery","volume":"57","author":"Yang","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"8287","DOI":"10.1109\/JSTARS.2021.3104382","article-title":"Light-weight semantic segmentation network for UAV remote sensing images","volume":"14","author":"Liu","year":"2021","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_19","first-page":"562","article-title":"Deeply-Supervised Nets","volume":"Volume 38","author":"Lebanon","year":"2015","journal-title":"Machine Learning Research, Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Deng, C., Liang, L., Su, Y., He, C., and Cheng, J. (2021, January 11\u201316). Semantic segmentation for high-resolution remote sensing images by light-weight network. Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium.","DOI":"10.1109\/IGARSS47720.2021.9554244"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Liu, S., Ding, W., Liu, C., Liu, Y., Wang, Y., and Li, H. (2018). ERN: Edge loss reinforced semantic segmentation network for remote sensing images. Remote Sens., 10.","DOI":"10.3390\/rs10091339"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"641","DOI":"10.1109\/ACCESS.2021.3082076","article-title":"Neighborloss: A loss function considering spatial correlation for semantic segmentation of remote sensing image","volume":"9","author":"Yuan","year":"2021","journal-title":"IEEE Access"},{"key":"ref_23","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Maaten, L.V.D., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_26","unstructured":"Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., and Keutzer, K. (2016). SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size. arXiv."},{"key":"ref_27","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20138). Imagenet classification with deep convolutional neural networks. Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_28","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.-C. (2018, January 18\u201322). Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Howard, A., Sandler, M., Chu, G., Chen, L.-C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., and Vasudevan, V. (2019, January 16\u201320). Searching for mobilenetv3. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Long Beach, CA, USA.","DOI":"10.1109\/ICCV.2019.00140"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018, January 18\u201322). Shufflenet: An extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00716"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Ma, N., Zhang, X., Zheng, H.-T., and Sun, J. (2018, January 14\u201318). Shufflenet v2: Practical guidelines for efficient CNN architecture design. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_8"},{"key":"ref_33","unstructured":"Paszke, A., Chaurasia, A., Kim, S., and Culurciello, E. (2016). Enet: A deep neural network architecture for real-time semantic segmentation. arXiv."},{"key":"ref_34","unstructured":"Li, G., Yun, I., Kim, J., and Kim, J. (2019, January 9\u201312). Dabnet: Depth-wise asymmetric bottleneck for real-time semantic segmentation. Proceedings of the British Machine Vision Conference (BMVC), Cardiff, UK."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Lo, S.-Y., Hang, H.-M., Chan, S.-W., and Lin, J.-J. (2019, January 21\u201325). Efficient dense modules of asymmetric convolution for real-time semantic segmentation. Proceedings of the ACM Multimedia Asia, Nice, France.","DOI":"10.1145\/3338533.3366558"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1109\/TITS.2017.2750080","article-title":"Erfnet: Efficient residual factorized convnet for real-time semantic segmentation","volume":"19","author":"Romera","year":"2017","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1183","DOI":"10.1109\/TII.2018.2849348","article-title":"Fast semantic segmentation for scene perception","volume":"15","author":"Zhang","year":"2018","journal-title":"IEEE Trans. Industr. Inform."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Fan, M., Lai, S., Huang, J., Wei, X., Chai, Z., Luo, J., and Wei, X. (2021, January 11\u201317). Rethinking bisenet for real-time semantic segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Montreal, QC, Canada.","DOI":"10.1109\/CVPR46437.2021.00959"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Chaurasia, A., and Culurciello, E. (2017, January 22\u201329). Linknet: Exploiting encoder representations for efficient semantic segmentation. Proceedings of the 2017 IEEE Visual Communications and Image Processing (VCIP), Venice, Italy.","DOI":"10.1109\/VCIP.2017.8305148"},{"key":"ref_40","unstructured":"Liu, M., and Yin, H. (2019). Feature pyramid encoding network for real-time semantic segmentation. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Wang, Y., Zhou, Q., Liu, J., Xiong, J., Gao, G., Wu, X., and Latecki, L.J. (2019, January 22\u201425). Lednet: A lightweight encoder-decoder network for real-time semantic segmentation. Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803154"},{"key":"ref_42","unstructured":"Poudel, R.P., Liwicki, S., and Cipolla, R. (2019). Fast-scnn: Fast semantic segmentation network. arXiv."},{"key":"ref_43","unstructured":"Poudel, R.P., Bonde, U., Liwicki, S., and Zach, C. (2018). Contextnet: Exploring context and detail for semantic segmentation in real-time. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., and Sang, N. (2018, January 8\u201314). Bisenet: Bilateral segmentation network for real-time semantic segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01261-8_20"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"3051","DOI":"10.1007\/s11263-021-01515-2","article-title":"Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation","volume":"129","author":"Yu","year":"2021","journal-title":"Int. J. Comput. Vis."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Li, H., Xiong, P., Fan, H., and Sun, J. (2019, January 15\u201320). Dfanet: Deep feature aggregation for real-time semantic segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00975"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Li, X., You, A., Zhu, Z., Zhao, H., Yang, M., Yang, K., Tan, S., and Tong, Y. (2020). Semantic flow for fast and accurate scene parsing. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-030-58452-8_45"},{"key":"ref_48","unstructured":"Hong, Y., Pan, H., Sun, W., and Jia, Y. (2021). Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes. arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid scene parsing network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_50","unstructured":"Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L. (2014). Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., and Lu, H. (2019, January 15\u201320). Dual attention network for scene segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00326"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Yuan, Y., Chen, X., and Wang, J. (2020). Object-contextual representations for semantic segmentation. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-030-58539-6_11"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201322). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_55","unstructured":"Park, J., Woo, S., Lee, J.-Y., and Kweon, I.S. (2018). Bam: Bottleneck attention module. arXiv."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.-Y., and Kweon, I.S. (2018, January 8\u201314). Cbam: Convolutional block attention module. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_57","first-page":"1","article-title":"Multilayer feature fusion network with spatial attention and gated mechanism for remote sensing scene classification","volume":"19","author":"Meng","year":"2022","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Ambartsoumian, A., and Popowich, F. (2018). Self-attention: A better building block for sentiment analysis neural network classifiers. arXiv.","DOI":"10.18653\/v1\/W18-6219"},{"key":"ref_59","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I. (2017, January 4\u20139). Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems, Long Bench, CA, USA."},{"key":"ref_60","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16 \u00d7 16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Wang, W., Xie, E., Li, X., Fan, D.-P., Song, K., Liang, D., Lu, T., Luo, P., and Shao, L. (2021, January 11\u201317). Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00061"},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1016\/j.isprsjprs.2021.09.005","article-title":"Abcnet: Attentive bilateral contextual network for efficient semantic segmentation of fine-resolution remotely sensed imagery","volume":"181","author":"Li","year":"2021","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_64","unstructured":"Wang, S., Li, B.Z., Khabsa, M., Fang, H., and Ma, H. (2020). Linformer: Self-attention with linear complexity. arXiv."},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Guo, M.-H., Liu, Z.-N., Mu, T.-J., and Hu, S.-M. (2021). Beyond self-attention: External attention using two linear layers for visual tasks. arXiv.","DOI":"10.1109\/TPAMI.2022.3211006"},{"key":"ref_66","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1109\/TGRS.2016.2612821","article-title":"Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification","volume":"55","author":"Maggiori","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"1169","DOI":"10.1109\/TIP.2020.3042065","article-title":"Cgnet: A light-weight context guided network for semantic segmentation","volume":"30","author":"Wu","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Wang, Y., Zhou, Q., Xiong, J., Wu, X., and Jin, X. (2019). Esnet: An efficient symmetric network for real-time semantic segmentation. Chinese Conference on Pattern Recognition and Computer Vision (PRCV), Springer.","DOI":"10.1007\/978-3-030-31723-2_4"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/21\/5399\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:04:34Z","timestamp":1760144674000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/21\/5399"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,27]]},"references-count":68,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["rs14215399"],"URL":"https:\/\/doi.org\/10.3390\/rs14215399","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,27]]}}}