{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T18:26:41Z","timestamp":1771698401056,"version":"3.50.1"},"reference-count":48,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2022,10,27]],"date-time":"2022-10-27T00:00:00Z","timestamp":1666828800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Strategic Priority Research Program of the Chinese Academy of Sciences","award":["XDA19060205"],"award-info":[{"award-number":["XDA19060205"]}]},{"name":"Strategic Priority Research Program of the Chinese Academy of Sciences","award":["XDA19020305"],"award-info":[{"award-number":["XDA19020305"]}]},{"name":"Strategic Priority Research Program of the Chinese Academy of Sciences","award":["XDA19020104"],"award-info":[{"award-number":["XDA19020104"]}]},{"name":"Strategic Priority Research Program of the Chinese Academy of Sciences","award":["2019YFC0507405"],"award-info":[{"award-number":["2019YFC0507405"]}]},{"name":"Strategic Priority Research Program of the Chinese Academy of Sciences","award":["CAS-WX2022GC-0106"],"award-info":[{"award-number":["CAS-WX2022GC-0106"]}]},{"name":"Key Research Development Program of China","award":["XDA19060205"],"award-info":[{"award-number":["XDA19060205"]}]},{"name":"Key Research Development Program of China","award":["XDA19020305"],"award-info":[{"award-number":["XDA19020305"]}]},{"name":"Key Research Development Program of China","award":["XDA19020104"],"award-info":[{"award-number":["XDA19020104"]}]},{"name":"Key Research Development Program of China","award":["2019YFC0507405"],"award-info":[{"award-number":["2019YFC0507405"]}]},{"name":"Key Research Development Program of China","award":["CAS-WX2022GC-0106"],"award-info":[{"award-number":["CAS-WX2022GC-0106"]}]},{"name":"Special Project of Informatization of Chinese Academy of Sciences","award":["XDA19060205"],"award-info":[{"award-number":["XDA19060205"]}]},{"name":"Special Project of Informatization of Chinese Academy of Sciences","award":["XDA19020305"],"award-info":[{"award-number":["XDA19020305"]}]},{"name":"Special Project of Informatization of Chinese Academy of Sciences","award":["XDA19020104"],"award-info":[{"award-number":["XDA19020104"]}]},{"name":"Special Project of Informatization of Chinese Academy of Sciences","award":["2019YFC0507405"],"award-info":[{"award-number":["2019YFC0507405"]}]},{"name":"Special Project of Informatization of Chinese Academy of Sciences","award":["CAS-WX2022GC-0106"],"award-info":[{"award-number":["CAS-WX2022GC-0106"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>In recent years, with the extensive application of deep learning in images, the task of remote sensing image change detection has witnessed a significant improvement. Several excellent methods based on Convolutional Neural Networks and emerging transformer-based methods have achieved impressive accuracy. However, Convolutional Neural Network-based approaches have difficulties in capturing long-range dependencies because of their natural limitations in effective receptive field acquisition unless deeper networks are employed, introducing other drawbacks such as an increased number of parameters and loss of shallow information. The transformer-based methods can effectively learn the relationship between different regions, but the computation is inefficient. Thus, in this paper, a multi-scale feature aggregation via transformer (MFATNet) is proposed for remote sensing image change detection. To obtain a more accurate change map after learning the intra-relationships of feature maps at different scales through the transformer, MFATNet aggregates the multi-scale features. Moreover, the Spatial Semantic Tokenizer (SST) is introduced to obtain refined semantic tokens before feeding into the transformer structure to make it focused on learning more crucial pixel relationships. To fuse low-level features (more fine-grained localization information) and high-level features (more accurate semantic information), and to alleviate the localization and semantic gap between high and low features, the Intra- and Inter-class Channel Attention Module (IICAM) are integrated to further determine more convincing change maps. Extensive experiments are conducted on LEVIR-CD, WHU-CD, and DSIFN-CD datasets. Intersection over union (IoU) of 82.42 and F1 score of 90.36, intersection over union (IoU) of 79.08 and F1 score of 88.31, intersection over union (IoU) of 77.98 and F1 score of 87.62, respectively, are achieved. The experimental results achieved promising performance compared to certain previous state-of-the-art change detection methods.<\/jats:p>","DOI":"10.3390\/rs14215379","type":"journal-article","created":{"date-parts":[[2022,10,27]],"date-time":"2022-10-27T22:36:17Z","timestamp":1666910177000},"page":"5379","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":29,"title":["MFATNet: Multi-Scale Feature Aggregation via Transformer for Remote Sensing Image Change Detection"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3804-524X","authenticated-orcid":false,"given":"Zan","family":"Mao","sequence":"first","affiliation":[{"name":"Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"University of the Chinese Academy of Sciences, Beijing 100049, China"}]},{"given":"Xinyu","family":"Tong","sequence":"additional","affiliation":[{"name":"Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"University of the Chinese Academy of Sciences, Beijing 100049, China"}]},{"given":"Ze","family":"Luo","sequence":"additional","affiliation":[{"name":"Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China"}]},{"given":"Honghai","family":"Zhang","sequence":"additional","affiliation":[{"name":"Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,10,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"989","DOI":"10.1080\/01431168908903939","article-title":"Review article digital change detection techniques using remotely-sensed data","volume":"10","author":"Singh","year":"1989","journal-title":"Int. J. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2020.3034752","article-title":"Remote sensing image change detection with transformers","volume":"60","author":"Chen","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Bandara, W.G.C., and Patel, V.M. (2022). A transformer-based siamese network for change detection. arXiv.","DOI":"10.1109\/IGARSS46834.2022.9883686"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Chen, H., and Shi, Z. (2020). A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection. Remote Sens., 12.","DOI":"10.3390\/rs12101662"},{"key":"ref_5","unstructured":"Daudt, R.C., Le Saux, B., and Boulch, A. (2018, January 7\u201310). Fully convolutional siamese networks for change detection. Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Shi, W., Zhang, M., Zhang, R., Chen, S., and Zhan, Z. (2020). Change Detection Based on Artificial Intelligence: State-of-the-Art and Challenges. Remote Sens., 12.","DOI":"10.3390\/rs12101688"},{"key":"ref_7","first-page":"1","article-title":"MSTDSNet-CD: Multi-scale Swin Transformer and Deeply Supervised Network for Change Detection of the Fast-Growing Urban Regions","volume":"19","author":"Song","year":"2022","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Daudt, R.C., Le Saux, B., Boulch, A., and Gousseau, Y. (2018, January 22\u201327). Urban change detection for multispectral earth observation using Convolutional Neural Networks. Proceedings of the IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain.","DOI":"10.1109\/IGARSS.2018.8518015"},{"key":"ref_9","first-page":"1","article-title":"SNUNet-CD: A densely connected siamese network for change detection of VHR images","volume":"19","author":"Fang","year":"2021","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2020.3034752","article-title":"Adversarial Instance Augmentation for Building Change Detection in Remote Sensing Images","volume":"60","author":"Chen","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"5891","DOI":"10.1109\/TGRS.2020.3011913","article-title":"SemiCDNet: A semisupervised Convolutional Neural Network for change detection in high resolution remote-sensing images","volume":"59","author":"Peng","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1194","DOI":"10.1109\/JSTARS.2020.3037893","article-title":"DASNet: Dual attentive fully convolutional siamese networks for change detection in high-resolution satellite images","volume":"14","author":"Chen","year":"2020","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1109\/TGRS.2018.2858817","article-title":"Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set","volume":"57","author":"Ji","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"811","DOI":"10.1109\/LGRS.2020.2988032","article-title":"Building change detection for remote sensing images using a dual-task constrained deep siamese convolutional network model","volume":"18","author":"Liu","year":"2020","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_15","first-page":"640","article-title":"Fully Convolutional Networks for Semantic Segmentation","volume":"39","author":"Long","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation, Springer International Publishing.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Peng, D., Zhang, Y., and Guan, H. (2019). End-to-end change detection for high resolution satellite images using improved UNet++. Remote Sens., 11.","DOI":"10.3390\/rs11111382"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1016\/j.isprsjprs.2020.06.003","article-title":"A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images","volume":"166","author":"Zhang","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1109\/LGRS.2018.2869608","article-title":"Triplet-based semantic relation learning for aerial remote sensing image change detection","volume":"16","author":"Zhang","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"7232","DOI":"10.1109\/TGRS.2020.2981051","article-title":"A feature difference Convolutional Neural Network-based change detection method","volume":"58","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_21","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wang, W., Xie, E., Li, X., Fan, D.P., Song, K., Liang, D., Lu, T., Luo, P., and Shao, L. (2021, January 10\u201317). Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00061"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Ke, Q., and Zhang, P. (2022). Hybrid-TransCD: A Hybrid Transformer Remote Sensing Image Change Detection Network via Token Aggregation. ISPRS Int. J. Geo-Inf., 11.","DOI":"10.3390\/ijgi11040263"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Nemoto, K., Hamaguchi, R., Sato, M., Fujita, A., Imaizumi, T., and Hikosaka, S. (2017, January 11\u201312). Building change detection via a combination of CNNs using only RGB aerial imageries. Proceedings of the Remote Sensing Technologies and Applications in Urban Environments II, Warsaw, Poland.","DOI":"10.1117\/12.2277912"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Ji, S., Shen, Y., Lu, M., and Zhang, Y. (2019). Building instance change detection from large-scale aerial images using Convolutional Neural Networks and simulated samples. Remote Sens., 11.","DOI":"10.3390\/rs11111343"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Liu, R., Kuffer, M., and Persello, C. (2019). The temporal dynamics of slums employing a CNN-based change detection approach. Remote Sens., 11.","DOI":"10.3390\/rs11232844"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Rahman, F., Vasu, B., Van Cor, J., Kerekes, J., and Savakis, A. (2018, January 26\u201329). Siamese network with multi-level features for patch-based change detection in satellite imagery. Proceedings of the 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Anaheim, CA, USA.","DOI":"10.1109\/GlobalSIP.2018.8646512"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., and Liang, J. (2018). Unet++: A nested u-net architecture for medical image segmentation. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Springer.","DOI":"10.1007\/978-3-030-00889-5_1"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation, Springer.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Bem, P.P.D., J\u00fanior, O., Guimares, R.F., and Gomes, R. (2020). Change Detection of Deforestation in the Brazilian Amazon Using Landsat Data and Convolutional Neural Networks. Remote Sens., 12.","DOI":"10.3390\/rs12060901"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1797","DOI":"10.1109\/LGRS.2019.2955309","article-title":"PPCNET: A combined patch-level and pixel-level end-to-end deep network for high-resolution remote sensing image change detection","volume":"17","author":"Bao","year":"2020","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Diakogiannis, F.I., Waldner, F., and Caccetta, P. (2021). Looking for change? Roll the Dice and demand Attention. Remote Sens., 13.","DOI":"10.3390\/rs13183707"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Fang, B., Pan, L., and Kou, R. (2019). Dual learning-based siamese framework for change detection using bi-temporal VHR optical remote sensing images. Remote Sens., 11.","DOI":"10.3390\/rs11111292"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1790","DOI":"10.1109\/TGRS.2019.2948659","article-title":"From W-Net to CDGAN: Bitemporal change detection via deep learning techniques","volume":"58","author":"Hou","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Jiang, H., Hu, X., Li, K., Zhang, J., Gong, J., and Zhang, M. (2020). PGA-SiamNet: Pyramid feature-based attention-guided siamese network for remote sensing orthoimagery building change detection. Remote Sens., 12.","DOI":"10.3390\/rs12030484"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"7296","DOI":"10.1109\/TGRS.2020.3033009","article-title":"Optical remote sensing image change detection based on attention mechanism and image difference","volume":"59","author":"Peng","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"565","DOI":"10.5194\/isprs-archives-XLII-2-565-2018","article-title":"Change detection in remote sensing images using conditional adversarial networks","volume":"42","author":"Lebedev","year":"2018","journal-title":"Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci."},{"key":"ref_41","first-page":"8003605","article-title":"Using adversarial network for multiple change detection in bitemporal remote sensing imagery","volume":"19","author":"Zhao","year":"2020","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1845","DOI":"10.1109\/LGRS.2017.2738149","article-title":"Change detection based on deep siamese convolutional network for optical aerial images","volume":"14","author":"Zhan","year":"2017","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_43","unstructured":"Wu, B., Xu, C., Dai, X., Wan, A., Zhang, P., Yan, Z., Tomizuka, M., Gonzalez, J., Keutzer, K., and Vajda, P. (2020). Visual transformers: Token-based image representation and processing for computer vision. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., and Torr, P.H. (2021, January 10\u201317). Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Montreal, QC, Canada.","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"ref_45","first-page":"12077","article-title":"SegFormer: Simple and efficient design for semantic segmentation with transformers","volume":"34","author":"Xie","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Zhang, D., Zhang, H., Tang, J., Wang, M., Hua, X., and Sun, Q. (2020, January 23\u201328). Feature pyramid transformer. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58604-1_20"},{"key":"ref_47","unstructured":"Hendrycks, D., and Gimpel, K. (2016). Gaussian error linear units (gelus). arXiv."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"10990","DOI":"10.1109\/JSTARS.2021.3119654","article-title":"STransFuse: Fusing swin transformer and Convolutional Neural Network for remote sensing image semantic segmentation","volume":"14","author":"Gao","year":"2021","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/21\/5379\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:03:56Z","timestamp":1760144636000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/21\/5379"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,27]]},"references-count":48,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["rs14215379"],"URL":"https:\/\/doi.org\/10.3390\/rs14215379","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,27]]}}}