{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,10]],"date-time":"2026-06-10T03:17:18Z","timestamp":1781061438969,"version":"3.54.1"},"reference-count":53,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2022,4,11]],"date-time":"2022-04-11T00:00:00Z","timestamp":1649635200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["41801308"],"award-info":[{"award-number":["41801308"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Natural Science Foundation of Shandong Province","award":["ZR202103070314"],"award-info":[{"award-number":["ZR202103070314"]}]},{"name":"the Open Research Fund of National Earth Observation Data Center","award":["NODAOP2020008"],"award-info":[{"award-number":["NODAOP2020008"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Building contour extraction from high-resolution remote sensing images is a basic task for the reasonable planning of regional construction. Recently, building segmentation methods based on the U-Net network have become popular as they largely improve the segmentation accuracy by applying \u2018skip connection\u2019 to combine high-level and low-level feature information more effectively. Meanwhile, researchers have demonstrated that introducing an attention mechanism into U-Net can enhance local feature expression and improve the performance of building extraction in remote sensing images. In this paper, we intend to explore the effectiveness of the primeval attention gate module and propose the novel Attention Gate Module (AG) based on adjusting the position of \u2018Resampler\u2019 in an attention gate to Sigmoid function for a building extraction task, and a novel Attention Gates U network (AGs-Unet) is further proposed based on AG, which can automatically learn different forms of building structures in high-resolution remote sensing images and realize efficient extraction of building contour. AGs-Unet integrates attention gates with a single U-Net network, in which a series of attention gate modules are added into the \u2018skip connection\u2019 for suppressing the irrelevant and noisy feature responses in the input image to highlight the dominant features of the buildings in the image. AGs-Unet improves the feature selection of the attention map to enhance the ability of feature learning, as well as paying attention to the feature information of small-scale buildings. We conducted the experiments on the WHU building dataset and the INRIA Aerial Image Labeling dataset, in which the proposed AGs-Unet model is compared with several classic models (such as FCN8s, SegNet, U-Net, and DANet) and two state-of-the-art models (such as PISANet, and ARC-Net). The extraction accuracy of each model is evaluated by using three evaluation indexes, namely, overall accuracy, precision, and intersection over union. Experimental results show that the proposed AGs-Unet model can improve the quality of building extraction from high-resolution remote sensing images effectively in terms of prediction performance and result accuracy.<\/jats:p>","DOI":"10.3390\/s22082932","type":"journal-article","created":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T22:48:45Z","timestamp":1649803725000},"page":"2932","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":53,"title":["AGs-Unet: Building Extraction Model for High Resolution Remote Sensing Images Based on Attention Gates U Network"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1142-6405","authenticated-orcid":false,"given":"Mingyang","family":"Yu","sequence":"first","affiliation":[{"name":"School of Surveying and Geo-Informatics, Shandong Jianzhu University, Jinan 250101, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4493-3227","authenticated-orcid":false,"given":"Xiaoxian","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Surveying and Geo-Informatics, Shandong Jianzhu University, Jinan 250101, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Wenzhuo","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Surveying and Geo-Informatics, Shandong Jianzhu University, Jinan 250101, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3041-3557","authenticated-orcid":false,"given":"Yaohui","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Surveying and Geo-Informatics, Shandong Jianzhu University, Jinan 250101, China"},{"name":"Hebei Key Laboratory of Earthquake Dynamics, Sanhe 065201, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2022,4,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1016\/j.rse.2003.10.022","article-title":"Impacts of imagery temporal frequency on land-cover change detection monitoring","volume":"89","author":"Lunetta","year":"2004","journal-title":"Remote Sens. Environ."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Wu, T., Hu, Y., Peng, L., and Chen, R. (2020). Improved Anchor-Free Instance Segmentation for Building Extraction from High-Resolution Remote Sensing Images. Remote Sens., 12.","DOI":"10.3390\/rs12182910"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"101577","DOI":"10.1016\/j.ijdrr.2020.101577","article-title":"Scenario-based seismic vulnerability and hazard analyses to help direct disaster risk reduction in rural Weinan, China","volume":"48","author":"Liu","year":"2020","journal-title":"Int. J. Disaster Risk Reduct."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Sun, S., Mu, L., Wang, L., Liu, P., Liu, X., and Zhang, Y. (2021). Semantic Segmentation for Buildings of Large Intra-Class Variation in Remote Sensing Images with O-GAN. Remote Sens., 13.","DOI":"10.3390\/rs13030475"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1109\/MSP.2013.2279179","article-title":"Advances in Hyperspectral Image Classification: Earth Monitoring with Statistical Learning Methods","volume":"31","author":"Tuia","year":"2014","journal-title":"IEEE Signal Processing Mag."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"958","DOI":"10.1080\/19475705.2018.1524400","article-title":"Seismic vulnerability assessment at urban scale using data mining and GIScience technology: Application to Urumqi (China)","volume":"10","author":"Liu","year":"2019","journal-title":"Geomat. Nat. Hazards Risk"},{"key":"ref_7","unstructured":"Mnih, V. (2013). Machine Learning for Aerial Image Labeling. [Ph.D. Thesis, University of Toronto]."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.4236\/ijg.2019.101001","article-title":"A Review of Researches on Deep Learning in Remote Sensing Application","volume":"10","author":"Zhu","year":"2019","journal-title":"Int. J. Geosci."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1842","DOI":"10.1109\/JSTARS.2020.2991391","article-title":"Refined Extraction Of Building Outlines From High-Resolution Remote Sensing Imagery Based on a Multifeature Convolutional Neural Network and Morphological Filtering","volume":"13","author":"Xie","year":"2020","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"4287","DOI":"10.1109\/TGRS.2020.3014312","article-title":"Scene-Driven Multitask Parallel Attention Network for Building Extraction in High-Resolution Remote Sensing Images","volume":"59","author":"Guo","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Chen, Q., Zhang, Y., Li, X., and Tao, P. (2022). Extracting Rectified Building Footprints from Traditional Orthophotos: A New Workflow. Sensors, 22.","DOI":"10.3390\/s22010207"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Wang, Y., Li, S., Lin, Y., and Wang, M. (2021). Lightweight Deep Neural Network Method for Water Body Extraction from High-Resolution Remote Sensing Images with Multisensors. Sensors, 21.","DOI":"10.3390\/s21217397"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1156","DOI":"10.1109\/TGRS.2008.2008440","article-title":"Urban-Area and Building Detection Using SIFT Keypoints and Graph Theory","volume":"47","author":"Sirmacek","year":"2009","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1109\/JSTARS.2011.2168195","article-title":"Morphological Building\/Shadow Index for Building Extraction From High-Resolution Imagery Over Urban Areas","volume":"5","author":"Huang","year":"2012","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1388","DOI":"10.1109\/LGRS.2016.2590481","article-title":"A Morphological Building Detection Framework for High-Resolution Optical Imagery Over Urban Areas","volume":"13","author":"Zhang","year":"2016","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_16","first-page":"150","article-title":"Automatic urban building boundary extraction from high resolution aerial images using an innovative model of active contours","volume":"12","author":"Ahmadi","year":"2010","journal-title":"Int. J. Appl. Earth Obs."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1127","DOI":"10.1080\/01431161.2016.1148283","article-title":"Building extraction in satellite images using active contours and colour features","volume":"37","author":"Liasis","year":"2016","journal-title":"Int. J. Remote Sens."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/j.isprsjprs.2013.09.004","article-title":"Automated detection of buildings from single VHR multispectral images using shadow information and graph cuts","volume":"86","author":"Ok","year":"2013","journal-title":"ISPRS J. Photogramm."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"906","DOI":"10.1109\/JSTARS.2016.2603184","article-title":"Building Extraction from Remotely Sensed Images by Integrating Saliency Cue","volume":"10","author":"Li","year":"2017","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/j.isprsjprs.2007.05.011","article-title":"Automatic recognition of man-made objects in high resolution optical remote sensing images by SVM classification of geometric image features","volume":"62","author":"Inglada","year":"2007","journal-title":"ISPRS J. Photogramm."},{"key":"ref_21","first-page":"58","article-title":"Building extraction from high-resolution optical spaceborne images using the integration of support vector machine (SVM) classification, Hough transformation and perceptual grouping","volume":"34","author":"Turker","year":"2015","journal-title":"Int. J. Appl. Earth Obs."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1016\/j.patcog.2017.10.013","article-title":"Recent advances in convolutional neural networks","volume":"77","author":"Gu","year":"2018","journal-title":"Pattern Recognit."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"ImageNet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. Acm"},{"key":"ref_24","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"154997","DOI":"10.1109\/ACCESS.2020.3015701","article-title":"ARC-Net: An Efficient Network for Building Extraction From High-Resolution Aerial Images","volume":"8","author":"Liu","year":"2020","journal-title":"IEEE Access"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhou, D., Wang, G., He, G., Long, T., Yin, R., Zhang, Z., Chen, S., and Luo, B. (2020). Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention Network. Sensors, 20.","DOI":"10.3390\/s20247241"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Chen, K., Zou, Z., and Shi, Z. (2021). Building Extraction from Remote Sensing Images with Sparse Token Transformers. Remote Sens., 13.","DOI":"10.3390\/rs13214441"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1109\/TGRS.2018.2858817","article-title":"Fully Convolutional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set","volume":"57","author":"Ji","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Guo, M., Liu, H., Xu, Y., and Huang, Y. (2020). Building Extraction Based on U-Net with an Attention Block and Multiple Losses. Remote Sens., 12.","DOI":"10.3390\/rs12091400"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 8\u201310). Fully convolutional networks for semantic segmentation. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-Net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., and Lu, H. (2019, January 15\u201320). Dual attention network for scene segmentation. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00326"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Li, C., Fu, L., Zhu, Q., Zhu, J., Fang, Z., Xie, Y., Guo, Y., and Gong, Y. (2021). Attention Enhanced U-Net for Building Extraction from Farmland Based on Google and WorldView-2 Remote Sensing Images. Remote Sens., 13.","DOI":"10.3390\/rs13214411"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2611","DOI":"10.1109\/JSTARS.2021.3058097","article-title":"Attention-Gate-Based Encoder-Decoder Network for Automatical Building Extraction","volume":"14","author":"Deng","year":"2021","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_38","unstructured":"Guo, M., Xu, T., Liu, J., Liu, Z., Jiang, P., Mu, T., Zhang, S., Martin, R.R., Cheng, M., and Hu, S. (2021). Attention Mechanisms in Computer Vision: A Survey. arXiv."},{"key":"ref_39","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., and Jones, L. (2017, January 4\u20139). Attention is all you need. Proceedings of the 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_40","unstructured":"Bahdanau, D., Cho, K., and Bengio, Y. (2014). Neural Machine Translation by Jointly Learning to Align and Translate. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., and Zhang, L. (2017, January 21\u201326). Bottom-up and top-down attention for image captioning and visual question answering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2018.00636"},{"key":"ref_42","unstructured":"Stollenga, M., Masci, J., Gomez, F., and Schmidhuber, J. (2014, January 8\u201313). Deep networks with internal selective attention through feedback connections. Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_43","unstructured":"Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., and Kainz, B. (2018). Attention U-Net: Learning Where to Look for the Pancreas. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J., and Kweon, I.S. (2018, January 8\u201314). CBAM: Convolutional block attention module. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., Albanie, S., Sun, G., and Wu, E. (2018, January 18\u201323). Squeeze-and-Excitation Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Cao, Y., Xu, J., Lin, S., Wei, F., and Hu, H. (2019, January 27\u201328). GCNet: Non-local networks meet squeeze-excitation networks and beyond. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision Workshops (ICCVW), Seoul, Korea.","DOI":"10.1109\/ICCVW.2019.00246"},{"key":"ref_47","unstructured":"Jetley, S., Lord, N.A., Lee, N., and Torr, P.H.S. (2018). Learn To Pay Attention. arXiv."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"731","DOI":"10.5194\/isprs-archives-XLIII-B2-2020-731-2020","article-title":"Building Outline Delineation: From Very High Resolution Remote Sensing Imagery to Polygons with an Improved End-To-End Learning Framework","volume":"XLIII-B2-2020","author":"Zhao","year":"2020","journal-title":"ISPRS\u2014Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Maggiori, E., Tarabalka, Y., Charpiat, G., and Alliez, P. (2017, January 23\u201328). Can semantic labeling methods generalize to any city the inria aerial image labeling benchmark. Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA.","DOI":"10.1109\/IGARSS.2017.8127684"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L. (2018, January 18\u201322). MobileNetV2: Inverted residuals and linear bottlenecks. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_51","unstructured":"Han, S., Pool, J., Tran, J., and Dally, W.J. (2015). Learning both Weights and Connections for Efficient Neural Networks. arXiv."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"128774","DOI":"10.1109\/ACCESS.2019.2940527","article-title":"Automatic Building Extraction on High-Resolution Remote Sensing Imagery Using Deep Convolutional Encoder-Decoder With Spatial Pyramid Pooling","volume":"7","author":"Liu","year":"2019","journal-title":"IEEE Access"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Dixit, M., Chaurasia, K., and Mishra, V.K. (2021). Automatic Building Extraction from High-Resolution Satellite Images Using Deep Learning Techniques, Springer.","DOI":"10.1007\/978-981-15-7533-4_61"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/8\/2932\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:51:57Z","timestamp":1760136717000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/8\/2932"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,11]]},"references-count":53,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2022,4]]}},"alternative-id":["s22082932"],"URL":"https:\/\/doi.org\/10.3390\/s22082932","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,4,11]]}}}