{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,14]],"date-time":"2026-02-14T13:17:56Z","timestamp":1771075076215,"version":"3.50.1"},"reference-count":59,"publisher":"MDPI AG","issue":"16","license":[{"start":{"date-parts":[[2022,8,12]],"date-time":"2022-08-12T00:00:00Z","timestamp":1660262400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Natural Science Foundation for Distinguished Young Scholars of Henan Province","award":["212300410014"],"award-info":[{"award-number":["212300410014"]}]},{"name":"Natural Science Foundation for Distinguished Young Scholars of Henan Province","award":["2021SJGLX299"],"award-info":[{"award-number":["2021SJGLX299"]}]},{"name":"Practice Projects of Higher Education Reform in Henan Province","award":["212300410014"],"award-info":[{"award-number":["212300410014"]}]},{"name":"Practice Projects of Higher Education Reform in Henan Province","award":["2021SJGLX299"],"award-info":[{"award-number":["2021SJGLX299"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>There remains several challenges that are encountered in the task of extracting buildings from aerial imagery using convolutional neural networks (CNNs). First, the tremendous complexity of existing building extraction networks impedes their practical application. In addition, it is arduous for networks to sufficiently utilize the various building features in different images. To address these challenges, we propose an efficient network called MSL-Net that focuses on both multiscale building features and multilevel image features. First, we use depthwise separable convolution (DSC) to significantly reduce the network complexity, and then we embed a group normalization (GN) layer in the inverted residual structure to alleviate network performance degradation. Furthermore, we extract multiscale building features through an atrous spatial pyramid pooling (ASPP) module and apply long skip connections to establish long-distance dependence to fuse features at different levels of the given image. Finally, we add a deformable convolution network layer before the pixel classification step to enhance the feature extraction capability of MSL-Net for buildings with irregular shapes. The experimental results obtained on three publicly available datasets demonstrate that our proposed method achieves state-of-the-art accuracy with a faster inference speed than that of competing approaches. Specifically, the proposed MSL-Net achieves 90.4%, 81.1% and 70.9% intersection over union (IoU) values on the WHU Building Aerial Imagery dataset, Inria Aerial Image Labeling dataset and Massachusetts Buildings dataset, respectively, with an inference speed of 101.4 frames per second (FPS) for an input image of size 3 \u00d7 512 \u00d7 512 on an NVIDIA RTX 3090 GPU. With an excellent tradeoff between accuracy and speed, our proposed MSL-Net may hold great promise for use in building extraction tasks.<\/jats:p>","DOI":"10.3390\/rs14163914","type":"journal-article","created":{"date-parts":[[2022,8,15]],"date-time":"2022-08-15T23:44:03Z","timestamp":1660607043000},"page":"3914","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":25,"title":["MSL-Net: An Efficient Network for Building Extraction from Aerial Imagery"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4984-4920","authenticated-orcid":false,"given":"Yue","family":"Qiu","sequence":"first","affiliation":[{"name":"Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China"}]},{"given":"Fang","family":"Wu","sequence":"additional","affiliation":[{"name":"Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9690-6373","authenticated-orcid":false,"given":"Jichong","family":"Yin","sequence":"additional","affiliation":[{"name":"Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China"}]},{"given":"Chengyi","family":"Liu","sequence":"additional","affiliation":[{"name":"Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6184-1300","authenticated-orcid":false,"given":"Xianyong","family":"Gong","sequence":"additional","affiliation":[{"name":"Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China"}]},{"given":"Andong","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,8,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"2691","DOI":"10.1007\/s00521-021-06027-1","article-title":"Recognition and Extraction of High-Resolution Satellite Remote Sensing Image Buildings Based on Deep Learning","volume":"34","author":"Zeng","year":"2022","journal-title":"Neural. Comput. Appl."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"5234","DOI":"10.1080\/01431161.2016.1230287","article-title":"Building Extraction from High-Resolution Satellite Images in Urban Areas: Recent Methods and Strategies Against Significant Challenges","volume":"37","author":"Ghanea","year":"2016","journal-title":"Int. J. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1109\/TGRS.2018.2858817","article-title":"Fully Convolutional Networks for Multisource Building Extraction from an Open Aerial and Satellite Imagery Data Set","volume":"57","author":"Ji","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1016\/j.isprsjprs.2020.10.008","article-title":"An End-to-End Shape Modeling Framework for Vectorized Building Outline Generation from Aerial Images","volume":"170","author":"Chen","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_5","unstructured":"Katartzis, A., Sahli, H., Nyssen, E., and Cornelis, J. (2001, January 9\u201313). Detection of Buildings from a Single Airborne Image Using a Markov Random Field Model. Proceedings of the IGARSS 2001, Scanning the Present and Resolving the Future, IEEE 2001 International Geoscience and Remote Sensing Symposium (Cat. No.01CH37217), Sydney, Australia."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"2386","DOI":"10.1109\/TGRS.2005.853570","article-title":"Rectangular Building Extraction from Stereoscopic Airborne Radar Images","volume":"43","author":"Simonetto","year":"2005","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_7","unstructured":"Jung, C.R., and Schramm, R. (2004, January 20\u201320). Rectangle Detection Based on a Windowed Hough Transform. Proceedings of the 17th Brazilian Symposium on Computer Graphics and Image Processing, Curitiba, Brazil."},{"key":"ref_8","unstructured":"Li, L. (2011). Research on Shadow-Based Building Extraction from High Resolution Remote Sensing Images. [Master\u2019s Thesis, Hunan University of Science and Technology]."},{"key":"ref_9","first-page":"503","article-title":"Building Extraction from Airborne Laser Point Cloud Using NDVI Constrained Watershed Algorithm","volume":"36","author":"Zhao","year":"2016","journal-title":"Acta Optica Sin."},{"key":"ref_10","first-page":"224","article-title":"Remote Sensing Image Segmentation Approach Based on Quarter-Tree and Graph Cut","volume":"36","author":"Zhou","year":"2010","journal-title":"Comput. Eng."},{"key":"ref_11","unstructured":"Wei, D. (2013). Research on Buildings Extraction Technology on High Resolution Remote Sensing Images. [Master\u2019s Thesis, Information Engineering University]."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1016\/j.isprsjprs.2010.02.002","article-title":"An Efficient Stochastic Approach for Building Footprint Extraction from Digital Elevation Models","volume":"65","author":"Tournaire","year":"2010","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"185","DOI":"10.5623\/cig2017-401","article-title":"Building Extraction from Fused LiDAR and Hyperspectral Data Using Random Forest Algorithm","volume":"71","author":"Parsian","year":"2017","journal-title":"Geomatica"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"935","DOI":"10.1109\/TGRS.2012.2205156","article-title":"Automatic Detection and Reconstruction of Building Radar Footprints from Single VHR SAR Images","volume":"51","author":"Ferro","year":"2013","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_15","first-page":"2008","article-title":"Urban Building Extraction from High-Resolution Satellite Panchromatic Image Using Clustering and Edge Detection","volume":"Volume 3","author":"Wei","year":"2004","journal-title":"Proceedings of the IEEE International Geoscience and Remote Sensing Symposium"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1109\/JSTARS.2011.2168195","article-title":"Morphological Building\/Shadow Index for Building Extraction from High-Resolution Imagery Over Urban Areas","volume":"5","author":"Huang","year":"2012","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"22034","DOI":"10.1109\/ACCESS.2018.2819705","article-title":"Building Extraction from RGB VHR Images Using Shifted Shadow Algorithm","volume":"6","author":"Gao","year":"2018","journal-title":"IEEE Access"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"552","DOI":"10.1016\/j.proeng.2011.07.069","article-title":"Use of Digital Surface Model Constructed from Digital Aerial Images to Detect Collapsed Buildings during Earthquake","volume":"14","author":"Maruyama","year":"2011","journal-title":"Procedia Eng."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"240","DOI":"10.1016\/j.isprsjprs.2021.11.005","article-title":"A Coarse-to-Fine Boundary Refinement Network for Building Footprint Extraction from Remote Sensing Imagery","volume":"183","author":"Guo","year":"2022","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"2793","DOI":"10.1109\/TPAMI.2017.2750680","article-title":"Learning Building Extraction in Aerial Scenes with Convolutional Networks","volume":"40","author":"Yuan","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"7092","DOI":"10.1109\/TGRS.2017.2740362","article-title":"High-Resolution Aerial Image Labeling with Convolutional Neural Networks","volume":"55","author":"Maggiori","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid Scene Parsing Network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_24","unstructured":"Chen, L.-C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"406","DOI":"10.11834\/jrs.20209200","article-title":"Classification of High-Resolution Remote Sensing Images Based on Enhanced DeepLab Algorithm and Adaptive Loss Function","volume":"26","author":"Xu","year":"2022","journal-title":"Nat. Remote Sens. Bull."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-Net: Convolutional Networks for Biomedical Image Segmentation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2015, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"2245","DOI":"10.11834\/jrs.20210042","article-title":"House Building Extraction from High-Resolution Remote Sensing Images based on IEU-Net","volume":"25","author":"Wang","year":"2021","journal-title":"Nat. Remote Sens. Bull."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"101972","DOI":"10.1109\/ACCESS.2021.3097630","article-title":"HA U-Net: Improved Model for Building Extraction from High Resolution Remote Sensing Imagery","volume":"9","author":"Xu","year":"2021","journal-title":"IEEE Access"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"6106","DOI":"10.1109\/TGRS.2020.3022410","article-title":"Multiscale U-Shaped CNN Building Instance Extraction Framework with Edge Constraint for High-Spatial-Resolution Remote Sensing Imagery","volume":"59","author":"Liu","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_31","first-page":"135","article-title":"Multi-Path RSU Network Method for High-Resolution Remote Sensing Image Building Extraction","volume":"51","author":"Zhang","year":"2022","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_32","first-page":"1838","article-title":"High-Resolution Remote Sensing Image Building Extraction Based on PRCUnet","volume":"23","author":"Xu","year":"2021","journal-title":"J. Geo-inf. Sci."},{"key":"ref_33","first-page":"457","article-title":"E-Unet: A Atrous Convolution-Based Neural Network for Building Extraction from High-Resolution Remote Sensing Images","volume":"51","author":"He","year":"2022","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_34","first-page":"490","article-title":"Multi-Scale Dilated Convolutional Pyramid Network for Building Extraction","volume":"41","author":"Zhang","year":"2021","journal-title":"J. Xi\u2019an Univ. Sci. Technol."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Rashidian, V., Baise, L.G., and Koch, M. (August, January 28). Detecting Collapsed Buildings After a Natural Hazard on VHR Optical Satellite Imagery Using U-Net Convolutional Neural Networks. Proceedings of the IGARSS 2019\u20142019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan.","DOI":"10.1109\/IGARSS.2019.8899121"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"102994","DOI":"10.1016\/j.autcon.2019.102994","article-title":"Automated Regional Seismic Damage Assessment of Buildings Using an Unmanned Aerial Vehicle and a Convolutional Neural Network","volume":"109","author":"Xiong","year":"2020","journal-title":"Autom. Constr."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Cooner, A.J., Shao, Y., and Campbell, J.B. (2016). Detection of Urban Damage Using Remote Sensing and Machine Learning Algorithms: Revisiting the 2010 Haiti Earthquake. Remote Sens., 8.","DOI":"10.3390\/rs8100868"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1109\/MGRS.2019.2902525","article-title":"Hypersectral Imaging for Military and Security Applications: Combining Myriad Processing and Sensing Techniques","volume":"7","author":"Shimoni","year":"2019","journal-title":"IEEE Geosci. Remote Sens. Mag."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.-C. (2018, January 18\u201323). MobileNetV2: Inverted Residuals and Linear Bottlenecks. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_40","unstructured":"Sifre, L. (2014). Rigid-Motion Scattering for Image Classification. [Ph.D. Thesis, \u00c9cole Polytechnique]."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Wu, Y., and He, K. (2018, January 8\u201314). Group Normalization. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01261-8_1"},{"key":"ref_42","unstructured":"Srivastava, R.K., Greff, K., and Schmidhuber, J. (2015). Highway Networks. arXiv."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zhu, X., Hu, H., Lin, S., and Dai, J. (2019, January 15\u201320). Deformable ConvNets V2: More Deformable, Better Results. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00953"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_45","unstructured":"Ioffe, S., and Szegedy, C. (2015, January 6\u201311). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on International Conference on Machine Learning, Lille, France."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Huang, L., Zhou, Y., Wang, T., Luo, J., and Liu, X. (2022). Delving into the Estimation Shift of Batch Normalization in a Network. arXiv.","DOI":"10.1109\/CVPR52688.2022.00084"},{"key":"ref_47","unstructured":"Yu, F., and Koltun, V. (2015). Multi-Scale Context Aggregation by Dilated Convolutions. arXiv."},{"key":"ref_48","unstructured":"Loshchilov, I., and Hutter, F. (2017). SGDR: Stochastic Gradient Descent with Warm Restarts. arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Maggiori, E., Tarabalka, Y., Charpiat, G., and Alliez, P. (2017, January 23\u201328). Can Semantic Labeling Methods Generalize to Any City? The Inria Aerial Image Labeling Benchmark. Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA.","DOI":"10.1109\/IGARSS.2017.8127684"},{"key":"ref_50","unstructured":"Mnih, V. (2013). Machine Learning for Aerial Image Labeling. [Ph.D. Thesis, University of Toronto]."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1109\/TPAMI.2018.2858826","article-title":"Focal Loss for Dense Object Detection","volume":"42","author":"Lin","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Li, X., Sun, X., Meng, Y., Liang, J., Wu, F., and Li, J. (2020). Dice Loss for Data-imbalanced NLP Tasks. arXiv.","DOI":"10.18653\/v1\/2020.acl-main.45"},{"key":"ref_53","first-page":"448","article-title":"Building Extraction via Convolutional Neural Networks from an Open Remote Sensing Building Dataset","volume":"48","author":"Ji","year":"2019","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Yu, M., Chen, X., Zhang, W., and Liu, Y. (2022). AGs-Unet: Building Extraction Model for High Resolution Remote Sensing Images Based on Attention Gates U Network. Sensors, 22.","DOI":"10.3390\/s22082932"},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Zhou, D., Wang, G., He, G., Long, T., Yin, R., Zhang, Z., Chen, S., and Luo, B. (2020). Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention Network. Sensors, 20.","DOI":"10.3390\/s20247241"},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Chen, M., Wu, J., Liu, L., Zhao, W., Tian, F., Shen, Q., Zhao, B., and Du, R. (2021). DR-Net: An Improved Network for Building Extraction from High Resolution Remote Sensing Image. Remote Sens., 13.","DOI":"10.3390\/rs13020294"},{"key":"ref_57","first-page":"1","article-title":"A Lightweight Network for Building Extraction from Remote Sensing Images","volume":"60","author":"Huang","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Shao, Z., Tang, P., Wang, Z., Saleem, N., Yam, S., and Sommai, C. (2020). BRRNet: A Fully Convolutional Neural Network for Automatic Building Extraction from High-Resolution Remote Sensing Images. Remote Sens., 12.","DOI":"10.3390\/rs12061050"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Liu, P., Liu, X., Liu, M., Shi, Q., Yang, J., Xu, X., and Zhang, Y. (2019). Building Footprint Extraction from High-Resolution Images via Spatial Residual Inception Convolutional Neural Network. Remote Sens., 11.","DOI":"10.3390\/rs11070830"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/16\/3914\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:07:50Z","timestamp":1760141270000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/16\/3914"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,12]]},"references-count":59,"journal-issue":{"issue":"16","published-online":{"date-parts":[[2022,8]]}},"alternative-id":["rs14163914"],"URL":"https:\/\/doi.org\/10.3390\/rs14163914","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,12]]}}}