{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,17]],"date-time":"2026-07-17T12:22:18Z","timestamp":1784290938771,"version":"3.55.0"},"reference-count":24,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2020,3,6]],"date-time":"2020-03-06T00:00:00Z","timestamp":1583452800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>In this paper, we consider building extraction from high spatial resolution remote sensing images. At present, most building extraction methods are based on artificial features. However, the diversity and complexity of buildings mean that building extraction methods still face great challenges, so methods based on deep learning have recently been proposed. In this paper, a building extraction framework based on a convolution neural network and edge detection algorithm is proposed. The method is called Mask R-CNN Fusion Sobel. Because of the outstanding achievement of Mask R-CNN in the field of image segmentation, this paper improves it and then applies it in remote sensing image building extraction. Our method consists of three parts. First, the convolutional neural network is used for rough location and pixel level classification, and the problem of false and missed extraction is solved by automatically discovering semantic features. Second, Sobel edge detection algorithm is used to segment building edges accurately so as to solve the problem of edge extraction and the integrity of the object of deep convolutional neural networks in semantic segmentation. Third, buildings are extracted by the fusion algorithm. We utilize the proposed framework to extract the building in high-resolution remote sensing images from Chinese satellite GF-2, and the experiments show that the average value of IOU (intersection over union) of the proposed method was 88.7% and the average value of Kappa was 87.8%, respectively. Therefore, our method can be applied to the recognition and segmentation of complex buildings and is superior to the classical method in accuracy.<\/jats:p>","DOI":"10.3390\/s20051465","type":"journal-article","created":{"date-parts":[[2020,3,9]],"date-time":"2020-03-09T05:37:34Z","timestamp":1583732254000},"page":"1465","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":86,"title":["An Efficient Building Extraction Method from High Spatial Resolution Remote Sensing Images Based on Improved Mask R-CNN"],"prefix":"10.3390","volume":"20","author":[{"given":"Lili","family":"Zhang","sequence":"first","affiliation":[{"name":"College of Computer and Information Engineering, Hohai University, Nanjing 211100, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jisen","family":"Wu","sequence":"additional","affiliation":[{"name":"College of Computer and Information Engineering, Hohai University, Nanjing 211100, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yu","family":"Fan","sequence":"additional","affiliation":[{"name":"College of Computer and Information Engineering, Hohai University, Nanjing 211100, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8404-2464","authenticated-orcid":false,"given":"Hongmin","family":"Gao","sequence":"additional","affiliation":[{"name":"College of Computer and Information Engineering, Hohai University, Nanjing 211100, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yehong","family":"Shao","sequence":"additional","affiliation":[{"name":"Arts and Science, Ohio University Southern, Ironton, OH 45638, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2020,3,6]]},"reference":[{"key":"ref_1","unstructured":"Jung, C.R., and Schramm, R. (2004, January 17\u201320). Rectangle detection based on a windowed Hough transform. Proceedings of the 17th Brazilian Symposium on Computer Graphics and Image Processing, Foz do Igua\u00e7u, Brazil."},{"key":"ref_2","first-page":"150","article-title":"Automatic urban building boundary extraction from high resolution aerial images using an innovative model of active contours","volume":"12","author":"Ahmadi","year":"2010","journal-title":"Int. J. Appl. Earth Obs. Geoinf."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1145","DOI":"10.1016\/j.rse.2010.12.017","article-title":"Per-pixel vs. object-based classification of urban land cover extraction using high spatial resolution imagery","volume":"115","author":"Myint","year":"2011","journal-title":"Remote Sens. Environ."},{"key":"ref_4","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). Imagenet classification with deep convolutional neural networks. Proceedings of the Advances in Neural Information Processing Systems 25, Lake Tahoe, NV, USA."},{"key":"ref_5","unstructured":"Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., and Sun, J. (2017). Light-head R-CNN: In defense of two-stage object detector. arXiv, Available online: http:\/\/dwz.date\/CyZ."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition","volume":"37","author":"He","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2015, Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, 5\u20139 October, Munich, Germany, Springer.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_8","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv, Available online: http:\/\/dwz.date\/Czd."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 10). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_11","first-page":"3","article-title":"Study on Building Extraction from High-Resolution Images Using Mbi","volume":"42","author":"Ding","year":"2018","journal-title":"Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2152","DOI":"10.1080\/01431161.2011.606852","article-title":"Unsupervised building detection in complex urban environments from multispectral satellite imagery","volume":"33","author":"Aytekin","year":"2012","journal-title":"Int. J. Remote Sens."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"482","DOI":"10.1007\/s11227-016-1890-9","article-title":"Efficient implementation of morphological index for building\/shadow extraction from remotely sensed images","volume":"73","author":"Plaza","year":"2017","journal-title":"J. Supercomput."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Chen, J., Wang, C., Zhang, H., Wu, F., Zhang, B., and Lei, W. (2017). Automatic detection of low-rise gable-roof building from single submeter SAR images based on local multilevel segmentation. Remote Sens., 9.","DOI":"10.3390\/rs9030263"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1080\/01431161.2010.548410","article-title":"Complex building description and extraction based on Hough transformation and cycle detection","volume":"3","author":"Cui","year":"2012","journal-title":"Remote Sens. Lett."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Guo, Z., Chen, Q., Wu, G., Xu, Y., Shibasaki, R., and Shao, X. (2017). Village Building Identification Based on Ensemble Convolutional Neural Networks. Sensors, 17.","DOI":"10.3390\/s17112487"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1016\/j.isprsjprs.2018.02.006","article-title":"Building instance classification using street view images","volume":"145","author":"Kang","year":"2018","journal-title":"ISPRS J. Photogramm."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Makantasis, K., Karantzalos, K., Doulamis, A., and Loupos, K. (2015, January 14\u201316). Deep learning-based man-made object detection from hyperspectral data. Proceedings of the 11th International Symposium, ISVC 2015, Las Vegas, NV, USA.","DOI":"10.1007\/978-3-319-27857-5_64"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1016\/j.patcog.2016.07.001","article-title":"Towards Better Exploiting Convolutional Neural Networks for Remote Sensing Scene Classification","volume":"61","author":"Nogueira","year":"2016","journal-title":"Pattern Recognit."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"773","DOI":"10.1007\/s00607-018-0609-6","article-title":"PTL-CFS based deep convolutional neural network model for remote sensing classification","volume":"100","author":"Yu","year":"2018","journal-title":"Computing"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 24\u201327). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 13\u201316). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_23","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster R-CNN: Towards real-time object detection with region proposal networks. Proceedings of the Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 8\u201310). Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/5\/1465\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:04:58Z","timestamp":1760173498000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/5\/1465"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,3,6]]},"references-count":24,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2020,3]]}},"alternative-id":["s20051465"],"URL":"https:\/\/doi.org\/10.3390\/s20051465","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,3,6]]}}}