{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,26]],"date-time":"2025-11-26T16:38:14Z","timestamp":1764175094308,"version":"build-2065373602"},"reference-count":47,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2021,9,23]],"date-time":"2021-09-23T00:00:00Z","timestamp":1632355200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Building instances extraction is an essential task for surveying and mapping. Challenges still exist in extracting building instances from high-resolution remote sensing imagery mainly because of complex structures, variety of scales, and interconnected buildings. This study proposes a coarse-to-fine contour optimization network to improve the performance of building instance extraction. Specifically, the network contains two special sub-networks: attention-based feature pyramid sub-network (AFPN) and coarse-to-fine contour sub-network. The former sub-network introduces channel attention into each layer of the original feature pyramid network (FPN) to improve the identification of small buildings, and the latter is designed to accurately extract building contours via two cascaded contour optimization learning. Furthermore, the whole network is jointly optimized by multiple losses, that is, a contour loss, a classification loss, a box regression loss and a general mask loss. Experimental results on three challenging building extraction datasets demonstrated that the proposed method outperformed the state-of-the-art methods\u2019 accuracy and quality of building contours.<\/jats:p>","DOI":"10.3390\/rs13193814","type":"journal-article","created":{"date-parts":[[2021,9,27]],"date-time":"2021-09-27T22:16:38Z","timestamp":1632780998000},"page":"3814","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["A Coarse-to-Fine Contour Optimization Network for Extracting Building Instances from High-Resolution Remote Sensing Imagery"],"prefix":"10.3390","volume":"13","author":[{"given":"Fang","family":"Fang","sequence":"first","affiliation":[{"name":"School of Geographic and Information Engineering, China University of Geosciences, Wuhan 430074, China"},{"name":"National Engineering Research Center of Geographic Information System, China University of Geosciences, Wuhan 430074, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kaishun","family":"Wu","sequence":"additional","affiliation":[{"name":"National Engineering Research Center of Geographic Information System, China University of Geosciences, Wuhan 430074, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0465-3976","authenticated-orcid":false,"given":"Yuanyuan","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Geographic and Information Engineering, China University of Geosciences, Wuhan 430074, China"},{"name":"National Engineering Research Center of Geographic Information System, China University of Geosciences, Wuhan 430074, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1829-4006","authenticated-orcid":false,"given":"Shengwen","family":"Li","sequence":"additional","affiliation":[{"name":"School of Geographic and Information Engineering, China University of Geosciences, Wuhan 430074, China"},{"name":"National Engineering Research Center of Geographic Information System, China University of Geosciences, Wuhan 430074, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2387-5419","authenticated-orcid":false,"given":"Bo","family":"Wan","sequence":"additional","affiliation":[{"name":"School of Geographic and Information Engineering, China University of Geosciences, Wuhan 430074, China"},{"name":"National Engineering Research Center of Geographic Information System, China University of Geosciences, Wuhan 430074, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yanling","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Geographic and Information Engineering, China University of Geosciences, Wuhan 430074, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daoyuan","family":"Zheng","sequence":"additional","affiliation":[{"name":"School of Geographic and Information Engineering, China University of Geosciences, Wuhan 430074, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,9,23]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1006\/cviu.1999.0750","article-title":"Automatic object extraction from aerial imagery\u2014A survey focusing on buildings","volume":"74","author":"Mayer","year":"1999","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Shrestha, S., and Vanneschi, L. (2018). Improved fully convolutional network with conditional random fields for building extraction. Remote Sens., 10.","DOI":"10.3390\/rs10071135"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1109\/TGRS.2018.2858817","article-title":"Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set","volume":"57","author":"Ji","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Zhao, K., Kang, J., Jung, J., and Sohn, G. (2018, January 18\u201322). Building extraction from satellite images using mask R-CNN with building boundary regularization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00045"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Wen, Q., Jiang, K., Wang, W., Liu, Q., Guo, Q., Li, L., and Wang, P. (2019). Automatic building extraction from Google Earth images under complex backgrounds based on deep instance segmentation network. Sensors, 19.","DOI":"10.3390\/s19020333"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"6106","DOI":"10.1109\/TGRS.2020.3022410","article-title":"Multiscale U-Shaped CNN Building Instance Extraction Framework With Edge Constraint for High-Spatial-Resolution Remote Sensing Imagery","volume":"59","author":"Liu","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Mohanty, S.P., Czakon, J., Kaczmarek, K.A., Pyskir, A., Tarasiewicz, P., Kunwar, S., Rohrbach, J., Luo, D., Prasad, M., and Fleer, S. (2020). Crowdai Mapping Challenge 2018: Baseline with Maskrcnn. Front. Artif. Intell., 3, Available online: https:\/\/www.crowdai.org\/challenges\/mapping-challenge\/dataset_files.","DOI":"10.3389\/frai.2020.534696"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Shao, Z., Tang, P., Wang, Z., Saleem, N., Yam, S., and Sommai, C. (2020). BRRNet: A fully convolutional neural network for automatic building extraction from high-resolution remote sensing images. Remote Sens., 12.","DOI":"10.3390\/rs12061050"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Ma, W., Wan, Y., Li, J., Zhu, S., and Wang, M. (2019). An automatic morphological attribute building extraction approach for satellite high spatial resolution imagery. Remote Sens., 11.","DOI":"10.3390\/rs11030337"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Wagner, F.H., Dalagnol, R., Tarabalka, Y., Segantine, T.Y., Thom\u00e9, R., and Hirye, M.C. (2020). U-net-id, an instance segmentation model for building extraction from satellite images\u2014Case study in the Joanopolis City, Brazil. Remote Sen., 12.","DOI":"10.3390\/rs12101544"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Yang, G., Zhang, Q., and Zhang, G. (2020). EANet: Edge-aware network for the extraction of buildings from aerial images. Remote Sens., 12.","DOI":"10.3390\/rs12132161"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"587","DOI":"10.1016\/j.patrec.2004.09.033","article-title":"An improved snake model for building detection from urban aerial images","volume":"26","author":"Peng","year":"2005","journal-title":"Pattern Recognit. Lett."},{"key":"ref_15","unstructured":"Shackelford, A.K., Davis, C.H., and Wang, X. (2004, January 20\u201324). Automated 2-D building footprint extraction from high-resolution satellite multispectral imagery. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Anchorage, AK, USA."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zhang, Q., Huang, X., and Zhang, G. (2017). Urban area extraction by regional and line segment feature fusion and urban morphology analysis. Remote Sens., 9.","DOI":"10.3390\/rs9070663"},{"key":"ref_17","unstructured":"Liu, Z., Cui, S., and Yan, Q. (July, January 30). Building extraction from high resolution satellite imagery based on multi-scale image segmentation and model matching. Proceedings of the International Workshop on Earth Observation and Remote Sensing Applications (EORSA), Beijing, China."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1016\/j.rse.2019.03.033","article-title":"Automatic extraction of built-up area from ZY3 multi-view satellite imagery: Analysis of 45 global cities","volume":"226","author":"Liu","year":"2019","journal-title":"Remote Sens. Environ."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Xu, Y., Wu, L., Xie, Z., and Chen, Z. (2018). Building extraction in very high resolution remote sensing imagery using deep learning and guided filters. Remote Sens., 10.","DOI":"10.3390\/rs10010144"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.isprsjprs.2017.05.002","article-title":"Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks","volume":"130","author":"Alshehhi","year":"2017","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Ye, Z., Fu, Y., Gan, M., Deng, J., Comber, A., and Wang, K. (2019). Building extraction from very high resolution aerial imagery using joint attention deep neural network. Remote Sens., 11.","DOI":"10.3390\/rs11242970"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Duan, Y., and Sun, L. (August, January 28). Buildings Extraction from Remote Sensing Data Using Deep Learning Method Based on Improved U-Net Network. Proceedings of the IGARSS 2019\u20142019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan.","DOI":"10.1109\/IGARSS.2019.8899798"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Liu, P., Liu, X., Liu, M., Shi, Q., Yang, J., Xu, X., and Zhang, Y. (2019). Building footprint extraction from high-resolution images via spatial residual inception convolutional neural network. Remote Sens., 11.","DOI":"10.3390\/rs11070830"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Li, W., He, C., Fang, J., Zheng, J., Fu, H., and Yu, L. (2019). Semantic segmentation-based building footprint extraction using very high-resolution satellite images and multi-source GIS data. Remote Sens., 11.","DOI":"10.3390\/rs11040403"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Chen, M., Wu, J., Liu, L., Zhao, W., Tian, F., Shen, Q., Zhao, B., and Du, R. (2021). Dr-net: An improved network for building extraction from high resolution remote sensing image. Remote Sens., 13.","DOI":"10.3390\/rs13020294"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Wu, T., Hu, Y., Peng, L., and Chen, R. (2020). Improved anchor-free instance segmentation for building extraction from high-resolution remote sensing images. Remote Sens., 12.","DOI":"10.3390\/rs12182910"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Lee, Y., and Park, J. (2020, January 14\u201319). Centermask: Real-time anchor-free instance segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01392"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201322). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201322). Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/0262-8856(83)90006-9","article-title":"On the accuracy of the Sobel edge detector","volume":"1","author":"Kittler","year":"1983","journal-title":"Image Vis. Comput."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Peng, S., Jiang, W., Pi, H., Li, X., Bao, H., and Zhou, X. (2020, January 14\u201319). Deep snake for real-time instance segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00856"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Cheng, T., Wang, X., Huang, L., and Liu, W. (2020, January 23\u201328). Boundary-preserving mask R-CNN. Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK.","DOI":"10.1007\/978-3-030-58568-6_39"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Deng, R., Shen, C., Liu, S., Wang, H., and Liu, X. (2018, January 8\u201314). Learning to predict crisp boundaries. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01231-1_35"},{"key":"ref_39","unstructured":"Fang, F., Wu, K., and Zheng, D. (2021). A dataset of building instances of typical cities in China [DB\/OL]. Sci. Data Bank."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1214\/aoms\/1177729586","article-title":"A stochastic approximation method","volume":"22","author":"Robbins","year":"1951","journal-title":"Ann. Math. Stat."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft COCO: Common objects in context. Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Chen, K., Pang, J., Wang, J., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Shi, J., and Ouyang, W. (2019, January 15\u201320). Hybrid task cascade for instance segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00511"},{"key":"ref_44","unstructured":"Wang, X., Zhang, R., Kong, T., Li, L., and Shen, C. (2020). SOLOv2: Dynamic, faster and stronger. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"886","DOI":"10.1109\/TPAMI.2007.1027","article-title":"Laplacian operator-based edge detectors","volume":"29","author":"Wang","year":"2007","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"721","DOI":"10.1016\/S0031-3203(00)00023-6","article-title":"On the Canny edge detector","volume":"34","author":"Ding","year":"2001","journal-title":"Pattern Recognit."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Zhang, C.C., Fang, J.D., and Atlantis, P. (2016, January 11\u201314). Edge Detection Based on Improved Sobel Operator. Proceedings of the 2016 International Conference on Computer Engineering and Information Systems, Gdansk, Poland.","DOI":"10.2991\/ceis-16.2016.25"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/19\/3814\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:04:03Z","timestamp":1760166243000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/19\/3814"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,23]]},"references-count":47,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2021,10]]}},"alternative-id":["rs13193814"],"URL":"https:\/\/doi.org\/10.3390\/rs13193814","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2021,9,23]]}}}