{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T17:36:00Z","timestamp":1774632960541,"version":"3.50.1"},"reference-count":55,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2022,9,22]],"date-time":"2022-09-22T00:00:00Z","timestamp":1663804800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Key Research Program of Frontier Sciences of the CAS","award":["QYZDB-SSW-DQC005"],"award-info":[{"award-number":["QYZDB-SSW-DQC005"]}]},{"name":"Key Research Program of Frontier Sciences of the CAS","award":["XDA19040301"],"award-info":[{"award-number":["XDA19040301"]}]},{"name":"Key Research Program of Frontier Sciences of the CAS","award":["CAS-WX2021PY-0109"],"award-info":[{"award-number":["CAS-WX2021PY-0109"]}]},{"name":"Strategic Priority Research Program of the CAS","award":["QYZDB-SSW-DQC005"],"award-info":[{"award-number":["QYZDB-SSW-DQC005"]}]},{"name":"Strategic Priority Research Program of the CAS","award":["XDA19040301"],"award-info":[{"award-number":["XDA19040301"]}]},{"name":"Strategic Priority Research Program of the CAS","award":["CAS-WX2021PY-0109"],"award-info":[{"award-number":["CAS-WX2021PY-0109"]}]},{"name":"Informatization Plan of Chinese Academy of Sciences of the CAS","award":["QYZDB-SSW-DQC005"],"award-info":[{"award-number":["QYZDB-SSW-DQC005"]}]},{"name":"Informatization Plan of Chinese Academy of Sciences of the CAS","award":["XDA19040301"],"award-info":[{"award-number":["XDA19040301"]}]},{"name":"Informatization Plan of Chinese Academy of Sciences of the CAS","award":["CAS-WX2021PY-0109"],"award-info":[{"award-number":["CAS-WX2021PY-0109"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Convolutional neural network (CNN)-based remote sensing (RS) image segmentation has become a widely used method for building footprint mapping. Recently, DeeplabV3+, an advanced CNN architecture, has shown satisfactory performance for building extraction in different urban landscapes. However, it faces challenges due to the large amount of labeled data required for model training and the extremely high costs associated with the annotation of unlabelled data. These challenges encouraged us to design a framework for building footprint mapping with fewer labeled data. In this context, the published studies on RS image segmentation are reviewed first, with a particular emphasis on the use of active learning (AL), incremental learning (IL), transfer learning (TL), and their integration for reducing the cost of data annotation. Based on the literature review, we defined three candidate frameworks by integrating AL strategies (i.e., margin sampling, entropy, and vote entropy), IL, TL, and DeeplabV3+. They examine the efficacy of AL, the efficacy of IL in accelerating AL performance, and the efficacy of both IL and TL in accelerating AL performance, respectively. Additionally, these frameworks enable the iterative selection of image tiles to be annotated, training and evaluation of DeeplabV3+, and quantification of the landscape features of selected image tiles. Then, all candidate frameworks were examined using WHU aerial building dataset as it has sufficient (i.e., 8188) labeled image tiles with representative buildings (i.e., various densities, areas, roof colors, and shapes of the building). The results support our theoretical analysis: (1) all three AL strategies reduced the number of image tiles by selecting the most informative image tiles, and no significant differences were observed in their performance; (2) image tiles with more buildings and larger building area were proven to be informative for the three AL strategies, which were prioritized during the data selection process; (3) IL can expedite model training by accumulating knowledge from chosen labeled tiles; (4) TL provides a better initial learner by incorporating knowledge from a pre-trained model; (5) DeeplabV3+ incorporated with IL, TL, and AL has the best performance in reducing the cost of data annotation. It achieved good performance (i.e., mIoU of 0.90) using only 10\u201315% of the sample dataset; DeeplabV3+ needs 50% of the sample dataset to realize the equivalent performance. The proposed frameworks concerning DeeplabV3+ and the results imply that integrating TL, AL, and IL in human-in-the-loop building extraction could be considered in real-world applications, especially for building footprint mapping.<\/jats:p>","DOI":"10.3390\/rs14194738","type":"journal-article","created":{"date-parts":[[2022,9,22]],"date-time":"2022-09-22T23:07:55Z","timestamp":1663888075000},"page":"4738","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["A Framework Integrating DeeplabV3+, Transfer Learning, Active Learning, and Incremental Learning for Mapping Building Footprints"],"prefix":"10.3390","volume":"14","author":[{"given":"Zhichao","family":"Li","sequence":"first","affiliation":[{"name":"Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5687-803X","authenticated-orcid":false,"given":"Jinwei","family":"Dong","sequence":"additional","affiliation":[{"name":"Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,9,22]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"044003","DOI":"10.1088\/1748-9326\/4\/4\/044003","article-title":"A new map of global urban extent from MODIS satellite data","volume":"4","author":"Schneider","year":"2009","journal-title":"Environ. Res. Lett."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"112589","DOI":"10.1016\/j.rse.2021.112589","article-title":"Deep building footprint update network: A semi-supervised method for updating existing building footprint from bi-temporal remote sensing images","volume":"264","author":"Guo","year":"2021","journal-title":"Remote Sens. Environ."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Jochem, W.C., and Tatem, A.J. (2021). Tools for mapping multi-scale settlement patterns of building footprints: An introduction to the R package foot. PLoS ONE, 16.","DOI":"10.1371\/journal.pone.0247535"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.isprsjprs.2019.02.006","article-title":"Semantic segmentation of slums in satellite images using transfer learning on fully convolutional neural networks","volume":"150","author":"Wurm","year":"2019","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Touzani, S., and Granderson, J. (2021). Open Data and Deep Semantic Segmentation for Automated Extraction of Building Footprints. Remote Sens., 13.","DOI":"10.3390\/rs13132578"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"2600","DOI":"10.1109\/JSTARS.2018.2835377","article-title":"Building Extraction at Scale Using Convolutional Neural Network: Mapping of the United States","volume":"11","author":"Yang","year":"2018","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Li, Z., Xin, Q., Sun, Y., and Cao, M. (2021). A Deep Learning-Based Framework for Automated Extraction of Building Footprint Polygons from Very High-Resolution Aerial Imagery. Remote Sens., 13.","DOI":"10.3390\/rs13183630"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Pasquali, G., Iannelli, G.C., and Dell\u2019Acqua, F. (2019). Building Footprint Extraction from Multispectral, Spaceborne Earth Observation Datasets Using a Structurally Optimized U-Net Convolutional Neural Network. Remote Sens., 11.","DOI":"10.3390\/rs11232803"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"115530","DOI":"10.1016\/j.eswa.2021.115530","article-title":"Dilated-ResUnet: A novel deep learning architecture for building extraction from medium resolution multi-spectral satellite imagery","volume":"184","author":"Dixit","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhao, F., and Zhang, C. (2020, January 11\u201313). Building Damage Evaluation from Satellite Imagery using Deep Learning. Proceedings of the 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI), Las Vegas, NV, USA.","DOI":"10.1109\/IRI49571.2020.00020"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Pan, Z., Xu, J., Guo, Y., Hu, Y., and Wang, G. (2020). Deep Learning Segmentation and Classification for Urban Village Using a Worldview Satellite Image Based on U-Net. Remote Sens., 12.","DOI":"10.3390\/rs12101574"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Wagner, F., Dalagnol, R., Tarabalka, Y., Segantine, T., Thom\u00e9, R., and Hirye, M. (2020). U-Net-Id, an Instance Segmentation Model for Building Extraction from Satellite Images\u2014Case Study in the Joan\u00f3polis City, Brazil. Remote Sens., 12.","DOI":"10.3390\/rs12101544"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1501","DOI":"10.1080\/10106049.2020.1778100","article-title":"Automatic building footprint extraction from very high-resolution imagery using deep learning techniques","volume":"37","author":"Rastogi","year":"2020","journal-title":"Geocarto Int."},{"key":"ref_14","unstructured":"Jiwani, A., Ganguly, S., Ding, C., Zhou, N., and Chan, D.M. (2021). A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1080\/17538947.2020.1831087","article-title":"Incorporating DeepLabv3+ and object-based image analysis for semantic segmentation of very high resolution remote sensing images","volume":"14","author":"Du","year":"2021","journal-title":"Int. J. Digit. Earth"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Li, Z., Zhang, S., and Dong, J. (2022). Suggestive Data Annotation for CNN-Based Building Footprint Mapping Based on Deep Active Learning and Landscape Metrics. Remote Sens., 14.","DOI":"10.3390\/rs14133147"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1038\/s41597-020-0542-3","article-title":"A rasterized building footprint dataset for the United States","volume":"7","author":"Heris","year":"2020","journal-title":"Sci. Data"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1109\/TGRS.2018.2858817","article-title":"Fully Convolutional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set","volume":"57","author":"Ji","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Maggiori, E., Tarabalka, Y., Charpiat, G., and Alliez, P. (2017, January 23\u201328). Can semantic labeling methods generalize to any city? The inria aerial image labeling benchmark. Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA.","DOI":"10.1109\/IGARSS.2017.8127684"},{"key":"ref_20","unstructured":"Etten, A.V., Lindenbaum, D., and Bacastow, T.M. (2018). SpaceNet: A Remote Sensing Dataset and Challenge Series. arXiv."},{"key":"ref_21","unstructured":"Mnih, V. (2013). Machine Learning for Aerial Image Labeling, University of Toronto."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Yang, N., and Tang, H. (2020). GeoBoost: An Incremental Deep Learning Approach toward Global Mapping of Buildings from VHR Remote Sensing Images. Remote Sens., 12.","DOI":"10.3390\/rs12111794"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"11530","DOI":"10.1109\/JSTARS.2021.3123398","article-title":"A Large-Scale Mapping Scheme for Urban Building From Gaofen-2 Images Using Deep Learning and Hierarchical Approach","volume":"14","author":"Zhou","year":"2021","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Li, J., Meng, L., Yang, B., Tao, C., Li, L., and Zhang, W. (2021). LabelRS: An Automated Toolbox to Make Deep Learning Samples from Remote Sensing Images. Remote Sens., 13.","DOI":"10.3390\/rs13112064"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"15014","DOI":"10.3390\/rs71115014","article-title":"Accurate Annotation of Remote Sensing Images via Active Spectral Clustering with Little Expert Knowledge","volume":"7","author":"Xia","year":"2015","journal-title":"Remote Sens."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3510414","article-title":"A Survey of Deep Active Learning","volume":"54","author":"Ren","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_27","first-page":"2509","article-title":"Human-Machine Collaboration for Fast Land Cover Mapping","volume":"34","author":"Robinson","year":"2020","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"6440","DOI":"10.1109\/TGRS.2018.2838665","article-title":"Active Learning With Convolutional Neural Networks for Hyperspectral Image Classification Using a New Bayesian Approach","volume":"56","author":"Haut","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Belharbi, S., Ayed, I.B., McCaffrey, L., and Granger, E. (2021, January 3\u20138). Deep Active Learning for Joint Classification & Segmentation with Weak Annotator. Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA.","DOI":"10.1109\/WACV48630.2021.00338"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Siddiqui, Y., Valentin, J., and Nie\u00dfner, M. (2020, January 13\u201319). ViewAL: Active Learning with Viewpoint Entropy for Semantic Segmentation. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00945"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"2591","DOI":"10.1109\/TCSVT.2016.2589879","article-title":"Cost-Effective Active Learning for Deep Image Classification","volume":"27","author":"Wang","year":"2017","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"3524","DOI":"10.1109\/JSTARS.2019.2925416","article-title":"Incremental Learning for Semantic Segmentation of Large-Scale Remote Sensing Data","volume":"12","author":"Tasar","year":"2019","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1016\/j.isprsjprs.2020.09.003","article-title":"Active and incremental learning for semantic ALS point cloud segmentation","volume":"169","author":"Lin","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1345","DOI":"10.1109\/TKDE.2009.191","article-title":"A Survey on Transfer Learning","volume":"22","author":"Pan","year":"2010","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_35","first-page":"102313","article-title":"Crop type mapping by using transfer learning","volume":"98","author":"Nowakowski","year":"2021","journal-title":"Int. J. Appl. Earth Obs. Geoinf."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"4057","DOI":"10.1080\/01431161.2020.1714774","article-title":"Using convolutional neural networks incorporating hierarchical active learning for target-searching in large-scale remote sensing images","volume":"41","author":"Xu","year":"2020","journal-title":"Int. J. Remote Sens."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Xie, M., Jean, N., Burke, M., Lobell, D., and Ermon, S. (2016, January 12\u201317). Transfer Learning from Deep Features for Remote Sensing and Poverty Mapping. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.9906"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zhou, Z., Shin, J., Zhang, L., Gurudu, S., Gotway, M., and Liang, J. (2017, January 21\u201326). Fine-Tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.506"},{"key":"ref_39","unstructured":"Settles, B. (2009). Active Learning Literature Survey, Department of Computer Sciences, University of Wisconsin-Madison."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"964","DOI":"10.3390\/rs6020964","article-title":"Comparison of Classification Algorithms and Training Sample Sizes in Urban Land Classification with Landsat Thematic Mapper Imagery","volume":"6","author":"Li","year":"2014","journal-title":"Remote Sens."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1016\/j.isprsjprs.2020.10.018","article-title":"From local to global: A transfer learning-based approach for mapping poplar plantations at national scale using Sentinel-2","volume":"171","author":"Hamrouni","year":"2021","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Wang, Z., and Brenning, A. (2021). Active-Learning Approaches for Landslide Mapping Using Support Vector Machines. Remote Sens., 13.","DOI":"10.3390\/rs13132588"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"2993","DOI":"10.1109\/TITS.2017.2665658","article-title":"Road Recognition From Remote Sensing Imagery Using Incremental Learning","volume":"18","author":"Zhang","year":"2017","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Qin, R., and Liu, T. (2022). A Review of Landcover Classification with Very-High Resolution Remotely Sensed Optical Images\u2014Analysis Unit, Model Scalability and Transferability. Remote Sens., 14.","DOI":"10.3390\/rs14030646"},{"key":"ref_45","unstructured":"Ulmas, P., and Liiv, I. (2020). Segmentation of Satellite Imagery using U-Net Models for Land Cover Classification. arXiv."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Zhu, Q., Zhong, Y., Guan, Q., Zhang, L., and Li, D. (October, January 26). A Modified D-Linknet with Transfer Learning for Road Extraction from High-Resolution Remote Sensing. Proceedings of the IGARSS 2020\u20142020 IEEE International Geoscience and Remote Sensing Symposium, Waikoloa, HI, USA.","DOI":"10.1109\/IGARSS39084.2020.9324236"},{"key":"ref_47","unstructured":"He, K., Girshick, R., and Doll\u00e1r, P. (November, January 27). Rethinking imagenet pre-training. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"105219","DOI":"10.1016\/j.cmpb.2019.105219","article-title":"An incremental learning system for atrial fibrillation detection based on transfer learning and active learning","volume":"187","author":"Shi","year":"2020","journal-title":"Comput. Methods Programs Biomed."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"114417","DOI":"10.1016\/j.eswa.2020.114417","article-title":"A review of deep learning methods for semantic segmentation of remote sensing imagery","volume":"169","author":"Yuan","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Bosch, M. (2019). PyLandStats: An open-source Pythonic library to compute landscape metrics. PLoS ONE, 14.","DOI":"10.1101\/715052"},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"2715","DOI":"10.1007\/s10994-021-05972-1","article-title":"Spatial dependence between training and test sets: Another pitfall of classification accuracy assessment in remote sensing","volume":"111","author":"Karasiak","year":"2022","journal-title":"Mach. Learn."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1016\/j.neucom.2020.10.115","article-title":"HAL: Hybrid active learning for efficient labeling in medical domain","volume":"456","author":"Wu","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Zhou, Y., Lin, C., Wang, S., Liu, W., and Tian, Y. (2016). Estimation of Building Density with the Integrated Use of GF-1 PMS and Radarsat-2 Data. Remote Sens., 8.","DOI":"10.3390\/rs8110969"},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"108278","DOI":"10.1016\/j.knosys.2022.108278","article-title":"One-shot active learning for image segmentation via contrastive learning and diversity-based sampling","volume":"241","author":"Jin","year":"2022","journal-title":"Knowl. Based Syst."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/19\/4738\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:37:34Z","timestamp":1760143054000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/19\/4738"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,22]]},"references-count":55,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2022,10]]}},"alternative-id":["rs14194738"],"URL":"https:\/\/doi.org\/10.3390\/rs14194738","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,22]]}}}