{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T17:00:06Z","timestamp":1775667606389,"version":"3.50.1"},"reference-count":44,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2021,12,29]],"date-time":"2021-12-29T00:00:00Z","timestamp":1640736000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["41601506"],"award-info":[{"award-number":["41601506"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Fundamental Research Funds for the Central Universities, China","award":["CUG190603"],"award-info":[{"award-number":["CUG190603"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Deep learning techniques such as convolutional neural networks have largely improved the performance of building segmentation from remote sensing images. However, the images for building segmentation are often in the form of traditional orthophotos, where the relief displacement would cause non-negligible misalignment between the roof outline and the footprint of a building; such misalignment poses considerable challenges for extracting accurate building footprints, especially for high-rise buildings. Aiming at alleviating this problem, a new workflow is proposed for generating rectified building footprints from traditional orthophotos. We first use the facade labels, which are prepared efficiently at low cost, along with the roof labels to train a semantic segmentation network. Then, the well-trained network, which employs the state-of-the-art version of EfficientNet as backbone, extracts the roof segments and the facade segments of buildings from the input image. Finally, after clustering the classified pixels into instance-level building objects and tracing out the roof outlines, an energy function is proposed to drive the roof outline to maximally align with the building footprint; thus, the rectified footprints can be generated. The experiments on the aerial orthophotos covering a high-density residential area in Shanghai demonstrate that the proposed workflow can generate obviously more accurate building footprints than the baseline methods, especially for high-rise buildings.<\/jats:p>","DOI":"10.3390\/s22010207","type":"journal-article","created":{"date-parts":[[2021,12,29]],"date-time":"2021-12-29T08:12:15Z","timestamp":1640765535000},"page":"207","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Extracting Rectified Building Footprints from Traditional Orthophotos: A New Workflow"],"prefix":"10.3390","volume":"22","author":[{"given":"Qi","family":"Chen","sequence":"first","affiliation":[{"name":"School of Geography and Information Engineering, China University of Geosciences (Wuhan), Wuhan 430074, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuanyi","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Geography and Information Engineering, China University of Geosciences (Wuhan), Wuhan 430074, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xinyuan","family":"Li","sequence":"additional","affiliation":[{"name":"School of Geography and Information Engineering, China University of Geosciences (Wuhan), Wuhan 430074, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5011-9446","authenticated-orcid":false,"given":"Pengjie","family":"Tao","sequence":"additional","affiliation":[{"name":"School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430072, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,12,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Boonpook, W., Tan, Y., Ye, Y., Torteeka, P., Torsri, K., and Dong, S. (2018). A deep learning approach on building detection from unmanned aerial vehicle-based images in riverbank monitoring. Sensors, 18.","DOI":"10.3390\/s18113921"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1016\/j.isprsjprs.2020.10.008","article-title":"An end-to-end shape modeling framework for vectorized building outline generation from aerial images","volume":"170","author":"Chen","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"6106","DOI":"10.1109\/TGRS.2020.3022410","article-title":"Multiscale U-shaped CNN building instance extraction framework with edge constraint for high-spatial-resolution remote sensing imagery","volume":"59","author":"Liu","year":"2020","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Wu, G., Shao, X., Guo, Z., Chen, Q., Yuan, W., Shi, X., Xu, Y., and Shibasaki, R. (2018). Automatic Building Segmentation of Aerial Imagery Using Multi-Constraint Fully Convolutional Networks. Remote. Sens., 10.","DOI":"10.3390\/rs10030407"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Wen, Q., Jiang, K., Wang, W., Liu, Q., Guo, Q., Li, L., and Wang, P. (2019). Automatic building extraction from google earth images under complex backgrounds based on deep instance segmentation network. Sensors, 19.","DOI":"10.3390\/s19020333"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Chen, Z., Li, D., Fan, W., Guan, H., Wang, C., and Li, J. (2021). Self-attention in reconstruction bias U-Net for semantic segmentation of building rooftops in optical remote sensing images. Remote Sens., 13.","DOI":"10.3390\/rs13132524"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Gharibi, H., and Habib, A. (2018). True orthophoto generation from aerial frame images and LiDAR data: An update. Remote Sens., 10.","DOI":"10.3390\/rs10040581"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"095087","DOI":"10.1117\/1.JRS.9.095087","article-title":"Automatic true orthophoto generation based on three-dimensional building model using multiview urban aerial images","volume":"9","author":"Deng","year":"2015","journal-title":"J. Appl. Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"793","DOI":"10.1016\/j.rse.2018.02.025","article-title":"Multi-sensor feature fusion for very high spatial resolution built-up area extraction in temporary settlements","volume":"209","author":"Pelizari","year":"2018","journal-title":"Remote Sens. Environ."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Guo, Z., Shao, X., Xu, Y., Miyazaki, H., Ohira, W., and Shibasaki, R. (2016). Identification of village building via Google Earth images and supervised machine learning methods. Remote Sens., 8.","DOI":"10.3390\/rs8040271"},{"key":"ref_11","first-page":"58","article-title":"Building extraction from high-resolution optical spaceborne images using the integration of support vector machine (SVM) classification, Hough transformation and perceptual grouping","volume":"34","author":"Turker","year":"2015","journal-title":"Int. J. Appl. Earth Obs. Geoinf."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"7092","DOI":"10.1109\/TGRS.2017.2740362","article-title":"High-resolution aerial image labeling with convolutional neural networks","volume":"55","author":"Maggiori","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1016\/j.isprsjprs.2018.11.011","article-title":"Aerial imagery for roof segmentation: A large-scale dataset towards automatic mapping of buildings","volume":"147","author":"Chen","year":"2019","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1109\/TGRS.2016.2612821","article-title":"Convolutional neural networks for large-scale remote-sensing image classification","volume":"55","author":"Maggiori","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Xu, Y., Wu, L., Xie, Z., and Chen, Z. (2018). Building extraction in very high resolution remote sensing imagery using deep learning and guided filters. Remote Sens., 10.","DOI":"10.3390\/rs10010144"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"Segnet: A deep convolutional encoder-decoder architecture for image segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Liu, P., Liu, X., Liu, M., Shi, Q., Yang, J., Xu, X., and Zhang, Y. (2019). Building footprint extraction from high-resolution images via spatial residual inception convolutional neural network. Remote Sens., 11.","DOI":"10.3390\/rs11070830"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Yang, G., Zhang, Q., and Zhang, G. (2020). EANet: Edge-aware network for the extraction of buildings from aerial images. Remote Sens., 12.","DOI":"10.3390\/rs12132161"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"112589","DOI":"10.1016\/j.rse.2021.112589","article-title":"Deep building footprint update network: A semi-supervised method for updating existing building footprint from bi-temporal remote sensing images","volume":"264","author":"Guo","year":"2021","journal-title":"Remote Sens. Environ."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1016\/j.isprsjprs.2019.02.019","article-title":"Automatic building extraction from high-resolution aerial images and LiDAR data using gated residual refinement network","volume":"151","author":"Huang","year":"2019","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid scene parsing network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Lin, G., Milan, A., Shen, C., and Reid, I. (2017, January 21\u201326). Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.549"},{"key":"ref_26","unstructured":"Chaudhuri, K., and Salakhutdinov, R. (2019, January 9\u201315). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1016\/j.isprsjprs.2021.02.014","article-title":"Building outline delineation: From aerial images to polygons with an improved end-to-end learning framework","volume":"175","author":"Zhao","year":"2021","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., and Raskar, R. (2018, January 18\u201322). Deepglobe 2018: A challenge to parse the earth through satellite images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00031"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Maggiori, E., Tarabalka, Y., Charpiat, G., and Alliez, P. (2017, January 23\u201328). Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark. Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA.","DOI":"10.1109\/IGARSS.2017.8127684"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Yang, N., and Tang, H. (2020). GeoBoost: An incremental deep learning approach toward global mapping of buildings from VHR remote sensing images. Remote Sens., 12.","DOI":"10.3390\/rs12111794"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Girard, N., Charpiat, G., and Tarabalka, Y. (2018, January 2\u20136). Aligning and updating cadaster maps with aerial images by multi-task, multi-resolution deep learning. Proceedings of the Asian Conference on Computer Vision, Perth, Australia.","DOI":"10.1007\/978-3-030-20873-8_43"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Zampieri, A., Charpiat, G., Girard, N., and Tarabalka, Y. (2018, January 8\u201314). Multimodal image alignment through a multiscale chain of neural networks with application to remote sensing. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01270-0_40"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1016\/j.isprsjprs.2019.05.013","article-title":"Improving public data for building segmentation from Convolutional Neural Networks (CNNs) for fused airborne lidar and image data using active contours","volume":"154","author":"Griffiths","year":"2019","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1016\/j.isprsjprs.2018.11.010","article-title":"Correcting rural building annotations in OpenStreetMap using convolutional neural networks","volume":"147","author":"Lobry","year":"2019","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"12334","DOI":"10.3390\/rs61212334","article-title":"Automatic seamline network generation for urban orthophoto mosaicking with the use of a digital surface model","volume":"6","author":"Chen","year":"2014","journal-title":"Remote Sens."},{"key":"ref_36","first-page":"697","article-title":"Perspective correction of building facade images for architectural applications","volume":"22","author":"Soycan","year":"2019","journal-title":"Eng. Sci. Technol. Int. J."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Gong, J., Hu, X., Pang, S., and Li, K. (2019). Patch matching and dense crf-based co-refinement for building change detection from bi-temporal aerial images. Sensors, 19.","DOI":"10.3390\/s19071557"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zhuo, X., Fraundorfer, F., Kurz, F., and Reinartz, P. (2018). Optimization of OpenStreetMap building footprints based on semantic information of oblique UAV images. Remote Sens., 10.","DOI":"10.3390\/rs10040624"},{"key":"ref_39","unstructured":"Tan, M., and Le, Q.V. (2021). EfficientNetV2: Smaller Models and Faster Training. arXiv."},{"key":"ref_40","unstructured":"Ramachandran, P., Zoph, B., and Le, Q.V. (2017). Searching for Activation Functions. arXiv, 1\u201313."},{"key":"ref_41","first-page":"8026","article-title":"Pytorch: An imperative style, high-performance deep learning library","volume":"32","author":"Paszke","year":"2019","journal-title":"Proc. Adv. Neural Inf. Process. Syst."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Douglas, D.H., and Peucker, T.K. (1973). Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartogr. Int. J. Geogr. Inf. Geovis.","DOI":"10.3138\/FM57-6770-U75U-7727"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Li, Z., Wegner, J.D., and Lucchi, A. (2019, January 27\u201328). Topological map extraction from overhead images. Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00180"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1111\/j.1469-8137.1912.tb05611.x","article-title":"The distribution of the flora in the alpine zone","volume":"11","author":"Jaccard","year":"1912","journal-title":"New Phytol."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/1\/207\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:55:15Z","timestamp":1760169315000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/1\/207"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,29]]},"references-count":44,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,1]]}},"alternative-id":["s22010207"],"URL":"https:\/\/doi.org\/10.3390\/s22010207","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,12,29]]}}}