{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T17:54:07Z","timestamp":1775066047561,"version":"3.50.1"},"reference-count":75,"publisher":"MDPI AG","issue":"14","license":[{"start":{"date-parts":[[2022,7,18]],"date-time":"2022-07-18T00:00:00Z","timestamp":1658102400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["61901439"],"award-info":[{"award-number":["61901439"]}]},{"name":"National Natural Science Foundation of China","award":["ZDBS-LY-JSC036"],"award-info":[{"award-number":["ZDBS-LY-JSC036"]}]},{"name":"Key Research Program of Frontier Sciences, Chinese Academy of Science","award":["61901439"],"award-info":[{"award-number":["61901439"]}]},{"name":"Key Research Program of Frontier Sciences, Chinese Academy of Science","award":["ZDBS-LY-JSC036"],"award-info":[{"award-number":["ZDBS-LY-JSC036"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>The generation of topographic classification maps or relative heights from aerial or remote sensing images represents a crucial research tool in remote sensing. On the one hand, from auto-driving, three-dimensional city modeling, road design, and resource statistics to smart cities, each task requires relative height data and classification data of objects. On the other hand, most relative height data acquisition methods currently use multiple images. We find that relative height and geographic classification data can be mutually assisted through data distribution. In recent years, with the rapid development of artificial intelligence technology, it has become possible to estimate the relative height from a single image. It learns implicit mapping relationships in a data-driven manner that may not be explicitly available through mathematical modeling. On this basis, we propose a unified, in-depth learning structure that can generate both estimated relative height maps and semantically segmented maps and perform end-to-end training. Compared with the existing methods, our task is to perform both relative height estimation and semantic segmentation tasks simultaneously. We only need one picture to obtain the corresponding semantically segmented images and relative heights simultaneously. The model\u2019s performance is much better than that of equivalent computational models. We also designed dynamic weights to enable the model to learn relative height estimation and semantic segmentation simultaneously. At the same time, we have conducted good experiments on existing datasets. The experimental results show that the proposed Transformer-based network architecture is suitable for relative height estimation tasks and vastly outperforms other state-of-the-art DL (Deep Learning) methods.<\/jats:p>","DOI":"10.3390\/rs14143450","type":"journal-article","created":{"date-parts":[[2022,7,19]],"date-time":"2022-07-19T00:19:21Z","timestamp":1658189961000},"page":"3450","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Multi-Task Learning of Relative Height Estimation and Semantic Segmentation from Single Airborne RGB Images"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0673-2854","authenticated-orcid":false,"given":"Min","family":"Lu","sequence":"first","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China"},{"name":"Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Chinese Academy of Sciences, Beijing 100190, China"},{"name":"School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 101408, China"}]},{"given":"Jiayin","family":"Liu","sequence":"additional","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China"},{"name":"Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Chinese Academy of Sciences, Beijing 100190, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6494-3639","authenticated-orcid":false,"given":"Feng","family":"Wang","sequence":"additional","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China"},{"name":"Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Chinese Academy of Sciences, Beijing 100190, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2063-9816","authenticated-orcid":false,"given":"Yuming","family":"Xiang","sequence":"additional","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China"},{"name":"Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Chinese Academy of Sciences, Beijing 100190, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,7,18]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"885","DOI":"10.1002\/esp.1210","article-title":"Methods for the visualization of digital elevation models for landform mapping","volume":"30","author":"Smith","year":"2005","journal-title":"Earth Surf. Process. Landforms"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1016\/S0016-7061(00)00046-X","article-title":"Use of combined digital elevation model and satellite radiometric data for regional soil mapping","volume":"97","author":"Dobos","year":"2000","journal-title":"Geoderma"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1016\/S0016-7061(01)00096-9","article-title":"Soil erosion caused by extreme rainfall events: Mapping and quantification in agricultural plots from very detailed digital elevation models","volume":"105","author":"Ramos","year":"2002","journal-title":"Geoderma"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1481","DOI":"10.5194\/hess-11-1481-2007","article-title":"Uncertainties associated with digital elevation models for hydrologic applications: A review","volume":"11","author":"Wechsler","year":"2007","journal-title":"Hydrol. Earth Syst. Sci."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"2259","DOI":"10.1029\/1999WR900034","article-title":"On the effect of digital elevation model accuracy on hydrology and geomorphology","volume":"35","author":"Walker","year":"1999","journal-title":"Water Resour. Res."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"2215","DOI":"10.1007\/s12517-014-1273-6","article-title":"Scale matching of multiscale digital elevation model (DEM) data and the Weather Research and Forecasting (WRF) model: A case study of meteorological simulation in Hong Kong","volume":"7","author":"Zhang","year":"2014","journal-title":"Arab. J. Geosci."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1016\/0341-8162(92)90022-4","article-title":"The digital elevation model of Italy for geomorphology and structural geology","volume":"19","author":"Onorati","year":"1992","journal-title":"Catena"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1016\/S0016-7061(00)00081-1","article-title":"Digital elevation model resolution: Effects on terrain attribute calculation and quantitative soil-landscape modeling","volume":"100","author":"Thompson","year":"2001","journal-title":"Geoderma"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhou, S., Mi, L., Chen, H., and Geng, Y. (2013, January 22\u201323). Building detection in Digital surface model. Proceedings of the IEEE International Conference on Imaging Systems and Techniques (IST), Beijing, China.","DOI":"10.1109\/IST.2013.6729690"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Dawid, W., and Pokonieczny, K. (2020). Analysis of the Possibilities of Using Different Resolution Digital Elevation Models in the Study of Microrelief on the Example of Terrain Passability. Remote Sens., 12.","DOI":"10.3390\/rs12244146"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"\u0160tular, B., Lozi\u0107, E., and Eichert, S. (2021). Airborne LiDAR-derived digital elevation model for archaeology. Remote Sens., 13.","DOI":"10.3390\/rs13091855"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"4748","DOI":"10.1109\/TGRS.2012.2191155","article-title":"Urban digital elevation model reconstruction using very high resolution multichannel InSAR data","volume":"50","author":"Shabou","year":"2012","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1120","DOI":"10.1109\/TGRS.2019.2943919","article-title":"A new baseline linear combination algorithm for generating urban digital elevation models with multitemporal InSAR observations","volume":"58","author":"Luo","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1016\/j.isprsjprs.2016.03.012","article-title":"An automated, open-source pipeline for mass production of digital elevation models (DEMs) from very-high-resolution commercial stereo satellite imagery","volume":"116","author":"Shean","year":"2016","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"789","DOI":"10.1016\/S0262-8856(03)00092-1","article-title":"Extraction of digital elevation models from satellite stereo images through stereo matching based on epipolarity and scene geometry","volume":"21","author":"Lee","year":"2003","journal-title":"Image Vis. Comput."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1016\/j.isprsjprs.2014.08.011","article-title":"Sequential digital elevation models of active lava flows from ground-based stereo time-lapse imagery","volume":"97","author":"James","year":"2014","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.neucom.2018.03.037","article-title":"Methods and datasets on semantic segmentation: A review","volume":"304","author":"Yu","year":"2018","journal-title":"Neurocomputing"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Panagiotou, E., Chochlakis, G., Grammatikopoulos, L., and Charou, E. (2020). Generating Elevation Surface from a Single RGB Remotely Sensed Image Using Deep Learning. Remote Sens., 12.","DOI":"10.3390\/rs12122002"},{"key":"ref_19","unstructured":"Russell, S., and Norvig, P. (2002). Artificial Intelligence: A Modern Approach, Pearson Education, Inc."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"7068349","DOI":"10.1155\/2018\/7068349","article-title":"Deep learning for computer vision: A brief review","volume":"2018","author":"Voulodimos","year":"2018","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_21","unstructured":"Forsyth, D., and Ponce, J. (2011). Computer Vision: A modern Approach., Prentice hall."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Chowdhary, K. (2020). Natural language processing. Fundam. Artif. Intell., 603\u2013649.","DOI":"10.1007\/978-81-322-3972-7_19"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"544","DOI":"10.1136\/amiajnl-2011-000464","article-title":"Natural language processing: An introduction","volume":"18","author":"Nadkarni","year":"2011","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1109\/MSP.2019.2918706","article-title":"Speech processing for digital home assistants: Combining signal processing with deep-learning techniques","volume":"36","author":"Watanabe","year":"2019","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1109\/TASL.2011.2173371","article-title":"Introduction to the special section on deep learning for speech and language processing","volume":"20","author":"Yu","year":"2011","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_27","first-page":"1097","article-title":"Imagenet classification with deep convolutional neural networks","volume":"25","author":"Krizhevsky","year":"2012","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_30","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"300","DOI":"10.1007\/s11263-016-0908-3","article-title":"Deepmatching: Hierarchical deformable dense matching","volume":"120","author":"Revaud","year":"2016","journal-title":"Int. J. Comput. Vis."},{"key":"ref_32","first-page":"297","article-title":"Application of DEM data to Landsat image classification: Evaluation in a tropical wet-dry landscape of Thailand","volume":"66","author":"Eiumnoh","year":"2000","journal-title":"Photogramm. Eng. Remote Sens."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1257","DOI":"10.3390\/rs1041257","article-title":"Improving Landsat and IRS image classification: Evaluation of unsupervised and supervised classification through band ratios and DEM in a mountainous landscape in Nepal","volume":"1","author":"Bahadur","year":"2009","journal-title":"Remote Sens."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Zhang, Y., and Yu, W. (2022). Comparison of DEM Super-Resolution Methods Based on Interpolation and Neural Networks. Sensors, 22.","DOI":"10.3390\/s22030745"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Zhou, A., Chen, Y., Wilson, J.P., Su, H., Xiong, Z., and Cheng, Q. (2021). An Enhanced Double-Filter Deep Residual Neural Network for Generating Super Resolution DEMs. Remote Sens., 13.","DOI":"10.3390\/rs13163089"},{"key":"ref_36","unstructured":"Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative adversarial nets. arXiv."},{"key":"ref_37","unstructured":"Kipf, T.N., and Welling, M. (2016). Variational graph auto-encoders. arXiv."},{"key":"ref_38","unstructured":"Eigen, D., Puhrsch, C., and Fergus, R. (2014). Depth map prediction from a single image using a multi-scale deep network. arXiv."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1023\/A:1007379606734","article-title":"Multitask learning","volume":"28","author":"Caruana","year":"1997","journal-title":"Mach. Learn."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Tsai, Y.M., Chang, Y.L., and Chen, L.G. (2006, January 12\u201315). Block-based vanishing line and vanishing point detection for 3D scene reconstruction. Proceedings of the International Symposium on Intelligent Signal Processing and Communications, Yonago, Japan.","DOI":"10.1109\/ISPACS.2006.364726"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Prados, E., and Faugeras, O. (2006). Shape from shading. Handbook of Mathematical Models in Computer Vision, Springer.","DOI":"10.1007\/0-387-28831-7_23"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1080\/09500340.2014.967321","article-title":"Depth recovery and refinement from a single image using defocus cues","volume":"62","author":"Tang","year":"2015","journal-title":"J. Mod. Opt."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Lowe, D.G. (1999, January 20\u201327). Object recognition from local scale-invariant features. Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece.","DOI":"10.1109\/ICCV.1999.790410"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Bay, H., Tuytelaars, T., and Gool, L.V. (2006, January 7\u201313). Surf: Speeded up robust features. Proceedings of the European Conference on Computer Vision, Graz, Austria.","DOI":"10.1007\/11744023_32"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Lee, J.H., Heo, M., Kim, K.R., and Kim, C.S. (2018, January 18\u201322). Single-image depth estimation based on fourier domain analysis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00042"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Liu, F., Shen, C., and Lin, G. (2015, January 7\u201312). Deep convolutional neural fields for depth estimation from a single image. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299152"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Xu, D., Ricci, E., Ouyang, W., Wang, X., and Sebe, N. (2017, January 21\u201326). Multi-Scale Continuous Crfs as Sequential Deep Networks for Monocular Depth Estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.25"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Xu, D., Wang, W., Tang, H., Liu, H., Sebe, N., and Ricci, E. (2018, January 18\u201323). Structured attention guided convolutional neural fields for monocular depth estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00412"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Fu, H., Gong, M., Wang, C., Batmanghelich, K., and Tao, D. (2018, January 18\u201323). Deep ordinal regression network for monocular depth estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00214"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"794","DOI":"10.1109\/LGRS.2018.2806945","article-title":"IMG2DSM: Height simulation from single imagery using conditional generative adversarial net","volume":"15","author":"Ghamisi","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Isola, P., Zhu, J.Y., Zhou, T., and Efros, A.A. (2017, January 21\u201326). Image-to-image translation with conditional adversarial networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.632"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1016\/j.isprsjprs.2019.01.013","article-title":"Height estimation from single aerial images using a deep convolutional encoder-decoder network","volume":"149","author":"Amirkolaee","year":"2019","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Liu, C.J., Krylov, V.A., Kane, P., Kavanagh, G., and Dahyot, R. (2020). IM2ELEVATION: Building height estimation from single-view aerial imagery. Remote Sens., 12.","DOI":"10.3390\/rs12172719"},{"key":"ref_55","unstructured":"Li, X., Wang, M., and Fang, Y. (2020). Height estimation from single aerial images using a deep ordinal regression network. arXiv."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Zhang, Y., and Yang, Q. (2021). A survey on multi-task learning. IEEE Trans. Knowl. Data Eng.","DOI":"10.1109\/TKDE.2021.3070203"},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1093\/nsr\/nwx105","article-title":"An overview of multi-task learning","volume":"5","author":"Zhang","year":"2018","journal-title":"Natl. Sci. Rev."},{"key":"ref_58","unstructured":"Liebel, L., and K\u00f6rner, M. (2018). Auxiliary tasks in multi-task learning. arXiv."},{"key":"ref_59","unstructured":"Islam, M., Vibashan, V., and Ren, H. (August, January 31). Ap-mtl: Attention pruned multi-task learning model for real-time instrument detection and segmentation in robot-assisted surgery. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1613\/jair.1.11304","article-title":"Using task descriptions in lifelong machine learning for improved performance and zero-shot transfer","volume":"67","author":"Rostami","year":"2020","journal-title":"J. Artif. Intell. Res."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Song, T.J., Jeong, J., and Kim, J.H. (2022). End-to-End Real-Time Obstacle Detection Network for Safe Self-Driving via Multi-Task Learning. IEEE Trans. on Intell. Transp. Syst., 1\u201312.","DOI":"10.1109\/TITS.2022.3149789"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Srivastava, S., Volpi, M., and Tuia, D. (2017, January 23\u201328). Joint height estimation and semantic labeling of monocular aerial images with CNNs. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Worth, TX, USA.","DOI":"10.1109\/IGARSS.2017.8128167"},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"1391","DOI":"10.1109\/LGRS.2019.2947783","article-title":"Multitask learning of height and semantics from aerial images","volume":"17","author":"Carvalho","year":"2019","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Bischke, B., Helber, P., Folz, J., Borth, D., and Dengel, A. (2019, January 22\u201325). Multi-task learning for segmentation of building footprints with deep neural networks. Proceedings of the IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803050"},{"key":"ref_65","first-page":"6000","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_66","first-page":"12077","article-title":"SegFormer: Simple and efficient design for semantic segmentation with transformers","volume":"34","author":"Xie","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Kirkland, E.J. (2010). Bilinear interpolation. Advanced Computing in Electron Microscopy, Springer.","DOI":"10.1007\/978-1-4419-6533-2"},{"key":"ref_68","unstructured":"Bhat, S.F., Alhashim, I., and Wonka, P. (2021, January 20\u201325). Adabins: Depth estimation using adaptive bins. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA."},{"key":"ref_69","unstructured":"Xu, B., Wang, N., Chen, T., and Li, M. (2015). Empirical evaluation of rectified activations in convolutional network. arXiv."},{"key":"ref_70","unstructured":"Chen, Z., Badrinarayanan, V., Lee, C.Y., and Rabinovich, A. (2018, January 10\u201315). Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden."},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Liu, S., Johns, E., and Davison, A.J. (2019, January 15\u201319). End-to-end multi-task learning with attention. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00197"},{"key":"ref_72","first-page":"4701312","article-title":"Synthesizing optical and SAR imagery from land cover maps and auxiliary raster data","volume":"60","author":"Baier","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"1709","DOI":"10.1109\/JSTARS.2019.2911113","article-title":"Advanced multi-sensor optical remote sensing for urban land use and land cover classification: Outcome of the 2018 IEEE GRSS data fusion contest","volume":"12","author":"Xu","year":"2019","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_74","doi-asserted-by":"crossref","unstructured":"Zhu, J.Y., Park, T., Isola, P., and Efros, A.A. (2017, January 22\u201329). Unpaired image-to-image translation using cycle-consistent adversarial networks. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.244"},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Karatsiolis, S., Kamilaris, A., and Cole, I. (2021). Img2ndsm: Height estimation from single airborne rgb images with deep learning. Remote Sens., 13.","DOI":"10.3390\/rs13122417"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/14\/3450\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:53:02Z","timestamp":1760140382000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/14\/3450"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,18]]},"references-count":75,"journal-issue":{"issue":"14","published-online":{"date-parts":[[2022,7]]}},"alternative-id":["rs14143450"],"URL":"https:\/\/doi.org\/10.3390\/rs14143450","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,18]]}}}