{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,24]],"date-time":"2026-07-24T15:06:07Z","timestamp":1784905567805,"version":"3.55.0"},"reference-count":50,"publisher":"MDPI AG","issue":"20","license":[{"start":{"date-parts":[[2023,10,11]],"date-time":"2023-10-11T00:00:00Z","timestamp":1696982400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the National Natural Science Foundation of China","award":["42064003"],"award-info":[{"award-number":["42064003"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>In the realm of remote sensing image analysis, the task of road extraction poses significant complexities, especially in the context of intricate scenes and diminutive targets. In response to these challenges, we have developed a novel deep learning network, christened CDAU-Net, designed to discern and delineate these features with enhanced precision. This network takes its structural inspiration from the fundamental architecture of U-Net while introducing innovative enhancements: we have integrated CoordConv convolutions into both the initial layer of the U-Net encoder and the terminal layer of the decoder, thereby facilitating a more efficacious processing of spatial information inherent in remote sensing images. Moreover, we have devised a unique mechanism termed the Deep Dual Cross Attention (DDCA), purposed to capture long-range dependencies within images\u2014a critical factor in remote sensing image analysis. Our network replaces the skip-connection component of the U-Net with this newly designed mechanism, dealing with feature maps of the first four scales in the encoder and generating four corresponding outputs. These outputs are subsequently linked with the decoder stage to further capture the remote dependencies present within the remote sensing imagery. We have subjected CDAU-Net to extensive empirical validation, including testing on the Massachusetts Road Dataset and DeepGlobe Road Dataset. Both datasets encompass a diverse range of complex road scenes, making them ideal for evaluating the performance of road extraction algorithms. The experimental results showcase that whether in terms of accuracy, recall rate, or Intersection over Union (IoU) metrics, the CDAU-Net outperforms existing state-of-the-art methods in the task of road extraction. These findings substantiate the effectiveness and superiority of our approach in handling complex scenes and small targets, as well as in capturing long-range dependencies in remote sensing imagery. In sum, the design of CDAU-Net not only enhances the accuracy of road extraction but also presents new perspectives and possibilities for deep learning analysis of remote sensing imagery.<\/jats:p>","DOI":"10.3390\/rs15204914","type":"journal-article","created":{"date-parts":[[2023,10,11]],"date-time":"2023-10-11T08:18:57Z","timestamp":1697012337000},"page":"4914","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["CDAU-Net: A Novel CoordConv-Integrated Deep Dual Cross Attention Mechanism for Enhanced Road Extraction in Remote Sensing Imagery"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6972-6908","authenticated-orcid":false,"given":"Anchao","family":"Yin","sequence":"first","affiliation":[{"name":"College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2591-6619","authenticated-orcid":false,"given":"Chao","family":"Ren","sequence":"additional","affiliation":[{"name":"College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China"},{"name":"Guangxi Key Laboratory of Spatial Information and Geomatics, Guilin 541106, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6936-808X","authenticated-orcid":false,"given":"Weiting","family":"Yue","sequence":"additional","affiliation":[{"name":"College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hongjuan","family":"Shao","sequence":"additional","affiliation":[{"name":"College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiaoqin","family":"Xue","sequence":"additional","affiliation":[{"name":"College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2023,10,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"2558","DOI":"10.1109\/TAES.2021.3053115","article-title":"Decentralized Autonomous Navigation of a UAV Network for Road Traffic Monitoring","volume":"57","author":"Huang","year":"2021","journal-title":"IEEE Trans. Aerosp. Electron. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Baltodano, S., Sibi, S., Martelaro, N., Gowda, N., and Ju, W. (2015, January 1\u20133). The RRADS Platform: A Real Road Autonomous Driving Simulator. Proceedings of the 7th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Nottingham, UK.","DOI":"10.1145\/2799250.2799288"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"11999","DOI":"10.1016\/j.eswa.2010.12.123","article-title":"Real World Representation of a Road Network for Route Planning in GIS","volume":"38","author":"Varshosaz","year":"2011","journal-title":"Expert Syst. Appl."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Salama, A.S., Saleh, B.K., and Eassa, M.M. (2010, January 2\u20134). Intelligent Cross Road Traffic Management System (ICRTMS). Proceedings of the 2010 2nd International Conference on Computer Technology and Development, Cairo, Egypt.","DOI":"10.1109\/ICCTD.2010.5646059"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"101436","DOI":"10.1016\/j.ecoinf.2021.101436","article-title":"Application of Geographical Information System (GIS) in Reducing Accident Blackspots and in Planning of a Safer Urban Road Network: A Review","volume":"66","author":"Singh","year":"2021","journal-title":"Ecol. Inform."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1016\/S0305-9006(03)00066-7","article-title":"Remote Sensing Technology for Mapping and Monitoring Land-Cover and Land-Use Change","volume":"61","author":"Rogan","year":"2004","journal-title":"Prog. Plan."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1814","DOI":"10.1109\/JSTARS.2022.3148139","article-title":"Progress and Challenges in Intelligent Remote Sensing Satellite Systems","volume":"15","author":"Zhang","year":"2022","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Lu, J., Liu, H., Yao, Y., Tao, S., Tang, Z., and Lu, J. (2020, January 6\u201310). Hsi Road: A Hyper Spectral Image Dataset for Road Segmentation. Proceedings of the 2020 IEEE International Conference on Multimedia and Expo (ICME), London, UK.","DOI":"10.1109\/ICME46284.2020.9102890"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1016\/j.isprsjprs.2013.02.006","article-title":"Automatic Fuzzy Object-Based Analysis of VHSR Images for Urban Objects Extraction","volume":"79","author":"Sebari","year":"2013","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"947","DOI":"10.1080\/13658816.2019.1696968","article-title":"Automatic Extraction of Road Intersection Points from USGS Historical Map Series Using Deep Convolutional Neural Networks","volume":"34","author":"Saeedimoghaddam","year":"2020","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Hou, Y., Liu, Z., Zhang, T., and Li, Y. (2021). C-UNet: Complement UNet for Remote Sensing Road Extraction. Sensors, 21.","DOI":"10.3390\/s21062153"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zhou, L., Zhang, C., and Wu, M. (2018, January 18\u201323). D-LinkNet: LinkNet with Pretrained Encoder and Dilated Convolution for High Resolution Satellite Imagery Road Extraction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00034"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"5602213","DOI":"10.1109\/TGRS.2023.3237561","article-title":"RADANet: Road Augmented Deformable Attention Network for Road Extraction From Complex High-Resolution Remote-Sensing Images","volume":"61","author":"Dai","year":"2023","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Abdollahi, A., Pradhan, B., Shukla, N., Chakraborty, S., and Alamri, A. (2020). Deep Learning Approaches Applied to Remote Sensing Datasets for Road Extraction: A State-of-the-Art Review. Remote Sens., 12.","DOI":"10.3390\/rs12091444"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1016\/j.ins.2020.05.062","article-title":"Global Context Based Automatic Road Segmentation via Dilated Convolutional Neural Network","volume":"535","author":"Lan","year":"2020","journal-title":"Inf. Sci."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"8919","DOI":"10.1109\/TGRS.2020.2991733","article-title":"Simultaneous Road Surface and Centerline Extraction from Large-Scale Remote Sensing Images Using CNN-Based Segmentation and Tracing","volume":"58","author":"Wei","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Sun, Z., Zhou, W., Ding, C., and Xia, M. (2022). Multi-Resolution Transformer Network for Building and Road Segmentation of Remote Sensing Image. ISPRS Int. J. Geo-Inf., 11.","DOI":"10.3390\/ijgi11030165"},{"key":"ref_18","first-page":"15908","article-title":"Transformer in Transformer","volume":"34","author":"Han","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_19","unstructured":"Liu, R., Lehman, J., Molino, P., Petroski Such, F., Frank, E., Sergeev, A., and Yosinski, J. (2018). An Intriguing Failing of Convolutional Neural Networks and the Coordconv Solution. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Xu, Y., Xie, Z., Feng, Y., and Chen, Z. (2018). Road Extraction from High-Resolution Remote Sensing Imagery Using Deep Learning. Remote Sens., 10.","DOI":"10.3390\/rs10091461"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1641","DOI":"10.1109\/JSEN.2014.2364854","article-title":"Road Surface Status Classification Using Spectral Analysis of NIR Camera Images","volume":"15","author":"Jonsson","year":"2014","journal-title":"IEEE Sens. J."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"761","DOI":"10.1016\/j.tra.2012.02.008","article-title":"Remoteness and Accessibility in the Vulnerability Analysis of Regional Road Networks","volume":"46","author":"Taylor","year":"2012","journal-title":"Transp. Res. Part A Policy Pract."},{"key":"ref_23","first-page":"635","article-title":"Knowledge-Based Road Interpretation in Aerial Images","volume":"32","author":"Trinder","year":"1998","journal-title":"Int. Arch. Photogramm. Remote Sens."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1109\/TIM.2019.2902003","article-title":"Online Fault Diagnosis Method Based on Transfer Convolutional Neural Networks","volume":"69","author":"Xu","year":"2019","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-Net: Convolutional Networks for Biomedical Image Segmentation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2013MICCAI 2015: 18th International Conference, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Ren, Y., Yu, Y., and Guan, H. (2020). DA-CapsUNet: A Dual-Attention Capsule U-Net for Road Extraction from Remote Sensing Imagery. Remote Sens., 12.","DOI":"10.3390\/rs12182866"},{"key":"ref_28","unstructured":"Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., and Kainz, B. (2018). Attention U-Net: Learning Where to Look for the Pancreas. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"749","DOI":"10.1109\/LGRS.2018.2802944","article-title":"Road Extraction by Deep Residual U-Net","volume":"15","author":"Zhang","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_31","unstructured":"Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L. (2014). Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected Crfs. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected Crfs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_33","unstructured":"Chen, L.-C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1016\/j.isprsjprs.2021.03.016","article-title":"A Global Context-Aware and Batch-Independent Network for Road Extraction from VHR Satellite Imagery","volume":"175","author":"Zhu","year":"2021","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"5621414","DOI":"10.1109\/TGRS.2022.3165817","article-title":"Cascaded Multi-Task Road Extraction Network for Road Surface, Centerline, and Edge Extraction","volume":"60","author":"Lu","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Shaw, P., Uszkoreit, J., and Vaswani, A. (2018). Self-Attention with Relative Position Representations. arXiv.","DOI":"10.18653\/v1\/N18-2074"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"6509505","DOI":"10.1109\/LGRS.2022.3171973","article-title":"TransRoadNet: A Novel Road Extraction Method for Remote Sensing Images via Combining High-Level Semantic Feature and Context","volume":"19","author":"Yang","year":"2022","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Miao, C., Liu, C., and Tian, Q. (2022). DCS-TransUperNet: Road Segmentation Network Based on CSwin Transformer with Dual Resolution. Appl. Sci., 12.","DOI":"10.3390\/app12073511"},{"key":"ref_40","first-page":"1","article-title":"Rngdet: Road Network Graph Detection by Transformer in Aerial Images","volume":"60","author":"Xu","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Niu, B., Wen, W., Ren, W., Zhang, X., Yang, L., Wang, S., Zhang, K., Cao, X., and Shen, H. (2020, January 23\u201328). Single Image Super-Resolution via a Holistic Attention Network. Proceedings of the Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK.","DOI":"10.1007\/978-3-030-58610-2_12"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Pan, X., Yang, F., Gao, L., Chen, Z., Zhang, B., Fan, H., and Ren, J. (2019). Building Extraction from High-Resolution Aerial Imagery Using a Generative Adversarial Network with Spatial and Channel Attention Mechanisms. Remote Sens., 11.","DOI":"10.3390\/rs11080917"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Wu, Y., Wu, Y., Wang, B., and Yang, H. (2022). A Remote Sensing Method for Crop Mapping Based on Multiscale Neighborhood Feature Extraction. Remote Sens., 15.","DOI":"10.3390\/rs15010047"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1138","DOI":"10.1109\/TIFS.2019.2936913","article-title":"Depth-Wise Separable Convolutions and Multi-Level Pooling for an Efficient Spatial CNN-Based Steganalysis","volume":"15","author":"Zhang","year":"2019","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"ref_45","unstructured":"Mnih, V. (2013). Machine Learning for Aerial Image Labeling, University of Toronto."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., and Raskar, R.D. (2018). A Challenge to Parse the Earth through Satellite Images. arXiv.","DOI":"10.1109\/CVPRW.2018.00031"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1007\/s10479-005-5724-z","article-title":"A Tutorial on the Cross-Entropy Method","volume":"134","author":"Kroese","year":"2005","journal-title":"Ann. Oper. Res."},{"key":"ref_48","unstructured":"Sun, K., Zhao, Y., Jiang, B., Cheng, T., Xiao, B., Liu, D., Mu, Y., Wang, X., Liu, W., and Wang, J. (2023, October 09). High-Resolution Representations for Labeling Pixels and Regions. Available online: https:\/\/arxiv.org\/abs\/1904.04514v1."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation, Springer.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_50","first-page":"4412612","article-title":"DDU-Net: Dual-Decoder-U-Net for Road Extraction Using High-Resolution Remote Sensing Images","volume":"60","author":"Wang","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/20\/4914\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:04:57Z","timestamp":1760130297000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/20\/4914"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,11]]},"references-count":50,"journal-issue":{"issue":"20","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["rs15204914"],"URL":"https:\/\/doi.org\/10.3390\/rs15204914","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,11]]}}}