{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T14:46:00Z","timestamp":1771944360188,"version":"3.50.1"},"reference-count":48,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2024,4,26]],"date-time":"2024-04-26T00:00:00Z","timestamp":1714089600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Nature Science Foundation of China (NSFC)","award":["61975028"],"award-info":[{"award-number":["61975028"]}]},{"name":"National Nature Science Foundation of China (NSFC)","award":["32371864"],"award-info":[{"award-number":["32371864"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Cracks provide the earliest and most immediate visual response to structural deterioration of asphalt pavements. Most of the current methods for crack detection are based on visible light sensors and convolutional neural networks. However, such an approach obviously limits the detection to daytime and good lighting conditions. Therefore, this paper proposes a crack detection technique cross-modal feature alignment of YOLOV5 based on visible and infrared images. The infrared spectrum characteristics of silicate concrete can be an important supplement. The adaptive illumination-aware weight generation module is introduced to compute illumination probability to guide the training of the fusion network. In order to alleviate the problem of weak alignment of the multi-scale feature map, the FA-BIFPN feature pyramid module is proposed. The parallel structure of a dual backbone network takes 40% less time to train than a single backbone network. As determined through validation on FLIR, LLVIP, and VEDAI bimodal datasets, the fused images have more stable performance compared to the visible images. In addition, the detector proposed in this paper surpasses the current advanced YOLOV5 unimodal detector and CFT cross-modal fusion module. In the publicly available bimodal road crack dataset, our method is able to detect cracks of 5 pixels with 98.3% accuracy under weak illumination.<\/jats:p>","DOI":"10.3390\/s24092759","type":"journal-article","created":{"date-parts":[[2024,4,26]],"date-time":"2024-04-26T05:13:41Z","timestamp":1714108421000},"page":"2759","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Concrete Highway Crack Detection Based on Visible Light and Infrared Silicate Spectrum Image Fusion"],"prefix":"10.3390","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7686-353X","authenticated-orcid":false,"given":"Jian","family":"Xing","sequence":"first","affiliation":[{"name":"College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-1513-9770","authenticated-orcid":false,"given":"Ying","family":"Liu","sequence":"additional","affiliation":[{"name":"College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guangzhu","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Civil Engineering and Transportation, Northeast Forestry University, Harbin 150040, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,4,26]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"De Blasiis, M.R., Di Benedetto, A., and Fiani, M. (2020). Mobile laser scanning data for the evaluation of pavement surface distress. Remote Sens., 12.","DOI":"10.3390\/rs12060942"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Zhang, S., Fu, Z., Li, G., and Liu, A. (2023). Lane Crack Detection Based on Saliency. Remote Sens., 15.","DOI":"10.3390\/rs15174146"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Hu, Q., Wang, P., Li, S., Liu, W., Li, Y., Lu, W., Kou, Y., Wei, F., He, P., and Yu, A. (2022). Research on Intelligent Crack Detection in a Deep-Cut Canal Slope in the Chinese South\u2013North Water Transfer Project. Remote Sens., 14.","DOI":"10.3390\/rs14215384"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"314","DOI":"10.1016\/j.precisioneng.2022.03.016","article-title":"Machine learning-based evaluation of the damage caused by cracks on concrete structures","volume":"76","author":"Mir","year":"2022","journal-title":"Precis. Eng."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Tian, Y., and Wang, Y. (2021, January 18\u201320). Crack detection method of highway side slope based on computer vision. Proceedings of the 2021 International Conference on Aviation Safety and Information Technology, Changsha, China.","DOI":"10.1145\/3510858.3511375"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"04016067","DOI":"10.1061\/(ASCE)CP.1943-5487.0000645","article-title":"Efficient crack detection method for tunnel lining surface cracks based on infrared images","volume":"31","author":"Yu","year":"2017","journal-title":"J. Comput. Civ. Eng."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1016\/j.autcon.2018.07.008","article-title":"Automatic recognition of asphalt pavement cracks using metaheuristic optimized edge detection algorithms and convolution neural network","volume":"94","author":"Nguyen","year":"2018","journal-title":"Autom. Constr."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1525","DOI":"10.1109\/TITS.2019.2910595","article-title":"Feature pyramid and hierarchical boosting network for pavement crack detection","volume":"21","author":"Yang","year":"2019","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1498","DOI":"10.1109\/TIP.2018.2878966","article-title":"Deepcrack: Learning hierarchical convolutional features for crack detection","volume":"28","author":"Zou","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"22179","DOI":"10.1109\/TITS.2022.3177210","article-title":"A detection method for pavement cracks combining object detection and attention mechanism","volume":"23","author":"Yao","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"103831","DOI":"10.1016\/j.autcon.2021.103831","article-title":"Real-time multiple damage mapping using autonomous UAV and deep faster region-based neural networks for GPS-denied structures","volume":"130","author":"Ali","year":"2021","journal-title":"Autom. Constr."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"15901","DOI":"10.1109\/JSEN.2023.3281585","article-title":"Improved YOLOV5-Based UAV Pavement Crack Detection","volume":"23","author":"Xing","year":"2023","journal-title":"IEEE Sens. J."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Salazar, A., Rodr\u00edguez, A., Vargas, N., and Vergara, L. (2022). On training road surface classifiers by data augmentation. Appl. Sci., 12.","DOI":"10.3390\/app12073423"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1016\/j.autcon.2017.06.024","article-title":"Remote sensing of concrete bridge decks using unmanned aerial vehicle infrared thermography","volume":"83","author":"Omar","year":"2017","journal-title":"Autom. Constr."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1016\/j.inffus.2021.02.023","article-title":"RFN-Nest: An end-to-end residual fusion network for infrared and visible images","volume":"73","author":"Li","year":"2021","journal-title":"Inf. Fusion"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"29200","DOI":"10.1109\/JSEN.2023.3324451","article-title":"DFECF-DET: All-weather detector based on differential feature enhancement and cross-modal fusion with visible and infrared sensors","volume":"23","author":"Wang","year":"2023","journal-title":"IEEE Sens. J."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"103906","DOI":"10.1016\/j.infrared.2021.103906","article-title":"MAF-YOLO: Multi-modal attention fusion based YOLO for pedestrian detection","volume":"118","author":"Xue","year":"2021","journal-title":"Infrared Phys. Technol."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"478","DOI":"10.1016\/j.infrared.2017.07.010","article-title":"A survey of infrared and visual image fusion methods","volume":"85","author":"Jin","year":"2017","journal-title":"Infrared Phys. Technol."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Feroz, S., and Abu Dabous, S. (2021). Uav-based remote sensing applications for bridge condition assessment. Remote Sens., 13.","DOI":"10.3390\/rs13091809"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Li, H., Ding, W., Cao, X., and Liu, C. (2017). Image registration and fusion of visible and infrared integrated camera for medium-altitude unmanned aerial vehicle remote sensing. Remote Sens., 9.","DOI":"10.3390\/rs9050441"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"105213","DOI":"10.1016\/j.autcon.2023.105213","article-title":"Crack detection of masonry structure based on thermal and visible image fusion and semantic segmentation","volume":"158","author":"Huang","year":"2024","journal-title":"Autom. Constr."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"102246","DOI":"10.1016\/j.inffus.2024.102246","article-title":"Improving RGB-infrared object detection with cascade alignment-guided transformer","volume":"105","author":"Yuan","year":"2024","journal-title":"Inf. Fusion"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"22145","DOI":"10.1109\/TITS.2022.3142393","article-title":"Asphalt pavement crack detection based on convolutional neural network and infrared thermography","volume":"23","author":"Liu","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1016\/j.trc.2014.05.013","article-title":"Use of infrared thermography for assessing HMA paving and compaction","volume":"46","author":"Plati","year":"2014","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1016\/j.inffus.2020.05.002","article-title":"Object fusion tracking based on visible and infrared images: A comprehensive review","volume":"63","author":"Zhang","year":"2020","journal-title":"Inf. Fusion"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Fang, Q., Han, D., and Wang, Z. (2021). Cross-modality fusion transformer for multispectral object detection. arXiv.","DOI":"10.2139\/ssrn.4227745"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"47773","DOI":"10.1007\/s11042-023-15333-w","article-title":"SLBAF-Net: Super-Lightweight bimodal adaptive fusion network for UAV detection in low recognition environment","volume":"82","author":"Cheng","year":"2023","journal-title":"Multimed. Tools Appl."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"102709","DOI":"10.1016\/j.ndteint.2022.102709","article-title":"Effect of different imaging modalities on the performance of a CNN: An experimental study on damage segmentation in infrared, visible, and fused images of concrete structures","volume":"132","author":"Pozzer","year":"2022","journal-title":"NDT E Int."},{"key":"ref_29","first-page":"9314","article-title":"Transportation mode recognition fusing wearable motion, sound, and vision sensors","volume":"20","author":"Richoz","year":"2020","journal-title":"IEEE Sens. J."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"10361","DOI":"10.1007\/s00521-023-08239-z","article-title":"Cross-modality complementary information fusion for multispectral pedestrian detection","volume":"35","author":"Yan","year":"2023","journal-title":"Neural Comput. Appl."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhang, H., Fromont, E., Lef\u00e8vre, S., and Avignon, B. (2021, January 3\u20138). Guided attentive feature fusion for multispectral pedestrian detection. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","DOI":"10.1109\/WACV48630.2021.00012"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1016\/j.inffus.2018.02.004","article-title":"Infrared and visible image fusion methods and applications: A survey","volume":"45","author":"Ma","year":"2019","journal-title":"Inf. Fusion"},{"key":"ref_33","unstructured":"Zheng, Y., Izzat, I.H., and Ziaee, S. (2019). GFD-SSD: Gated fusion double SSD for multispectral pedestrian detection. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.inffus.2018.09.015","article-title":"Cross-modality interactive attention network for multispectral pedestrian detection","volume":"50","author":"Zhang","year":"2019","journal-title":"Inf. Fusion"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/j.patcog.2018.08.005","article-title":"Illumination-aware faster R-CNN for robust multispectral pedestrian detection","volume":"85","author":"Li","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1109\/TPAMI.1980.4767032","article-title":"Disparity analysis of images","volume":"PAMI-2","author":"Barnard","year":"1980","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Zhang, L., Liu, Z., Zhu, X., Song, Z., Yang, X., Lei, Z., and Qiao, H. (2021). Weakly aligned feature fusion for multimodal object detection. IEEE Trans. Neural Netw. Learn. Syst., 1\u201315.","DOI":"10.1109\/TNNLS.2021.3105143"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1016\/j.inffus.2018.11.017","article-title":"Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection","volume":"50","author":"Guan","year":"2019","journal-title":"Inf. Fusion"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Noh, J., Lee, S., Kim, B., and Kim, G. (2018, January 18\u201323). Improving occlusion and hard negative handling for single-stage pedestrian detectors. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00107"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1016\/j.inffus.2023.03.011","article-title":"Multiscale spatial\u2013spectral transformer network for hyperspectral and multispectral image fusion","volume":"96","author":"Jia","year":"2023","journal-title":"Inf. Fusion"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Liu, C., Sui, H., Wang, J., Ni, Z., and Ge, L. (2022). Real-time ground-level building damage detection based on lightweight and accurate YOLOv5 using terrestrial images. Remote Sens., 14.","DOI":"10.3390\/rs14122763"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zhou, K., Chen, L., and Cao, X. (2020, January 23\u201328). Improving multispectral pedestrian detection by addressing modality imbalance problems. Proceedings of the Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK.","DOI":"10.1007\/978-3-030-58523-5_46"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 13\u201319). Efficientdet: Scalable and efficient object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1016\/j.jvcir.2015.11.002","article-title":"Vehicle detection in aerial imagery: A small target detection benchmark","volume":"34","author":"Razakarivony","year":"2016","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Jia, X., Zhu, C., Li, M., Tang, W., and Zhou, W. (2021, January 11\u201317). LLVIP: A visible-infrared paired dataset for low-light vision. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00389"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the Computer Vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"8906","DOI":"10.1109\/TMM.2023.3243616","article-title":"Dilateformer: Multi-scale dilated transformer for visual recognition","volume":"25","author":"Jiao","year":"2023","journal-title":"IEEE Trans. Multimed."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/9\/2759\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:33:39Z","timestamp":1760106819000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/9\/2759"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,26]]},"references-count":48,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2024,5]]}},"alternative-id":["s24092759"],"URL":"https:\/\/doi.org\/10.3390\/s24092759","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,26]]}}}