{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T01:12:42Z","timestamp":1774314762222,"version":"3.50.1"},"reference-count":45,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2023,1,22]],"date-time":"2023-01-22T00:00:00Z","timestamp":1674345600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62201237"],"award-info":[{"award-number":["62201237"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61971208"],"award-info":[{"award-number":["61971208"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["202101BE070001-008"],"award-info":[{"award-number":["202101BE070001-008"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["202101AU070050"],"award-info":[{"award-number":["202101AU070050"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation","doi-asserted-by":"publisher","award":["62201237"],"award-info":[{"award-number":["62201237"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation","doi-asserted-by":"publisher","award":["61971208"],"award-info":[{"award-number":["61971208"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation","doi-asserted-by":"publisher","award":["202101BE070001-008"],"award-info":[{"award-number":["202101BE070001-008"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation","doi-asserted-by":"publisher","award":["202101AU070050"],"award-info":[{"award-number":["202101AU070050"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Yunnan Fundamental Research Projects","award":["62201237"],"award-info":[{"award-number":["62201237"]}]},{"name":"Yunnan Fundamental Research Projects","award":["61971208"],"award-info":[{"award-number":["61971208"]}]},{"name":"Yunnan Fundamental Research Projects","award":["202101BE070001-008"],"award-info":[{"award-number":["202101BE070001-008"]}]},{"name":"Yunnan Fundamental Research Projects","award":["202101AU070050"],"award-info":[{"award-number":["202101AU070050"]}]},{"name":"Fundamental Research Project of Yunnan Province","award":["62201237"],"award-info":[{"award-number":["62201237"]}]},{"name":"Fundamental Research Project of Yunnan Province","award":["61971208"],"award-info":[{"award-number":["61971208"]}]},{"name":"Fundamental Research Project of Yunnan Province","award":["202101BE070001-008"],"award-info":[{"award-number":["202101BE070001-008"]}]},{"name":"Fundamental Research Project of Yunnan Province","award":["202101AU070050"],"award-info":[{"award-number":["202101AU070050"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Remote sensing object detection based on the combination of infrared and visible images can effectively adapt to the around-the-clock and changeable illumination conditions. However, most of the existing infrared and visible object detection networks need two backbone networks to extract the features of two modalities, respectively. Compared with the single modality detection network, this greatly increases the amount of calculation, which limits its real-time processing on the vehicle and unmanned aerial vehicle (UAV) platforms. Therefore, this paper proposes a local adaptive illumination-driven input-level fusion module (LAIIFusion). The previous methods for illumination perception only focus on the global illumination, ignoring the local differences. In this regard, we design a new illumination perception submodule, and newly define the value of illumination. With more accurate area selection and label design, the module can more effectively perceive the scene illumination condition. In addition, aiming at the problem of incomplete alignment between infrared and visible images, a submodule is designed for the rapid estimation of slight shifts. The experimental results show that the single modality detection algorithm based on LAIIFusion can ensure a large improvement in accuracy with a small loss of speed. On the DroneVehicle dataset, our module combined with YOLOv5L could achieve the best performance.<\/jats:p>","DOI":"10.3390\/rs15030660","type":"journal-article","created":{"date-parts":[[2023,1,23]],"date-time":"2023-01-23T04:19:22Z","timestamp":1674447562000},"page":"660","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":42,"title":["Local Adaptive Illumination-Driven Input-Level Fusion for Infrared and Visible Object Detection"],"prefix":"10.3390","volume":"15","author":[{"given":"Jiawen","family":"Wu","sequence":"first","affiliation":[{"name":"Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China"},{"name":"Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming 650500, China"}]},{"given":"Tao","family":"Shen","sequence":"additional","affiliation":[{"name":"Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China"},{"name":"Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming 650500, China"}]},{"given":"Qingwang","family":"Wang","sequence":"additional","affiliation":[{"name":"Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China"},{"name":"Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming 650500, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1573-6235","authenticated-orcid":false,"given":"Zhimin","family":"Tao","sequence":"additional","affiliation":[{"name":"Beijing Anlu International Technology Co., Ltd., Beijing 100043, China"},{"name":"School of Transportation Science and Engineering, Beihang University, Beijing 100191, China"}]},{"given":"Kai","family":"Zeng","sequence":"additional","affiliation":[{"name":"Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China"},{"name":"Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming 650500, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6895-0385","authenticated-orcid":false,"given":"Jian","family":"Song","sequence":"additional","affiliation":[{"name":"Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China"},{"name":"Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming 650500, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,1,22]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1109\/MGRS.2020.3041450","article-title":"Methods for Small, Weak Object Detection in Optical High-Resolution Remote Sensing Images: A survey of advances and challenges","volume":"9","author":"Han","year":"2021","journal-title":"IEEE Geosci. Remote Sens. Mag."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Ding, J., Xue, N., Long, Y., Xia, G., and Lu, Q. (2019, January 15\u201320). Learning RoI Transformer for Oriented Object Detection in Aerial Images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00296"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Wang, Q., Chi, Y., Shen, T., Song, J., Zhang, Z., and Zhu, Y. (2022). Improving RGB-Infrared Object Detection by Reducing Cross-Modality Redundancy. Remote Sens., 14.","DOI":"10.3390\/rs14092020"},{"key":"ref_4","first-page":"4407416","article-title":"A New Spatial-Oriented Object Detection Framework for Remote Sensing Images","volume":"60","author":"Yu","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"749","DOI":"10.1109\/LGRS.2018.2802944","article-title":"Road extraction by deep residual U-Net","volume":"15","author":"Zhang","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., and Berg, A.C. (2016, January 8\u201316). SSD: Single shot multibox detector. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Liu, S., Huang, D., and Wang, Y. (2018, January 8\u201314). Receptive field block net for accurate and fast object detection. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01252-6_24"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You Only Look Once: Unified, Real- Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_9","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_10","unstructured":"ultralytics (2020, May 18). yolov5. Available online: https:\/\/github.com\/ultralytics\/yolov5."},{"key":"ref_11","unstructured":"Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021). YOLOX: Exceeding YOLO Series in 2021. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirllov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-end object detection with transformers. Proceedings of the European Conference on Computer Vision (ECCV), Online.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_13","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster R-CNN: Towards real-time object detection with region proposal networks. Proceedings of the 28th International Conference on Neural Information Processing Systems, Cambridge, MA, USA."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1109\/TPAMI.2018.2844175","article-title":"Mask R-CNN","volume":"42","author":"He","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Qiao, S., Chen, L., and Yuille, A. (2021, January 20\u201325). Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01008"},{"key":"ref_16","first-page":"98","article-title":"The pascal visual object classes challenge: A retrospective","volume":"111","author":"Everingham","year":"2015","journal-title":"Int. J. Comput."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Lin, T., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft COCO: Common objects in context. Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_18","unstructured":"Zhu, P., Wen, L., Bian, X., Ling, H., and Hu, Q. (2018). Vision meets drones: A challenge. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Zhang, P., Zhong, Y., and Li, X. (2019, January 27\u201328). SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications. Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshops (ICCVW), Seoul, Korea.","DOI":"10.1109\/ICCVW.2019.00011"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Liu, Z., Gao, G., Sun, L., and Fang, Z. (2021, January 5\u20139). HRDNet: High-Resolution Detection Network for Small Objects. Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Shenzhen, China.","DOI":"10.1109\/ICME51207.2021.9428241"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"103058","DOI":"10.1016\/j.jvcir.2021.103058","article-title":"A lightweight multi-scale aggregated model for detecting aerial images captured by uavs","volume":"77","author":"Li","year":"2021","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Yu, W., Yang, T., and Chen, C. (2021, January 3\u20138). Towards resolving the challenge of longtail distribution in uav images for object detection. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA.","DOI":"10.1109\/WACV48630.2021.00330"},{"key":"ref_23","unstructured":"Li, C., Song, D., Tong, R., and Tang, M. (2018, January 3\u20136). Multispectral Pedestrian Detection via Simultaneous Detection and Segmentation. Proceedings of the British Machine Vision Conference (BMWC), Newcastle, UK."},{"key":"ref_24","first-page":"509","article-title":"Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks","volume":"587","author":"Wagner","year":"2016","journal-title":"ESANN"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/j.patcog.2018.08.005","article-title":"Illumination-aware faster R-CNN for robust multispectral pedestrian detection","volume":"85","author":"Li","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.inffus.2018.09.015","article-title":"Cross-modality interactive attention network for multispectral pedestrian detection","volume":"50","author":"Zhang","year":"2019","journal-title":"Inf. Fusion"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1016\/j.inffus.2018.11.017","article-title":"Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection","volume":"50","author":"Guan","year":"2019","journal-title":"Inf. Fusion"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhou, K., Chen, L., and Cao, X. (2020, January 23\u201328). Improving multispectral pedestrian detection by addressing modality imbalance problems. Proceedings of the European Conference on Computer Vision (ECCV), Online.","DOI":"10.1007\/978-3-030-58523-5_46"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Zhang, L., Zhu, X., Chen, X., Yang, X., Lei, Z., and Liu, Z. (2019, January 27\u201328). Weakly aligned cross-modal learning for multispectral pedestrian detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00523"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1109\/TCSVT.2021.3060162","article-title":"Deep Cross-Modal Representation Learning and Distillation for Illumination-Invariant Pedestrian Detection","volume":"32","author":"Liu","year":"2022","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"2614","DOI":"10.1109\/TIP.2018.2887342","article-title":"DenseFuse: A Fusion Approach to Infrared and Visible Images","volume":"28","author":"Li","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Zhao, Z., Xu, S., Zhang, C., Liu, J., Zhang, J., and Li, P. (2020, January 11\u201317). DIDFuse: Deep Image Decomposition for Infrared and Visible Image Fusion. Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI), Yokohama, Japan.","DOI":"10.24963\/ijcai.2020\/135"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.inffus.2018.09.004","article-title":"FusionGAN: A generative adversarial network for infrared and visible image fusion","volume":"48","author":"Ma","year":"2019","journal-title":"Inf. Fusion"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"9645","DOI":"10.1109\/TIM.2020.3005230","article-title":"NestFuse: An Infrared and Visible Image Fusion Architecture Based on Nest Connection and Spatial\/Channel Attention Models","volume":"69","author":"Li","year":"2020","journal-title":"IEEE Tans. Instrum. Meas."},{"key":"ref_35","unstructured":"Jaderberg, M., Simonyan, K., Zisserman, A., and Kavukcuoglu, K. (2015, January 7\u201312). Spatial Transformer Networks. Proceedings of the 28th International Conference on Neural Information Processing Systems, Cambridge (NIPS), MA, USA."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"6700","DOI":"10.1109\/TCSVT.2022.3168279","article-title":"Drone-based RGB-Infrared Cross-Modality Vehicle Detection via Uncertainty-Aware Learning","volume":"32","author":"Sun","year":"2022","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Hwang, S., Park, J., Kim, N., Choi, Y., and Kweon, S.I. (2015, January 7\u201312). Multispectral Pedestrian Detection: Benchmark Dataset and Baseline. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298706"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"743","DOI":"10.1109\/TPAMI.2011.155","article-title":"Pedestrian Detection: An Evaluation of the State of the Art","volume":"34","author":"Dollar","year":"2012","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"He, Y., Zhang, X., and Sun, J. (2017, January 22\u201329). Channel Pruning for Accelerating Very Deep Neural Networks. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.155"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., and Zhang, C. (2017, January 22\u201329). Learning Efficient Convolutional Networks through Network Slimming. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.298"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Wu, J., Leng, C., Wang, Y., Hu, Q., and Cheng, J. (2016, January 27\u201330). Quantized Convolutional Neural Networks for Mobile Devices. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, NV, USA.","DOI":"10.1109\/CVPR.2016.521"},{"key":"ref_43","unstructured":"Courbariaux, M., and Bengio, Y. (2016). BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. arXiv."},{"key":"ref_44","unstructured":"Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the Knowledge in a Neural Network. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Yim, J., Joo, D., Bae, J., and Kim, J. (2017, January 21\u201326). A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.754"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/3\/660\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:13:35Z","timestamp":1760120015000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/3\/660"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1,22]]},"references-count":45,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["rs15030660"],"URL":"https:\/\/doi.org\/10.3390\/rs15030660","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1,22]]}}}