{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T16:06:04Z","timestamp":1774541164421,"version":"3.50.1"},"reference-count":48,"publisher":"MDPI AG","issue":"18","license":[{"start":{"date-parts":[[2023,9,6]],"date-time":"2023-09-06T00:00:00Z","timestamp":1693958400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Beijing Nature Science Foundation of China","award":["4232014"],"award-info":[{"award-number":["4232014"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Infrared small target detection for aerial remote sensing is crucial in both civil and military fields. For infrared targets with small sizes, low signal-to-noise ratio, and little detailed texture information, we propose a Res-SwinTransformer with a Local Contrast Attention Network (RSLCANet). Specifically, we first design a SwinTransformer-based backbone to improve the interaction capability of global information. On this basis, we introduce a residual structure to fully retain the shallow detail information of small infrared targets. Furthermore, we design a plug-and-play attention module named LCA Block (local contrast attention block) to enhance the target and suppress the background, which is based on local contrast calculation. In addition, we develop an air-to-ground multi-scene infrared vehicle dataset based on an unmanned aerial vehicle (UAV) platform, which can provide a database for infrared vehicle target detection algorithm testing and infrared target characterization studies. Experiments demonstrate that our method can achieve a low-miss detection rate, high detection accuracy, and high detection speed. In particular, on the DroneVehicle dataset, our designed RSLCANet increases by 4.3% in terms of mAP@0.5 compared to the base network You Only Look Once (YOLOX). In addition, our network has fewer parameters than the two-stage network and the Transformer-based network model, which helps the practical deployment and can be applied in fields such as car navigation, crop monitoring, and infrared warning.<\/jats:p>","DOI":"10.3390\/rs15184387","type":"journal-article","created":{"date-parts":[[2023,9,6]],"date-time":"2023-09-06T10:23:42Z","timestamp":1693995822000},"page":"4387","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Res-SwinTransformer with Local Contrast Attention for Infrared Small Target Detection"],"prefix":"10.3390","volume":"15","author":[{"given":"Tianhua","family":"Zhao","sequence":"first","affiliation":[{"name":"School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China"}]},{"given":"Jie","family":"Cao","sequence":"additional","affiliation":[{"name":"School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China"},{"name":"Yangtze Delta Region Academy, Beijing Institute of Technology, Jiaxing 314003, China"}]},{"given":"Qun","family":"Hao","sequence":"additional","affiliation":[{"name":"School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China"},{"name":"Yangtze Delta Region Academy, Beijing Institute of Technology, Jiaxing 314003, China"},{"name":"School of Opto-Electronic Engineering, Changchun University of Science and Technology, Changchun 130022, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5177-6808","authenticated-orcid":false,"given":"Chun","family":"Bao","sequence":"additional","affiliation":[{"name":"School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China"}]},{"given":"Moudan","family":"Shi","sequence":"additional","affiliation":[{"name":"School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,9,6]]},"reference":[{"key":"ref_1","first-page":"5519015","article-title":"A locally optimized model for hyperspectral and multispectral images fusion","volume":"60","author":"Ren","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_2","first-page":"5533216","article-title":"Generalized linear spectral mixing model for spatial\u2013temporal\u2013spectral fusion","volume":"60","author":"Zhou","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"5522914","DOI":"10.1109\/TGRS.2022.3146296","article-title":"MLR-DBPFN: A multi-scale low rank deep back projection fusion network for anti-noise hyperspectral and multispectral image fusion","volume":"60","author":"Sun","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_4","first-page":"102846","article-title":"Marine floating raft aquaculture extraction of hyperspectral remote sensing images based decision tree algorithm","volume":"111","author":"Hou","year":"2022","journal-title":"Int. J. Appl. Earth Obs. Geoinf."},{"key":"ref_5","first-page":"102572","article-title":"A simple and effective spectral-spatial method for mapping large-scale coastal wetlands using China ZY1-02D satellite hyperspectral images","volume":"104","author":"Sun","year":"2021","journal-title":"Int. J. Appl. Earth Obs. Geoinf."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Ma, J., Guo, H., Rong, S., Feng, J., and He, B. (2023). Infrared Dim and Small Target Detection Based on Background Prediction. Remote Sens., 15.","DOI":"10.20944\/preprints202305.1075.v1"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1016\/j.isprsjprs.2015.10.004","article-title":"Remote sensing platforms and sensors: A survey","volume":"115","author":"Toth","year":"2016","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the Computer Vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland. Proceedings, Part V 13.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_9","unstructured":"Henini, M., and Razeghi, M. (2002). Handbook of Infrared Detection Technologies, Elsevier."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"082401","DOI":"10.1088\/0034-4885\/77\/8\/082401","article-title":"Advances in mid-infrared detection and imaging: A key issues review","volume":"77","author":"Razeghi","year":"2014","journal-title":"Rep. Prog. Phys."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"103897","DOI":"10.1016\/j.infrared.2021.103897","article-title":"Infrared scene prediction of night unmanned vehicles based on multi-scale feature maps","volume":"118","author":"Li","year":"2021","journal-title":"Infrared Phys. Technol."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"125940","DOI":"10.1016\/j.jhydrol.2020.125940","article-title":"Estimation of the transpiration of urban shrubs using the modified three-dimensional three-temperature model and infrared remote sensing","volume":"594","author":"Qiu","year":"2021","journal-title":"J. Hydrol."},{"key":"ref_13","first-page":"4401015","article-title":"Retrieval of land surface temperature, emissivity, and atmospheric parameters from hyperspectral thermal infrared image using a feature-band linear-format hybrid algorithm","volume":"60","author":"Ren","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"107781","DOI":"10.1016\/j.compeleceng.2022.107781","article-title":"An infrared pedestrian detection method based on segmentation and domain adaptation learning","volume":"99","author":"Zhang","year":"2022","journal-title":"Comput. Electr. Eng."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Deshpande, S.D., Er, M.H., Venkateswarlu, R., and Chan, P. (1999, January 18\u201323). Max-mean and max-median filters for detection of small targets. Proceedings of the Signal and Data Processing of Small Targets 1999, Denver, CO, USA.","DOI":"10.1117\/12.364049"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1109\/TGRS.2013.2242477","article-title":"A local contrast method for small infrared target detection","volume":"52","author":"Chen","year":"2013","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"2168","DOI":"10.1109\/LGRS.2014.2323236","article-title":"A robust infrared small target detection algorithm based on human visual system","volume":"11","author":"Han","year":"2014","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"612","DOI":"10.1109\/LGRS.2018.2790909","article-title":"Infrared small target detection utilizing the multiscale relative local contrast measure","volume":"15","author":"Han","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"4996","DOI":"10.1109\/TIP.2013.2281420","article-title":"Infrared patch-image model for small target detection in a single image","volume":"22","author":"Gao","year":"2013","journal-title":"IEEE Trans. Image Process."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1469","DOI":"10.1007\/s11036-019-01377-6","article-title":"Infrared dim and small target detection based on denoising autoencoder network","volume":"25","author":"Shi","year":"2020","journal-title":"Mob. Netw. Appl."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zheng, G., Wu, X., Hu, Y., and Liu, X. (2019, January 27\u201330). Object detection for low-resolution infrared image in land battlefield based on deep learning. Proceedings of the 2019 Chinese Control Conference (CCC), Guangzhou, China.","DOI":"10.23919\/ChiCC.2019.8866344"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"25671","DOI":"10.1109\/ACCESS.2021.3057723","article-title":"Weak and occluded vehicle detection in complex infrared environment based on improved YOLOv4","volume":"9","author":"Du","year":"2021","journal-title":"IEEE Access"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"9813","DOI":"10.1109\/TGRS.2020.3044958","article-title":"Attentional local contrast networks for infrared small target detection","volume":"59","author":"Dai","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Zhu, R., and Zhuang, L. (2022). Unsupervised Infrared Small-Object-Detection Approach of Spatial\u2013Temporal Patch Tensor and Object Selection. Remote Sens., 14.","DOI":"10.3390\/rs14071612"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Wang, Q., Chi, Y., Shen, T., Song, J., Zhang, Z., and Zhu, Y. (2022). Improving RGB-infrared object detection by reducing cross-modality redundancy. Remote Sens., 14.","DOI":"10.3390\/rs14092020"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Dang, L.M., Wang, H., Li, Y., Min, K., Kwak, J.T., Lee, O.N., Park, H., and Moon, H. (2020). Fusarium wilt of radish detection using RGB and near infrared images from Unmanned Aerial Vehicles. Remote Sens., 12.","DOI":"10.3390\/rs12172863"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Wu, J., Shen, T., Wang, Q., Tao, Z., Zeng, K., and Song, J. (2023). Local Adaptive Illumination-Driven Input-Level Fusion for Infrared and Visible Object Detection. Remote Sens., 15.","DOI":"10.3390\/rs15030660"},{"key":"ref_28","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017, January 4\u20139). Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA."},{"key":"ref_29","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_30","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2017, January 10\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-end object detection with transformers. Proceedings of the Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK. Proceedings, Part I 16.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_32","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020). Deformable detr: Deformable transformers for end-to-end object detection. arXiv."},{"key":"ref_33","unstructured":"Wang, Y., Zhang, X., Yang, T., and Sun, J. (March, January 22). Anchor detr: Query design for transformer-based detector. Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22), Virtual."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1016\/j.patcog.2016.04.002","article-title":"Multiscale patch-based contrast measure for small infrared target detection","volume":"58","author":"Wei","year":"2016","journal-title":"Pattern Recognit."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Chen, Y., Wang, H., Pang, Y., Han, J., Mou, E., and Cao, E. (2023). An Infrared Small Target Detection Method Based on a Weighted Human Visual Comparison Mechanism for Safety Monitoring. Remote Sens., 15.","DOI":"10.3390\/rs15112922"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201323). Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_38","unstructured":"Jocher, G., Stoken, A., Borovec, J., Christopher, S., and Laughing, L.C. (2021). Ultralytics\/Yolov5: v6.0, Zenodo."},{"key":"ref_39","unstructured":"Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021). Yolox: Exceeding yolo series in 2021. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Braun, M., Krebs, S., Flohr, F., and Gavrila, D.M. (2018). The eurocity persons dataset: A novel benchmark for object detection. arXiv.","DOI":"10.1109\/TPAMI.2019.2897684"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Xia, G.-S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M., Pelillo, M., and Zhang, L. (2018, January 18\u201323). DOTA: A large-scale dataset for object detection in aerial images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00418"},{"key":"ref_42","first-page":"291","article-title":"A dataset for infrared detection and tracking of dim-small aircraft targets under ground\/air background","volume":"5","author":"Hui","year":"2020","journal-title":"China Sci. Data"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"6700","DOI":"10.1109\/TCSVT.2022.3168279","article-title":"Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning","volume":"32","author":"Sun","year":"2022","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1822","DOI":"10.1109\/LGRS.2019.2954578","article-title":"A local contrast method for infrared small-target detection utilizing a tri-layer window","volume":"17","author":"Han","year":"2019","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"107727","DOI":"10.1016\/j.sigpro.2020.107727","article-title":"Fast and robust small infrared target detection using absolute directional mean difference algorithm","volume":"177","author":"Moradi","year":"2020","journal-title":"Signal Process."},{"key":"ref_46","unstructured":"Dai, Y., Wu, Y., Zhou, F., and Barnard, K. (2018, January 18\u201323). Asymmetric contextual modulation for infrared small target detection. Proceedings of Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Salt Lake City, UT, USA."},{"key":"ref_47","unstructured":"Zhou, X., Wang, D., and Kr\u00e4henb\u00fchl, P. (2019). Objects as points. arXiv."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Devaguptapu, C., Akolekar, N., Sharma, M.M., and Balasubramanian, V.N. (2019, January 16\u201317). Borrow from anywhere: Pseudo multi-modal object detection in thermal imagery. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA.","DOI":"10.1109\/CVPRW.2019.00135"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/18\/4387\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:46:18Z","timestamp":1760129178000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/18\/4387"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,6]]},"references-count":48,"journal-issue":{"issue":"18","published-online":{"date-parts":[[2023,9]]}},"alternative-id":["rs15184387"],"URL":"https:\/\/doi.org\/10.3390\/rs15184387","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,6]]}}}