{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T04:21:40Z","timestamp":1774585300778,"version":"3.50.1"},"reference-count":65,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2023,10,9]],"date-time":"2023-10-09T00:00:00Z","timestamp":1696809600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["61971095"],"award-info":[{"award-number":["61971095"]}]},{"name":"National Natural Science Foundation of China","award":["62271119"],"award-info":[{"award-number":["62271119"]}]},{"name":"National Natural Science Foundation of China","award":["62071086"],"award-info":[{"award-number":["62071086"]}]},{"name":"National Natural Science Foundation of China","award":["2023NSFSC1972"],"award-info":[{"award-number":["2023NSFSC1972"]}]},{"name":"Natural Science Foundation of Sichuan Province","award":["61971095"],"award-info":[{"award-number":["61971095"]}]},{"name":"Natural Science Foundation of Sichuan Province","award":["62271119"],"award-info":[{"award-number":["62271119"]}]},{"name":"Natural Science Foundation of Sichuan Province","award":["62071086"],"award-info":[{"award-number":["62071086"]}]},{"name":"Natural Science Foundation of Sichuan Province","award":["2023NSFSC1972"],"award-info":[{"award-number":["2023NSFSC1972"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Object detection based on RGB and infrared images has emerged as a crucial research area in computer vision, and the synergy of RGB-Infrared ensures the robustness of object-detection algorithms under varying lighting conditions. However, the RGB-IR image pairs captured typically exhibit spatial misalignment due to sensor discrepancies, leading to compromised localization performance. Furthermore, since the inconsistent distribution of deep features from the two modalities, directly fusing multi-modal features will weaken the feature difference between the object and the background, therefore interfering with the RGB-Infrared object-detection performance. To address these issues, we propose an adaptive dual-discrepancy calibration network (ADCNet) for misaligned RGB-Infrared object detection, including spatial discrepancy and domain-discrepancy calibration. Specifically, the spatial discrepancy calibration module conducts an adaptive affine transformation to achieve spatial alignment of features. Then, the domain-discrepancy calibration module separately aligns object and background features from different modalities, making the distribution of the object and background of the fusion feature easier to distinguish, therefore enhancing the effectiveness of RGB-Infrared object detection. Our ADCNet outperforms the baseline by 3.3% and 2.5% in mAP50 on the FLIR and misaligned M3FD datasets, respectively. Experimental results demonstrate the superiorities of our proposed method over the state-of-the-art approaches.<\/jats:p>","DOI":"10.3390\/rs15194887","type":"journal-article","created":{"date-parts":[[2023,10,9]],"date-time":"2023-10-09T10:48:33Z","timestamp":1696848513000},"page":"4887","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["Misaligned RGB-Infrared Object Detection via Adaptive Dual-Discrepancy Calibration"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-6555-6973","authenticated-orcid":false,"given":"Mingzhou","family":"He","sequence":"first","affiliation":[{"name":"School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2936-6340","authenticated-orcid":false,"given":"Qingbo","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China"}]},{"given":"King Ngi","family":"Ngan","sequence":"additional","affiliation":[{"name":"School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China"}]},{"given":"Feng","family":"Jiang","sequence":"additional","affiliation":[{"name":"Beijing Institute of Control and Electronics Technology, Beijing 100038, China"}]},{"given":"Fanman","family":"Meng","sequence":"additional","affiliation":[{"name":"School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China"}]},{"given":"Linfeng","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,10,9]]},"reference":[{"key":"ref_1","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_2","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst., 28, Available online: https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2015\/hash\/14bfa6bb14875e45bba028a21ed38046-Abstract.html."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Yang, X., Sun, H., Fu, K., Yang, J., Sun, X., Yan, M., and Guo, Z. (2018). Automatic Ship Detection in Remote Sensing Images from Google Earth of Complex Scenes Based on Multiscale Rotation Dense Feature Pyramid Networks. Remote Sens., 10.","DOI":"10.3390\/rs10010132"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Yao, C., Xie, P., Zhang, L., and Fang, Y. (2022). ATSD: Anchor-Free Two-Stage Ship Detection Based on Feature Enhancement in SAR Images. Remote Sens., 14.","DOI":"10.3390\/rs14236058"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Cai, Z., and Vasconcelos, N. (2018, January 18\u201323). Cascade r-cnn: Delving into high quality object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00644"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Zhang, H., Fromont, E., Lef\u00e8vre, S., and Avignon, B. (2021, January 3\u20138). Guided attentive feature fusion for multispectral pedestrian detection. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","DOI":"10.1109\/WACV48630.2021.00012"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Chen, Y.T., Shi, J., Ye, Z., Mertz, C., Ramanan, D., and Kong, S. (2022, January 23\u201327). Multimodal object detection via probabilistic ensembling. Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-20077-9_9"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"9813","DOI":"10.1109\/TGRS.2020.3044958","article-title":"Attentional local contrast networks for infrared small target detection","volume":"59","author":"Dai","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Kieu, M., Bagdanov, A.D., Bertini, M., and Bimbo, A.d. (2020, January 23\u201328). Task-conditioned domain adaptation for pedestrian detection in thermal imagery. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58542-6_33"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Devaguptapu, C., Akolekar, N., Sharma, M.M., and Balasubramanian, V.N. (2019, January 16\u201317). Borrow from anywhere: Pseudo multi-modal object detection in thermal imagery. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA.","DOI":"10.1109\/CVPRW.2019.00135"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zhao, C., Wang, J., Su, N., Yan, Y., and Xing, X. (2022). Low Contrast Infrared Target Detection Method Based on Residual Thermal Backbone Network and Weighting Loss Function. Remote Sens., 14.","DOI":"10.3390\/rs14010177"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Hwang, S., Park, J., Kim, N., Choi, Y., and So Kweon, I. (2015, January 7\u201312). Multispectral pedestrian detection: Benchmark dataset and baseline. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298706"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"103752","DOI":"10.1016\/j.dsp.2022.103752","article-title":"Object detection in hyperspectral images","volume":"131","author":"Lone","year":"2022","journal-title":"Digit. Signal Process."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"508","DOI":"10.1109\/LSP.2021.3059204","article-title":"Object detection in hyperspectral images","volume":"28","author":"Yan","year":"2021","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"2901","DOI":"10.1109\/TIP.2023.3263109","article-title":"Learning a Deep Ensemble Network with Band Importance for Hyperspectral Object Tracking","volume":"32","author":"Li","year":"2023","journal-title":"IEEE Trans. Image Process."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Wang, Q., Chi, Y., Shen, T., Song, J., Zhang, Z., and Zhu, Y. (2022). Improving RGB-Infrared Object Detection by Reducing Cross-Modality Redundancy. Remote Sens., 14.","DOI":"10.3390\/rs14092020"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Xu, D., Ouyang, W., Ricci, E., Wang, X., and Sebe, N. (2017, January 21\u201326). Learning cross-modal deep representations for robust pedestrian detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.451"},{"key":"ref_19","unstructured":"Qingyun, F., Dapeng, H., and Zhaokui, W. (2021). Cross-modality fusion transformer for multispectral object detection. arXiv."},{"key":"ref_20","unstructured":"Li, C., Song, D., Tong, R., and Tang, M. (2018). Multispectral pedestrian detection via simultaneous detection and segmentation. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Yuan, M., Wang, Y., and Wei, X. (2022, January 23\u201327). Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection. Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-20077-9_30"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Liu, J., Fan, X., Huang, Z., Wu, G., Liu, R., Zhong, W., and Luo, Z. (2022, January 18\u201324). Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00571"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Valverde, F.R., Hurtado, J.V., and Valada, A. (2021, January 20\u201325). There is more than meets the eye: Self-supervised multi-object detection and tracking with sound by distilling multimodal knowledge. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01144"},{"key":"ref_24","unstructured":"Team, F. (2023, October 05). Free Flir Thermal Dataset for Algorithm Training. Available online: https:\/\/www.flir.com\/oem\/adas\/adas-dataset-form\/."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhang, H., Fromont, E., Lefevre, S., and Avignon, B. (2020, January 25\u201328). Multispectral fusion for object detection with cyclic fuse-and-refine blocks. Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates.","DOI":"10.1109\/ICIP40778.2020.9191080"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Xu, C., Zheng, X., and Lu, X. (2022). Multi-Level Alignment Network for Cross-Domain Ship Detection. Remote Sens., 14.","DOI":"10.3390\/rs14102389"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Liu, Z., Yang, X., Gao, R., Liu, S., Dou, H., He, S., Huang, Y., Huang, Y., Luo, H., and Zhang, Y. (2020, January 3\u20137). Remove appearance shift for ultrasound image segmentation via fast and universal style transfer. Proceedings of the 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), Iowa City, IA, USA.","DOI":"10.1109\/ISBI45749.2020.9098457"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Liu, M., Ren, D., Sun, H., and Yang, S.X. (2022). Multibranch Unsupervised Domain Adaptation Network for Cross Multidomain Orchard Area Segmentation. Remote Sens., 14.","DOI":"10.3390\/rs14194915"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Li, M., Li, R., Jia, K., and Zhang, L. (2022, January 18\u201324). Exact feature distribution matching for arbitrary style transfer and domain generalization. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00787"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Atapour-Abarghouei, A., and Breckon, T.P. (2018, January 18\u201323). Real-time monocular depth estimation using synthetic data with domain adaptation via image style transfer. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00296"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhang, J., Xu, S., Sun, J., Ou, D., Wu, X., and Wang, M. (2022). Unsupervised Adversarial Domain Adaptation for Agricultural Land Extraction of Remote Sensing Images. Remote Sens., 14.","DOI":"10.3390\/rs14246298"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"9984","DOI":"10.1109\/TITS.2023.3266487","article-title":"Multi-Modal Feature Pyramid Transformer for RGB-Infrared Object Detection","volume":"24","author":"Zhu","year":"2023","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Jia, X., Zhu, C., Li, M., Tang, W., and Zhou, W. (2021, January 11\u201317). LLVIP: A visible-infrared paired dataset for low-light vision. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00389"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1016\/j.jvcir.2015.11.002","article-title":"Vehicle detection in aerial imagery: A small target detection benchmark","volume":"34","author":"Razakarivony","year":"2016","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_35","unstructured":"Choi, H., Kim, S., Park, K., and Sohn, K. (2016, January 4\u20138). Multi-spectral pedestrian detection based on accumulated object proposal with fully convolutional networks. Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.inffus.2018.09.015","article-title":"Cross-modality interactive attention network for multispectral pedestrian detection","volume":"50","author":"Zhang","year":"2019","journal-title":"Inf. Fusion"},{"key":"ref_37","unstructured":"Wagner, J., Fischer, V., Herman, M., and Behnke, S. (2016, January 27\u201329). Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks. Proceedings of the 24th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning ESANN, Bruges, Belgium."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Fu, Y., Wu, X.J., and Kittler, J. (2021). A deep decomposition network for image processing: A case study for visible and infrared image fusion. arXiv.","DOI":"10.2139\/ssrn.4178002"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhao, Z., Xu, S., Zhang, C., Liu, J., Li, P., and Zhang, J. (2020). DIDFuse: Deep image decomposition for infrared and visible image fusion. arXiv.","DOI":"10.24963\/ijcai.2020\/135"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Liu, J., Zhang, S., Wang, S., and Metaxas, D.N. (2016). Multispectral deep neural networks for pedestrian detection. arXiv.","DOI":"10.5244\/C.30.73"},{"key":"ref_41","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention is all you need. Adv. Neural Inf. Process. Syst., 30, Available online: https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2017\/hash\/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Konig, D., Adam, M., Jarvers, C., Layher, G., Neumann, H., and Teutsch, M. (2017, January 21\u201326). Fully convolutional region proposal networks for multispectral person detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.36"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"Imagenet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1023\/A:1007515423169","article-title":"An empirical comparison of voting classification algorithms: Bagging, boosting, and variants","volume":"36","author":"Bauer","year":"1999","journal-title":"Mach. Learn."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Dietterich, T.G. (2000, January 21\u201323). Ensemble methods in machine learning. Proceedings of the International Workshop on Multiple Classifier Systems, Cagliari, Italy.","DOI":"10.1007\/3-540-45014-9_1"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Bodla, N., Singh, B., Chellappa, R., and Davis, L.S. (2017, January 22\u201329). Soft-NMS\u2014improving object detection with one line of code. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.593"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"104117","DOI":"10.1016\/j.imavis.2021.104117","article-title":"Weighted boxes fusion: Ensembling boxes from different object detection models","volume":"107","author":"Solovyev","year":"2021","journal-title":"Image Vis. Comput."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/j.patcog.2018.08.005","article-title":"Illumination-aware faster R-CNN for robust multispectral pedestrian detection","volume":"85","author":"Li","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Doll\u00e1r, P., Wojek, C., Schiele, B., and Perona, P. (2009, January 20\u201325). Pedestrian detection: A benchmark. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206631"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Xu, P., Davoine, F., and Denoeux, T. (2014, January 1\u20135). Evidential combination of pedestrian detectors. Proceedings of the British Machine Vision Conference, Nottingham, UK.","DOI":"10.5244\/C.28.2"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Rostami, M., Kolouri, S., Eaton, E., and Kim, K. (2019). Deep transfer learning for few-shot SAR image classification. Remote Sens., 11.","DOI":"10.20944\/preprints201905.0030.v1"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Benjdira, B., Bazi, Y., Koubaa, A., and Ouni, K. (2019). Unsupervised domain adaptation using generative adversarial networks for semantic segmentation of aerial images. Remote Sens., 11.","DOI":"10.3390\/rs11111369"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Kan, M., Shan, S., and Chen, X. (2015, January 20\u201323). Bi-shifting auto-encoder for unsupervised domain adaptation. Proceedings of the IEEE International Conference on Computer Vision, Cambridge, MA, USA.","DOI":"10.1109\/ICCV.2015.438"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"1482","DOI":"10.1109\/LGRS.2019.2896948","article-title":"SAR image retrieval based on unsupervised domain adaptation and clustering","volume":"16","author":"Ye","year":"2019","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"7219","DOI":"10.1109\/TII.2022.3154789","article-title":"Adversarial regressive domain adaptation approach for infrared thermography-based unsupervised remaining useful life prediction","volume":"18","author":"Jiang","year":"2022","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_56","unstructured":"Ultralytics (2023, October 05). YOLOv5. Available online: https:\/\/github.com\/ultralytics\/yolov5."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201323). Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_58","unstructured":"Jaderberg, M., Simonyan, K., and Zisserman, A. (2015). Spatial transformer networks. Adv. Neural Inf. Process. Syst., 28, Available online: https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2015\/hash\/33ceb07bf4eeb3da587e268d663aba1a-Abstract.html."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1109\/TMM.2019.2929949","article-title":"Oriented spatial transformer network for pedestrian detection using fish-eye camera","volume":"22","author":"Qian","year":"2019","journal-title":"IEEE Trans. Multimed."},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Huang, X., and Belongie, S. (2017, January 22\u201329). Arbitrary style transfer in real-time with adaptive instance normalization. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.167"},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Pan, X., Luo, P., Shi, J., and Tang, X. (2018, January 8\u201314). Two at once: Enhancing learning and generalization capacities via ibn-net. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01225-0_29"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Tang, Z., Gao, Y., Zhu, Y., Zhang, Z., Li, M., and Metaxas, D.N. (2021, January 10\u201317). Crossnorm and selfnorm for generalization under distribution shifts. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00012"},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Karras, T., Laine, S., and Aila, T. (2019, January 15\u201320). A style-based generator architecture for generative adversarial networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00453"},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., and Ren, D. (2020, January 7\u201312). Distance-IoU loss: Faster and better learning for bounding box regression. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.","DOI":"10.1609\/aaai.v34i07.6999"},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Zhu, X., Lyu, S., Wang, X., and Zhao, Q. (2021, January 11\u201317). TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00312"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/19\/4887\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:03:31Z","timestamp":1760130211000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/19\/4887"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,9]]},"references-count":65,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["rs15194887"],"URL":"https:\/\/doi.org\/10.3390\/rs15194887","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,9]]}}}