{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T22:56:33Z","timestamp":1771973793545,"version":"3.50.1"},"reference-count":28,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2023,1,22]],"date-time":"2023-01-22T00:00:00Z","timestamp":1674345600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Automatically translating chromaticity-free thermal infrared (TIR) images into realistic color visible (CV) images is of great significance for autonomous vehicles, emergency rescue, robot navigation, nighttime video surveillance, and many other fields. Most recent designs use end-to-end neural networks to translate TIR directly to CV; however, compared to these networks, TIR has low contrast and an unclear texture for CV translation. Thus, directly translating the TIR temperature value of only one channel to the RGB color value of three channels without adding additional constraints or semantic information does not handle the one-to-three mapping problem between different domains in a good way, causing the translated CV images not only to have blurred edges but also color confusion. As for the methodology of the work, considering that in the translation from TIR to CV the most important process is to map information from the temperature domain into the color domain, an improved CycleGAN (GMA-CycleGAN) is proposed in this work in order to translate TIR images to grayscale visible (GV) images. Although the two domains have different properties, the numerical mapping is one-to-one, which reduces the color confusion caused by one-to-three mapping when translating TIR to CV. Then, a GV-CV translation network is applied to obtain CV images. Since the process of decomposing GV images into CV images is carried out in the same domain, edge blurring can be avoided. To enhance the boundary gradient between the object (pedestrian and vehicle) and the background, a mask attention module based on the TIR temperature mask and the CV semantic mask is designed without increasing the network parameters, and it is added to the feature encoding and decoding convolution layers of the CycleGAN generator. Moreover, a perceptual loss term is applied to the original CycleGAN loss function to bring the translated images closer to the real images regarding the space feature. In order to verify the effectiveness of the proposed method, the FLIR dataset is used for experiments, and the obtained results show that, compared to the state-of-the-art model, the subjective quality of the translated CV images obtained by the proposed method is better, as the objective evaluation metric FID (Fr\u00e9chet inception distance) is reduced by 2.42 and the PSNR (peak signal-to-noise ratio) is improved by 1.43.<\/jats:p>","DOI":"10.3390\/rs15030663","type":"journal-article","created":{"date-parts":[[2023,1,23]],"date-time":"2023-01-23T04:19:22Z","timestamp":1674447562000},"page":"663","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["An Unpaired Thermal Infrared Image Translation Method Using GMA-CycleGAN"],"prefix":"10.3390","volume":"15","author":[{"given":"Shihao","family":"Yang","sequence":"first","affiliation":[{"name":"Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China"}]},{"given":"Min","family":"Sun","sequence":"additional","affiliation":[{"name":"Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China"}]},{"given":"Xiayin","family":"Lou","sequence":"additional","affiliation":[{"name":"Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China"}]},{"given":"Hanjun","family":"Yang","sequence":"additional","affiliation":[{"name":"Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9177-3452","authenticated-orcid":false,"given":"Hang","family":"Zhou","sequence":"additional","affiliation":[{"name":"Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,1,22]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Hou, F., Zhang, Y., Zhou, Y., Zhang, M., Lv, B., and Wu, J. (2022). Review on Infrared Imaging Technology. Sustainability, 14.","DOI":"10.3390\/su141811161"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"116269","DOI":"10.1016\/j.eswa.2021.116269","article-title":"ClawGAN: Claw connection-based generative adversarial networks for facial image translation in thermal to RGB visible light","volume":"191","author":"Luo","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Hu, X., Zhou, X., Huang, Q., Shi, Z., Sun, L., and Li, Q. (2022, January 19\u201320). QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01775"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"105006","DOI":"10.1016\/j.engappai.2022.105006","article-title":"Deep learning for image colorization: Current and future prospects","volume":"114","author":"Huang","year":"2022","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"103764","DOI":"10.1016\/j.infrared.2021.103764","article-title":"An improved DualGAN for near-infrared image colorization","volume":"116","author":"Liang","year":"2021","journal-title":"Infrared Phys. Technol."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Toet, A., and Hogervorst, M.A. (2008, January 17\u201320). Portable real-time color night vision. Proceedings of the SPIE Defense and Security Symposium, Orlando, FL, USA.","DOI":"10.1117\/12.775405"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1016\/j.inffus.2009.06.005","article-title":"Fast natural color mapping for night-time imagery","volume":"11","author":"Hogervorst","year":"2010","journal-title":"Inf. Fusion"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Berg, A., Ahlberg, J., and Felsberg, M. (2018, January 18\u201322). Generating Visible Spectrum Images from Thermal Infrared. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00159"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1016\/j.neucom.2022.06.021","article-title":"Towards high-quality thermal infrared image colorization via attention-based hierarchical network","volume":"501","author":"Wang","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Sola, P., Zhu, J.-Y., Zhou, T., and Efros, A.A. (2017, January 21\u201326). Image-to-Image Translation with Conditional Adversarial Networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.632"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Zhu, J.-Y., Park, T., Isola, P., and Efros, A.A. (2017, January 22\u201329). Unpaired image-to-image translation using cycle-consistent adversarial networks. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.244"},{"key":"ref_12","unstructured":"Kim, J., Kim, M., Kang, H., and Lee, K. (2020). U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation. arXiv, Available online: http:\/\/arxiv.org\/abs\/1907.10830."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Chen, R., Huang, W., Huang, B., Sun, F., and Fang, B. (2020, January 13\u201319). Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00819"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"319","DOI":"10.1007\/978-3-030-58545-7_19","article-title":"Contrastive Learning for Unpaired Image-to-Image Translation","volume":"Volume 12354","author":"Vedaldi","year":"2020","journal-title":"Computer Vision\u2014ECCV 2020"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"103338","DOI":"10.1016\/j.infrared.2020.103338","article-title":"Thermal infrared colorization via conditional generative adversarial network","volume":"107","author":"Kuang","year":"2020","journal-title":"Infrared Phys. Technol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"15808","DOI":"10.1109\/TITS.2022.3145476","article-title":"Thermal Infrared Image Colorization for Nighttime Driving Scenes With Top-Down Guided Attention","volume":"23","author":"Luo","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Tao, A., Kautz, J., and Catanzaro, B. (2018, January 18\u201323). High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00917"},{"key":"ref_18","first-page":"1","article-title":"AttentionGAN: Unpaired Image-to-Image Translation Using Attention-Guided Generative Adversarial Networks","volume":"11","author":"Tang","year":"2021","journal-title":"IEEE Trans. Neural. Networks Learn. Syst."},{"key":"ref_19","unstructured":"Simonyan, K., and Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv, Available online: http:\/\/arxiv.org\/abs\/1409.155."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., and Girdhar, R. (2022, January 18\u201324). Masked-attention Mask Transformer for Universal Image Segmentation. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00135"},{"key":"ref_21","unstructured":"Nikolov, I.A., Philipsen, M.P., Liu, J., Dueholm, J.V., Johansen, A.S., Nasrollahi, K., and Moeslund, T.B. Seasons in Drift: A Long Term Thermal Imaging Dataset for Studying Concept Drift. Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, Montreal, Canada. Available online: https:\/\/datasets-benchmarks-proceedings.neurips.cc\/paper\/2021\/file\/c45147dee729311ef5b5c3003946c48f-Paper-round2.pdf."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Zhou, H., Sun, M., Ren, X., and Wang, X. (2021). Visible-Thermal Image Object Detection via the Combination of Illumination Conditions and Temperature Information. Remote Sens., 13.","DOI":"10.3390\/rs13183656"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"694","DOI":"10.1007\/978-3-319-46475-6_43","article-title":"Perceptual Losses for Real-Time Style Transfer and Super-Resolution","volume":"Volume 9906","author":"Leibe","year":"2016","journal-title":"Computer Vision\u2014ECCV 2016"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"ImageNet Large Scale Visual Recognition Challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Hwang, S., Park, J., Kim, N., Choi, Y., and So Kweon, I. (2015, January 7\u201312). Multispectral pedestrian detection: Benchmark dataset and baseline. Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298706"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Zhang, H., Fromont, E., Lefevre, S., and Avignon, B. (2020, January 25\u201328). Multispectral Fusion for Object Detection with Cyclic Fuse-and-Refine Blocks. Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates.","DOI":"10.1109\/ICIP40778.2020.9191080"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the Inception Architecture for Computer Vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_28","unstructured":"Kingma, D.P., and Ba, J. (2023, January 22). Adam: A Method for Stochastic Optimization, in ICLR (Poster). Available online: http:\/\/arxiv.org\/abs\/1412.6980."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/3\/663\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:13:38Z","timestamp":1760120018000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/3\/663"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1,22]]},"references-count":28,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["rs15030663"],"URL":"https:\/\/doi.org\/10.3390\/rs15030663","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1,22]]}}}