{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:55:46Z","timestamp":1760151346284,"version":"build-2065373602"},"reference-count":40,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2022,3,9]],"date-time":"2022-03-09T00:00:00Z","timestamp":1646784000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61371143"],"award-info":[{"award-number":["61371143"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Key Research and Development Program Project","award":["2020YFC0811004"],"award-info":[{"award-number":["2020YFC0811004"]}]},{"name":"Beijing Science and Technology Innovation Service capacity-basic scientific research project","award":["110052971921\/002"],"award-info":[{"award-number":["110052971921\/002"]}]},{"name":"the Science and Technology Development Center for the Ministry of Education &quot;Tiancheng Huizhi&quot; Innovation and Education Promotion Fund","award":["2018A03029"],"award-info":[{"award-number":["2018A03029"]}]},{"name":"Cooperative Education Project of Higher Education Department of the Ministry of Education","award":["201902083001"],"award-info":[{"award-number":["201902083001"]}]},{"name":"Science and Technology Project of Beijing Education Commission","award":["No.KM202110009002"],"award-info":[{"award-number":["No.KM202110009002"]}]},{"name":"Hangzhou Innovation Institute of Beihang University","award":["No. 2020-Y3-A-014"],"award-info":[{"award-number":["No. 2020-Y3-A-014"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Infrared image simulation is challenging because it is complex to model. To estimate the corresponding infrared image directly from the visible light image, we propose a three-level refined light-weight generative adversarial network with cascaded guidance (V2T-GAN), which can improve the accuracy of the infrared simulation image. V2T-GAN is guided by cascading auxiliary tasks and auxiliary information: the first-level adversarial network uses semantic segmentation as an auxiliary task, focusing on the structural information of the infrared image; the second-level adversarial network uses the grayscale inverted visible image as the auxiliary task to supplement the texture details of the infrared image; the third-level network obtains a sharp and accurate edge by adding auxiliary information of the edge image and a displacement network. Experiments on the public dataset Multispectral Pedestrian Dataset demonstrate that the structure and texture features of the infrared simulation image obtained by V2T-GAN are correct, and outperform the state-of-the-art methods in objective metrics and subjective visualization effects.<\/jats:p>","DOI":"10.3390\/s22062119","type":"journal-article","created":{"date-parts":[[2022,3,10]],"date-time":"2022-03-10T02:10:35Z","timestamp":1646878235000},"page":"2119","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["V2T-GAN: Three-Level Refined Light-Weight GAN with Cascaded Guidance for Visible-to-Thermal Translation"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2430-4183","authenticated-orcid":false,"given":"Ruiming","family":"Jia","sequence":"first","affiliation":[{"name":"School of Information Science and Technology, North China University of Technology, Beijing 100144, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8341-7041","authenticated-orcid":false,"given":"Xin","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Information Science and Technology, North China University of Technology, Beijing 100144, China"}]},{"given":"Tong","family":"Li","sequence":"additional","affiliation":[{"name":"School of Information Science and Technology, North China University of Technology, Beijing 100144, China"}]},{"given":"Jiali","family":"Cui","sequence":"additional","affiliation":[{"name":"School of Information Science and Technology, North China University of Technology, Beijing 100144, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,3,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"24487","DOI":"10.3390\/s150924487","article-title":"Sea-based infrared scene interpretation by background type classification and coastal region detection for small target detection","volume":"15","author":"Kim","year":"2015","journal-title":"Sensors"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"932","DOI":"10.4028\/www.scientific.net\/AMM.716-717.932","article-title":"Infrared Image Simulation of Ground Maneuver Target and Scene Based on OGRE","volume":"716\u2013717","author":"Mu","year":"2014","journal-title":"Appl. Mech. Mater."},{"key":"ref_3","first-page":"53","article-title":"Infrared simulation of ship target on the sea based on OGRE","volume":"47","author":"Yang","year":"2017","journal-title":"Laser Infrared"},{"key":"ref_4","unstructured":"Eigen, D., Puhrsch, C., and Fergus, R. (2014, January 8\u201313). Depth map prediction from a single image using a multi-scale deep network. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., and Navab, N. (2016, January 25\u201328). Deeper depth prediction with fully convolutional residual networks. Proceedings of the 2016 4th International Conference on 3D Vision, 3DV 2016, Stanford, CA, USA.","DOI":"10.1109\/3DV.2016.32"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Noh, H., Hong, S., and Han, B. (2015, January 13\u201316). Learning deconvolution network for semantic segmentation. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.178"},{"key":"ref_7","unstructured":"Badrinarayanan, V., Handa, A., and Cipolla, R. (2015). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Yin, Z., and Shi, J. (2018, January 18\u201322). GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00212"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Isola, P., Zhu, J.Y., Zhou, T., and Efros, A.A. (2017, January 21\u201326). Image-to-image translation with conditional adversarial networks. Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.632"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhu, J.Y., Park, T., Isola, P., and Efros, A.A. (2017, January 22\u201329). Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.244"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Hwang, S., Park, J., Kim, N., Choi, Y., and Kweon, I.S. (2015, January 7\u201312). Multispectral pedestrian detection: Benchmark dataset and baseline. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298706"},{"key":"ref_12","first-page":"11","article-title":"Near Infrared Scene Simulation Based on Visual Image","volume":"37","author":"Zhou","year":"2015","journal-title":"Infrared Technol."},{"key":"ref_13","first-page":"34","article-title":"Infrared Image Generation Method and Detail Modulation Based on Visible Light Images","volume":"40","author":"Li","year":"2018","journal-title":"Infrared Technol."},{"key":"ref_14","unstructured":"Wang, P., Shen, X., Lin, Z., Cohen, S., Price, B., and Yuille, A. (2015, January 7\u201312). Towards unified depth and semantic prediction from a single image. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Boston, MA, USA."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Xu, D., Ricci, E., Ouyang, W., Wang, X., and Sebe, N. (2017, January 21\u201326). Multi-scale continuous CRFs as sequential deep networks for monocular depth estimation. Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.25"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Xu, D., Wang, W., Tang, H., Liu, H., Sebe, N., and Ricci, E. (2018, January 18\u201322). Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00412"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Qi, X., Liao, R., Liu, Z., Urtasun, R., and Jia, J. (2018, January 18\u201322). GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00037"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Ranjan, A., Jampani, V., Balles, L., Kim, K., Sun, D., Wulff, J., and Black, M.J. (2019, January 16\u201320). Competitive collaboration: Joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01252"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Jiao, J., Cao, Y., Song, Y., and Lau, R. (2018, January 8\u201314). Look deeper into depth: Monocular depth estimation with semantic booster and attention-driven loss. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01267-0_4"},{"key":"ref_20","unstructured":"Mirza, M., and Osindero, S. (2014). Conditional Generative Adversarial Nets. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"0510001","DOI":"10.3788\/AOS202040.0510001","article-title":"Facial Image Translation in Short-Wavelength Infrared and Visible Light Based on Generative Adversarial Network","volume":"40","author":"Hu","year":"2020","journal-title":"Guangxue Xuebao\/Acta Opt. Sin."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Ma, S., Fu, J., Chen, C.W., and Mei, T. (2018, January 18\u201322). DA-GAN: Instance-Level Image Translation by Deep Attention Generative Adversarial Networks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00593"},{"key":"ref_23","unstructured":"Mejjati, Y.A., Richardt, C., Cosker, D., Tompkin, J., and Kim, K.I. (2018, January 3\u20138). Unsupervised Attention-guided Image-to-Image Translation. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Tang, H., Xu, D., Sebe, N., Wang, Y., Corso, J.J., and Yan, Y. (2019, January 16\u201320). Multi-channel attention selection gan with cascaded semantic guidance for cross-view image translation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00252"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018, January 18\u201322). ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00716"},{"key":"ref_26","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). ImageNet Classification with Deep Convolutional Neural Networks. Proceedings of the International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018, January 18\u201322). MobileNetV2: Inverted Residuals and Linear Bottlenecks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Mehta, S., Rastegari, M., Shapiro, L., and Hajishirzi, H. (2019, January 16\u201320). ESPNetv2: A light-weight, power efficient, and general purpose convolutional neural network. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00941"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Haase, D., and Amthor, M. (2020, January 14\u201319). Rethinking depthwise separable convolutions: How intra-kernel correlations lead to improved mobilenets. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPR42600.2020.01461"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., and Xu, C. (2020, January 14\u201319). GhostNet: More features from cheap operations. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPR42600.2020.00165"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). UNet: Convolutional Networks for Biomedical Image Segmentation. Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_32","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv."},{"key":"ref_33","unstructured":"Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., and Hochreiter, S. (2017, January 4\u20139). GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Zhang, R., Isola, P., Efros, A.A., Shechtman, E., and Wang, O. (2018, January 18\u201322). The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00068"},{"key":"ref_35","first-page":"2552","article-title":"Deep multi-scale encoder-decoder convolutional network for blind deblurring","volume":"9081","author":"Jia","year":"2019","journal-title":"J. Comput. Appl."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Ramamonjisoa, M., Du, Y., and Lepetit, V. (2020, January 14\u201319). Predicting sharp and accurate occlusion boundaries in monocular depth estimation using displacement fields. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPR42600.2020.01466"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Lin, G., Milan, A., Shen, C., and Reid, I. (2017, January 21\u201326). RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.549"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Regmi, K., and Borji, A. (2018, January 18\u201322). Cross-View Image Synthesis Using Conditional GANs. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00369"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhu, P., Abdal, R., Qin, Y., and Wonka, P. (2020, January 14\u201319). SEAN: Image synthesis with semantic region-adaptive normalization. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPR42600.2020.00515"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Tang, H., Xu, D., Yan, Y., Torr, P.H.S., and Sebe, N. (2020, January 14\u201319). Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPR42600.2020.00789"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/6\/2119\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:33:41Z","timestamp":1760135621000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/6\/2119"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,9]]},"references-count":40,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2022,3]]}},"alternative-id":["s22062119"],"URL":"https:\/\/doi.org\/10.3390\/s22062119","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2022,3,9]]}}}