{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,10]],"date-time":"2026-06-10T10:09:37Z","timestamp":1781086177205,"version":"3.54.1"},"reference-count":37,"publisher":"MDPI AG","issue":"13","license":[{"start":{"date-parts":[[2023,7,3]],"date-time":"2023-07-03T00:00:00Z","timestamp":1688342400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>This paper proposes a real-time semantics-driven infrared and visible image fusion framework (RSDFusion). A novel semantics-driven image fusion strategy is introduced in image fusion to maximize the retention of significant information of the source image in the fusion image. First, a semantically segmented image of the source image is obtained using a pre-trained semantic segmentation model. Second, masks of significant targets are obtained from the semantically segmented image, and these masks are used to separate the targets in the source and fusion images. Finally, the local semantic loss of the separation target is designed and combined with the overall structural similarity loss of the image to instruct the network to extract appropriate features to reconstruct the fusion image. Experimental results show that the RSDFusion proposed in this paper outperformed other comparative methods on both subjective and objective evaluation of public datasets and that the main target of the source image is better preserved in the fusion image.<\/jats:p>","DOI":"10.3390\/s23136113","type":"journal-article","created":{"date-parts":[[2023,7,4]],"date-time":"2023-07-04T01:42:47Z","timestamp":1688434967000},"page":"6113","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Real-Time Semantics-Driven Infrared and Visible Image Fusion Network"],"prefix":"10.3390","volume":"23","author":[{"given":"Binhao","family":"Zheng","sequence":"first","affiliation":[{"name":"School of Electronic Engineering, Hangzhou Dianzi University, Hangzhou 310018, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Tieming","family":"Xiang","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Hangzhou Dianzi University, Hangzhou 310018, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Minghuang","family":"Lin","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Hangzhou Dianzi University, Hangzhou 310018, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Silin","family":"Cheng","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Hangzhou Dianzi University, Hangzhou 310018, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Pengquan","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Hangzhou Dianzi University, Hangzhou 310018, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2023,7,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1016\/j.inffus.2021.06.008","article-title":"Image fusion meets deep learning: A survey and perspective","volume":"76","author":"Zhang","year":"2021","journal-title":"Inf. Fusion"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1016\/j.inffus.2018.02.004","article-title":"Infrared and visible image fusion methods and applications: A survey","volume":"45","author":"Ma","year":"2019","journal-title":"Inf. Fusion"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"206","DOI":"10.1016\/j.inffus.2018.06.005","article-title":"Pedestrian detection with unsupervised multispectral feature learning using deep neural networks","volume":"46","author":"Cao","year":"2019","journal-title":"Inf. Fusion"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1014","DOI":"10.1109\/TMM.2013.2244870","article-title":"Directive contrast based multimodal medical image fusion in NSCT domain","volume":"15","author":"Bhatnagar","year":"2013","journal-title":"IEEE Trans. Multimed."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Ha, Q., Watanabe, K., Karasawa, T., Ushiku, Y., and Harada, T. (2017, January 24\u201328). MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. Proceedings of the 2017 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada.","DOI":"10.1109\/IROS.2017.8206396"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/S1566-2535(01)00056-2","article-title":"Image fusion techniques for remote sensing applications","volume":"3","author":"Simone","year":"2002","journal-title":"Inf. Fusion"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"135","DOI":"10.3233\/ICA-2005-12201","article-title":"A multiscale approach to pixel-level image fusion","volume":"12","author":"He","year":"2005","journal-title":"Integr. Comput.-Aided Eng."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1016\/j.ins.2019.08.066","article-title":"Infrared and visible image fusion based on target-enhanced multiscale transform decomposition","volume":"508","author":"Chen","year":"2020","journal-title":"Inf. Sci."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"2368","DOI":"10.1109\/TCYB.2014.2307067","article-title":"Robust face recognition via adaptive sparse representation","volume":"44","author":"Wang","year":"2014","journal-title":"IEEE Trans. Cybern."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1882","DOI":"10.1109\/LSP.2016.2618776","article-title":"Image fusion with convolutional sparse representation","volume":"23","author":"Liu","year":"2016","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"4733","DOI":"10.1109\/TIP.2020.2975984","article-title":"MDLatLRR: A novel decomposition method for infrared and visible image fusion","volume":"29","author":"Li","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"743","DOI":"10.1109\/JSEN.2007.894926","article-title":"Region-based multimodal image fusion using ICA bases","volume":"7","author":"Cvejic","year":"2007","journal-title":"IEEE Sens. J."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/j.infrared.2017.02.005","article-title":"Infrared and visible image fusion based on visual saliency map and weighted least square optimization","volume":"82","author":"Ma","year":"2017","journal-title":"Infrared Phys. Technol."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"2614","DOI":"10.1109\/TIP.2018.2887342","article-title":"DenseFuse: A fusion approach to infrared and visible images","volume":"28","author":"Li","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"9645","DOI":"10.1109\/TIM.2020.3005230","article-title":"NestFuse: An infrared and visible image fusion architecture based on nest connection and spatial\/channel attention models","volume":"69","author":"Li","year":"2020","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1850018","DOI":"10.1142\/S0219691318500182","article-title":"Infrared and visible image fusion with convolutional neural networks","volume":"16","author":"Liu","year":"2018","journal-title":"Int. J. Wavelets Multiresolut. Inf. Process."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1016\/j.inffus.2019.07.011","article-title":"IFCNN: A general image fusion framework based on convolutional neural network","volume":"54","author":"Zhang","year":"2020","journal-title":"Inf. Fusion"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1016\/j.inffus.2021.12.004","article-title":"Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network","volume":"82","author":"Tang","year":"2022","journal-title":"Inf. Fusion"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.inffus.2018.09.004","article-title":"FusionGAN: A generative adversarial network for infrared and visible image fusion","volume":"48","author":"Ma","year":"2019","journal-title":"Inf. Fusion"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"4980","DOI":"10.1109\/TIP.2020.2977573","article-title":"DDcGAN: A dual-discriminator conditional generative adversarial network for multi-resolution image fusion","volume":"29","author":"Ma","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1383","DOI":"10.1109\/TMM.2020.2997127","article-title":"AttentionFGAN: Infrared and visible image fusion using attention-based generative adversarial networks","volume":"23","author":"Li","year":"2020","journal-title":"IEEE Trans. Multimed."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_23","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015). Lecture Notes in Computer Science, Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2013MICCAI 2015: 18th International Conference, Munich, Germany, 5\u20139 October 2015, Springer. Proceedings, Part III 18."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_25","first-page":"12077","article-title":"SegFormer: Simple and efficient design for semantic segmentation with transformers","volume":"34","author":"Xie","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Hou, J., Zhang, D., Wu, W., Ma, J., and Zhou, H. (2021). A generative adversarial network for infrared and visible image fusion based on semantic segmentation. Entropy, 23.","DOI":"10.3390\/e23030376"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1016\/j.cviu.2018.09.001","article-title":"CNN-based features for retrieval and classification of food images","volume":"176","author":"Ciocca","year":"2018","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., and Torralba, A. (2017, January 21\u201326). Scene parsing through ade20k dataset. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.544"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"3345","DOI":"10.1109\/TIP.2015.2442920","article-title":"Perceptual quality assessment for multi-exposure image fusion","volume":"24","author":"Ma","year":"2015","journal-title":"IEEE Trans. Image Process."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"023522","DOI":"10.1117\/1.2945910","article-title":"Assessment of image fusion procedures using entropy, image quality, and multispectral classification","volume":"2","author":"Roberts","year":"2008","journal-title":"J. Appl. Remote Sens."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"355","DOI":"10.1088\/0957-0233\/8\/4\/002","article-title":"In-fibre Bragg grating sensors","volume":"8","author":"Rao","year":"1997","journal-title":"Meas. Sci. Technol."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1049\/el:20020212","article-title":"Information measure for performance of image fusion","volume":"38","author":"Qu","year":"2002","journal-title":"Electron. Lett."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1016\/j.inffus.2011.08.002","article-title":"A new image fusion performance metric based on visual information fidelity","volume":"14","author":"Han","year":"2013","journal-title":"Inf. Fusion"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1890","DOI":"10.1016\/j.aeue.2015.09.004","article-title":"A new image quality metric for image fusion: The sum of the correlations of differences","volume":"69","author":"Aslantas","year":"2015","journal-title":"Aeu-Int. J. Electron. Commun."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1016\/j.inffus.2016.02.001","article-title":"Infrared and visible image fusion via gradient transfer and total variation minimization","volume":"31","author":"Ma","year":"2016","journal-title":"Inf. Fusion"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1016\/j.inffus.2021.02.023","article-title":"RFN-Nest: An end-to-end residual fusion network for infrared and visible images","volume":"73","author":"Li","year":"2021","journal-title":"Inf. Fusion"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"502","DOI":"10.1109\/TPAMI.2020.3012548","article-title":"U2Fusion: A unified unsupervised image fusion network","volume":"44","author":"Xu","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/13\/6113\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:05:05Z","timestamp":1760126705000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/13\/6113"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,3]]},"references-count":37,"journal-issue":{"issue":"13","published-online":{"date-parts":[[2023,7]]}},"alternative-id":["s23136113"],"URL":"https:\/\/doi.org\/10.3390\/s23136113","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,3]]}}}