{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,19]],"date-time":"2025-12-19T10:08:12Z","timestamp":1766138892182,"version":"build-2065373602"},"reference-count":45,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2024,9,26]],"date-time":"2024-09-26T00:00:00Z","timestamp":1727308800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Philosophy and Social Science Planning Cross-disciplinary Key Support Subjects of Zhejiang Province","award":["22JCXK08Z","2022J162","JD6-228"],"award-info":[{"award-number":["22JCXK08Z","2022J162","JD6-228"]}]},{"name":"Ningbo Natural Science Foundation","award":["22JCXK08Z","2022J162","JD6-228"],"award-info":[{"award-number":["22JCXK08Z","2022J162","JD6-228"]}]},{"name":"Ningbo Philosophy and Social Science Research Base Project","award":["22JCXK08Z","2022J162","JD6-228"],"award-info":[{"award-number":["22JCXK08Z","2022J162","JD6-228"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Diffusion models have attracted considerable scholarly interest for their outstanding performance in generative tasks. However, current style transfer techniques based on diffusion models still rely on fine-tuning during the inference phase to optimize the generated results. This approach is not merely laborious and resource-demanding but also fails to fully harness the creative potential of expansive diffusion models. To overcome this limitation, this paper introduces an innovative solution that utilizes a pretrained diffusion model, thereby obviating the necessity for additional training steps. The scheme proposes a Feature Normalization Mapping Module with Cross-Attention Mechanism (INN-FMM) based on the dual-path diffusion model. This module employs soft attention to extract style features and integrate them with content features. Additionally, a parameter-free Similarity Attention Mechanism (SimAM) is employed within the image feature space to facilitate the transfer of style image textures and colors, while simultaneously minimizing the loss of structural content information. The fusion of these dual attention mechanisms enables us to achieve style transfer in texture and color without sacrificing content integrity. The experimental results indicate that our approach exceeds existing methods in several evaluation metrics.<\/jats:p>","DOI":"10.3390\/info15100588","type":"journal-article","created":{"date-parts":[[2024,9,26]],"date-time":"2024-09-26T08:20:52Z","timestamp":1727338852000},"page":"588","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["A Training-Free Latent Diffusion Style Transfer Method"],"prefix":"10.3390","volume":"15","author":[{"given":"Zhengtao","family":"Xiang","sequence":"first","affiliation":[{"name":"School of Electrical and Information Engineering, Hubei University of Automotive Technology, Shiyan 442002, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xing","family":"Wan","sequence":"additional","affiliation":[{"name":"School of Electrical and Information Engineering, Hubei University of Automotive Technology, Shiyan 442002, China"},{"name":"School of Computer and Data Engineering, Ningbo Tech University, Ningbo 315100, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7779-5011","authenticated-orcid":false,"given":"Libo","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Computer and Data Engineering, Ningbo Tech University, Ningbo 315100, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xin","family":"Yu","sequence":"additional","affiliation":[{"name":"School of Computer and Data Engineering, Ningbo Tech University, Ningbo 315100, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuhan","family":"Mao","sequence":"additional","affiliation":[{"name":"School of Economics and Management, Zhejiang Sci-Tech University, Hangzhou 310018, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,9,26]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Gatys, L.A., Ecker, A.S., and Bethge, M. (2015). A Neural Algorithm of Artistic Style. arXiv.","DOI":"10.1167\/16.12.326"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Everaert, M.N., Bocchio, M., Arpa, S., S\u00fcsstrunk, S., and Achanta, R. (2023, January 2\u20136). Diffusion in style. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.00214"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Wang, Z., Zhao, L., and Xing, W. (2023, January 2\u20136). Stylediffusion: Controllable disentangled style transfer via diffusion models. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.00706"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Huang, N., Tang, F., Huang, H., Ma, C., Dong, W., and Xu, C. (2023, January 17\u201324). Inversion-based style transfer with diffusion models. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00978"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Chung, J., Hyun, S., and Heo, J.P. (2024, January 16\u201322). Style injection in diffusion: A training-free approach for adapting large-scale diffusion models for style transfer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.00840"},{"key":"ref_6","unstructured":"Yang, L., Zhang, R.Y., Li, L., and Xie, X. (2021, January 18\u201324). Simam: A simple, parameter-free attention module for convolutional neural networks. Proceedings of the International Conference on Machine Learning, Online."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wright, M., and Ommer, B. (2022, January 27\u201330). Artfid: Quantitative evaluation of neural style transfer. Proceedings of the DAGM German Conference on Pattern Recognition, Konstanz, Germany.","DOI":"10.1007\/978-3-031-16788-1_34"},{"key":"ref_8","unstructured":"Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., and Hochreiter, S. (2017, January 4\u20139). Gans trained by a two time-scale update rule converge to a local nash equilibrium. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhang, R., Isola, P., Efros, A.A., Shechtman, E., and Wang, O. (2018, January 18\u201323). The unreasonable effectiveness of deep features as a perceptual metric. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00068"},{"key":"ref_10","unstructured":"Naeem, M.F., Oh, S.J., Uh, Y., Choi, Y., and Yoo, J. (2020, January 13\u201318). Reliable fidelity and diversity metrics for generative models. Proceedings of the 37th International Conference on Machine Learning, Online."},{"key":"ref_11","unstructured":"Banar, N., Sabatelli, M., Geurts, P., Daelemans, W., and Kestemont, M. (2021, January 11\u201328). Transfer learning with style transfer between the photorealistic and artistic domain. Proceedings of the IS&T International Symposium on Electronic Imaging 2021, Computer Vision and Image Analysis of Art 2021, Online."},{"key":"ref_12","unstructured":"Li, H., and Wan, X.X. (2020, January 18\u201320). Image style transfer algorithm under deep convolutional neural network. Proceedings of the Computer Engineering and Applications, Guangzhou, China."},{"key":"ref_13","unstructured":"Chen, C.J. (2021). Chinese Painting Style Transfer Based on Convolutional Neural Network, Hangzhou Dianzi University."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Li, S., Xu, X., Nie, L., and Chua, T.S. (2017, January 23\u201327). Laplacian-steered neural style transfer. Proceedings of the 25th ACM International Conference on Multimedia, Mountain View, CA, USA.","DOI":"10.1145\/3123266.3123425"},{"key":"ref_15","unstructured":"Risser, E., Wilmot, P., and Barnes, C. (2017). Stable and controllable neural texture synthesis and style transfer using histogram losses. arXiv."},{"key":"ref_16","unstructured":"Dumoulin, V., Shlens, J., and Kudlur, M. (2016). A learned representation for artistic style. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Chen, D., Yuan, L., Liao, J., Yu, N., and Hua, G. (2017, January 21\u201326). Stylebank: An explicit representation for neural image style transfer. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.296"},{"key":"ref_18","unstructured":"Chen, T.Q., and Schmidt, M. (2016). Fast patch-based style transfer of arbitrary style. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Huang, X., and Belongie, S. (2017, January 22\u201329). Arbitrary style transfer in real-time with adaptive instance normalization. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.167"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Tang, F., Dong, W., Huang, H., Ma, C., Lee, T.Y., and Xu, C. (2022, January 7\u201311). Domain enhanced arbitrary image style transfer via contrastive learning. Proceedings of the ACM SIGGRAPH 2022 Conference Proceedings, Vancouver, BC, Canada.","DOI":"10.1145\/3528233.3530736"},{"key":"ref_21","unstructured":"Liu, S., Ye, J., and Wang, X. (2023). Any-to-any style transfer: Making picasso and da vinci collaborate. arXiv."},{"key":"ref_22","unstructured":"Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., and Yang, M.H. (2017, January 4\u20139). Universal style transfer via feature transforms. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Liu, S., Lin, T., He, D., Li, F., Wang, M., Li, X., Sun, Z., Li, Q., and Ding, E. (2021, January 11\u201317). Adaattn: Revisit attention mechanism in arbitrary neural style transfer. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00658"},{"key":"ref_24","unstructured":"Zhu, Z.X., Mao, Y.S., and Cai, K.W. (2023, January 7\u20139). Image style transfer method for industrial inspection. Proceedings of the Computer Engineering and Applications, Hangzhou, China."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Han, J., Shoeiby, M., Petersson, L., and Armin, M.A. (2021, January 20\u201325). Dual contrastive learning for unsupervised image-to-image translation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPRW53098.2021.00084"},{"key":"ref_26","unstructured":"Nichol, A., Dhariwal, P., Ramesh, A., Shyam, P., Mishkin, P., McGrew, B., Sutskever, I., and Chen, M. (2021). Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv."},{"key":"ref_27","unstructured":"Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., and Chen, M. (2022). Hierarchical text-conditional image generation with clip latents. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. (2022, January 18\u201324). High-resolution image synthesis with latent diffusion models. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Avrahami, O., Lischinski, D., and Fried, O. (2022, January 18\u201324). Blended diffusion for text-driven editing of natural images. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01767"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Brooks, T., Holynski, A., and Efros, A.A. (2023, January 17\u201324). Instructpix2pix: Learning to follow image editing instructions. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01764"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Cao, M., Wang, X., Qi, Z., Shan, Y., Qie, X., and Zheng, Y. (2023, January 2\u20136). Masactrl: Tuning-free mutual self-attention control for consistent image synthesis and editing. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.02062"},{"key":"ref_32","unstructured":"Couairon, G., Verbeek, J., Schwenk, H., and Cord, M. (2022). Dffedit: Diffusion-based semantic image editing with mask guidance. arXiv."},{"key":"ref_33","unstructured":"Hertz, A., Mokady, R., Tenenbaum, J., Aberman, K., Pritch, Y., and Cohen-Or, D. (2022). Prompt-to-prompt image editing with cross attention control. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wu, C.H., and De la Torre, F. (2023, January 2\u20136). A latent space of stochastic diffusion models for zero-shot image editing and guidance. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.00678"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Han, L., Ghosh, A., Metaxas, D.N., and Ren, J. (2023, January 17\u201324). Sine: Single image editing with text-to-image diffusion models. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00584"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Qi, T., Fang, S., Wu, Y., Xie, H., Liu, J., Chen, L., and Zhang, Y. (2024, January 16\u201322). DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.00830"},{"key":"ref_37","unstructured":"Jeong, J., Kwon, M., and Uh, Y. (2023). Training-free style transfer emerges from h-space in diffusion models. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Lin, H., Cheng, X., Wu, X., Shen, D., Wang, Z., Song, Q., and Yuan, W. (2022, January 18\u201322). Cat: Cross attention in vision transformer. Proceedings of the 2022 IEEE International Conference on Multimedia and Expo, Taipei, Taiwan.","DOI":"10.1109\/ICME52920.2022.9859720"},{"key":"ref_39","unstructured":"Ulyanov, D., Vedaldi, A., and Lempitsky, V. (2016). Instance normalization: The missing ingredient for fast stylization. arXiv."},{"key":"ref_40","unstructured":"Song, J., Meng, C., and Ermon, S. (2020). Denoising diffusion implicit models. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Lawrence Zitnick, C., and Doll\u00e1r, P. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the Computer Vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"394","DOI":"10.1109\/TIP.2018.2866698","article-title":"Improved ArtGAN for conditional synthesis of natural image and artwork","volume":"28","author":"Tan","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Deng, Y., Tang, F., Dong, W., Ma, C., Pan, X., Wang, L., and Xu, C. (2022, January 18\u201324). Stytr2: Image style transfer with transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01104"},{"key":"ref_44","unstructured":"Kwon, G., and Ye, J.C. (2022). Diffusion-based image translation using disentangled style and content representation. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Deng, Y., Tang, F., Dong, W., Sun, W., Huang, F., and Xu, C. (2020, January 12\u201316). Arbitrary style transfer via multi-adaptation network. Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA.","DOI":"10.1145\/3394171.3414015"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/10\/588\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:03:57Z","timestamp":1760112237000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/10\/588"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,26]]},"references-count":45,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2024,10]]}},"alternative-id":["info15100588"],"URL":"https:\/\/doi.org\/10.3390\/info15100588","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2024,9,26]]}}}