{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T22:09:39Z","timestamp":1740175779347,"version":"3.37.3"},"reference-count":28,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2024,8,1]],"date-time":"2024-08-01T00:00:00Z","timestamp":1722470400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,8,1]],"date-time":"2024-08-01T00:00:00Z","timestamp":1722470400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62006240"],"award-info":[{"award-number":["62006240"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Score-based diffusion models have shown promising results in unpaired image-to-image translation (I2I). However, the existing methods only perform unpaired I2I in pixel space, which requires high computation costs. To this end, we propose guiding stochastic differential equations in latent space (Latent-SDE) that extracts domain-specific and domain-independent features of the image in the latent space to calculate the loss and guides the inference process of a pretrained SDE in the latent space for unpaired I2I. To refine the image in the latent space, we propose a latent time-travel strategy that increases the sampling timestep. Empirically, we compare Latent-SDE to the baseline of the score-based diffusion model on three widely adopted unpaired I2I tasks under two metrics. Latent-SDE achieves state-of-the-art on Cat <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\rightarrow $$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mo>\u2192<\/mml:mo>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> Dog and is competitive on the other two tasks. Our code will be freely available for public use upon acceptance at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/zhangXJ147\/Latent-SDE\">https:\/\/github.com\/zhangXJ147\/Latent-SDE<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s40747-024-01566-1","type":"journal-article","created":{"date-parts":[[2024,8,1]],"date-time":"2024-08-01T09:12:56Z","timestamp":1722503576000},"page":"7765-7775","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Latent-SDE: guiding stochastic differential equations in latent space for unpaired image-to-image translation"],"prefix":"10.1007","volume":"10","author":[{"given":"Xianjie","family":"Zhang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Min","family":"Li","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yujie","family":"He","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yao","family":"Gou","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yusen","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,8,1]]},"reference":[{"issue":"3","key":"1566_CR1","doi-asserted-by":"publisher","first-page":"313","DOI":"10.1016\/0304-4149(82)90051-5","volume":"12","author":"BD Anderson","year":"1982","unstructured":"Anderson BD (1982) Reverse-time diffusion equation models. Stoch Process Appl 12(3):313\u2013326. https:\/\/doi.org\/10.1016\/0304-4149(82)90051-5","journal-title":"Stoch Process Appl"},{"key":"1566_CR2","doi-asserted-by":"publisher","unstructured":"Choi J, Kim S, Jeong Y, Gwon Y, Yoon S (2021) Ilvr: conditioning method for denoising diffusion probabilistic models. In: 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), pp 14347\u201314356. https:\/\/doi.org\/10.1109\/ICCV48922.2021.01410","DOI":"10.1109\/ICCV48922.2021.01410"},{"key":"1566_CR3","doi-asserted-by":"publisher","unstructured":"Choi Y, Uh Y, Yoo J, Ha JW (2020) Stargan v2: diverse image synthesis for multiple domains. In: 2020 IEEE\/CVF conference on computer vision and pattern recognition (CVPR), pp 8185\u20138194. https:\/\/doi.org\/10.1109\/CVPR42600.2020.00821","DOI":"10.1109\/CVPR42600.2020.00821"},{"key":"1566_CR4","unstructured":"Dhariwal P, Nichol A (2021) Diffusion models beat GANS on image synthesis. In: Ranzato M, Beygelzimer A, Dauphin Y, Liang P, Vaughan JW (eds) Advances in neural information processing systems, vol 34. Curran Associates, pp 8780\u20138794"},{"key":"1566_CR5","doi-asserted-by":"publisher","unstructured":"Han J, Shoeiby M, Petersson L, Armin MA (2021) Dual contrastive learning for unsupervised image-to-image translation. In: 2021 IEEE\/CVF conference on computer vision and pattern recognition workshops (CVPRW), pp 746\u2013755. https:\/\/doi.org\/10.1109\/CVPRW53098.2021.00084","DOI":"10.1109\/CVPRW53098.2021.00084"},{"key":"1566_CR6","unstructured":"Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local Nash equilibrium. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30. Curran Associates"},{"key":"1566_CR7","unstructured":"Ho J, Jain A, Abbeel P (2020) Denoising diffusion probabilistic models. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in neural information processing systems, vol 33. Curran Associates, pp 6840\u20136851"},{"key":"1566_CR8","unstructured":"Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of GANs for improved quality, stability, and variation. In: International conference on learning representations"},{"key":"1566_CR9","unstructured":"Kim B, Kwon G, Kim K, Ye JC (2024) Unpaired image-to-image translation via neural Schr\u00f6dinger bridge. In: ICLR"},{"key":"1566_CR10","doi-asserted-by":"publisher","unstructured":"Kim K, Park S, Jeon E, Kim T, Kim D (2022) A style-aware discriminator for controllable image translation. IEEE Computer Society, pp 18218\u201318227. https:\/\/doi.org\/10.1109\/CVPR52688.2022.01770","DOI":"10.1109\/CVPR52688.2022.01770"},{"key":"1566_CR11","doi-asserted-by":"publisher","unstructured":"Li S, Van De\u00a0Weijer J, Wang Y, Khan FS, Liu M, Yang J (2023) 3D-aware multi-class image-to-image translation with NeRFs. IEEE Computer Society, pp 12652\u201312662. https:\/\/doi.org\/10.1109\/CVPR52729.2023.01217","DOI":"10.1109\/CVPR52729.2023.01217"},{"key":"1566_CR12","unstructured":"Meng C, He Y, Song Y, Song J, Wu J, Zhu JY, Ermon S (2022) SDEdit: guided image synthesis and editing with stochastic differential equations. In: International conference on learning representations"},{"key":"1566_CR13","unstructured":"Nichol AQ, Dhariwal P (2021) Improved denoising diffusion probabilistic models. In: Meila M, Zhang T (eds) Proceedings of the 38th international conference on machine learning, proceedings of machine learning research, vol 139. PMLR, pp 8162\u20138171"},{"key":"1566_CR14","doi-asserted-by":"crossref","unstructured":"Park T, Efros AA, Zhang R, Zhu JY (2020) Contrastive learning for unpaired image-to-image translation. In: Vedaldi A, Bischof H, Brox T, Frahm JM (eds) Computer Vision\u2014ECCV 2020. Springer International Publishing, Cham, pp 319\u2013345","DOI":"10.1007\/978-3-030-58545-7_19"},{"key":"1566_CR15","doi-asserted-by":"publisher","unstructured":"Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B (2022) High-resolution image synthesis with latent diffusion models. In: 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 10674\u201310685. https:\/\/doi.org\/10.1109\/CVPR52688.2022.01042","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"1566_CR16","unstructured":"Song J, Meng C, Ermon S (2021) Denoising diffusion implicit models. In: International conference on learning representations"},{"key":"1566_CR17","unstructured":"Song Y, Sohl-Dickstein J, Kingma DP, Kumar A, Ermon S, Poole B (2021) Score-based generative modeling through stochastic differential equations. In: International conference on learning representations"},{"key":"1566_CR18","unstructured":"Sun S, Wei L, Xing J, Jia J, Tian Q (2023) SDDM: Score-decomposed diffusion models on manifolds for unpaired image-to-image translation. In: Krause A, Brunskill E, Cho K, Engelhardt B, Sabato S, Scarlett J (eds) Proceedings of the 40th international conference on machine learning, proceedings of machine learning research, vol 202. PMLR, pp 33115\u201333134"},{"key":"1566_CR19","doi-asserted-by":"publisher","unstructured":"Wang W, Zhou W, Bao J, Chen D, Li H (2021) Instance-wise hard negative example generation for contrastive learning in unpaired image-to-image translation. In: 2021 IEEE\/CVF international conference on computer vision (ICCV), pp 14000\u201314009. https:\/\/doi.org\/10.1109\/ICCV48922.2021.01376","DOI":"10.1109\/ICCV48922.2021.01376"},{"key":"1566_CR20","doi-asserted-by":"publisher","unstructured":"Wang Y, Gonzalez-Garcia A, Berga D, Herranz L, Khan FS, Van De\u00a0Weijer J (2020) Minegan: effective knowledge transfer from GANS to target domains with few images. IEEE Computer Society, pp 9329\u20139338. https:\/\/doi.org\/10.1109\/CVPR42600.2020.00935","DOI":"10.1109\/CVPR42600.2020.00935"},{"key":"1566_CR21","doi-asserted-by":"crossref","unstructured":"Wang Y, Gonzalez-Garcia A, Wu C, Herranz L, Khan FS, Jui S, Yang J, van\u00a0de Weijer J MineGAN++: mining generative models for efficient knowledge transfer to limited data domains. Int J Comput Vis 132(2):490\u2013514","DOI":"10.1007\/s11263-023-01882-y"},{"key":"1566_CR22","doi-asserted-by":"publisher","unstructured":"Wang Y, Wu C, Herranz L, van\u00a0de Weijer J, Gonzalez-Garcia A, Raducanu B (2018) Transferring GANS: generating images from limited data. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 11210 LNCS, pp 220\u2013236. https:\/\/doi.org\/10.1007\/978-3-030-01231-1_14","DOI":"10.1007\/978-3-030-01231-1_14"},{"issue":"4","key":"1566_CR23","doi-asserted-by":"publisher","first-page":"600","DOI":"10.1109\/TIP.2003.819861","volume":"13","author":"Z Wang","year":"2004","unstructured":"Wang Z, Bovik A, Sheikh H, Simoncelli E (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600\u2013612. https:\/\/doi.org\/10.1109\/TIP.2003.819861","journal-title":"IEEE Trans Image Process"},{"key":"1566_CR24","doi-asserted-by":"publisher","unstructured":"Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: unsupervised dual learning for image-to-image translation. In: 2017 IEEE international conference on computer vision (ICCV), pp 2868\u20132876. https:\/\/doi.org\/10.1109\/ICCV.2017.310","DOI":"10.1109\/ICCV.2017.310"},{"key":"1566_CR25","doi-asserted-by":"crossref","unstructured":"Yu J, Wang Y, Zhao C, Ghanem B, Zhang J (2023) Freedom: training-free energy-guided conditional diffusion model. In: Proceedings of the IEEE\/CVF international conference on computer vision (ICCV), pp 23174\u201323184","DOI":"10.1109\/ICCV51070.2023.02118"},{"issue":"2","key":"1566_CR26","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1109\/JAS.2022.106004","volume":"10","author":"W Zhang","year":"2023","unstructured":"Zhang W, Deng L, Zhang L, Wu D (2023) A survey on negative transfer. IEEE\/CAA J Autom Sin 10(2):305\u2013329. https:\/\/doi.org\/10.1109\/JAS.2022.106004","journal-title":"IEEE\/CAA J Autom Sin"},{"key":"1566_CR27","unstructured":"Zhao M, Bao F, LI C, Zhu J (2022) Egsde: Unpaired image-to-image translation via energy-guided stochastic differential equations. In: Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A (eds) Advances in neural information processing systems, vol\u00a035. Curran Associates, pp 3609\u20133623"},{"key":"1566_CR28","doi-asserted-by":"publisher","unstructured":"Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE international conference on computer vision (ICCV), pp 2242\u20132251. https:\/\/doi.org\/10.1109\/ICCV.2017.244","DOI":"10.1109\/ICCV.2017.244"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01566-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-024-01566-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01566-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T22:12:21Z","timestamp":1729116741000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-024-01566-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,1]]},"references-count":28,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,12]]}},"alternative-id":["1566"],"URL":"https:\/\/doi.org\/10.1007\/s40747-024-01566-1","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"type":"print","value":"2199-4536"},{"type":"electronic","value":"2198-6053"}],"subject":[],"published":{"date-parts":[[2024,8,1]]},"assertion":[{"value":"11 December 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 July 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 August 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The research was conducted without any commercial or financial relationships that could be construed as a potential Conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}