{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,20]],"date-time":"2026-03-20T15:47:42Z","timestamp":1774021662726,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":56,"publisher":"ACM","funder":[{"name":"Hong Kong Research Grants Council (RGC) General Research Fund","award":["CityU 11216122"],"award-info":[{"award-number":["CityU 11216122"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,8,10]]},"DOI":"10.1145\/3721238.3730707","type":"proceedings-article","created":{"date-parts":[[2025,7,23]],"date-time":"2025-07-23T08:40:47Z","timestamp":1753260047000},"page":"1-11","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Style Customization of Text-to-Vector Generation with Image Diffusion Priors"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3517-808X","authenticated-orcid":false,"given":"Peiying","family":"Zhang","sequence":"first","affiliation":[{"name":"City University of Hong Kong, Hong Kong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4007-2776","authenticated-orcid":false,"given":"Nanxuan","family":"Zhao","sequence":"additional","affiliation":[{"name":"Adobe Research, San Jose, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7014-5377","authenticated-orcid":false,"given":"Jing","family":"Liao","sequence":"additional","affiliation":[{"name":"City University of Hong Kong, Hong Kong, China"}]}],"member":"320","published-online":{"date-parts":[[2025,7,27]]},"reference":[{"key":"e_1_3_3_3_2_1","unstructured":"Josh Achiam Steven Adler Sandhini Agarwal Lama Ahmad Ilge Akkaya Florencia\u00a0Leoni Aleman Diogo Almeida Janko Altenschmidt Sam Altman Shyamal Anadkat et\u00a0al. 2023. Gpt-4 technical report. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2303.08774 (2023)."},{"key":"e_1_3_3_3_3_1","unstructured":"Alexandre Carlier Martin Danelljan Alexandre Alahi and Radu Timofte. 2020. Deepsvg: A hierarchical generative network for vector graphics animation. Advances in Neural Information Processing Systems 33 (2020) 16351\u201316361."},{"key":"e_1_3_3_3_4_1","unstructured":"Junsong Chen Jincheng Yu Chongjian Ge Lewei Yao Enze Xie Yue Wu Zhongdao Wang James Kwok Ping Luo Huchuan Lu et\u00a0al. 2023. Pixart-\u03b1 : Fast training of diffusion transformer for photorealistic text-to-image synthesis. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2310.00426 (2023)."},{"key":"e_1_3_3_3_5_1","unstructured":"Jian Chen Ruiyi Zhang Yufan Zhou Rajiv Jain Zhiqiang Xu Ryan Rossi and Changyou Chen. 2024. Towards aligned layout generation via diffusion model with aesthetic constraints. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2402.04754 (2024)."},{"key":"e_1_3_3_3_6_1","unstructured":"Louis Clou\u00e2tre and Marc Demers. 2019. Figr: Few-shot image generation with reptile. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/1901.02199 (2019)."},{"key":"e_1_3_3_3_7_1","volume-title":"Forty-first International Conference on Machine Learning","author":"Crowson Katherine","year":"2024","unstructured":"Katherine Crowson, Stefan\u00a0Andreas Baumann, Alex Birch, Tanishq\u00a0Mathew Abraham, Daniel\u00a0Z Kaplan, and Enrico Shippole. 2024. Scalable high-resolution pixel-space image synthesis with hourglass diffusion transformers. In Forty-first International Conference on Machine Learning."},{"key":"e_1_3_3_3_8_1","doi-asserted-by":"crossref","unstructured":"Edoardo\u00a0Alberto Dominici Nico Schertler Jonathan Griffin Shayan Hoshyari Leonid Sigal and Alla Sheffer. 2020. Polyfit: Perception-aligned vectorization of raster clip-art via intermediate polygonal fitting. ACM Transactions on Graphics (TOG) 39 4 (2020) 77\u20131.","DOI":"10.1145\/3386569.3392401"},{"key":"e_1_3_3_3_9_1","unstructured":"Valeria Efimova Artyom Chebykin Ivan Jarsky Evgenii Prosvirnin and Andrey Filchenkov. 2023. Neural Style Transfer for Vector Graphics. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2303.03405 (2023)."},{"key":"e_1_3_3_3_10_1","doi-asserted-by":"crossref","unstructured":"Jean-Dominique Favreau Florent Lafarge and Adrien Bousseau. 2017. Photo2clipart: Image abstraction and vectorization using layered linear gradients. ACM Transactions on Graphics (TOG) 36 6 (2017) 1\u201311.","DOI":"10.1145\/3130800.3130888"},{"key":"e_1_3_3_3_11_1","doi-asserted-by":"crossref","unstructured":"Kevin Frans Lisa Soros and Olaf Witkowski. 2022. Clipdraw: Exploring text-to-drawing synthesis through language-image encoders. Advances in Neural Information Processing Systems 35 (2022) 5207\u20135218.","DOI":"10.52202\/068431-0376"},{"key":"e_1_3_3_3_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-72684-2_11"},{"key":"e_1_3_3_3_13_1","unstructured":"Rinon Gal Yuval Alaluf Yuval Atzmon Or Patashnik Amit\u00a0H Bermano Gal Chechik and Daniel Cohen-Or. 2022. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2208.01618 (2022)."},{"key":"e_1_3_3_3_14_1","unstructured":"Rinon Gal Yael Vinker Yuval Alaluf Amit\u00a0H Bermano Daniel Cohen-Or Ariel Shamir and Gal Chechik. 2023. Breathing Life Into Sketches Using Text-to-Video Priors. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2311.13608 (2023)."},{"key":"e_1_3_3_3_15_1","unstructured":"Jonathan Ho Ajay Jain and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020) 6840\u20136851."},{"key":"e_1_3_3_3_16_1","unstructured":"Jonathan Ho and Tim Salimans. 2022. Classifier-free diffusion guidance. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2207.12598 (2022)."},{"key":"e_1_3_3_3_17_1","doi-asserted-by":"crossref","unstructured":"Shayan Hoshyari Edoardo\u00a0Alberto Dominici Alla Sheffer Nathan Carr Zhaowen Wang Duygu Ceylan and I-Chao Shen. 2018. Perception-driven semi-structured boundary vectorization. ACM Transactions on Graphics (TOG) 37 4 (2018) 1\u201314.","DOI":"10.1145\/3197517.3201312"},{"key":"e_1_3_3_3_18_1","unstructured":"Edward\u00a0J Hu Yelong Shen Phillip Wallis Zeyuan Allen-Zhu Yuanzhi Li Shean Wang Lu Wang and Weizhu Chen. 2021. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2106.09685 (2021)."},{"key":"e_1_3_3_3_19_1","unstructured":"Adobe Illustrator. 2023. Turn ideas into illustrations with Text to Vector Graphic. https:\/\/www.adobe.com\/products\/illustrator\/text-to-vector-graphic.html."},{"key":"e_1_3_3_3_20_1","unstructured":"Illustroke. 2024. Stunning vector illustrations from text prompts. https:\/\/illustroke.com\/."},{"key":"e_1_3_3_3_21_1","unstructured":"Shir Iluz Yael Vinker Amir Hertz Daniel Berio Daniel Cohen-Or and Ariel Shamir. 2023. Word-as-image for semantic typography. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2303.01818 (2023)."},{"key":"e_1_3_3_3_22_1","unstructured":"Ajay Jain Amber Xie and Pieter Abbeel. 2022. VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2211.11319 (2022)."},{"key":"e_1_3_3_3_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1964921.1964994"},{"key":"e_1_3_3_3_24_1","unstructured":"Nupur Kumari Bingliang Zhang Richard Zhang Eli Shechtman and Jun-Yan Zhu. 2022. Multi-Concept Customization of Text-to-Image Diffusion. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2212.04488 (2022)."},{"key":"e_1_3_3_3_25_1","doi-asserted-by":"crossref","unstructured":"Tzu-Mao Li Michal Luk\u00e1\u010d Micha\u00ebl Gharbi and Jonathan Ragan-Kelley. 2020. Differentiable vector graphics rasterization for editing and learning. ACM Transactions on Graphics (TOG) 39 6 (2020) 1\u201315.","DOI":"10.1145\/3414685.3417871"},{"key":"e_1_3_3_3_26_1","unstructured":"Eric Luhman and Troy Luhman. 2021. Knowledge distillation in iterative generative models for improved sampling speed. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2101.02388 (2021)."},{"key":"e_1_3_3_3_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01583"},{"key":"e_1_3_3_3_28_1","unstructured":"Sourab Mangrulkar Sylvain Gugger Lysandre Debut Younes Belkada Sayak Paul and Benjamin Bossan. 2022. PEFT: State-of-the-art Parameter-Efficient Fine-Tuning methods. https:\/\/github.com\/huggingface\/peft."},{"key":"e_1_3_3_3_29_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v38i5.28226"},{"key":"e_1_3_3_3_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00387"},{"key":"e_1_3_3_3_31_1","unstructured":"Ben Poole Ajay Jain Jonathan\u00a0T Barron and Ben Mildenhall. 2022. Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2209.14988 (2022)."},{"key":"e_1_3_3_3_32_1","first-page":"8748","volume-title":"International conference on machine learning","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong\u00a0Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et\u00a0al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748\u20138763."},{"key":"e_1_3_3_3_33_1","unstructured":"Juan\u00a0A Rodriguez Shubham Agarwal Issam\u00a0H Laradji Pau Rodriguez David Vazquez Christopher Pal and Marco Pedersoli. 2023. StarVector: Generating Scalable Vector Graphics Code from Images. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2312.11556 (2023)."},{"key":"e_1_3_3_3_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_3_3_3_35_1","unstructured":"Nataniel Ruiz Yuanzhen Li Varun Jampani Yael Pritch Michael Rubinstein and Kfir Aberman. 2022. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2208.12242 (2022)."},{"key":"e_1_3_3_3_36_1","doi-asserted-by":"crossref","unstructured":"Peter Schaldenbrand Zhixuan Liu and Jean Oh. 2022. Styleclipdraw: Coupling content and style in text-to-drawing translation. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2202.12362 (2022).","DOI":"10.24963\/ijcai.2022\/688"},{"key":"e_1_3_3_3_37_1","unstructured":"Christoph Schuhmann. 2022. Improved Aesthetic Predictor. https:\/\/github.com\/christophschuhmann\/improved-aesthetic-predictor."},{"key":"e_1_3_3_3_38_1","unstructured":"Peter Selinger. 2003. Potrace: a polygon-based tracing algorithm."},{"key":"e_1_3_3_3_39_1","unstructured":"Kihyuk Sohn Nataniel Ruiz Kimin Lee Daniel\u00a0Castro Chin Irina Blok Huiwen Chang Jarred Barber Lu Jiang Glenn Entis Yuanzhen Li et\u00a0al. 2023. Styledrop: Text-to-image generation in any style. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2306.00983 (2023)."},{"key":"e_1_3_3_3_40_1","unstructured":"Yiren Song Xning Shao Kang Chen Weidong Zhang Minzhe Li and Zhongliang Jing. 2022. CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2212.02122 (2022)."},{"key":"e_1_3_3_3_41_1","unstructured":"Yang Song Jascha Sohl-Dickstein Diederik\u00a0P Kingma Abhishek Kumar Stefano Ermon and Ben Poole. 2020. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2011.13456 (2020)."},{"key":"e_1_3_3_3_42_1","unstructured":"Zecheng Tang Chenfei Wu Zekai Zhang Mingheng Ni Shengming Yin Yu Liu Zhengyuan Yang Lijuan Wang Zicheng Liu Juntao Li et\u00a0al. 2024. StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2401.17093 (2024)."},{"key":"e_1_3_3_3_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.00759"},{"key":"e_1_3_3_3_44_1","doi-asserted-by":"crossref","unstructured":"Yael Vinker Ehsan Pajouheshgar Jessica\u00a0Y Bo Roman\u00a0Christian Bachmann Amit\u00a0Haim Bermano Daniel Cohen-Or Amir Zamir and Ariel Shamir. 2022. Clipasso: Semantically-aware object sketching. ACM Transactions on Graphics (TOG) 41 4 (2022) 1\u201311.","DOI":"10.1145\/3528223.3530068"},{"key":"e_1_3_3_3_45_1","volume-title":"The Eleventh International Conference on Learning Representations","author":"Wang Qiang","year":"2023","unstructured":"Qiang Wang, Haoge Deng, Yonggang Qi, Da Li, and Yi-Zhe Song. 2023a. Sketchknitter: Vectorized sketch generation with diffusion models. In The Eleventh International Conference on Learning Representations."},{"key":"e_1_3_3_3_46_1","unstructured":"Zhengyi Wang Cheng Lu Yikai Wang Fan Bao Chongxuan Li Hang Su and Jun Zhu. 2023b. ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2305.16213 (2023)."},{"key":"e_1_3_3_3_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3586183.3606751"},{"key":"e_1_3_3_3_48_1","doi-asserted-by":"crossref","unstructured":"Ronghuan Wu Wanchao Su Kede Ma and Jing Liao. 2023. IconShop: Text-Guided Vector Icon Synthesis with Autoregressive Transformers. ACM Transactions on Graphics (TOG) 42 6 (2023) 1\u201314.","DOI":"10.1145\/3618364"},{"key":"e_1_3_3_3_49_1","unstructured":"Ronghuan Wu Wanchao Su Kede Ma and Jing Liao. 2024. AniClipart: Clipart animation with text-to-video priors. International Journal of Computer Vision (2024) 1\u201317."},{"key":"e_1_3_3_3_50_1","unstructured":"Ximing Xing Juncheng Hu Jing Zhang Dong Xu and Qian Yu. 2024. SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2412.10437 (2024)."},{"key":"e_1_3_3_3_51_1","unstructured":"Ximing Xing Chuang Wang Haitao Zhou Jing Zhang Qian Yu and Dong Xu. 2023a. DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2306.14685 (2023)."},{"key":"e_1_3_3_3_52_1","unstructured":"Ximing Xing Haitao Zhou Chuang Wang Jing Zhang Dong Xu and Qian Yu. 2023b. SVGDreamer: Text Guided SVG Generation with Diffusion Model. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2312.16476 (2023)."},{"key":"e_1_3_3_3_53_1","unstructured":"Hu Ye Jun Zhang Sibo Liu Xiao Han and Wei Yang. 2023. Ip-adapter: Text compatible image prompt adapter for text-to-image diffusion models. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2308.06721 (2023)."},{"key":"e_1_3_3_3_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.00632"},{"key":"e_1_3_3_3_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00355"},{"key":"e_1_3_3_3_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3610548.3618232"},{"key":"e_1_3_3_3_57_1","doi-asserted-by":"crossref","unstructured":"Peiying Zhang Nanxuan Zhao and Jing Liao. 2024. Text-to-vector generation with neural path representation. ACM Transactions on Graphics (TOG) 43 4 (2024) 1\u201313.","DOI":"10.1145\/3658204"}],"event":{"name":"SIGGRAPH Conference Papers '25: Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers","location":"Vancouver BC Canada","acronym":"SIGGRAPH Conference Papers '25","sponsor":["SIGGRAPH ACM Special Interest Group on Computer Graphics and Interactive Techniques"]},"container-title":["Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3721238.3730707","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,20]],"date-time":"2026-03-20T14:53:44Z","timestamp":1774018424000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3721238.3730707"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,27]]},"references-count":56,"alternative-id":["10.1145\/3721238.3730707","10.1145\/3721238"],"URL":"https:\/\/doi.org\/10.1145\/3721238.3730707","relation":{},"subject":[],"published":{"date-parts":[[2025,7,27]]},"assertion":[{"value":"2025-07-27","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}