{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T15:52:59Z","timestamp":1776095579065,"version":"3.50.1"},"reference-count":57,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T00:00:00Z","timestamp":1721347200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"NSF IIS","award":["2144117"],"award-info":[{"award-number":["2144117"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2024,7,19]]},"abstract":"<jats:p>Recent advances in generative imagery have brought forth outpainting and inpainting models that can produce high-quality, plausible image content in unknown regions. However, the content these models hallucinate is necessarily inauthentic, since they are unaware of the true scene. In this work, we propose RealFill, a novel generative approach for image completion that fills in missing regions of an image with the content that should have been there. RealFill is a generative inpainting model that is personalized using only a few reference images of a scene. These reference images do not have to be aligned with the target image, and can be taken with drastically varying viewpoints, lighting conditions, camera apertures, or image styles. Once personalized, RealFill is able to complete a target image with visually compelling contents that are faithful to the original scene. We evaluate RealFill on a new image completion benchmark that covers a set of diverse and challenging scenarios, and find that it outperforms existing approaches by a large margin. Project page: https:\/\/realfill.github.io.<\/jats:p>","DOI":"10.1145\/3658237","type":"journal-article","created":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T14:47:57Z","timestamp":1721400477000},"page":"1-12","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":24,"title":["RealFill: Reference-Driven Generation for Authentic Image Completion"],"prefix":"10.1145","volume":"43","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6944-1101","authenticated-orcid":false,"given":"Luming","family":"Tang","sequence":"first","affiliation":[{"name":"Cornell University, Ithaca, United States of America"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9966-6456","authenticated-orcid":false,"given":"Nataniel","family":"Ruiz","sequence":"additional","affiliation":[{"name":"Google Research, Boston, United States of America"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-4309-0969","authenticated-orcid":false,"given":"Qinghao","family":"Chu","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, United States of America"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9831-8249","authenticated-orcid":false,"given":"Yuanzhen","family":"Li","sequence":"additional","affiliation":[{"name":"Google Research, Boston, United States of America"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-6915-0126","authenticated-orcid":false,"given":"Aleksander","family":"Holynski","sequence":"additional","affiliation":[{"name":"Google Research, San Francisco, United States of America"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-8710-6065","authenticated-orcid":false,"given":"David E.","family":"Jacobs","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, United States of America"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2309-4703","authenticated-orcid":false,"given":"Bharath","family":"Hariharan","sequence":"additional","affiliation":[{"name":"Cornell University, Ithaca, United States of America"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5419-3915","authenticated-orcid":false,"given":"Yael","family":"Pritch","sequence":"additional","affiliation":[{"name":"Google Research, Tel Aviv, Israel"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-1874-563X","authenticated-orcid":false,"given":"Neal","family":"Wadhwa","sequence":"additional","affiliation":[{"name":"Google Research, New York, United States of America"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4958-601X","authenticated-orcid":false,"given":"Kfir","family":"Aberman","sequence":"additional","affiliation":[{"name":"Snap Research, Palo Alto, United States of America"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3707-3807","authenticated-orcid":false,"given":"Michael","family":"Rubinstein","sequence":"additional","affiliation":[{"name":"Google Research, Boston, United States of America"}]}],"member":"320","published-online":{"date-parts":[[2024,7,19]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"Adobe Inc. 2023. Adobe Photoshop. https:\/\/www.adobe.com\/products\/photoshop.html"},{"key":"e_1_2_2_2_1","volume-title":"Break-A-Scene: Extracting Multiple Concepts from a Single Image. ArXiv preprint abs\/2305.16311","author":"Avrahami Omri","year":"2023","unstructured":"Omri Avrahami, Kfir Aberman, Ohad Fried, Daniel Cohen-Or, and Dani Lischinski. 2023. Break-A-Scene: Extracting Multiple Concepts from a Single Image. ArXiv preprint abs\/2305.16311 (2023). https:\/\/arxiv.org\/abs\/2305.16311"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1531326.1531330"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/344779.344972"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01764"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00951"},{"key":"e_1_2_2_7_1","volume-title":"Muse: Text-to-image generation via masked generative transformers. ArXiv preprint abs\/2301.00704","author":"Chang Huiwen","year":"2023","unstructured":"Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T Freeman, Michael Rubinstein, et al. 2023. Muse: Text-to-image generation via masked generative transformers. ArXiv preprint abs\/2301.00704 (2023). https:\/\/arxiv.org\/abs\/2301.00704"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01103"},{"key":"e_1_2_2_9_1","volume-title":"Subject-driven text-to-image generation via apprenticeship learning. ArXiv preprint abs\/2304.00186","author":"Chen Wenhu","year":"2023","unstructured":"Wenhu Chen, Hexiang Hu, Yandong Li, Nataniel Ruiz, Xuhui Jia, Ming-Wei Chang, and William W Cohen. 2023. Subject-driven text-to-image generation via apprenticeship learning. ArXiv preprint abs\/2304.00186 (2023). https:\/\/arxiv.org\/abs\/2304.00186"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2003.1211538"},{"key":"e_1_2_2_11_1","volume-title":"Diffusion Models Beat GANs on Image Synthesis. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021","author":"Dhariwal Prafulla","year":"2021","unstructured":"Prafulla Dhariwal and Alexander Quinn Nichol. 2021. Diffusion Models Beat GANs on Image Synthesis. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6--14, 2021, virtual, Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 8780--8794. https:\/\/proceedings.neurips.cc\/paper\/2021\/hash\/49ad23d1ec9fa4bd8d77d02681df5cfa-Abstract.html"},{"key":"e_1_2_2_12_1","volume-title":"DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data. ArXiv preprint abs\/2306.09344","author":"Fu Stephanie","year":"2023","unstructured":"Stephanie Fu, Netanel Tamir, Shobhita Sundaram, Lucy Chai, Richard Zhang, Tali Dekel, and Phillip Isola. 2023. DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data. ArXiv preprint abs\/2306.09344 (2023). https:\/\/arxiv.org\/abs\/2306.09344"},{"key":"e_1_2_2_13_1","volume-title":"An image is worth one word: Personalizing text-to-image generation using textual inversion. ArXiv preprint abs\/2208.01618","author":"Gal Rinon","year":"2022","unstructured":"Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H Bermano, Gal Chechik, and Daniel Cohen-Or. 2022. An image is worth one word: Personalizing text-to-image generation using textual inversion. ArXiv preprint abs\/2208.01618 (2022). https:\/\/arxiv.org\/abs\/2208.01618"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1276377.1276382"},{"key":"e_1_2_2_15_1","volume-title":"Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020","author":"Ho Jonathan","year":"2020","unstructured":"Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6--12, 2020, virtual, Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https:\/\/proceedings.neurips.cc\/paper\/2020\/hash\/4c5bcfec8584af0d967f1ab10179ca4b-Abstract.html"},{"key":"e_1_2_2_16_1","volume-title":"LoRA: Low-Rank Adaptation of Large Language Models. In The Tenth International Conference on Learning Representations, ICLR 2022","author":"Hu Edward J.","year":"2022","unstructured":"Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25--29, 2022. OpenReview.net. https:\/\/openreview.net\/forum?id=nZeVKeeFYf9"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3072959.3073659"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00582"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW56347.2022.00063"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01252-6_6"},{"key":"e_1_2_2_21_1","volume-title":"Pavel Tokmakov, Sergey Zakharov, and Carl Vondrick.","author":"Liu Ruoshi","year":"2023","unstructured":"Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tokmakov, Sergey Zakharov, and Carl Vondrick. 2023. Zero-1-to-3: Zero-shot One Image to 3D Object. arXiv:2303.11328 [cs.CV]"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01117"},{"key":"e_1_2_2_23_1","volume-title":"Aleksander Holynski, and Trevor Darrell.","author":"Luo Grace","year":"2023","unstructured":"Grace Luo, Lisa Dunlap, Dong Huk Park, Aleksander Holynski, and Trevor Darrell. 2023. Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence. ArXiv preprint abs\/2305.14334 (2023). https:\/\/arxiv.org\/abs\/2305.14334"},{"key":"e_1_2_2_24_1","doi-asserted-by":"crossref","unstructured":"Ben Mildenhall Pratul P. Srinivasan Matthew Tancik Jonathan T. Barron Ravi Ramamoorthi and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In ECCV.","DOI":"10.1007\/978-3-030-58452-8_24"},{"key":"e_1_2_2_25_1","volume-title":"T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models. ArXiv preprint abs\/2302.08453","author":"Mou Chong","year":"2023","unstructured":"Chong Mou, Xintao Wang, Liangbin Xie, Yanze Wu, Jian Zhang, Zhongang Qi, Ying Shan, and Xiaohu Qie. 2023. T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models. ArXiv preprint abs\/2302.08453 (2023). https:\/\/arxiv.org\/abs\/2302.08453"},{"key":"e_1_2_2_26_1","volume-title":"Accessed","author":"AI.","year":"2023","unstructured":"OpenAI. 2023. ChatGPT: Optimizing Language Models for Dialogue. https:\/\/openai.com\/blog\/chatgpt. Accessed: November 17, 2023."},{"key":"e_1_2_2_27_1","volume-title":"Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18--24","volume":"8763","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18--24 July 2021, Virtual Event (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8748--8763. http:\/\/proceedings.mlr.press\/v139\/radford21a.html"},{"key":"e_1_2_2_28_1","volume-title":"DreamBooth3D: Subject-Driven Text-to-3D Generation. ArXiv preprint abs\/2303.13508","author":"Raj Amit","year":"2023","unstructured":"Amit Raj, Srinivas Kaza, Ben Poole, Michael Niemeyer, Nataniel Ruiz, Ben Mildenhall, Shiran Zada, Kfir Aberman, Michael Rubinstein, Jonathan Barron, Yuanzhen Li, and Varun Jampani. 2023. DreamBooth3D: Subject-Driven Text-to-3D Generation. ArXiv preprint abs\/2303.13508 (2023). https:\/\/arxiv.org\/abs\/2303.13508"},{"key":"e_1_2_2_29_1","volume-title":"Hierarchical text-conditional image generation with clip latents. ArXiv preprint abs\/2204.06125","author":"Ramesh Aditya","year":"2022","unstructured":"Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical text-conditional image generation with clip latents. ArXiv preprint abs\/2204.06125 (2022). https:\/\/arxiv.org\/abs\/2204.06125"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.02155"},{"key":"e_1_2_2_32_1","volume-title":"HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models. ArXiv preprint abs\/2307.06949","author":"Ruiz Nataniel","year":"2023","unstructured":"Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Wei Wei, Tingbo Hou, Yael Pritch, Neal Wadhwa, Michael Rubinstein, and Kfir Aberman. 2023b. HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models. ArXiv preprint abs\/2307.06949 (2023). https:\/\/arxiv.org\/abs\/2307.06949"},{"key":"e_1_2_2_33_1","unstructured":"Simo Ryu. 2023. Low-rank Adaptation for Fast Text-to-Image Diffusion Fine-tuning. https:\/\/github.com\/cloneofsimo\/lora."},{"key":"e_1_2_2_34_1","first-page":"36479","article-title":"Photorealistic text-to-image diffusion models with deep language understanding","volume":"35","author":"Saharia Chitwan","year":"2022","unstructured":"Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, et al. 2022. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems 35 (2022), 36479--36494.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_35_1","volume-title":"It is all about where you start: Text-to-image generation with seed selection. ArXiv preprint abs\/2304.14530","author":"Samuel Dvir","year":"2023","unstructured":"Dvir Samuel, Rami Ben-Ari, Simon Raviv, Nir Darshan, and Gal Chechik. 2023. It is all about where you start: Text-to-image generation with seed selection. ArXiv preprint abs\/2304.14530 (2023). https:\/\/arxiv.org\/abs\/2304.14530"},{"key":"e_1_2_2_36_1","volume-title":"Proceedings, Part VI 13","author":"Shan Qi","year":"2014","unstructured":"Qi Shan, Brian Curless, Yasutaka Furukawa, Carlos Hernandez, and Steven M Seitz. 2014. Photo uncrop. In Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6--12, 2014, Proceedings, Part VI 13. Springer, 16--31."},{"key":"e_1_2_2_37_1","volume-title":"Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, et al.","author":"Sohn Kihyuk","year":"2023","unstructured":"Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, et al. 2023. StyleDrop: Text-to-Image Generation in Any Style. ArXiv preprint abs\/2306.00983 (2023). https:\/\/arxiv.org\/abs\/2306.00983"},{"key":"e_1_2_2_38_1","volume-title":"9th International Conference on Learning Representations, ICLR 2021","author":"Song Yang","year":"2021","unstructured":"Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. 2021. Score-Based Generative Modeling through Stochastic Differential Equations. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3--7, 2021. OpenReview.net. https:\/\/openreview.net\/forum?id=PxTIG12RRHS"},{"key":"e_1_2_2_39_1","unstructured":"Stability AI. 2022. Stable-Diffusion-2-Inpainting. https:\/\/huggingface.co\/stabilityai\/stable-diffusion-2-inpainting."},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00881"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV51458.2022.00323"},{"key":"e_1_2_2_42_1","volume-title":"Thirty-seventh Conference on Neural Information Processing Systems. https:\/\/openreview.net\/forum?id=ypOiXjdfnU","author":"Tang Luming","year":"2023","unstructured":"Luming Tang, Menglin Jia, Qianqian Wang, Cheng Perng Phoo, and Bharath Hariharan. 2023. Emergent Correspondence from Image Diffusion. In Thirty-seventh Conference on Neural Information Processing Systems. https:\/\/openreview.net\/forum?id=ypOiXjdfnU"},{"key":"e_1_2_2_43_1","volume-title":"Diffusers: State-of-the-art diffusion models. https:\/\/github.com\/huggingface\/diffusers.","author":"von Platen Patrick","year":"2022","unstructured":"Patrick von Platen, Suraj Patil, Anton Lozhkov, Pedro Cuenca, Nathan Lambert, Kashif Rasul, Mishig Davaadorj, and Thomas Wolf. 2022. Diffusers: State-of-the-art diffusion models. https:\/\/github.com\/huggingface\/diffusers."},{"key":"e_1_2_2_44_1","volume-title":"Extended Textual Conditioning in Text-to-Image Generation. ArXiv preprint abs\/2303.09522","author":"Voynov Andrey","year":"2023","unstructured":"Andrey Voynov, Qinghao Chu, Daniel Cohen-Or, and Kfir Aberman. 2023. P+: Extended Textual Conditioning in Text-to-Image Generation. ArXiv preprint abs\/2303.09522 (2023). https:\/\/arxiv.org\/abs\/2303.09522"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01761"},{"key":"e_1_2_2_46_1","volume-title":"Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie, and Mike Zheng Shou.","author":"Wu Jay Zhangjie","year":"2022","unstructured":"Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Stan Weixian Lei, Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie, and Mike Zheng Shou. 2022. Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. ArXiv preprint abs\/2212.11565 (2022). https:\/\/arxiv.org\/abs\/2212.11565"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01763"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.728"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00577"},{"key":"e_1_2_2_50_1","volume-title":"Varun Jampani, Deqing Sun, and Ming-Hsuan Yang.","author":"Zhang Junyi","year":"2023","unstructured":"Junyi Zhang, Charles Herrmann, Junhwa Hur, Luisa Polania Cabrera, Varun Jampani, Deqing Sun, and Ming-Hsuan Yang. 2023a. A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence. ArXiv preprint abs\/2305.15347 (2023). https:\/\/arxiv.org\/abs\/2305.15347"},{"key":"e_1_2_2_51_1","doi-asserted-by":"crossref","unstructured":"Lvmin Zhang Anyi Rao and Maneesh Agrawala. 2023b. Adding Conditional Control to Text-to-Image Diffusion Models.","DOI":"10.1109\/ICCV51070.2023.00355"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00068"},{"key":"e_1_2_2_53_1","volume-title":"Unleashing Text-to-Image Diffusion Models for Visual Perception. ICCV","author":"Zhao Wenliang","year":"2023","unstructured":"Wenliang Zhao, Yongming Rao, Zuyan Liu, Benlin Liu, Jie Zhou, and Jiwen Lu. 2023b. Unleashing Text-to-Image Diffusion Models for Visual Perception. ICCV (2023)."},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV56688.2023.00182"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00153"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00230"},{"key":"e_1_2_2_57_1","volume-title":"Designing a Better Asymmetric VQGAN for StableDiffusion. ArXiv preprint abs\/2306.04632","author":"Zhu Zixin","year":"2023","unstructured":"Zixin Zhu, Xuelu Feng, Dongdong Chen, Jianmin Bao, Le Wang, Yinpeng Chen, Lu Yuan, and Gang Hua. 2023. Designing a Better Asymmetric VQGAN for StableDiffusion. ArXiv preprint abs\/2306.04632 (2023). https:\/\/arxiv.org\/abs\/2306.04632"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3658237","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3658237","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:04:16Z","timestamp":1750291456000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3658237"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,19]]},"references-count":57,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,7,19]]}},"alternative-id":["10.1145\/3658237"],"URL":"https:\/\/doi.org\/10.1145\/3658237","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,19]]},"assertion":[{"value":"2024-07-19","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}