{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,17]],"date-time":"2026-07-17T06:10:00Z","timestamp":1784268600203,"version":"3.55.0"},"publisher-location":"New York, NY, USA","reference-count":47,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,7,13]],"date-time":"2024-07-13T00:00:00Z","timestamp":1720828800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,7,13]]},"DOI":"10.1145\/3641519.3657445","type":"proceedings-article","created":{"date-parts":[[2024,7,12]],"date-time":"2024-07-12T10:39:28Z","timestamp":1720780768000},"page":"1-11","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":64,"title":["RGB\u2194X: Image decomposition and synthesis using material- and lighting-aware diffusion models"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9025-9427","authenticated-orcid":false,"given":"Zheng","family":"Zeng","sequence":"first","affiliation":[{"name":"University of California, Santa Barbara, United States of America"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6219-3747","authenticated-orcid":false,"given":"Valentin","family":"Deschaintre","sequence":"additional","affiliation":[{"name":"Adobe Research, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9655-2138","authenticated-orcid":false,"given":"Iliyan","family":"Georgiev","sequence":"additional","affiliation":[{"name":"Adobe Research, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1060-6941","authenticated-orcid":false,"given":"Yannick","family":"Hold-Geoffroy","sequence":"additional","affiliation":[{"name":"Adobe Research, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3674-295X","authenticated-orcid":false,"given":"Yiwei","family":"Hu","sequence":"additional","affiliation":[{"name":"Adobe Research, United States of America"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5926-6266","authenticated-orcid":false,"given":"Fujun","family":"Luan","sequence":"additional","affiliation":[{"name":"Adobe Research, United States of America"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9379-094X","authenticated-orcid":false,"given":"Ling-Qi","family":"Yan","sequence":"additional","affiliation":[{"name":"University of California, Santa Barbara, United States of America"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3808-6092","authenticated-orcid":false,"given":"Milo\u0161","family":"Ha\u0161an","sequence":"additional","affiliation":[{"name":"Adobe Research, United States of America"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2024,7,13]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Recovering intrinsic scene characteristics. Comput. vis. syst 2, 3-26","author":"Barrow Harry","year":"1978","unstructured":"Harry Barrow, J Tenenbaum, A Hanson, and E Riseman. 1978. Recovering intrinsic scene characteristics. Comput. vis. syst 2, 3-26 (1978), 2."},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601206"},{"key":"e_1_3_2_2_3_1","volume-title":"Stylegan knows normal, depth, albedo, and more. Advances in Neural Information Processing Systems 36","author":"Bhattad Anand","year":"2024","unstructured":"Anand Bhattad, Daniel McKee, Derek Hoiem, and David Forsyth. 2024. Stylegan knows normal, depth, albedo, and more. Advances in Neural Information Processing Systems 36 (2024)."},{"key":"e_1_3_2_2_4_1","volume-title":"MiDaS v3.1 \u2013 A Model Zoo for Robust Monocular Relative Depth Estimation. arXiv preprint arXiv:2307.14460","author":"Birkl Reiner","year":"2023","unstructured":"Reiner Birkl, Diana Wofk, and Matthias M\u00fcller. 2023. MiDaS v3.1 \u2013 A Model Zoo for Robust Monocular Relative Depth Estimation. arXiv preprint arXiv:2307.14460 (2023)."},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"crossref","unstructured":"Tim Brooks Aleksander Holynski and Alexei\u00a0A. Efros. 2023. InstructPix2Pix: Learning to Follow Image Editing Instructions. In CVPR.","DOI":"10.1109\/CVPR52729.2023.01764"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"crossref","unstructured":"Chris Careaga and Ya\u011f\u0131z Aksoy. 2023. Intrinsic Image Decomposition via Ordinal Shading. ACM Trans. Graph. (2023).","DOI":"10.1145\/3630750"},{"key":"e_1_3_2_2_7_1","volume-title":"Generative Models: What do they know? Do they know things? Let\u2019s find out!arXiv preprint arXiv:2311.17137","author":"Du Xiaodan","year":"2023","unstructured":"Xiaodan Du, Nicholas Kolkin, Greg Shakhnarovich, and Anand Bhattad. 2023. Generative Models: What do they know? Do they know things? Let\u2019s find out!arXiv preprint arXiv:2311.17137 (2023)."},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-021-01563-8"},{"key":"e_1_3_2_2_9_1","volume-title":"Generative adversarial nets. Advances in neural information processing systems 27","author":"Goodfellow Ian","year":"2014","unstructured":"Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014)."},{"key":"e_1_3_2_2_10_1","volume-title":"OutCast: Single Image Relighting with Cast Shadows. Computer Graphics Forum 43","author":"Griffiths David","year":"2022","unstructured":"David Griffiths, Tobias Ritschel, and Julien Philip. 2022. OutCast: Single Image Relighting with Cast Shadows. Computer Graphics Forum 43 (2022)."},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2009.5459428"},{"key":"e_1_3_2_2_12_1","volume-title":"A review on generative adversarial networks: Algorithms, theory, and applications","author":"Gui Jie","year":"2021","unstructured":"Jie Gui, Zhenan Sun, Yonggang Wen, Dacheng Tao, and Jieping Ye. 2021. A review on generative adversarial networks: Algorithms, theory, and applications. IEEE transactions on knowledge and data engineering 35, 4 (2021), 3313\u20133332."},{"key":"e_1_3_2_2_13_1","volume-title":"Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598","author":"Ho Jonathan","year":"2022","unstructured":"Jonathan Ho and Tim Salimans. 2022. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022)."},{"key":"e_1_3_2_2_14_1","volume-title":"Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685","author":"Hu J","year":"2021","unstructured":"Edward\u00a0J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021)."},{"key":"e_1_3_2_2_15_1","volume-title":"Composer: Creative and controllable image synthesis with composable conditions. arXiv preprint arXiv:2302.09778","author":"Huang Lianghua","year":"2023","unstructured":"Lianghua Huang, Di Chen, Yu Liu, Yujun Shen, Deli Zhao, and Jingren Zhou. 2023. Composer: Creative and controllable image synthesis with composable conditions. arXiv preprint arXiv:2302.09778 (2023)."},{"key":"e_1_3_2_2_16_1","unstructured":"Jay-Artist. 2012. Country-Kitchen Cycles. https:\/\/blendswap.com\/blend\/5156"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00453"},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00813"},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"crossref","unstructured":"Peter Kocsis Vincent Sitzmann and Matthias Nie\u00dfner. 2023. Intrinsic Image Diffusion for Single-view Material Estimation. In arxiv.","DOI":"10.1109\/CVPR52733.2024.00497"},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1364\/JOSA.61.000001"},{"key":"e_1_3_2_2_21_1","volume-title":"Exploiting Diffusion Prior for Generalizable Pixel-Level Semantic Prediction. arXiv preprint arXiv:2311.18832","author":"Lee Hsin-Ying","year":"2023","unstructured":"Hsin-Ying Lee, Hung-Yu Tseng, and Ming-Hsuan Yang. 2023. Exploiting Diffusion Prior for Generalizable Pixel-Level Semantic Prediction. arXiv preprint arXiv:2311.18832 (2023)."},{"key":"e_1_3_2_2_22_1","unstructured":"Junnan Li Dongxu Li Silvio Savarese and Steven Hoi. 2023. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. arxiv:2301.12597\u00a0[cs.CV]"},{"key":"e_1_3_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00255"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-20068-7_32"},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00711"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.13225"},{"key":"e_1_3_2_2_27_1","unstructured":"NVIDIA. 2020. NVIDIA OptiX\u2122 AI-Accelerated Denoiser. https:\/\/developer.nvidia.com\/optix-denoiser"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2905015"},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3450626.3459872"},{"key":"e_1_3_2_2_30_1","volume-title":"Physically Based Rendering: From Theory to Implementation","author":"Pharr Matt","unstructured":"Matt Pharr and Greg Humphreys. 2004. Physically Based Rendering: From Theory to Implementation. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA."},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206537"},{"key":"e_1_3_2_2_32_1","volume-title":"International conference on machine learning. PMLR, 8748\u20138763","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong\u00a0Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748\u20138763."},{"key":"e_1_3_2_2_33_1","unstructured":"Aditya Ramesh Prafulla Dhariwal Alex Nichol Casey Chu and Mark Chen. 2022. Hierarchical Text-Conditional Image Generation with CLIP Latents. arxiv:2204.06125\u00a0[cs.CV]"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2020.3019967"},{"key":"e_1_3_2_2_35_1","volume-title":"Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding. In International Conference on Computer Vision (ICCV)","author":"Roberts Mike","year":"2021","unstructured":"Mike Roberts, Jason Ramapuram, Anurag Ranjan, Atulit Kumar, Miguel\u00a0Angel Bautista, Nathan Paczan, Russ Webb, and Joshua\u00a0M. Susskind. 2021. Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding. In International Conference on Computer Vision (ICCV) 2021."},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-19787-1_35"},{"key":"e_1_3_2_2_38_1","volume-title":"Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512","author":"Salimans Tim","year":"2022","unstructured":"Tim Salimans and Jonathan Ho. 2022. Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512 (2022)."},{"key":"e_1_3_2_2_39_1","volume-title":"Alchemist: Parametric Control of Material Properties with Diffusion Models. arxiv:2312.02970\u00a0[cs.CV]","author":"Sharma Prafull","year":"2023","unstructured":"Prafull Sharma, Varun Jampani, Yuanzhen Li, Xuhui Jia, Dmitry Lagun, Fredo Durand, William\u00a0T. Freeman, and Mark Matthews. 2023. Alchemist: Parametric Control of Material Properties with Diffusion Models. arxiv:2312.02970\u00a0[cs.CV]"},{"key":"e_1_3_2_2_40_1","volume-title":"Deep Illumination: Approximating Dynamic Global Illumination with Generative Adversarial Network. arxiv:1710.09834\u00a0[cs.GR]","author":"Thomas Manu\u00a0Mathew","year":"2018","unstructured":"Manu\u00a0Mathew Thomas and Angus\u00a0G. Forbes. 2018. Deep Illumination: Approximating Dynamic Global Illumination with Generative Adversarial Network. arxiv:1710.09834\u00a0[cs.GR]"},{"key":"e_1_3_2_2_41_1","unstructured":"Bruce Walter Stephen\u00a0R. Marschner Hongsong Li and Kenneth\u00a0E. Torrance. 2007. Microfacet models for refraction through rough surfaces(EGSR\u201907). 195\u2013206."},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1007\/s41095-022-0274-8"},{"key":"e_1_3_2_2_43_1","volume-title":"Neural Fields meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Wang Zian","year":"2023","unstructured":"Zian Wang, Tianchang Shen, Jun Gao, Shengyu Huang, Jacob Munkberg, Jon Hasselgren, Zan Gojcic, Wenzheng Chen, and Sanja Fidler. 2023. Neural Fields meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_3_2_2_44_1","volume-title":"Proceedings, Part XXII 16","author":"Yu Ye","year":"2020","unstructured":"Ye Yu, Abhimitra Meka, Mohamed Elgharib, Hans-Peter Seidel, Christian Theobalt, and William\u00a0AP Smith. 2020. Self-supervised outdoor scene relighting. In Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXII 16. Springer, 84\u2013101."},{"key":"e_1_3_2_2_45_1","doi-asserted-by":"crossref","unstructured":"Lvmin Zhang Anyi Rao and Maneesh Agrawala. 2023. Adding Conditional Control to Text-to-Image Diffusion Models. arxiv:2302.05543\u00a0[cs.CV]","DOI":"10.1109\/ICCV51070.2023.00355"},{"key":"e_1_3_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3550469.3555407"},{"key":"e_1_3_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00284"}],"event":{"name":"SIGGRAPH '24: Special Interest Group on Computer Graphics and Interactive Techniques Conference","location":"Denver CO USA","acronym":"SIGGRAPH '24","sponsor":["SIGGRAPH ACM Special Interest Group on Computer Graphics and Interactive Techniques"]},"container-title":["Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3641519.3657445","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3641519.3657445","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:09:36Z","timestamp":1750295376000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3641519.3657445"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,13]]},"references-count":47,"alternative-id":["10.1145\/3641519.3657445","10.1145\/3641519"],"URL":"https:\/\/doi.org\/10.1145\/3641519.3657445","relation":{},"subject":[],"published":{"date-parts":[[2024,7,13]]},"assertion":[{"value":"2024-07-13","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}