{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,5]],"date-time":"2026-05-05T11:37:20Z","timestamp":1777981040384,"version":"3.51.4"},"reference-count":60,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T00:00:00Z","timestamp":1731974400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2024,12,19]]},"abstract":"<jats:p>Modern 3D content creation heavily relies on procedural assets. In particular, procedural materials are ubiquitous in the industry, but their manipulation remains challenging. Previous work [Hu et al. 2023] conditionally generates procedural graphs that match a given input image. However, the parameter generation step limits how accurately the generated graph matches the input image, due to a reliance on supervision with scarcely available procedural data. We propose to improve parameter prediction accuracy for image-conditioned procedural material generation by leveraging reinforcement learning (RL) and present the first RL approach for procedural materials. RL circumvents the limited availability of procedural data, the domain gap between real and synthetic materials, and the need for end-to-end differentiable loss functions. Given a target image, we retrieve a procedural material and use an RL-trained transformer model to predict a set of parameters that reconstruct the target image as closely as possible. We show that using RL significantly improves parameter prediction to match a given target image compared to supervised methods on both synthetic and real target images.<\/jats:p>","DOI":"10.1145\/3687979","type":"journal-article","created":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T15:46:04Z","timestamp":1732031164000},"page":"1-14","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Procedural Material Generation with Reinforcement Learning"],"prefix":"10.1145","volume":"43","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9271-0055","authenticated-orcid":false,"given":"Beichen","family":"Li","sequence":"first","affiliation":[{"name":"MIT CSAIL, Cambridge, United States of America"},{"name":"Adobe Research, Cambridge, United States of America"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3674-295X","authenticated-orcid":false,"given":"Yiwei","family":"Hu","sequence":"additional","affiliation":[{"name":"Adobe Research, San Jose, United States of America"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7568-2849","authenticated-orcid":false,"given":"Paul","family":"Guerrero","sequence":"additional","affiliation":[{"name":"Adobe Research, London, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3808-6092","authenticated-orcid":false,"given":"Milos","family":"Hasan","sequence":"additional","affiliation":[{"name":"Adobe Research, San Jose, United States of America"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4442-4679","authenticated-orcid":false,"given":"Liang","family":"Shi","sequence":"additional","affiliation":[{"name":"MIT CSAIL, Cambridge, United States of America"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6219-3747","authenticated-orcid":false,"given":"Valentin","family":"Deschaintre","sequence":"additional","affiliation":[{"name":"Adobe Research, London, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0212-5643","authenticated-orcid":false,"given":"Wojciech","family":"Matusik","sequence":"additional","affiliation":[{"name":"MIT CSAIL, Cambridge, United States of America"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,11,19]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cag.2023.04.004"},{"key":"e_1_2_1_2_1","volume-title":"Training diffusion models with reinforcement learning. arXiv preprint arXiv:2305.13301","author":"Black Kevin","year":"2023","unstructured":"Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, and Sergey Levine. 2023. Training diffusion models with reinforcement learning. arXiv preprint arXiv:2305.13301 (2023)."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201378"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.13765"},{"key":"e_1_2_1_5_1","unstructured":"Shukai Duan Nikos Kanakaris Xiongye Xiao Heng Ping Chenyu Zhou Nesreen K Ahmed Guixiang Ma Mihai Capota Theodore L Willke Shahin Nazarian et al. 2023. Leveraging Reinforcement Learning and Large Language Models for Code Optimization. arXiv preprint arXiv:2312.05657 (2023)."},{"key":"e_1_2_1_6_1","volume-title":"Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models. In Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS)","author":"Fan Ying","year":"2023","unstructured":"Ying Fan, Olivia Watkins, Yuqing Du, Hao Liu, Moonkyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, and Kimin Lee. 2023. Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models. In Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS) 2023. Neural Information Processing Systems Foundation."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3306346.3323042"},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Leon A. Gatys Alexander S. Ecker and Matthias Bethge. 2015. A Neural Algorithm of Artistic Style. arXiv:1508.06576 [cs.CV]","DOI":"10.1167\/16.12.326"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/CoG52621.2021.9619053"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.12867"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.14061"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3528223.3530173"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3450626.3459854"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3414685.3417779"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00929"},{"key":"e_1_2_1_16_1","article-title":"Generative Modelling of BRDF Textures from Flash Images","volume":"40","author":"Henzler Philipp","year":"2021","unstructured":"Philipp Henzler, Valentin Deschaintre, Niloy J Mitra, and Tobias Ritschel. 2021. Generative Modelling of BRDF Textures from Flash Images. ACM Trans Graph (Proc. SIGGRAPH Asia) 40, 6 (2021).","journal-title":"ACM Trans Graph (Proc. SIGGRAPH Asia)"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3355089.3356516"},{"key":"e_1_2_1_18_1","volume-title":"Node Graph Optimization Using Differentiable Proxies. In ACM SIGGRAPH 2022 Conference Proceedings","author":"Hu Yiwei","year":"2022","unstructured":"Yiwei Hu, Paul Guerrero, Milos Hasan, Holly Rushmeier, and Valentin Deschaintre. 2022a. Node Graph Optimization Using Differentiable Proxies. In ACM SIGGRAPH 2022 Conference Proceedings (Vancouver, BC, Canada). Article 5, 9 pages."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588432.3591520"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.14591"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3502431"},{"key":"e_1_2_1_22_1","first-page":"1","article-title":"Exposure: A white-box photo post-processing framework","volume":"37","author":"Hu Yuanming","year":"2018","unstructured":"Yuanming Hu, Hao He, Chenxi Xu, Baoyuan Wang, and Stephen Lin. 2018. Exposure: A white-box photo post-processing framework. ACM Transactions on Graphics (TOG) 37, 2 (2018), 1--17.","journal-title":"ACM Transactions on Graphics (TOG)"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00964"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1609\/aiide.v16i1.7416"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/15922.15904"},{"key":"e_1_2_1_26_1","first-page":"21314","article-title":"Coderl: Mastering code generation through pretrained models and deep reinforcement learning","volume":"35","author":"Le Hung","year":"2022","unstructured":"Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, and Steven Chu Hong Hoi. 2022. Coderl: Mastering code generation through pretrained models and deep reinforcement learning. Advances in Neural Information Processing Systems 35 (2022), 21314--21328.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_27_1","volume-title":"Rlaif: Scaling reinforcement learning from human feedback with ai feedback. arXiv preprint arXiv:2309.00267","author":"Lee Harrison","year":"2023","unstructured":"Harrison Lee, Samrat Phatale, Hassan Mansoor, Kellie Lu, Thomas Mesnard, Colton Bishop, Victor Carbune, and Abhinav Rastogi. 2023b. Rlaif: Scaling reinforcement learning from human feedback with ai feedback. arXiv preprint arXiv:2309.00267 (2023)."},{"key":"e_1_2_1_28_1","volume-title":"Aligning text-to-image models using human feedback. arXiv preprint arXiv:2302.12192","author":"Lee Kimin","year":"2023","unstructured":"Kimin Lee, Hao Liu, Moonkyung Ryu, Olivia Watkins, Yuqing Du, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, and Shixiang Shane Gu. 2023a. Aligning text-to-image models using human feedback. arXiv preprint arXiv:2302.12192 (2023)."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3592132"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3072959.3073641"},{"key":"e_1_2_1_31_1","doi-asserted-by":"crossref","first-page":"051005","DOI":"10.1115\/1.4046293","article-title":"Deep reinforcement learning for procedural content generation of 3d virtual environments","volume":"20","author":"L\u00f3pez Christian E","year":"2020","unstructured":"Christian E L\u00f3pez, James Cunningham, Omar Ashour, and Conrad S Tucker. 2020. Deep reinforcement learning for procedural content generation of 3d virtual environments. Journal of Computing and Information Science in Engineering 20, 5 (2020), 051005.","journal-title":"Journal of Computing and Information Science in Engineering"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.14466"},{"key":"e_1_2_1_33_1","volume-title":"Golnoosh Abdollahinejad, and Matin Hashemi.","author":"Mohaghegh Sajad","year":"2023","unstructured":"Sajad Mohaghegh, Mohammad Amin Ramezan Dehnavi, Golnoosh Abdollahinejad, and Matin Hashemi. 2023. PCGPT: Procedural Content Generation via Transformers. arXiv preprint arXiv:2310.02405 (2023)."},{"key":"e_1_2_1_34_1","unstructured":"OpenAI. 2022. ChatGPT: Optimizing Language Models for Dialogue."},{"key":"e_1_2_1_35_1","volume-title":"Oh (Eds.)","volume":"35","author":"Ouyang Long","year":"2022","unstructured":"Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), Vol. 35. Curran Associates, Inc., 27730--27744. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2022\/file\/b1efde53be364a73914f58805a001731-Paper-Conference.pdf"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.5555\/3618408.3619789"},{"key":"e_1_2_1_37_1","volume-title":"International Conference on Machine Learning. PMLR, 8748--8763","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748--8763."},{"key":"e_1_2_1_38_1","unstructured":"Rajkumar Ramamurthy Prithviraj Ammanabrolu Kiant\u00e9 Brantley Jack Hessel Rafet Sifa Christian Bauckhage Hannaneh Hajishirzi and Yejin Choi. 2022. Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks Baselines and Building Blocks for Natural Language Policy Optimization. arXiv preprint arXiv:2210.01241. https:\/\/arxiv.org\/abs\/2210.01241"},{"key":"e_1_2_1_39_1","doi-asserted-by":"crossref","unstructured":"Axel Sauer Dominik Lorenz Andreas Blattmann and Robin Rombach. 2023. Adversarial Diffusion Distillation. arXiv:2311.17042 [cs.CV]","DOI":"10.1007\/978-3-031-73016-0_6"},{"key":"e_1_2_1_40_1","volume-title":"High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438","author":"Schulman John","year":"2015","unstructured":"John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. 2015. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 (2015)."},{"key":"e_1_2_1_41_1","volume-title":"High-Dimensional Continuous Control Using Generalized Advantage Estimation. In 4th International Conference on Learning Representations, ICLR","author":"Schulman John","year":"2016","unstructured":"John Schulman, Philipp Moritz, Sergey Levine, Michael I. Jordan, and Pieter Abbeel. 2016. High-Dimensional Continuous Control Using Generalized Advantage Estimation. In 4th International Conference on Learning Representations, ICLR 2016, Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1506.02438"},{"key":"e_1_2_1_42_1","volume-title":"Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347","author":"Schulman John","year":"2017","unstructured":"John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)."},{"key":"e_1_2_1_43_1","volume-title":"CSGNet: Neural Shape Parser for Constructive Solid Geometry. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Sharma Gopal","year":"2018","unstructured":"Gopal Sharma, Rishabh Goyal, Difan Liu, Evangelos Kalogerakis, and Subhransu Maji. 2018. CSGNet: Neural Shape Parser for Constructive Solid Geometry. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3414685.3417781"},{"key":"e_1_2_1_45_1","volume-title":"Execution-based code generation using deep reinforcement learning. arXiv preprint arXiv:2301.13816","author":"Shojaee Parshin","year":"2023","unstructured":"Parshin Shojaee, Aneesh Jain, Sindhu Tipirneni, and Chandan K Reddy. 2023. Execution-based code generation using deep reinforcement learning. arXiv preprint arXiv:2301.13816 (2023)."},{"key":"e_1_2_1_46_1","unstructured":"David Silver Thomas Hubert Julian Schrittwieser Ioannis Antonoglou Matthew Lai Arthur Guez Marc Lanctot Laurent Sifre Dharshan Kumaran Thore Graepel et al. 2017. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017)."},{"key":"e_1_2_1_47_1","unstructured":"Hugo Touvron Louis Martin Kevin Stone Peter Albert Amjad Almahairi Yasmine Babaei Nikolay Bashlykov Soumya Batra Prajjwal Bhargava Shruti Bhosale et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023)."},{"key":"e_1_2_1_48_1","volume-title":"\u0141 ukasz Kaiser, and Illia Polosukhin","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141 ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https:\/\/proceedings.neurips.cc\/paper\/2017\/file\/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf"},{"key":"e_1_2_1_49_1","volume-title":"ControlMat: Controlled Generative Approach to Material Capture. arXiv preprint arXiv:2309.01700","author":"Vecchio Giuseppe","year":"2023","unstructured":"Giuseppe Vecchio, Rosalie Martin, Arthur Roullier, Adrien Kaiser, Romain Rouffet, Valentin Deschaintre, and Tamy Boubekeur. 2023a. ControlMat: Controlled Generative Approach to Material Capture. arXiv preprint arXiv:2309.01700 (2023)."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.01260"},{"key":"e_1_2_1_51_1","volume-title":"MatFuse: Controllable Material Generation with Diffusion Models. arXiv preprint arXiv:2308.11408","author":"Vecchio Giuseppe","year":"2023","unstructured":"Giuseppe Vecchio, Renato Sortino, Simone Palazzo, and Concetto Spampinato. 2023b. MatFuse: Controllable Material Generation with Diffusion Models. arXiv preprint arXiv:2308.11408 (2023)."},{"key":"e_1_2_1_52_1","volume-title":"Recursively summarizing books with human feedback. arXiv preprint arXiv:2109.10862","author":"Wu Jeff","year":"2021","unstructured":"Jeff Wu, Long Ouyang, Daniel M Ziegler, Nisan Stiennon, Ryan Lowe, Jan Leike, and Paul Christiano. 2021. Recursively summarizing books with human feedback. arXiv preprint arXiv:2109.10862 (2021)."},{"key":"e_1_2_1_53_1","volume-title":"Kaufman","author":"Xie Desai","year":"2023","unstructured":"Desai Xie, Jiahao Li, Hao Tan, Xin Sun, Zhixin Shu, Yi Zhou, Sai Bi, S\u00f6ren Pirk, and Arie E. Kaufman. 2023. Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning. arXiv:2312.13980 [cs.CV]"},{"key":"e_1_2_1_54_1","volume-title":"ACM SIGGRAPH Asia 2023 Conference Proceedings.","author":"Yan K.","unstructured":"K. Yan, F. Luan, M. Ha\u0161an, T. Groueix, V. Deschaintre, and S. Zhao. 2023. PSDR-Room: Single Photo to Scene using Differentiable Rendering. In ACM SIGGRAPH Asia 2023 Conference Proceedings."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.14387"},{"key":"e_1_2_1_56_1","unstructured":"Tianwei Yin Micha\u00ebl Gharbi Richard Zhang Eli Shechtman Fr\u00e9do Durand William T Freeman and Taesung Park. 2024. One-step Diffusion with Distribution Matching Distillation. In CVPR."},{"key":"e_1_2_1_57_1","doi-asserted-by":"crossref","unstructured":"Richard Zhang Phillip Isola Alexei A Efros Eli Shechtman and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In CVPR.","DOI":"10.1109\/CVPR.2018.00068"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2206.05649"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588432.3591535"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.142635"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3687979","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3687979","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:09:58Z","timestamp":1750295398000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3687979"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,19]]},"references-count":60,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,12,19]]}},"alternative-id":["10.1145\/3687979"],"URL":"https:\/\/doi.org\/10.1145\/3687979","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,19]]},"assertion":[{"value":"2024-11-19","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}