{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,24]],"date-time":"2025-11-24T21:47:43Z","timestamp":1764020863306,"version":"3.41.0"},"reference-count":79,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2025,5,22]],"date-time":"2025-05-22T00:00:00Z","timestamp":1747872000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Comput. Graph. Interact. Tech."],"published-print":{"date-parts":[[2025,5,22]]},"abstract":"<jats:p>In recent years, significant advancements have been made in text-driven 3D content generation. However, several challenges remain. In practical applications, users often provide extremely simple text inputs while expecting high-quality 3D content. Generating optimal results from such minimal text is a difficult task due to the strong dependency of text-to-3D models on the quality of input prompts. Moreover, the generation process exhibits high variability, making it difficult to control. Consequently, multiple iterations are typically required to produce content that meets user expectations, reducing generation efficiency. To address this issue, we propose GPT-4V for self-optimization, which significantly enhances the efficiency of generating satisfactory content in a single attempt. Furthermore, the controllability of text-to-3D generation methods has not been fully explored. Our approach enables users to not only provide textual descriptions but also specify additional conditions, such as style, edges, scribbles, poses, or combinations of multiple conditions, allowing for more precise control over the generated 3D content. Additionally, during training, we effectively integrate multi-view information, including multi-view depth, masks, features, and images, to address the common Janus problem in 3D content generation. Extensive experiments demonstrate that our method achieves robust generalization, facilitating the efficient and controllable generation of high-quality 3D content.<\/jats:p>","DOI":"10.1145\/3728305","type":"journal-article","created":{"date-parts":[[2025,5,23]],"date-time":"2025-05-23T03:52:27Z","timestamp":1747972347000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["ESCT3D: Efficient and Selectively Controllable Text-Driven 3D Content Generation with Gaussian Splatting"],"prefix":"10.1145","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-1736-0434","authenticated-orcid":false,"given":"Huiqi","family":"Wu","sequence":"first","affiliation":[{"name":"Southeast University, Nanjing, Jiangsu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2930-8407","authenticated-orcid":false,"given":"Li","family":"Yao","sequence":"additional","affiliation":[{"name":"Southeast University, Nanjing, Jiangsu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-4229-9795","authenticated-orcid":false,"given":"Jianbo","family":"Mei","sequence":"additional","affiliation":[{"name":"Southeast University, Nanjing, Jiangsu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-3785-593X","authenticated-orcid":false,"given":"Yingjie","family":"Huang","sequence":"additional","affiliation":[{"name":"Southeast University, Nanjing, Jiangsu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-2472-3460","authenticated-orcid":false,"given":"Yining","family":"Xu","sequence":"additional","affiliation":[{"name":"Southeast University, Nanjing, Jiangsu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-5094-9942","authenticated-orcid":false,"given":"Jingjiao","family":"You","sequence":"additional","affiliation":[{"name":"Southeast University, Nanjing, Jiangsu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-6927-8134","authenticated-orcid":false,"given":"Yilong","family":"Liu","sequence":"additional","affiliation":[{"name":"Southeast University, Nanjing, Jiangsu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,5,22]]},"reference":[{"key":"e_1_3_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00423"},{"key":"e_1_3_2_3_1","unstructured":"Open AI. 2023. ChatGPT can now see hear and speak."},{"key":"e_1_3_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3592450"},{"key":"e_1_3_2_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00580"},{"key":"e_1_3_2_6_1","article-title":"Blender\u2014A 3D modelling and rendering package","volume":"4","author":"Blender OC","year":"2018","unstructured":"OC Blender. 2018. Blender\u2014A 3D modelling and rendering package. Retrieved. represents the sequence of Constructs1 to 4 (2018).","journal-title":"Retrieved. represents the sequence of Constructs1 to"},{"key":"e_1_3_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3586183.3606725"},{"key":"e_1_3_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.02033"},{"key":"e_1_3_2_9_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v38i2.27886"},{"key":"e_1_3_2_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.02022"},{"key":"e_1_3_2_11_1","article-title":"Luciddreamer: Domain-free generation of 3d gaussian splatting scenes","author":"Chung Jaeyoung","year":"2023","unstructured":"Jaeyoung Chung, Suyoung Lee, Hyeongjin Nam, Jaerin Lee, and Kyoung\u00a0Mu Lee. 2023. Luciddreamer: Domain-free generation of 3d gaussian splatting scenes. arXiv preprint arXiv: 2311.13384 (2023).","journal-title":"arXiv preprint arXiv: 2311.13384"},{"key":"e_1_3_2_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00011"},{"key":"e_1_3_2_13_1","article-title":"Depth map prediction from a single image using a multi-scale deep network","volume":"27","author":"Eigen David","year":"2014","unstructured":"David Eigen, Christian Puhrsch, and Rob Fergus. 2014. Depth map prediction from a single image using a multi-scale deep network. Advances in neural information processing systems 27 (2014).","journal-title":"Advances in neural information processing systems"},{"key":"e_1_3_2_14_1","article-title":"Extended Bayesian information criteria for Gaussian graphical models","volume":"23","author":"Foygel Rina","year":"2010","unstructured":"Rina Foygel and Mathias Drton. 2010. Extended Bayesian information criteria for Gaussian graphical models. Advances in neural information processing systems 23 (2010).","journal-title":"Advances in neural information processing systems"},{"key":"e_1_3_2_15_1","first-page":"31841","article-title":"Get3d: A generative model of high quality 3d textured shapes learned from images","volume":"35","author":"Gao Jun","year":"2022","unstructured":"Jun Gao, Tianchang Shen, Zian Wang, Wenzheng Chen, Kangxue Yin, Daiqing Li, Or Litany, Zan Gojcic, and Sanja Fidler. 2022. Get3d: A generative model of high quality 3d textured shapes learned from images. Advances In Neural Information Processing Systems 35 (2022), 31841\u201331854.","journal-title":"Advances In Neural Information Processing Systems"},{"key":"e_1_3_2_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00725"},{"key":"e_1_3_2_17_1","article-title":"Generative adversarial nets","volume":"27","author":"Goodfellow Ian","year":"2014","unstructured":"Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014).","journal-title":"Advances in neural information processing systems"},{"key":"e_1_3_2_18_1","article-title":"3dgen: Triplane latent diffusion for textured mesh generation","author":"Gupta Anchit","year":"2023","unstructured":"Anchit Gupta, Wenhan Xiong, Yixin Nie, Ian Jones, and Barlas O\u011fuz. 2023. 3dgen: Triplane latent diffusion for textured mesh generation. arXiv preprint arXiv: 2303.05371 (2023).","journal-title":"arXiv preprint arXiv: 2303.05371"},{"key":"e_1_3_2_19_1","first-page":"66923","article-title":"Optimizing prompts for text-to-image generation","volume":"36","author":"Hao Yaru","year":"2023","unstructured":"Yaru Hao, Zewen Chi, Li Dong, and Furu Wei. 2023. Optimizing prompts for text-to-image generation. Advances in Neural Information Processing Systems 36 (2023), 66923\u201366939.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-19812-0_8"},{"key":"e_1_3_2_21_1","first-page":"6840","article-title":"Denoising diffusion probabilistic models","volume":"33","author":"Ho Jonathan","year":"2020","unstructured":"Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840\u20136851.","journal-title":"Advances in neural information processing systems"},{"key":"e_1_3_2_22_1","article-title":"Classifier-free diffusion guidance","author":"Ho Jonathan","year":"2022","unstructured":"Jonathan Ho and Tim Salimans. 2022. Classifier-free diffusion guidance. arXiv preprint arXiv: 2207.12598 (2022).","journal-title":"arXiv preprint arXiv: 2207.12598"},{"key":"e_1_3_2_23_1","article-title":"Dreamtime: An improved optimization strategy for text-to-3d content creation","author":"Huang Yukun","year":"2023","unstructured":"Yukun Huang, Jianan Wang, Yukai Shi, Xianbiao Qi, Zheng-Jun Zha, and Lei Zhang. 2023. Dreamtime: An improved optimization strategy for text-to-3d content creation. arXiv preprint arXiv: 2306.12422 (2023).","journal-title":"arXiv preprint arXiv: 2306.12422"},{"key":"e_1_3_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00094"},{"key":"e_1_3_2_25_1","article-title":"Shap-e: Generating conditional 3d implicit functions","author":"Jun Heewoo","year":"2023","unstructured":"Heewoo Jun and Alex Nichol. 2023. Shap-e: Generating conditional 3d implicit functions. arXiv preprint arXiv: 2305.02463 (2023).","journal-title":"arXiv preprint arXiv: 2305.02463"},{"key":"e_1_3_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3592433"},{"key":"e_1_3_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3550454.3555497"},{"key":"e_1_3_2_28_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.14339"},{"key":"e_1_3_2_29_1","article-title":"Sweetdreamer: Aligning geometric priors in 2d diffusion for consistent text-to-3d","author":"Li Weiyu","year":"2023","unstructured":"Weiyu Li, Rui Chen, Xuelin Chen, and Ping Tan. 2023. Sweetdreamer: Aligning geometric priors in 2d diffusion for consistent text-to-3d. arXiv preprint arXiv: 2310.02596 (2023).","journal-title":"arXiv preprint arXiv: 2310.02596"},{"key":"e_1_3_2_30_1","article-title":"Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting","author":"Li Zhiqi","year":"2024","unstructured":"Zhiqi Li, Yiming Chen, Lingzhe Zhao, and Peidong Liu. 2024. Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting. arXiv preprint arXiv: 2403.09981 (2024).","journal-title":"arXiv preprint arXiv: 2403.09981"},{"key":"e_1_3_2_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00037"},{"key":"e_1_3_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00853"},{"key":"e_1_3_2_33_1","article-title":"Meshdiffusion: Score-based generative 3d mesh modeling","author":"Liu Zhen","year":"2023","unstructured":"Zhen Liu, Yao Feng, Michael\u00a0J Black, Derek Nowrouzezahrai, Liam Paull, and Weiyang Liu. 2023a.Meshdiffusion: Score-based generative 3d mesh modeling. arXiv preprint arXiv: 2303.08133 (2023).","journal-title":"arXiv preprint arXiv: 2303.08133"},{"key":"e_1_3_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.01645"},{"key":"e_1_3_2_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01218"},{"key":"e_1_3_2_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01313"},{"key":"e_1_3_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503250"},{"key":"e_1_3_2_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3550469.3555392"},{"key":"e_1_3_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3528223.3530127"},{"key":"e_1_3_2_40_1","article-title":"Point-e: A system for generating 3d point clouds from complex prompts","author":"Nichol Alex","year":"2022","unstructured":"Alex Nichol, Heewoo Jun, Prafulla Dhariwal, Pamela Mishkin, and Mark Chen. 2022. Point-e: A system for generating 3d point clouds from complex prompts. arXiv preprint arXiv: 2212.08751 (2022).","journal-title":"arXiv preprint arXiv: 2212.08751"},{"key":"e_1_3_2_41_1","article-title":"4V (ision) system card","author":"OpenAI GPT","year":"2023","unstructured":"GPT OpenAI. 2023a.4V (ision) system card. preprint (2023).","journal-title":"preprint"},{"issue":"5","key":"e_1_3_2_42_1","article-title":"Gpt-4 technical report. arxiv 2303.08774","volume":"2","author":"OpenAI R","year":"2023","unstructured":"R OpenAI. 2023b.Gpt-4 technical report. arxiv 2303.08774. View in Article 2, 5 (2023).","journal-title":"View in Article"},{"key":"e_1_3_2_43_1","article-title":"Dinov2: Learning robust visual features without supervision","author":"Oquab Maxime","year":"2023","unstructured":"Maxime Oquab, Timoth\u00e9e Darcet, Th\u00e9o Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et\u00a0al. 2023. Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv: 2304.07193 (2023).","journal-title":"arXiv preprint arXiv: 2304.07193"},{"key":"e_1_3_2_44_1","article-title":"ED-NeRF: Efficient Text-Guided Editing of 3D Scene using Latent Space NeRF","author":"Park Jangho","year":"2023","unstructured":"Jangho Park, Gihyun Kwon, and Jong\u00a0Chul Ye. 2023. ED-NeRF: Efficient Text-Guided Editing of 3D Scene using Latent Space NeRF. arXiv preprint arXiv: 2310.02712 (2023).","journal-title":"arXiv preprint arXiv: 2310.02712"},{"key":"e_1_3_2_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00025"},{"key":"e_1_3_2_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01059"},{"key":"e_1_3_2_47_1","article-title":"Dreamfusion: Text-to-3d using 2d diffusion","author":"Poole Ben","year":"2022","unstructured":"Ben Poole, Ajay Jain, Jonathan\u00a0T Barron, and Ben Mildenhall. 2022. Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv: 2209.14988 (2022).","journal-title":"arXiv preprint arXiv: 2209.14988"},{"key":"e_1_3_2_48_1","article-title":"Magic123: One image to high-quality 3d object generation using both 2d and 3d diffusion priors","author":"Qian Guocheng","year":"2023","unstructured":"Guocheng Qian, Jinjie Mai, Abdullah Hamdi, Jian Ren, Aliaksandr Siarohin, Bing Li, Hsin-Ying Lee, Ivan Skorokhodov, Peter Wonka, Sergey Tulyakov, et\u00a0al. 2023. Magic123: One image to high-quality 3d object generation using both 2d and 3d diffusion priors. arXiv preprint arXiv: 2306.17843 (2023).","journal-title":"arXiv preprint arXiv: 2306.17843"},{"key":"e_1_3_2_49_1","first-page":"8748","volume-title":"International conference on machine learning","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong\u00a0Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et\u00a0al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748\u20138763."},{"key":"e_1_3_2_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00223"},{"key":"e_1_3_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588432.3591503"},{"key":"e_1_3_2_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_3_2_53_1","article-title":"G Schindler K Deep object co-segmentation","volume":"2019","author":"Rother Li\u00a0W Hosseini Jafari\u00a0O","year":"2018","unstructured":"Li\u00a0W Hosseini Jafari\u00a0O Rother and C\u00a0Jawahar CV Li\u00a0H Mori. 2018. G Schindler K Deep object co-segmentation. Computer Vision\u2013ACCV 2019 (2018).","journal-title":"Computer Vision\u2013ACCV"},{"key":"e_1_3_2_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.02155"},{"key":"e_1_3_2_55_1","first-page":"36479","article-title":"Photorealistic text-to-image diffusion models with deep language understanding","volume":"35","author":"Saharia Chitwan","year":"2022","unstructured":"Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily\u00a0L Denton, Kamyar Ghasemipour, Raphael Gontijo\u00a0Lopes, Burcu Karagol\u00a0Ayan, Tim Salimans, et\u00a0al. 2022. Photorealistic text-to-image diffusion models with deep language understanding. Advances in neural information processing systems 35 (2022), 36479\u201336494.","journal-title":"Advances in neural information processing systems"},{"key":"e_1_3_2_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01805"},{"key":"e_1_3_2_57_1","article-title":"Let 2d diffusion model know 3d-consistency for robust text-to-3d generation","author":"Seo Junyoung","year":"2023","unstructured":"Junyoung Seo, Wooseok Jang, Min-Seop Kwak, Hyeonsu Kim, Jaehoon Ko, Junho Kim, Jin-Hwa Kim, Jiyoung Lee, and Seungryong Kim. 2023. Let 2d diffusion model know 3d-consistency for robust text-to-3d generation. arXiv preprint arXiv: 2303.07937 (2023).","journal-title":"arXiv preprint arXiv: 2303.07937"},{"key":"e_1_3_2_58_1","first-page":"6087","article-title":"Deep marching tetrahedra: a hybrid representation for high-resolution 3d shape synthesis","volume":"34","author":"Shen Tianchang","year":"2021","unstructured":"Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, and Sanja Fidler. 2021. Deep marching tetrahedra: a hybrid representation for high-resolution 3d shape synthesis. Advances in Neural Information Processing Systems 34 (2021), 6087\u20136101.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_59_1","article-title":"Mvdream: Multi-view diffusion for 3d generation","author":"Shi Yichun","year":"2023","unstructured":"Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, and Xiao Yang. 2023. Mvdream: Multi-view diffusion for 3d generation. arXiv preprint arXiv: 2308.16512 (2023).","journal-title":"arXiv preprint arXiv: 2308.16512"},{"key":"e_1_3_2_60_1","first-page":"7462","article-title":"Implicit neural representations with periodic activation functions","volume":"33","author":"Sitzmann Vincent","year":"2020","unstructured":"Vincent Sitzmann, Julien Martel, Alexander Bergman, David Lindell, and Gordon Wetzstein. 2020. Implicit neural representations with periodic activation functions. Advances in neural information processing systems 33 (2020), 7462\u20137473.","journal-title":"Advances in neural information processing systems"},{"key":"e_1_3_2_61_1","article-title":"Roomdreamer: Text-driven 3d indoor scene synthesis with coherent geometry and texture","author":"Song Liangchen","year":"2023","unstructured":"Liangchen Song, Liangliang Cao, Hongyu Xu, Kai Kang, Feng Tang, Junsong Yuan, and Yang Zhao. 2023. Roomdreamer: Text-driven 3d indoor scene synthesis with coherent geometry and texture. arXiv preprint arXiv: 2305.11337 (2023).","journal-title":"arXiv preprint arXiv: 2305.11337"},{"key":"e_1_3_2_62_1","article-title":"Dreamcraft3d: Hierarchical 3d generation with bootstrapped diffusion prior","author":"Sun Jingxiang","year":"2023","unstructured":"Jingxiang Sun, Bo Zhang, Ruizhi Shao, Lizhen Wang, Wen Liu, Zhenda Xie, and Yebin Liu. 2023. Dreamcraft3d: Hierarchical 3d generation with bootstrapped diffusion prior. arXiv preprint arXiv: 2310.16818 (2023).","journal-title":"arXiv preprint arXiv: 2310.16818"},{"key":"e_1_3_2_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00807"},{"key":"e_1_3_2_64_1","article-title":"Dreamgaussian: Generative gaussian splatting for efficient 3d content creation","author":"Tang Jiaxiang","year":"2023","unstructured":"Jiaxiang Tang, Jiawei Ren, Hang Zhou, Ziwei Liu, and Gang Zeng. 2023a.Dreamgaussian: Generative gaussian splatting for efficient 3d content creation. arXiv preprint arXiv: 2309.16653 (2023).","journal-title":"arXiv preprint arXiv: 2309.16653"},{"key":"e_1_3_2_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.02086"},{"key":"e_1_3_2_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.160"},{"key":"e_1_3_2_67_1","first-page":"10021","article-title":"Lion: Latent point diffusion models for 3d shape generation","volume":"35","author":"Vahdat Arash","year":"2022","unstructured":"Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, Karsten Kreis, et\u00a0al. 2022. Lion: Latent point diffusion models for 3d shape generation. Advances in Neural Information Processing Systems 35 (2022), 10021\u201310039.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_68_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00381"},{"key":"e_1_3_2_69_1","article-title":"Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction","author":"Wang Peng","year":"2021","unstructured":"Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. 2021. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv: 2106.10689 (2021).","journal-title":"arXiv preprint arXiv: 2106.10689"},{"key":"e_1_3_2_70_1","article-title":"Imagedream: Image-prompt multi-view diffusion for 3d generation","author":"Wang Peng","year":"2023","unstructured":"Peng Wang and Yichun Shi. 2023. Imagedream: Image-prompt multi-view diffusion for 3d generation. arXiv preprint arXiv: 2312.02201 (2023).","journal-title":"arXiv preprint arXiv: 2312.02201"},{"key":"e_1_3_2_71_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2003.819861"},{"key":"e_1_3_2_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/3613904.3642803"},{"key":"e_1_3_2_73_1","article-title":"Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation","volume":"36","author":"Wang Zhengyi","year":"2024","unstructured":"Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, and Jun Zhu. 2024b.Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. Advances in Neural Information Processing Systems 36 (2024).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_74_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV57701.2024.00317"},{"key":"e_1_3_2_75_1","article-title":"Idea2img: Iterative self-refinement with gpt-4v (ision) for automatic image design and generation","author":"Yang Zhengyuan","year":"2023","unstructured":"Zhengyuan Yang, Jianfeng Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zicheng Liu, and Lijuan Wang. 2023. Idea2img: Iterative self-refinement with gpt-4v (ision) for automatic image design and generation. arXiv preprint arXiv: 2310.08541 (2023).","journal-title":"arXiv preprint arXiv: 2310.08541"},{"key":"e_1_3_2_76_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.01222"},{"key":"e_1_3_2_77_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.00649"},{"key":"e_1_3_2_78_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.00416"},{"key":"e_1_3_2_79_1","article-title":"A large-scale study of representation learning with the visual task adaptation benchmark","author":"Zhai Xiaohua","year":"2019","unstructured":"Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre\u00a0Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, et\u00a0al. 2019. A large-scale study of representation learning with the visual task adaptation benchmark. arXiv preprint arXiv: 1910.04867 (2019).","journal-title":"arXiv preprint arXiv: 1910.04867"},{"key":"e_1_3_2_80_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00068"}],"container-title":["Proceedings of the ACM on Computer Graphics and Interactive Techniques"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3728305","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3728305","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:18:35Z","timestamp":1750295915000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3728305"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,22]]},"references-count":79,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,5,22]]}},"alternative-id":["10.1145\/3728305"],"URL":"https:\/\/doi.org\/10.1145\/3728305","relation":{},"ISSN":["2577-6193"],"issn-type":[{"type":"electronic","value":"2577-6193"}],"subject":[],"published":{"date-parts":[[2025,5,22]]},"assertion":[{"value":"2025-05-22","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}