{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T17:21:28Z","timestamp":1777569688747,"version":"3.51.4"},"reference-count":62,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T00:00:00Z","timestamp":1721347200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Key R&D Program of Zhejiang","award":["2023C01047"],"award-info":[{"award-number":["2023C01047"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62036010"],"award-info":[{"award-number":["62036010"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100006469","name":"FDCT","doi-asserted-by":"crossref","award":["0002\/2023\/AKP"],"award-info":[{"award-number":["0002\/2023\/AKP"]}],"id":[{"id":"10.13039\/501100006469","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2024,7,19]]},"abstract":"<jats:p>\n            Existing neural rendering-based text-to-3D-portrait generation methods typically make use of human geometry prior and diffusion models to obtain guidance. However, relying solely on geometry information introduces issues such as the Janus problem, over-saturation, and over-smoothing. We present\n            <jats:italic>Portrait3D<\/jats:italic>\n            , a novel neural rendering-based framework with a novel joint geometry-appearance prior to achieve text-to-3D-portrait generation that overcomes the aforementioned issues. To accomplish this, we train a 3D portrait generator, 3DPortraitGAN, as a robust prior. This generator is capable of producing 360\u00b0 canonical 3D portraits, serving as a starting point for the subsequent diffusion-based generation process. To mitigate the \"grid-like\" artifact caused by the high-frequency information in the feature-map-based 3D representation commonly used by most 3D-aware GANs, we integrate a novel\n            <jats:italic>pyramid tri-grid<\/jats:italic>\n            3D representation into 3DPortraitGAN. To generate 3D portraits from text, we first project a randomly generated image aligned with the given prompt into the pre-trained 3DPortraitGAN's latent space. The resulting latent code is then used to synthesize a\n            <jats:italic>pyramid tri-grid.<\/jats:italic>\n            Beginning with the obtained\n            <jats:italic>pyramid tri-grid<\/jats:italic>\n            , we use score distillation sampling to distill the diffusion model's knowledge into the\n            <jats:italic>pyramid tri-grid.<\/jats:italic>\n            Following that, we utilize the diffusion model to refine the rendered images of the 3D portrait and then use these refined images as training data to further optimize the\n            <jats:italic>pyramid tri-grid<\/jats:italic>\n            , effectively eliminating issues with unrealistic color and unnatural artifacts. Our experimental results show that Portrait3D can produce realistic, high-quality, and canonical 3D portraits that align with the prompt.\n          <\/jats:p>","DOI":"10.1145\/3658162","type":"journal-article","created":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T14:47:57Z","timestamp":1721400477000},"page":"1-12","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior"],"prefix":"10.1145","volume":"43","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2432-809X","authenticated-orcid":false,"given":"Yiqian","family":"Wu","sequence":"first","affiliation":[{"name":"State Key Lab of CAD&amp;CG, Zhejiang University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5690-367X","authenticated-orcid":false,"given":"Hao","family":"Xu","sequence":"additional","affiliation":[{"name":"State Key Lab of CAD&amp;CG, Zhejiang University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7441-0086","authenticated-orcid":false,"given":"Xiangjun","family":"Tang","sequence":"additional","affiliation":[{"name":"State Key Lab of CAD&amp;CG, Zhejiang University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-7397-4955","authenticated-orcid":false,"given":"Xien","family":"Chen","sequence":"additional","affiliation":[{"name":"Yale University, New Haven, United States of America"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1015-4770","authenticated-orcid":false,"given":"Siyu","family":"Tang","sequence":"additional","affiliation":[{"name":"ETH Z\u00fcrich, Z\u00fcrich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-2860-5112","authenticated-orcid":false,"given":"Zhebin","family":"Zhang","sequence":"additional","affiliation":[{"name":"OPPO US Research Center, Bellevue, United States of America"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-6140-9216","authenticated-orcid":false,"given":"Chen","family":"Li","sequence":"additional","affiliation":[{"name":"OPPO US Research Center, Bellevue, United States of America"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7339-2920","authenticated-orcid":false,"given":"Xiaogang","family":"Jin","sequence":"additional","affiliation":[{"name":"State Key Lab of CAD&amp;CG, Zhejiang University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,7,19]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3610548.3618153"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00541"},{"key":"e_1_2_2_3_1","volume-title":"IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR. 20950--20959","author":"An Sizhe","year":"2023","unstructured":"Sizhe An, Hongyi Xu, Yichun Shi, Guoxian Song, Umit Y. Ogras, and Linjie Luo. 2023. PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360deg. In IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR. 20950--20959."},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01565"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00574"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00222"},{"key":"e_1_2_2_7_1","volume-title":"MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar. CoRR abs\/2312.04558","author":"Chen Yufan","year":"2023","unstructured":"Yufan Chen, Lizhen Wang, Qijing Li, Hongjiang Xiao, Shengping Zhang, Hongxun Yao, and Yebin Liu. 2023b. MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar. CoRR abs\/2312.04558 (2023)."},{"key":"e_1_2_2_8_1","first-page":"31841","article-title":"GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images","volume":"35","author":"Gao Jun","year":"2022","unstructured":"Jun Gao, Tianchang Shen, Zian Wang, Wenzheng Chen, Kangxue Yin, Daiqing Li, Or Litany, Zan Gojcic, and Sanja Fidler. 2022. GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images. In Advances in Neural Information Processing Systems, Vol. 35. 31841--31854.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_9_1","first-page":"2672","article-title":"Generative Adversarial Nets","volume":"27","author":"Goodfellow Ian J.","year":"2014","unstructured":"Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Vol. 27. 2672--2680.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_10_1","volume-title":"The 10th International Conference on Learning Representations, ICLR.","author":"Gu Jiatao","year":"2022","unstructured":"Jiatao Gu, Lingjie Liu, Peng Wang, and Christian Theobalt. 2022. StyleNeRF: A Style-based 3D Aware Generator for High-resolution Image Synthesis. In The 10th International Conference on Learning Representations, ICLR."},{"key":"e_1_2_2_11_1","volume-title":"DensePose: Dense Human Pose Estimation in the Wild. In IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR. 7297--7306","author":"G\u00fcler Riza Alp","year":"2018","unstructured":"Riza Alp G\u00fcler, Natalia Neverova, and Iasonas Kokkinos. 2018. DensePose: Dense Human Pose Estimation in the Wild. In IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR. 7297--7306."},{"key":"e_1_2_2_12_1","first-page":"5767","article-title":"Improved Training of Wasserstein GANs","volume":"30","author":"Gulrajani Ishaan","year":"2017","unstructured":"Ishaan Gulrajani, Faruk Ahmed, Mart\u00edn Arjovsky, Vincent Dumoulin, and Aaron C. Courville. 2017. Improved Training of Wasserstein GANs. In Advances in Neural Information Processing Systems, Vol. 30. 5767--5777.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_13_1","volume-title":"HeadSculpt: Crafting 3D Head Avatars with Text. CoRR abs\/2306.03038","author":"Han Xiao","year":"2023","unstructured":"Xiao Han, Yukang Cao, Kai Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang, and Kwan-Yee K. Wong. 2023. HeadSculpt: Crafting 3D Head Avatars with Text. CoRR abs\/2306.03038 (2023)."},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.01808"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.595"},{"key":"e_1_2_2_16_1","first-page":"6626","article-title":"GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium","volume":"30","author":"Heusel Martin","year":"2017","unstructured":"Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Advances in Neural Information Processing Systems, Vol. 30. 6626--6637.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_17_1","unstructured":"Hsuan-I Ho Jie Song and Otmar Hilliges. 2023. SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion. arXiv:2311.15855 [cs.CV]"},{"key":"e_1_2_2_18_1","first-page":"6840","article-title":"Denoising Diffusion Probabilistic Models","volume":"33","author":"Ho Jonathan","year":"2020","unstructured":"Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems, Vol. 33. 6840--6851.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_19_1","volume-title":"HumanLiff: Layer-wise 3D Human Generation with Diffusion Model. CoRR abs\/2308.09712","author":"Hu Shoukang","year":"2023","unstructured":"Shoukang Hu, Fangzhou Hong, Tao Hu, Liang Pan, Haiyi Mei, Weiye Xiao, Lei Yang, and Ziwei Liu. 2023. HumanLiff: Layer-wise 3D Human Generation with Diffusion Model. CoRR abs\/2308.09712 (2023)."},{"key":"e_1_2_2_20_1","volume-title":"HumanNorm: Learning Normal Diffusion Model for High-quality and Realistic 3D Human Generation. CoRR abs\/2310.01406","author":"Huang Xin","year":"2023","unstructured":"Xin Huang, Ruizhi Shao, Qi Zhang, Hongwen Zhang, Ying Feng, Yebin Liu, and Qing Wang. 2023a. HumanNorm: Learning Normal Diffusion Model for High-quality and Realistic 3D Human Generation. CoRR abs\/2310.01406 (2023)."},{"key":"e_1_2_2_21_1","unstructured":"Yangyi Huang Hongwei Yi Yuliang Xiu Tingting Liao Jiaxiang Tang Deng Cai and Justus Thies. 2023b. arXiv:2308.08545 [cs.CV]"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.01322"},{"key":"e_1_2_2_23_1","volume-title":"6th International Conference on Learning Representations, ICLR.","author":"Karras Tero","year":"2018","unstructured":"Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive Growing of GANs for Improved Quality, Stability, and Variation. In 6th International Conference on Learning Representations, ICLR."},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3592433"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3592433"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV56688.2023.00298"},{"key":"e_1_2_2_27_1","volume-title":"Mihai Fieraru, and Cristian Sminchisescu.","author":"Kolotouros Nikos","year":"2023","unstructured":"Nikos Kolotouros, Thiemo Alldieck, Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Fieraru, and Cristian Sminchisescu. 2023. DreamHuman: Animatable 3D Avatars from Text. CoRR abs\/2306.09329 (2023)."},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3130800.3130813"},{"key":"e_1_2_2_29_1","volume-title":"LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching. CoRR abs\/2311.11284","author":"Liang Yixun","year":"2023","unstructured":"Yixun Liang, Xin Yang, Jiantao Lin, Haodong Li, Xiaogang Xu, and Yingcong Chen. 2023. LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching. CoRR abs\/2311.11284 (2023)."},{"key":"e_1_2_2_30_1","volume-title":"Black","author":"Liao Tingting","year":"2023","unstructured":"Tingting Liao, Hongwei Yi, Yuliang Xiu, Jiaxiang Tang, Yangyi Huang, Justus Thies, and Michael J. Black. 2023. TADA! Text to Animatable Digital Avatars. CoRR abs\/2308.10899 (2023)."},{"key":"e_1_2_2_31_1","volume-title":"HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion. CoRR abs\/2310.08579","author":"Liu Xian","year":"2023","unstructured":"Xian Liu, Jian Ren, Aliaksandr Siarohin, Ivan Skorokhodov, Yanyu Li, Dahua Lin, Xihui Liu, Ziwei Liu, and Sergey Tulyakov. 2023a. HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion. CoRR abs\/2310.08579 (2023)."},{"key":"e_1_2_2_32_1","volume-title":"HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting. CoRR abs\/2311.17061","author":"Liu Xian","year":"2023","unstructured":"Xian Liu, Xiaohang Zhan, Jiaxiang Tang, Ying Shan, Gang Zeng, Dahua Lin, Xihui Liu, and Ziwei Liu. 2023b. HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting. CoRR abs\/2311.17061 (2023)."},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818013"},{"key":"e_1_2_2_34_1","volume-title":"Proceedings, Part I (Lecture Notes in Computer Science","volume":"421","author":"Mildenhall Ben","year":"2020","unstructured":"Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 12346). 405--421."},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3528223.3530127"},{"key":"e_1_2_2_36_1","first-page":"6767","article-title":"BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images","volume":"33","author":"Nguyen-Phuoc Thu H","year":"2020","unstructured":"Thu H Nguyen-Phuoc, Christian Richardt, Long Mai, Yongliang Yang, and Niloy Mitra. 2020. BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images. In Advances in Neural Information Processing Systems, Vol. 33. 6767--6778.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_37_1","volume-title":"The 11th International Conference on Learning Representations, ICLR.","author":"Poole Ben","year":"2023","unstructured":"Ben Poole, Ajay Jain, Jonathan T. Barron, and Ben Mildenhall. 2023. DreamFusion: Text-to-3D using 2D Diffusion. In The 11th International Conference on Learning Representations, ICLR."},{"key":"e_1_2_2_38_1","volume-title":"Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. In 4th International Conference on Learning Representations, ICLR.","author":"Radford Alec","year":"2016","unstructured":"Alec Radford, Luke Metz, and Soumith Chintala. 2016. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. In 4th International Conference on Learning Representations, ICLR."},{"key":"e_1_2_2_39_1","volume-title":"Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos. CoRR abs\/2402.03723","author":"Rivero Alfredo","year":"2024","unstructured":"Alfredo Rivero, ShahRukh Athar, Zhixin Shu, and Dimitris Samaras. 2024. Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos. CoRR abs\/2402.03723 (2024)."},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_2_2_41_1","first-page":"33999","article-title":"VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids","volume":"35","author":"Schwarz Katja","year":"2022","unstructured":"Katja Schwarz, Axel Sauer, Michael Niemeyer, Yiyi Liao, and Andreas Geiger. 2022. VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids. In Advances in Neural Information Processing Systems, Vol. 35. 33999--34011.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2206.10535"},{"key":"e_1_2_2_43_1","volume-title":"Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars. CoRR abs\/2211.11208","author":"Sun Jingxiang","year":"2022","unstructured":"Jingxiang Sun, Xuan Wang, Lizhen Wang, Xiaoyu Li, Yong Zhang, Hongwen Zhang, and Yebin Liu. 2022. Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars. CoRR abs\/2211.11208 (2022)."},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3592460"},{"key":"e_1_2_2_45_1","volume-title":"Manmohan Chandraker, Ravi Ramamoorthi, and Koki Nagano.","author":"Trevithick Alex","year":"2024","unstructured":"Alex Trevithick, Matthew A. Chan, Towaki Takikawa, Umar Iqbal, Shalini De Mello, Manmohan Chandraker, Ravi Ramamoorthi, and Koki Nagano. 2024. What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs. CoRR abs\/2401.02411 (2024)."},{"key":"e_1_2_2_46_1","unstructured":"Jie Wang Jiu-Cheng Xie Xianyan Li Feng Xu Chi-Man Pun and Hao Gao. 2024. GaussianHead: High-fidelity Head Avatars with Learnable Gaussian Derivation. arXiv:2312.01632 [cs.CV]"},{"key":"e_1_2_2_47_1","first-page":"27171","article-title":"NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction","volume":"34","author":"Wang Peng","year":"2021","unstructured":"Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. 2021. NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction. In Advances in Neural Information Processing Systems, Vol. 34. 27171--27183.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00443"},{"key":"e_1_2_2_49_1","volume-title":"Advances in Neural Information Processing Systems","volume":"34","author":"Wang Zhengyi","year":"2023","unstructured":"Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, and Jun Zhu. 2023a. ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation. In Advances in Neural Information Processing Systems, Vol. 34."},{"key":"e_1_2_2_50_1","unstructured":"Yiqian Wu Hao Xu Xiangjun Tang Hongbo Fu and Xiaogang Jin. 2023a. 3DPortraitGAN: Learning One-Quarter Headshot 3D GANs from a Single-View Portrait Dataset with Diverse Body Poses. arXiv:2307.14770 [cs.CV]"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3610548.3618164"},{"key":"e_1_2_2_52_1","volume-title":"IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023","author":"Xie Jiaxin","year":"2023","unstructured":"Jiaxin Xie, Hao Ouyang, Jingtan Piao, Chenyang Lei, and Qifeng Chen. 2023. Highfidelity 3D GAN Inversion by Pseudo-multi-view Optimization. In IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17--24, 2023. IEEE, 321--331."},{"key":"e_1_2_2_53_1","volume-title":"Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians. CoRR abs\/2312.03029","author":"Xu Yuelang","year":"2023","unstructured":"Yuelang Xu, Benwang Chen, Zhe Li, Hongwen Zhang, Lizhen Wang, Zerong Zheng, and Yebin Liu. 2023a. Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians. CoRR abs\/2312.03029 (2023)."},{"key":"e_1_2_2_54_1","unstructured":"Yuanyou Xu Zongxin Yang and Yi Yang. 2023b. SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance. arXiv:2312.08889 [cs.CV]"},{"key":"e_1_2_2_55_1","volume-title":"Progressive Text-to-3D Generation for Automatic 3D Prototyping. CoRR abs\/2309.14600","author":"Yi Han","year":"2023","unstructured":"Han Yi, Zhedong Zheng, Xiangyu Xu, and Tat-Seng Chua. 2023. Progressive Text-to-3D Generation for Automatic 3D Prototyping. CoRR abs\/2309.14600 (2023)."},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00041"},{"key":"e_1_2_2_57_1","volume-title":"High-quality & Stable 3D Avatar Creation from Text and Pose. CoRR abs\/2308.03610","author":"Zhang Huichao","year":"2023","unstructured":"Huichao Zhang, Bowen Chen, Hao Yang, Liao Qu, Xu Wang, Li Chen, Chao Long, Feida Zhu, Kang Du, and Min Zheng. 2023a. AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose. CoRR abs\/2308.03610 (2023)."},{"key":"e_1_2_2_58_1","volume-title":"Black","author":"Zhang Hao","year":"2024","unstructured":"Hao Zhang, Yao Feng, Peter Kulits, Yandong Wen, Justus Thies, and Michael J. Black. 2024. TECA: Text-Guided Generation and Editing of Compositional 3D Avatars."},{"key":"e_1_2_2_59_1","volume-title":"Chenxu Zhang, Yi Yang, and Jiashi Feng.","author":"Zhang Jianfeng","year":"2023","unstructured":"Jianfeng Zhang, Xuanmeng Zhang, Huichao Zhang, Jun Hao Liew, Chenxu Zhang, Yi Yang, and Jiashi Feng. 2023c. AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text. CoRR abs\/2311.17917 (2023)."},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00216"},{"key":"e_1_2_2_61_1","volume-title":"HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting. CoRR abs\/2402.06149","author":"Zhou Zhenglin","year":"2024","unstructured":"Zhenglin Zhou, Fan Ma, Hehe Fan, and Yi Yang. 2024. HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting. CoRR abs\/2402.06149 (2024)."},{"key":"e_1_2_2_62_1","first-page":"118","article-title":"Visual Object Networks: Image Generation with Disentangled 3D Representations","volume":"31","author":"Zhu Jun-Yan","year":"2018","unstructured":"Jun-Yan Zhu, Zhoutong Zhang, Chengkai Zhang, Jiajun Wu, Antonio Torralba, Josh Tenenbaum, and Bill Freeman. 2018. Visual Object Networks: Image Generation with Disentangled 3D Representations. In Advances in Neural Information Processing Systems, Vol. 31. 118--129.","journal-title":"Advances in Neural Information Processing Systems"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3658162","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3658162","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:05:54Z","timestamp":1750291554000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3658162"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,19]]},"references-count":62,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,7,19]]}},"alternative-id":["10.1145\/3658162"],"URL":"https:\/\/doi.org\/10.1145\/3658162","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,19]]},"assertion":[{"value":"2024-07-19","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}