{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T04:02:46Z","timestamp":1775016166422,"version":"3.50.1"},"reference-count":76,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T00:00:00Z","timestamp":1721347200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100018537","name":"National Science and Technology Major Project","doi-asserted-by":"publisher","award":["2021ZD0112902"],"award-info":[{"award-number":["2021ZD0112902"]}],"id":[{"id":"10.13039\/501100018537","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["Grant No. 62220106003"],"award-info":[{"award-number":["Grant No. 62220106003"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2024,7,19]]},"abstract":"<jats:p>BNRist, Department of Computer Science and Technology, Tsinghua University, China<\/jats:p>\n          <jats:p>In the field of digital content creation, generating high-quality 3D characters from single images is challenging, especially given the complexities of various body poses and the issues of self-occlusion and pose ambiguity. In this paper, we present CharacterGen, a framework developed to efficiently generate 3D characters. CharacterGen introduces a streamlined generation pipeline along with an image-conditioned multi-view diffusion model. This model effectively calibrates input poses to a canonical form while retaining key attributes of the input image, thereby addressing the challenges posed by diverse poses. A transformer-based, generalizable sparse-view reconstruction model is the other core component of our approach, facilitating the creation of detailed 3D models from multi-view images. We also adopt a texture-back-projection strategy to produce high-quality texture maps. Additionally, we have curated a dataset of anime characters, rendered in multiple poses and views, to train and evaluate our model. Our approach has been thoroughly evaluated through quantitative and qualitative experiments, showing its proficiency in generating 3D characters with high-quality shapes and textures, ready for downstream applications such as rigging and animation.<\/jats:p>","DOI":"10.1145\/3658217","type":"journal-article","created":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T14:47:57Z","timestamp":1721400477000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":17,"title":["CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization"],"prefix":"10.1145","volume":"43","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8797-1755","authenticated-orcid":false,"given":"Hao-Yang","family":"Peng","sequence":"first","affiliation":[{"name":"BNRist, Department of Computer Science and Technology, Tsinghua University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-9502-9484","authenticated-orcid":false,"given":"Jia-Peng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Zhili College, Tsinghua University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4128-4594","authenticated-orcid":false,"given":"Meng-Hao","family":"Guo","sequence":"additional","affiliation":[{"name":"BNRist, Department of Computer Science and Technology, Tsinghua University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0416-4374","authenticated-orcid":false,"given":"Yan-Pei","family":"Cao","sequence":"additional","affiliation":[{"name":"VAST, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7507-6542","authenticated-orcid":false,"given":"Shi-Min","family":"Hu","sequence":"additional","affiliation":[{"name":"BNRist, Department of Computer Science and Technology, Tsinghua University, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2024,7,19]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"actorcore. 2023. accurig a software for automatic character rigging. https:\/\/actorcore.reallusion.com\/auto-rig\/accurig"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00541"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00375"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2304.00916"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2019.2929257"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1111\/CGF.14769"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2303.13873"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.02018"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01263"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01977"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2303.17015"},{"key":"e_1_2_2_12_1","unstructured":"HakuyaLabs. 2023. warudo a 3D virtual image live broadcast software. https:\/\/warudo.app\/"},{"key":"e_1_2_2_13_1","volume-title":"The Eleventh International Conference on Learning Representations, ICLR 2023","author":"Hong Fangzhou","year":"2023","unstructured":"Fangzhou Hong, Zhaoxi Chen, Yushi Lan, Liang Pan, and Ziwei Liu. 2023a. EVA3D: Compositional 3D Human Generation from 2D Image Collections. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1--5, 2023. OpenReview.net. https:\/\/openreview.net\/pdf?id=g7U9jD_2CUr"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3528223.3530094"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2311.04400"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2311.17117"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2305.12529"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2308.08545"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3550469.3555394"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2303.17606"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2305.02463"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2306.09329"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3414685.3417861"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2311.06214"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3130800.3130813"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2308.10899"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00037"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2305.08891"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2311.07885"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2306.16928"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2303.11328"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2309.03453"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2310.15008"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818013"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00816"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01218"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503250"},{"key":"e_1_2_2_38_1","unstructured":"MixamoInc. 2009. Mixamo's online services. https:\/\/www.mixamo.com\/"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00421"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2212.08751"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01123"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/882262.882269"},{"key":"e_1_2_2_43_1","unstructured":"Pixiv. 2019. VRM tools of three.js. https:\/\/github.com\/pixiv\/three-vrm"},{"key":"e_1_2_2_44_1","volume-title":"The Eleventh International Conference on Learning Representations, ICLR 2023","author":"Poole Ben","year":"2023","unstructured":"Ben Poole, Ajay Jain, Jonathan T. Barron, and Ben Mildenhall. 2023. DreamFusion: Text-to-3D using 2D Diffusion. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1--5, 2023. OpenReview.net. https:\/\/openreview.net\/pdf?id=FjNys5c7VyY"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2306.17843"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00223"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.02155"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00239"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00016"},{"key":"e_1_2_2_50_1","volume-title":"Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021","author":"Shen Tianchang","year":"2021","unstructured":"Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, and Sanja Fidler. 2021. Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6--14, 2021, virtual, Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 6087--6101. https:\/\/proceedings.neurips.cc\/paper\/2021\/hash\/30a237d18c50f563cba4531f1db44acf-Abstract.html"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2310.15110"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2308.16512"},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.02086"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2307.01097"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/3DV53792.2021.00019"},{"key":"e_1_2_2_56_1","unstructured":"VRoid. 2022. VRoid Hub. https:\/\/vroid.com\/"},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01214"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2020.2984232"},{"key":"e_1_2_2_59_1","volume-title":"Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021","author":"Wang Peng","year":"2021","unstructured":"Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. 2021a. NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6--14, 2021, virtual, Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 27171--27183. https:\/\/proceedings.neurips.cc\/paper\/2021\/hash\/e41e164f7485ec4a28741a2d0ea41c74-Abstract.html"},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2312.02201"},{"key":"e_1_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00443"},{"key":"e_1_2_2_62_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2305.16213"},{"key":"e_1_2_2_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2022.3169735"},{"key":"e_1_2_2_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00875"},{"key":"e_1_2_2_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00084"},{"key":"e_1_2_2_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00057"},{"key":"e_1_2_2_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01294"},{"key":"e_1_2_2_68_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2308.06721"},{"key":"e_1_2_2_69_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2306.03414"},{"key":"e_1_2_2_70_1","volume-title":"Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022","author":"Zeng Xiaohui","year":"2022","unstructured":"Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, and Karsten Kreis. 2022. LION: Latent Point Diffusion Models for 3D Shape Generation. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (Eds.). http:\/\/papers.nips.cc\/paper_files\/paper\/2022\/hash\/40e56dabe12095a5fc44a6e4c3835948-Abstract-Conference.html"},{"key":"e_1_2_2_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/3592442"},{"key":"e_1_2_2_72_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2308.03610"},{"key":"e_1_2_2_73_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2311.17917"},{"key":"e_1_2_2_74_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2302.05543"},{"key":"e_1_2_2_75_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00068"},{"key":"e_1_2_2_76_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3050505"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3658217","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3658217","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:04:16Z","timestamp":1750291456000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3658217"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,19]]},"references-count":76,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,7,19]]}},"alternative-id":["10.1145\/3658217"],"URL":"https:\/\/doi.org\/10.1145\/3658217","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,19]]},"assertion":[{"value":"2024-07-19","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}