{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T20:39:14Z","timestamp":1775939954653,"version":"3.50.1"},"reference-count":176,"publisher":"Association for Computing Machinery (ACM)","issue":"7","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2025,7,31]]},"abstract":"<jats:p>The metaverse presents an emerging creative expression and collaboration frontier where generative artificial intelligence (GenAI) can play a pivotal role with its ability to generate multimodal content from simple prompts. These prompts allow the metaverse to interact with GenAI, where context information, instructions, input data, or even output indications constituting the prompt can come from within the metaverse. However, their integration poses challenges regarding interoperability, lack of standards, scalability, and maintaining a high-quality user experience. This article explores how GenAI can productively assist in enhancing creativity within the contexts of the metaverse and unlock new opportunities. We provide a technical, in-depth overview of the different generative models for image, video, audio, and 3D content within the metaverse environments. We also explore the bottlenecks, opportunities, and innovative applications of GenAI from the perspectives of end users, developers, service providers, and AI researchers. This survey commences by highlighting the potential of GenAI for enhancing the metaverse experience through dynamic content generation to populate massive virtual worlds. Subsequently, we shed light on the ongoing research practices and trends in multimodal content generation, enhancing realism and creativity and alleviating bottlenecks related to standardization, computational cost, privacy, and safety. Last, we share insights into promising research directions toward the integration of GenAI with the metaverse for creative enhancement, improved immersion, and innovative interactive applications.<\/jats:p>","DOI":"10.1145\/3713075","type":"journal-article","created":{"date-parts":[[2025,1,21]],"date-time":"2025-01-21T07:09:22Z","timestamp":1737443362000},"page":"1-43","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":12,"title":["Unleashing Creativity in the Metaverse: Generative AI and Multimodal Content"],"prefix":"10.1145","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7690-8547","authenticated-orcid":false,"given":"Abdulmotaleb","family":"El Saddik","sequence":"first","affiliation":[{"name":"Multimedia Communication Research Laboratory (MCRLab), University of Ottawa, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8407-5971","authenticated-orcid":false,"given":"Jamil","family":"Ahmad","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Software Engineering, United Arab Emirates University, Al Ain, United Arab Emirates"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8020-3590","authenticated-orcid":false,"given":"Mustaqeem","family":"Khan","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Software Engineering, United Arab Emirates University, Al Ain, United Arab Emirates"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4555-4181","authenticated-orcid":false,"given":"Saad","family":"Abouzahir","sequence":"additional","affiliation":[{"name":"Royal School of Aeronautics, Marrakesh, Morocco"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6490-4648","authenticated-orcid":false,"given":"Wail","family":"Gueaieb","sequence":"additional","affiliation":[{"name":"School of Electrical Engineering and Computer Science, University of Ottawa, Canada"}]}],"member":"320","published-online":{"date-parts":[[2025,7,19]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"3dfy. 2023. 3DFY Prompt Playground. Retrieved from https:\/\/3dfy.ai\/services#3DFYMegapacs"},{"key":"e_1_3_2_3_2","unstructured":"Andrea Agostinelli Timo I. Denk Zal\u00e1n Borsos Jesse Engel Mauro Verzetti Antoine Caillon Qingqing Huang Aren Jansen Adam Roberts Marco Tagliasacchi et al. 2023. MusicLM: Generating music from text. arXiv:2301.11325. Retrieved from https:\/\/arxiv.org\/abs\/2301.11325"},{"key":"e_1_3_2_4_2","unstructured":"Google AI. 2023. Magenta. Retrieved February 10 2023 from https:\/\/magenta.tensorflow.org\/"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3487891"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.bja.2020.06.049"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.7861\/fhj.2022-0013"},{"key":"e_1_3_2_8_2","first-page":"18","article-title":"Is ChatGPT leading generative AI? What is beyond expectations?.","author":"Ayd\u0131n \u00d6mer","year":"2023","unstructured":"\u00d6mer Ayd\u0131n and Enis Karaarslan. 2023. Is ChatGPT leading generative AI? What is beyond expectations?. Academic Platform Journal of Engineering and Smart Systems 11, 3 (2023), 18\u2013134.","journal-title":"Academic Platform Journal of Engineering and Smart Systems"},{"key":"e_1_3_2_9_2","first-page":"8309","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR \u201917)","author":"Baumgartner Christian F.","year":"2017","unstructured":"Christian F. Baumgartner, Kerem Can Tezcan, Krishna Chaitanya, Andreas M. H\u00f6tker, Urs J. Muehlematter, Khoschy Schawkat, Anton S. Becker, Olivio Donati, and Ender Konukoglu. 2017. Visual feature attribution using Wasserstein GANs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR \u201917), 8309\u20138319."},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/3272127.3275052"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.52842\/conf.caadria.2022.1.353"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2023.3288409"},{"key":"e_1_3_2_13_2","unstructured":"Louis Bouchard. 2023. MusicLM\u2014Pytorch. Retrieved March 15 2023 from https:\/\/github.com\/lucidrains\/musiclm-pytorch"},{"key":"e_1_3_2_14_2","unstructured":"Jean-Pierre Briot Ga\u00ebtan Hadjeres and Fran\u00e7ois-David Pachet. 2017. Deep learning techniques for music generation\u2014A survey. arXiv:1709.01620. Retrieved from https:\/\/arxiv.org\/abs\/1709.01620"},{"key":"e_1_3_2_15_2","unstructured":"Jean-Pierre Briot and Fran\u00e7ois Pachet. 2017. Music generation by deep learning-challenges and directions. arXiv:1712.04371. Retrieved from https:\/\/arxiv.org\/abs\/1712.04371"},{"key":"e_1_3_2_16_2","volume-title":"International Conference on Learning Representations (ICLR)","author":"Brock Andrew","year":"2018","unstructured":"Andrew Brock, Jeff Donahue, and Karen Simonyan. 2018. Large scale GAN training for high fidelity natural image synthesis. International Conference on Learning Representations (ICLR) 2019."},{"key":"e_1_3_2_17_2","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in Neural Information Processing Systems 33 (2020), 1877\u20131901.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACV56688.2023.00440"},{"key":"e_1_3_2_19_2","unstructured":"Yukang Cao Yan-Pei Cao Kai Han Ying Shan and Kwan-Yee K Wong. 2023. Dreamavatar: Text-and-shape guided 3d human avatar generation via diffusion models. arXiv:2304.00916. Retrieved from https:\/\/arxiv.org\/abs\/2304.00916"},{"key":"e_1_3_2_20_2","unstructured":"Yihan Cao Siyu Li Yixin Liu Zhiling Yan Yutong Dai Philip S Yu and Lichao Sun. 2023. A comprehensive survey of AI-generated content (AIGC): A history of generative AI from GAN to ChatGPT. arXiv:2303.04226. Retrieved from https:\/\/arxiv.org\/abs\/2303.04226"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3475799"},{"key":"e_1_3_2_22_2","unstructured":"Vinay Chamola Gaurang Bansal Tridib Kumar Das Vikas Hassija Naga Siva Sai Reddy Jiacheng Wang Sherali Zeadally Amir Hussain F Richard Yu Mohsen Guizani et al. 2023. Beyond reality: The pivotal role of generative AI in the Metaverse. arXiv:2308.06272. Retrieved from https:\/\/arxiv.org\/abs\/2308.06272"},{"key":"e_1_3_2_23_2","unstructured":"Angel X Chang Thomas Funkhouser Leonidas Guibas Pat Hanrahan Qixing Huang Zimo Li Silvio Savarese Manolis Savva Shuran Song Hao Su et al. 2015. Shapenet: An information-rich 3d model repository. arXiv:1512.03012. Retrieved from https:\/\/arxiv.org\/abs\/1512.03012"},{"key":"e_1_3_2_24_2","article-title":"Infogan: Interpretable representation learning by information maximizing generative adversarial nets","volume":"29","author":"Chen Xi","year":"2016","unstructured":"Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. Advances in Neural Information Processing Systems 29 (2016).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01978"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TLT.2023.3277952"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00609"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3241628"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1080\/10494820.2023.2172044"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.caeai.2023.100197"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01118"},{"key":"e_1_3_2_32_2","first-page":"7183","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Choi Yunjey","year":"2020","unstructured":"Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. HiDT: High-resolution daytime translation without domain labels. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 7183\u20137192."},{"key":"e_1_3_2_33_2","doi-asserted-by":"crossref","unstructured":"Yu-An Chung Yu Zhang Wei Han Chung-Cheng Chiu James Qin Ruoming Pang and Yonghui Wu. 2021. W2v-BERT: Combining contrastive learning and masked language modeling for self-supervised speech pre-training. arXiv:2108.06209. Retrieved from https:\/\/arxiv.org\/abs\/2108.06209","DOI":"10.1109\/ASRU51503.2021.9688253"},{"key":"e_1_3_2_34_2","first-page":"18444","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","year":"2023","unstructured":"Haomiao Ni, Changhao Shi, Kai Li, Sharon X. Huang, Martin Renqiang Min 2023. Conditional image-to-video generation with latent flow diffusion models. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18444-18455"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00581"},{"key":"e_1_3_2_36_2","unstructured":"Jade Copet Felix Kreuk Itai Gat Tal Remez David Kant Gabriel Synnaeve Yossi Adi and Alexandre D\u00e9fossez. 2023. Simple and controllable music generation. arXiv:2306.05284. Retrieved from https:\/\/arxiv.org\/abs\/2306.05284"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1201\/9781315381411"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.3390\/su16052032"},{"key":"e_1_3_2_39_2","unstructured":"Sumanth Dathathri Andrea Madotto Janice Lan Jane Hung Eric Frank Piero Molino Jason Yosinski and Rosanne Liu. 2019. Plug and play language models: A simple approach to controlled text generation. arXiv:1912.02164. Retrieved from https:\/\/arxiv.org\/abs\/1912.02164"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cag.2023.05.010"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01015"},{"key":"e_1_3_2_42_2","first-page":"8780","article-title":"Diffusion models beat GANs on image synthesis","volume":"34","author":"Dhariwal Prafulla","year":"2021","unstructured":"Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat GANs on image synthesis. Advances in Neural Information Processing Systems 34 (2021), 8780\u20138794.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1162\/COMJ_r_00008"},{"key":"e_1_3_2_44_2","unstructured":"B. Duan W. Wang H. Tang H. Latapie and Y. Yan. 2019. Cascade attention guided residue learning GAN for cross-Modal translation. arXiv:1907.01826. Retrieved from https:\/\/arxiv.org\/abs\/1907.01826"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCE.2023.3324978"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11063-022-10777-x"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3265191"},{"key":"e_1_3_2_48_2","doi-asserted-by":"crossref","unstructured":"Sam Ellis Octavio E Martinez Manzanera Vasileios Baltatzis Ibrahim Nawaz Arjun Nair Lo\u00efc Le Folgoc Sujal Desai Ben Glocker and Julia A. Schnabel. 2022. Evaluation of 3D GANs for lung tissue modelling in pulmonary CT. arXiv:2208.08184. Retrieved from https:\/\/arxiv.org\/abs\/2208.08184","DOI":"10.59275\/j.melba.2022-9e4b"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1613\/jair.3908"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1080\/1600910X.2022.2137546"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-35939-2_15"},{"key":"e_1_3_2_52_2","unstructured":"Prashnna Ghimire Kyungki Kim and Manoj Acharya. 2023. Generative AI in the construction industry: Opportunities and challenges. arXiv:2310.04427. Retrieved from https:\/\/arxiv.org\/abs\/2310.04427"},{"key":"e_1_3_2_53_2","article-title":"Generative adversarial nets","volume":"27","author":"Goodfellow Ian","year":"2014","unstructured":"Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in Neural Information Processing Systems 27 (2014).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00511"},{"key":"e_1_3_2_55_2","first-page":"191","volume-title":"Proceedings of the International Conference on Medical Imaging with Deep Learning","author":"Han Cong","year":"2019","unstructured":"Cong Han, Kazuya Murao, Takashi Noguchi, Ken\u2019ichi Morooka, Yuki Sanomura, Shigeto Nakamura, and Makoto Hashizume. 2019. Learning more with less: GAN-based medical image augmentation for increased dataset diversity in deep learning segmentation models. In Proceedings of the International Conference on Medical Imaging with Deep Learning, 191\u2013202."},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.3390\/s22176425"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11747-022-00908-0"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.chb.2022.107502"},{"key":"e_1_3_2_59_2","unstructured":"Jonathan Ho William Chan Chitwan Saharia Jay Whang Ruiqi Gao Alexey Gritsenko Diederik P. Kingma Ben Poole Mohammad Norouzi David J. Fleet et al. 2022. Imagen video: High definition video generation with diffusion models. arXiv:2210.02303. Retrieved from https:\/\/arxiv.org\/abs\/2210.02303"},{"key":"e_1_3_2_60_2","first-page":"2722","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Ho Jonathan","year":"2019","unstructured":"Jonathan Ho, Xi Chen, Aravind Srinivas, Yan Duan, and Pieter Abbeel. 2019. Flow++: Improving flow-based generative models with variational dequantization and architecture design. In Proceedings of the International Conference on Machine Learning. PMLR, 2722\u20132730."},{"key":"e_1_3_2_61_2","first-page":"6840","article-title":"Denoising diffusion probabilistic models","volume":"33","author":"Ho Jonathan","year":"2020","unstructured":"Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems 33 (2020), 6840\u20136851.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_62_2","unstructured":"J. Ho T. Salimans A. Gritsenko W. Chan M. Norouzi and D. J. Fleet. 2022. Video diffusion models. Document 2.3 4.3 2."},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-022-04512-5"},{"key":"e_1_3_2_64_2","volume-title":"Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR)","author":"Huang Qingqing","year":"2022","unstructured":"Qingqing Huang, Aren Jansen, Joonseok Lee, Ravi Ganti, Judith Yue Li, and Daniel P. W. Ellis. 2022. MuLan: A joint embedding of music audio and natural language. In Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR)."},{"key":"e_1_3_2_65_2","unstructured":"Qingqing Huang Daniel S. Park Tao Wang Timo I. Denk Andy Ly Nanxin Chen Zhengdong Zhang Zhishuai Zhang Jiahui Yu Christian Frank Jesse Engel Quoc V. Le William Chan Zhifeng Chen and Wei Han. 2023. Noise2Music: Text-conditioned music generation with diffusion models. arXiv:2302.03917. Retrieved from https:\/\/arxiv.org\/abs\/2302.03917"},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58555-6_24"},{"key":"e_1_3_2_67_2","doi-asserted-by":"publisher","DOI":"10.54517\/m.v4i1.2164"},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-14961-0_15"},{"key":"e_1_3_2_69_2","unstructured":"Tero Karras Timo Aila Samuli Laine and Jaakko Lehtinen. 2017. Progressive growing of GANs for improved quality stability and variation. arXiv:1710.10196. Retrieved from https:\/\/arxiv.org\/abs\/1710.10196"},{"key":"e_1_3_2_70_2","first-page":"852","article-title":"Alias-free generative adversarial networks","volume":"34","author":"Karras Tero","year":"2021","unstructured":"Tero Karras, Miika Aittala, Samuli Laine, Erik H\u00e4rk\u00f6nen, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2021. Alias-free generative adversarial networks. Advances in Neural Information Processing Systems 34 (2021), 852\u2013863.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00453"},{"key":"e_1_3_2_72_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00813"},{"key":"e_1_3_2_73_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2023.102846"},{"key":"e_1_3_2_74_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbusres.2022.113378"},{"key":"e_1_3_2_75_2","first-page":"3581","volume-title":"Proceedings of the 28th International Conference on Neural Information Processing Systems","author":"Kingma D. P.","year":"2014","unstructured":"D. P. Kingma, S. Mohamed, D. J. Rezende, and M. Welling. 2014. Semi-supervised learning with deep generative models. In Proceedings of the 28th International Conference on Neural Information Processing Systems, 3581\u20133589."},{"key":"e_1_3_2_76_2","article-title":"Glow: Generative flow with invertible 1x1 convolutions","volume":"31","author":"Kingma Durk P.","year":"2018","unstructured":"Durk P. Kingma and Prafulla Dhariwal. 2018. Glow: Generative flow with invertible 1x1 convolutions. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_77_2","unstructured":"Diederik P. Kingma and Max Welling. 2014. Auto-encoding variational bayes. arXiv:1312.6114. Retrieved from https:\/\/arxiv.org\/abs\/1312.6114"},{"key":"e_1_3_2_78_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ifacol.2018.08.474"},{"key":"e_1_3_2_79_2","doi-asserted-by":"publisher","DOI":"10.1186\/s12911-024-02495-2"},{"key":"e_1_3_2_80_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-45121-8_9"},{"issue":"3","key":"e_1_3_2_81_2","first-page":"334","article-title":"The role of perceived value of avatar\u2019s virtual fashion in metaverse on the impact of sense of presence on purchase intention","volume":"16","author":"Lee Eun-Jung","year":"2024","unstructured":"Eun-Jung Lee and Ji-Hye Jeon. 2024. The role of perceived value of avatar\u2019s virtual fashion in metaverse on the impact of sense of presence on purchase intention. International Journal of Internet, Broadcasting and Communication 16, 3 (2024), 334\u2013345.","journal-title":"International Journal of Internet, Broadcasting and Communication"},{"key":"e_1_3_2_82_2","unstructured":"Lik-Hang Lee Tristan Braud Pengyuan Zhou Lin Wang Dianlei Xu Zijun Lin Abhishek Kumar Carlos Bermejo and Pan Hui. 2021. All one needs to know about metaverse: A complete survey on technological singularity virtual ecosystem and research agenda. arXiv:2110.05352. Retrieved from https:\/\/arxiv.org\/abs\/2110.05352"},{"key":"e_1_3_2_83_2","doi-asserted-by":"crossref","unstructured":"Lik-Hang Lee Zijun Lin Rui Hu Zhengya Gong Abhishek Kumar Tangyao Li Sijia Li and Pan Hui. 2021. When creators meet the metaverse: A survey on computational arts. arXiv:2111.13486. Retrieved from https:\/\/arxiv.org\/abs\/2111.13486","DOI":"10.1007\/s00180-021-01068-5"},{"key":"e_1_3_2_84_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVIDL58838.2023.10167336"},{"key":"e_1_3_2_85_2","doi-asserted-by":"publisher","DOI":"10.2514\/1.J059254"},{"key":"e_1_3_2_86_2","first-page":"31199","article-title":"Pre-trained language models for interactive decision-making","volume":"35","author":"Li Shuang","year":"2022","unstructured":"Shuang Li, Xavier Puig, Chris Paxton, Yilun Du, Clinton Wang, Linxi Fan, Tao Chen, De-An Huang, Ekin Aky\u00fcrek, Anima Anandkumar, et al. 2022. Pre-trained language models for interactive decision-making. Advances in Neural Information Processing Systems 35 (2022), 31199\u201331212.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_87_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00649"},{"key":"e_1_3_2_88_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.12233"},{"key":"e_1_3_2_89_2","doi-asserted-by":"publisher","DOI":"10.1109\/TETCI.2016.2642200"},{"key":"e_1_3_2_90_2","unstructured":"Jian Liu Xiaoshui Huang Tianyu Huang Lu Chen Yuenan Hou Shixiang Tang Ziwei Liu Wanli Ouyang Wangmeng Zuo Junjun Jiang et al. 2024. A comprehensive survey on 3D content generation. arXiv:2402.01166. Retrieved from https:\/\/arxiv.org\/abs\/2402.01166"},{"key":"e_1_3_2_91_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00572"},{"key":"e_1_3_2_92_2","first-page":"118","volume-title":"Proceedings of the International Conference on AI-Generated Content","author":"Liu Yuxin","year":"2023","unstructured":"Yuxin Liu and Keng L Siau. 2023. Generative artificial intelligence and metaverse: Future of work, future of society, and future of humanity. In Proceedings of the International Conference on AI-Generated Content. Springer, 118\u2013127."},{"key":"e_1_3_2_93_2","doi-asserted-by":"publisher","DOI":"10.1109\/CONIELECOMP.2018.8327197"},{"key":"e_1_3_2_94_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.autcon.2023.105073"},{"key":"e_1_3_2_95_2","unstructured":"lumalabs. 2025. Dream Machine. Retrieved from https:\/\/lumalabs.ai\/dream-machine"},{"key":"e_1_3_2_96_2","doi-asserted-by":"publisher","DOI":"10.14254\/1795-6889.2023.19-2.2"},{"key":"e_1_3_2_97_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00286"},{"key":"e_1_3_2_98_2","doi-asserted-by":"publisher","DOI":"10.3390\/s23031583"},{"key":"e_1_3_2_99_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cogr.2023.06.001"},{"key":"e_1_3_2_100_2","unstructured":"MagiScan. [n.\u2009d.]. 3d Scanner App Powered by AI. Retrieved from https:\/\/magiscan.app"},{"key":"e_1_3_2_101_2","first-page":"183","volume-title":"Proceedings of the International Conference on Innovation in Urban and Regional Planning","author":"Martone Angela","year":"2023","unstructured":"Angela Martone and Monica Buonocore. 2023. Digital twin for urban development. In Proceedings of the International Conference on Innovation in Urban and Regional Planning. Springer, 183\u2013191."},{"key":"e_1_3_2_102_2","doi-asserted-by":"crossref","unstructured":"A. Mittal S. Kumar and K. Singh. 2020. Disentangled audio-visual representation learning. (2020).","DOI":"10.1109\/WACV45572.2020.9093527"},{"key":"e_1_3_2_103_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00040"},{"key":"e_1_3_2_104_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-024-10703-8"},{"key":"e_1_3_2_105_2","doi-asserted-by":"crossref","unstructured":"Lea M\u00fcller Vickie Ye Georgios Pavlakos Michael Black and Angjoo Kanazawa. 2023. Generative proxemics: A prior for 3D social interaction from images. arXiv:2306.09337. Retrieved from https:\/\/arxiv.org\/abs\/2306.09337","DOI":"10.1109\/CVPR52733.2024.00925"},{"key":"e_1_3_2_106_2","unstructured":"Alex Nichol Prafulla Dhariwal Aditya Ramesh Pranav Shyam Pamela Mishkin Bob McGrew Ilya Sutskever and Mark Chen. 2021. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv:2112.10741. Retrieved from https:\/\/arxiv.org\/abs\/2112.10741"},{"key":"e_1_3_2_107_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01129"},{"key":"e_1_3_2_108_2","first-page":"1","article-title":"The effect of avatar self-integration on consumers\u2019 behavioral intention in the metaverse","author":"Oh Sumin","year":"2023","unstructured":"Sumin Oh, Woo Bin Kim, and Ho Jung Choo. 2023. The effect of avatar self-integration on consumers\u2019 behavioral intention in the metaverse. International Journal of Human\u2013Computer Interaction (2023), 1\u201314.","journal-title":"International Journal of Human\u2013Computer Interaction"},{"key":"e_1_3_2_109_2","doi-asserted-by":"publisher","DOI":"10.3390\/app13158573"},{"key":"e_1_3_2_110_2","first-page":"147807712312227","article-title":"Using text-to-image generation for architectural design ideation","author":"Paananen Ville","year":"2023","unstructured":"Ville Paananen, Jonas Oppenlaender, and Aku Visuri. 2023. Using text-to-image generation for architectural design ideation. International Journal of Architectural Computing (2023), 14780771231222783.","journal-title":"International Journal of Architectural Computing"},{"key":"e_1_3_2_111_2","first-page":"65","volume-title":"Artificial Intelligence for Business Creativity","author":"Pagani Margherita","unstructured":"Margherita Pagani and Renaud Champion. 2023. How AI can foster business creativity. In Artificial Intelligence for Business Creativity. Routledge, 65\u201381."},{"key":"e_1_3_2_112_2","doi-asserted-by":"publisher","DOI":"10.1145\/3588432.3591500"},{"key":"e_1_3_2_113_2","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3127905"},{"key":"e_1_3_2_114_2","unstructured":"Emmanouil Panagiotou and Eleni Charou. 2020. Procedural 3D terrain generation using generative adversarial networks. arXiv:2010.06411. Retrieved from https:\/\/arxiv.org\/abs\/2010.06411"},{"key":"e_1_3_2_115_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00581"},{"key":"e_1_3_2_116_2","unstructured":"Christine Payne. 2019. MuseNet. Retrieved May 15 2023 from https:\/\/openai.com\/blog\/musenet\/"},{"key":"e_1_3_2_117_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00894"},{"key":"e_1_3_2_118_2","unstructured":"Ben Poole Ajay Jain Jonathan T. Barron and Ben Mildenhall. 2022. Dreamfusion: Text-to-3d using 2d diffusion. arXiv:2209.14988. Retrieved from https:\/\/arxiv.org\/abs\/2209.14988"},{"key":"e_1_3_2_119_2","doi-asserted-by":"crossref","unstructured":"Hua Xuan Qin and Pan Hui. 2023. Empowering the metaverse with generative AI: Survey and future directions. In Proceedings of the 2023 IEEE 43rd International Conference on Distributed Computing Systems Workshops (ICDCSW). IEEE 85\u201390.","DOI":"10.1109\/ICDCSW60045.2023.00022"},{"key":"e_1_3_2_120_2","unstructured":"Alec Radford Jong Wook Kim Chris Hallacy Aditya Ramesh Gabriel Goh Sandhini Agarwal Girish Sastry Amanda Askell Pamela Mishkin Jack Clark et al. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning. PMLR 8748\u20138763."},{"key":"e_1_3_2_121_2","doi-asserted-by":"publisher","DOI":"10.1039\/D1SC05976A"},{"key":"e_1_3_2_122_2","unstructured":"Aditya Ramesh Prafulla Dhariwal Alex Nichol Casey Chu and Mark Chen. 2022. Hierarchical text-conditional image generation with clip latents. arXiv:2204.06125. Retrieved from https:\/\/arxiv.org\/abs\/2204.06125"},{"key":"e_1_3_2_123_2","first-page":"8821","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Ramesh Aditya","year":"2021","unstructured":"Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-shot text-to-image generation. In Proceedings of the International Conference on Machine Learning. PMLR, 8821\u20138831."},{"key":"e_1_3_2_124_2","article-title":"Generating diverse high-fidelity images with vq-vae-2","volume":"32","author":"Razavi Ali","year":"2019","unstructured":"Ali Razavi, Aaron Van den Oord, and Oriol Vinyals. 2019. Generating diverse high-fidelity images with vq-vae-2. Advances in Neural Information Processing Systems 32 (2019).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_125_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01259"},{"key":"e_1_3_2_126_2","first-page":"1530","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Rezende Danilo","year":"2015","unstructured":"Danilo Rezende and Shakir Mohamed. 2015. Variational inference with normalizing flows. In Proceedings of the International Conference on Machine Learning. PMLR, 1530\u20131538."},{"key":"e_1_3_2_127_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3241809"},{"key":"e_1_3_2_128_2","doi-asserted-by":"publisher","DOI":"10.1145\/4468.4469"},{"key":"e_1_3_2_129_2","first-page":"387","volume-title":"International Fashion and Design Congress","author":"Sanchez Mercedes Rodriguez","year":"2022","unstructured":"Mercedes Rodriguez Sanchez and Guillermo Garcia-Badell. 2022. Dressing the metaverse. The digital strategies of fashion brands in the virtual universe. In International Fashion and Design Congress. Springer, 387\u2013397."},{"key":"e_1_3_2_130_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_3_2_131_2","first-page":"1","article-title":"Exploring use cases of generative AI and metaverse in financial analytics: Unveiling the synergies of advanced technologies","author":"Saivasan Rangapriya","year":"2023","unstructured":"Rangapriya Saivasan and Madhavi Lokhande. 2023. Exploring use cases of generative AI and metaverse in financial analytics: Unveiling the synergies of advanced technologies. International Journal of Global Business and Competitiveness (2023), 1\u201310.","journal-title":"International Journal of Global Business and Competitiveness"},{"key":"e_1_3_2_132_2","volume-title":"Diffusion Augmented Flows: Combining Normalizing Flows and Diffusion Models for Accurate Latent Space Mapping","author":"Soham Sajekar","year":"2023","unstructured":"Soham Sajekar. 2023. Diffusion Augmented Flows: Combining Normalizing Flows and Diffusion Models for Accurate Latent Space Mapping. Ph.D. Dissertation. University of Georgia."},{"key":"e_1_3_2_133_2","unstructured":"Tim Salimans Andrej Karpathy Xi Chen and Diederik P Kingma. 2017. Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications. arXiv:1701.05517. Retrieved from https:\/\/arxiv.org\/abs\/1701.05517"},{"key":"e_1_3_2_134_2","doi-asserted-by":"publisher","DOI":"10.1080\/1362704X.2021.1981657"},{"key":"e_1_3_2_135_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2019.01.010"},{"key":"e_1_3_2_136_2","first-page":"20154","article-title":"Graf: Generative radiance fields for 3d-aware image synthesis","volume":"33","author":"Schwarz Katja","year":"2020","unstructured":"Katja Schwarz, Yiyi Liao, Michael Niemeyer, and Andreas Geiger. 2020. Graf: Generative radiance fields for 3d-aware image synthesis. Advances in Neural Information Processing Systems 33 (2020), 20154\u201320166.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_137_2","doi-asserted-by":"publisher","DOI":"10.1109\/DICTA56598.2022.10034603"},{"key":"e_1_3_2_138_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW59228.2023.00487"},{"key":"e_1_3_2_139_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00926"},{"key":"e_1_3_2_140_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-19787-1_23"},{"key":"e_1_3_2_141_2","doi-asserted-by":"crossref","unstructured":"Taylor Shin Yasaman Razeghi Robert L Logan IV Eric Wallace and Sameer Singh. 2020. Autoprompt: Eliciting knowledge from language models with automatically generated prompts. arXiv:2010.15980. Retrieved from https:\/\/arxiv.org\/abs\/2010.15980","DOI":"10.18653\/v1\/2020.emnlp-main.346"},{"key":"e_1_3_2_142_2","unstructured":"Gautam Singh Fei Deng and Sungjin Ahn. 2021. Illiterate dall-e learns to compose. arXiv:2110.11405. Retrieved from https:\/\/arxiv.org\/abs\/2110.11405"},{"key":"e_1_3_2_143_2","first-page":"2256","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Sohl-Dickstein Jascha","year":"2015","unstructured":"Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of the International Conference on Machine Learning. PMLR, 2256\u20132265."},{"key":"e_1_3_2_144_2","unstructured":"Irene Solaiman Miles Brundage Jack Clark Amanda Askell Ariel Herbert-Voss Jeff Wu Alec Radford Gretchen Krueger Jong Wook Kim Sarah Kreps et al. 2019. Release strategies and the social impacts of language models. arXiv:1908.09203. Retrieved from https:\/\/arxiv.org\/abs\/1908.09203"},{"key":"e_1_3_2_145_2","article-title":"Generative modeling by estimating gradients of the data distribution","volume":"32","author":"Song Yang","year":"2019","unstructured":"Yang Song and Stefano Ermon. 2019. Generative modeling by estimating gradients of the data distribution. Advances in Neural Information Processing Systems 32 (2019).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_146_2","first-page":"133","volume-title":"Proceedings of the International Conference on Human-Computer Interaction","author":"Spadoni Elena","year":"2023","unstructured":"Elena Spadoni, Marina Carulli, Maura Mengoni, Marco Luciani, and Monica Bordegoni. 2023. Empowering virtual humans\u2019 emotional expression in the metaverse. In Proceedings of the International Conference on Human-Computer Interaction. Springer, 133\u2013143."},{"key":"e_1_3_2_147_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.culher.2023.11.002"},{"key":"e_1_3_2_148_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00807"},{"key":"e_1_3_2_149_2","doi-asserted-by":"publisher","DOI":"10.1145\/3588432.3591516"},{"key":"e_1_3_2_150_2","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2018.2873186"},{"key":"e_1_3_2_151_2","doi-asserted-by":"publisher","DOI":"10.1109\/CoG47356.2020.9231576"},{"key":"e_1_3_2_152_2","article-title":"Deep generative models for distribution-preserving lossy compression","volume":"31","author":"Tschannen Michael","year":"2018","unstructured":"Michael Tschannen, Eirikur Agustsson, and Mario Lucic. 2018. Deep generative models for distribution-preserving lossy compression. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_153_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00165"},{"key":"e_1_3_2_154_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01258"},{"key":"e_1_3_2_155_2","article-title":"Conditional image generation with pixelcnn decoders","volume":"29","author":"Van den Oord Aaron","year":"2016","unstructured":"Aaron Van den Oord, Nal Kalchbrenner, Lasse Espeholt, Oriol Vinyals, Alex Graves, et al. 2016. Conditional image generation with pixelcnn decoders. Advances in Neural Information Processing Systems 29 (2016).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_156_2","first-page":"1747","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Van Den Oord A\u00e4ron","year":"2016","unstructured":"A\u00e4ron Van Den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016. Pixel recurrent neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, 1747\u20131756."},{"key":"e_1_3_2_157_2","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N. Gomez Lukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. arXiv:1706.03762. Retrieved from https:\/\/arxiv.org\/abs\/1706.03762"},{"key":"e_1_3_2_158_2","doi-asserted-by":"publisher","DOI":"10.1080\/20932685.2023.2234918"},{"key":"e_1_3_2_159_2","article-title":"Generating videos with scene dynamics","volume":"29","author":"Vondrick Carl","year":"2016","unstructured":"Carl Vondrick, Hamed Pirsiavash, and Antonio Torralba. 2016. Generating videos with scene dynamics. Advances in Neural Information Processing Systems 29 (2016).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_160_2","unstructured":"K. Vougioukas N. Tsiknkas and T. Karras. 2018. Speech-driven animation using a generative adversarial network. (2018)."},{"key":"e_1_3_2_161_2","doi-asserted-by":"publisher","DOI":"10.1108\/APJML-10-2023-994"},{"key":"e_1_3_2_162_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2020.10.015"},{"key":"e_1_3_2_163_2","doi-asserted-by":"crossref","unstructured":"T. Wang J. Liu and Z. Wang. 2018. Video-to-video synthesis with conditional generative adversarial networks. (2018).","DOI":"10.1145\/3206025.3206068"},{"key":"e_1_3_2_164_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00443"},{"key":"e_1_3_2_165_2","first-page":"14","volume-title":"Proceedings of the International Workshop on Simulation and Synthesis in Medical Imaging","author":"Wolterink Jelmer M.","year":"2018","unstructured":"Jelmer M. Wolterink, Anna M. Dinkla, Mark H. F. Savenije, Peter R. Seevinck, Cornelis A. T. van den Berg, and Ivana I\u0161gum. 2018. Deep MR to CT synthesis using unpaired data. In Proceedings of the International Workshop on Simulation and Synthesis in Medical Imaging. Springer, 14\u201323."},{"key":"e_1_3_2_166_2","article-title":"Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling","volume":"29","author":"Wu Jiajun","year":"2016","unstructured":"Jiajun Wu, Chengkai Zhang, Tianfan Xue, Bill Freeman, and Josh Tenenbaum. 2016. Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. Advances in Neural Information Processing Systems 29 (2016).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_167_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00091"},{"key":"e_1_3_2_168_2","first-page":"1912 1920","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Wu Zhirong","year":"2015","unstructured":"Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1912\u20131920."},{"key":"e_1_3_2_169_2","unstructured":"Zhen Xing Qijun Feng Haoran Chen Qi Dai Han Hu Hang Xu Zuxuan Wu and Yu-Gang Jiang. 2023. A survey on video diffusion models. arXiv:2310.10647. Retrieved from https:\/\/arxiv.org\/abs\/2310.10647"},{"key":"e_1_3_2_170_2","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2023.3268730"},{"key":"e_1_3_2_171_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00464"},{"key":"e_1_3_2_172_2","article-title":"3d-aware scene manipulation via inverse graphics","volume":"31","author":"Yao Shunyu","year":"2018","unstructured":"Shunyu Yao, Tzu Ming Hsu, Jun-Yan Zhu, Jiajun Wu, Antonio Torralba, Bill Freeman, and Josh Tenenbaum. 2018. 3d-aware scene manipulation via inverse graphics. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_173_2","doi-asserted-by":"crossref","unstructured":"Neil Zeghidour Alejandro Luebs Ahmed Omran Jan Skoglund and Marco Tagliasacchi. 2021. SoundStream: An end-to-end neural audio codec. arXiv:2107.03312. Retrieved from https:\/\/arxiv.org\/abs\/2107.03312","DOI":"10.1109\/TASLP.2021.3129994"},{"key":"e_1_3_2_174_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.xcrm.2022.100794"},{"key":"e_1_3_2_175_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.629"},{"key":"e_1_3_2_176_2","first-page":"668","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Zhang Jianfeng","year":"2022","unstructured":"Jianfeng Zhang, Zihang Jiang, Dingdong Yang, Hongyi Xu, Yichun Shi, Guoxian Song, Zhongcong Xu, Xinchao Wang, and Jiashi Feng. 2022. Avatargen: A 3d generative model for animatable human avatars. In Proceedings of the European Conference on Computer Vision. Springer, 668\u2013685."},{"key":"e_1_3_2_177_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.244"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3713075","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T19:56:33Z","timestamp":1772222193000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3713075"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,19]]},"references-count":176,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2025,7,31]]}},"alternative-id":["10.1145\/3713075"],"URL":"https:\/\/doi.org\/10.1145\/3713075","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,7,19]]},"assertion":[{"value":"2024-04-16","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-10-27","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-07-19","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}