{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T15:06:04Z","timestamp":1775228764139,"version":"3.50.1"},"reference-count":89,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T00:00:00Z","timestamp":1731974400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62322604"],"award-info":[{"award-number":["62322604"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62176159"],"award-info":[{"award-number":["62176159"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Shanghai Municipal Science and Technology Major Project","award":["2021SHZDZX0102"],"award-info":[{"award-number":["2021SHZDZX0102"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2024,12,19]]},"abstract":"<jats:p>Reconstructing and rendering 3D objects from highly sparse views is of critical importance for promoting applications of 3D vision techniques and improving user experience. However, images from sparse views only contain very limited 3D information, leading to two significant challenges: 1) Difficulty in building multi-view consistency as images for matching are too few; 2) Partially omitted or highly compressed object information as view coverage is insufficient. To tackle these challenges, we propose GaussianObject, a framework to represent and render the 3D object with Gaussian splatting that achieves high rendering quality with only 4 input images. We first introduce techniques of visual hull and floater elimination, which explicitly inject structure priors into the initial optimization process to help build multi-view consistency, yielding a coarse 3D Gaussian representation. Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined. We design a self-generating strategy to obtain image pairs for training the repair model. We further design a COLMAP-free variant, where pre-given accurate camera poses are not required, which achieves competitive quality and facilitates wider applications. GaussianObject is evaluated on several challenging datasets, including MipNeRF360, OmniObject3D, OpenIllumination, and our-collected unposed images, achieving superior performance from only four views and significantly outperforming previous SOTA methods.<\/jats:p>","DOI":"10.1145\/3687759","type":"journal-article","created":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T15:46:04Z","timestamp":1732031164000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":33,"title":["GaussianObject: High-Quality 3D Object Reconstruction from Four Views with Gaussian Splatting"],"prefix":"10.1145","volume":"43","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4496-7849","authenticated-orcid":false,"given":"Chen","family":"Yang","sequence":"first","affiliation":[{"name":"MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-4080-7454","authenticated-orcid":false,"given":"Sikuang","family":"Li","sequence":"additional","affiliation":[{"name":"MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0322-4582","authenticated-orcid":false,"given":"Jiemin","family":"Fang","sequence":"additional","affiliation":[{"name":"Huawei, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-7667-1809","authenticated-orcid":false,"given":"Ruofan","family":"Liang","sequence":"additional","affiliation":[{"name":"University of Toronto, Toronto, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4831-9451","authenticated-orcid":false,"given":"Lingxi","family":"Xie","sequence":"additional","affiliation":[{"name":"Huawei, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6337-5748","authenticated-orcid":false,"given":"Xiaopeng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Huawei, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1235-598X","authenticated-orcid":false,"given":"Wei","family":"Shen","sequence":"additional","affiliation":[{"name":"MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7252-5047","authenticated-orcid":false,"given":"Qi","family":"Tian","sequence":"additional","affiliation":[{"name":"Huawei, Shenzhen, China"}]}],"member":"320","published-online":{"date-parts":[[2024,11,19]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Barron Jonathan T.","year":"2021","unstructured":"Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, and Peter Hedman. 2021. Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields. 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021), 5460--5469."},{"key":"e_1_2_1_2_1","volume-title":"Zoedepth: Zero-shot transfer by combining relative and metric depth. arXiv preprint arXiv:2302.12288","author":"Bhat Shariq Farooq","year":"2023","unstructured":"Shariq Farooq Bhat, Reiner Birkl, Diana Wofk, Peter Wonka, and Matthias M\u00fcller. 2023. Zoedepth: Zero-shot transfer by combining relative and metric depth. arXiv preprint arXiv:2302.12288 (2023)."},{"key":"e_1_2_1_3_1","volume-title":"Viewpoint Textual Inversion: Unleashing Novel View Synthesis with Pretrained 2D Diffusion Models. ECCV","author":"Burgess James","year":"2024","unstructured":"James Burgess, Kuan-Chieh Wang, and Serena Yeung. 2024. Viewpoint Textual Inversion: Unleashing Novel View Synthesis with Pretrained 2D Diffusion Models. ECCV (2024)."},{"key":"e_1_2_1_4_1","volume-title":"Axel Levy, Miika Aittala, Shalini De Mello, Tero Karras, and Gordon Wetzstein.","author":"Chan Eric R","year":"2023","unstructured":"Eric R Chan, Koki Nagano, Matthew A Chan, Alexander W Bergman, Jeong Joon Park, Axel Levy, Miika Aittala, Shalini De Mello, Tero Karras, and Gordon Wetzstein. 2023. GeNVS: Generative novel view synthesis with 3D-aware diffusion models."},{"key":"e_1_2_1_5_1","unstructured":"Jonathan Chang. 2023. minLoRA. https:\/\/github.com\/cccntu\/minLoRA."},{"key":"e_1_2_1_6_1","volume-title":"Andrea Tagliasacchi, and Vincent Sitzmann.","author":"Charatan David","year":"2024","unstructured":"David Charatan, Sizhe Lester Li, Andrea Tagliasacchi, and Vincent Sitzmann. 2024. pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction. In CVPR. 19457--19467."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.02033"},{"key":"e_1_2_1_8_1","volume-title":"Depth-regularized optimization for 3d gaussian splatting in few-shot images. arXiv preprint arXiv:2311.13398","author":"Chung Jaeyoung","year":"2023","unstructured":"Jaeyoung Chung, Jeongtaek Oh, and Kyoung Mu Lee. 2023. Depth-regularized optimization for 3d gaussian splatting in few-shot images. arXiv preprint arXiv:2311.13398 (2023)."},{"key":"e_1_2_1_9_1","volume-title":"Eli VanderBilt, Aniruddha Kembhavi","author":"Deitke Matt","year":"2023","unstructured":"Matt Deitke, Ruoshi Liu, Matthew Wallingford, Huong Ngo, Oscar Michel, Aditya Kusupati, Alan Fan, Christian Laforte, Vikram Voleti, Samir Yitzhak Gadre, Eli VanderBilt, Aniruddha Kembhavi, Carl Vondrick, Georgia Gkioxari, Kiana Ehsani, Ludwig Schmidt, and Ali Farhadi. 2023. Objaverse-XL: A Universe of 10M+ 3D Objects. arXiv preprint arXiv:2307.05663 (2023)."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01254"},{"key":"e_1_2_1_11_1","volume-title":"International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=YicbFdNTTy","author":"Dosovitskiy Alexey","year":"2021","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=YicbFdNTTy"},{"key":"e_1_2_1_12_1","volume-title":"Colmap-free 3d gaussian splatting. CVPR","author":"Fu Yang","year":"2024","unstructured":"Yang Fu, Sifei Liu, Amey Kulkarni, Jan Kautz, Alexei A Efros, and Xiaolong Wang. 2024. Colmap-free 3d gaussian splatting. CVPR (2024)."},{"key":"e_1_2_1_13_1","volume-title":"CAT3D: Create Anything in 3D with Multi-View Diffusion Models. arXiv preprint arXiv:2405.10314","author":"Gao Ruiqi","year":"2024","unstructured":"Ruiqi Gao, Aleksander Holynski, Philipp Henzler, Arthur Brussee, Ricardo Martin-Brualla, Pratul Srinivasan, Jonathan T Barron, and Ben Poole. 2024. CAT3D: Create Anything in 3D with Multi-View Diffusion Models. arXiv preprint arXiv:2405.10314 (2024)."},{"key":"e_1_2_1_14_1","volume-title":"SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis. IEEE\/CVF International Conference on Computer Vision (ICCV)","author":"Chen Zhaoxi","year":"2023","unstructured":"Guangcong, Zhaoxi Chen, Chen Change Loy, and Ziwei Liu. 2023. SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis. IEEE\/CVF International Conference on Computer Vision (ICCV) (2023)."},{"key":"e_1_2_1_15_1","unstructured":"Yuan-Chen Guo Ying-Tian Liu Ruizhi Shao Christian Laforte Vikram Voleti Guan Luo Chia-Hao Chen Zi-Xin Zou Chen Wang Yan-Pei Cao and Song-Hai Zhang. 2023. threestudio: A unified framework for 3D content generation. https:\/\/github.com\/threestudio-project\/threestudio."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.01808"},{"key":"e_1_2_1_17_1","volume-title":"Lrm: Large reconstruction model for single image to 3d. ICLR","author":"Hong Yicong","year":"2024","unstructured":"Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, and Hao Tan. 2024. Lrm: Large reconstruction model for single image to 3d. ICLR (2024)."},{"key":"e_1_2_1_18_1","volume-title":"Lora: Low-rank adaptation of large language models. ICLR","author":"Hu Edward J","year":"2022","unstructured":"Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2022. Lora: Low-rank adaptation of large language models. ICLR (2022)."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3641519.3657428"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CIBCB48159.2020.9277638"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00583"},{"key":"e_1_2_1_22_1","volume-title":"NViST: In the Wild New View Synthesis from a Single Image with Transformers. CVPR","author":"Jang Wonbong","year":"2024","unstructured":"Wonbong Jang and Lourdes Agapito. 2024. NViST: In the Wild New View Synthesis from a Single Image with Transformers. CVPR (2024)."},{"key":"e_1_2_1_23_1","volume-title":"LEAP: Liberate Sparse-view 3D Modeling from Camera Poses. ICLR","author":"Jiang Hanwen","year":"2024","unstructured":"Hanwen Jiang, Zhenyu Jiang, Yue Zhao, and Qixing Huang. 2024. LEAP: Liberate Sparse-view 3D Modeling from Camera Poses. ICLR (2024)."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3592433"},{"key":"e_1_2_1_25_1","volume-title":"Infonerf: Ray entropy minimization for few-shot neural","author":"Kim Mijeong","year":"2022","unstructured":"Mijeong Kim, Seonguk Seo, and Bohyung Han. 2022. Infonerf: Ray entropy minimization for few-shot neural volume rendering. In CVPR. 12912--12921."},{"key":"e_1_2_1_26_1","volume-title":"Kingma and Max Welling","author":"Diederik","year":"2014","unstructured":"Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14--16, 2014, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1312.6114"},{"key":"e_1_2_1_27_1","doi-asserted-by":"crossref","unstructured":"Alexander Kirillov Eric Mintun Nikhila Ravi Hanzi Mao Chloe Rolland Laura Gustafson Tete Xiao Spencer Whitehead Alexander C Berg Wan-Yen Lo et al. 2023. Segment anything. ICCV (2023).","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.273735"},{"key":"e_1_2_1_29_1","volume-title":"Instant3d: Fast text-to-3d with sparse-view generation and large reconstruction model. ICLR","author":"Li Jiahao","year":"2024","unstructured":"Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan Sunkavalli, Greg Shakhnarovich, and Sai Bi. 2024. Instant3d: Fast text-to-3d with sparse-view generation and large reconstruction model. ICLR (2024)."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00037"},{"key":"e_1_2_1_31_1","volume-title":"Yi Xu, Ravi Ramamoorthi, Zexiang Xu, and Hao Su.","author":"Liu Isabella","year":"2023","unstructured":"Isabella Liu, Linghao Chen, Ziyang Fu, Liwen Wu, Haian Jin, Zhong Li, Chin Ming Ryan Wong, Yi Xu, Ravi Ramamoorthi, Zexiang Xu, and Hao Su. 2023a. OpenIllumination: A Multi-Illumination Dataset for Inverse Rendering Evaluation on Real Objects. NeuRIPS 2023."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00853"},{"key":"e_1_2_1_33_1","volume-title":"Deceptive-NeRF: Enhancing NeRF Reconstruction using Pseudo-Observations from Diffusion Models. arXiv preprint arXiv:2305.15171","author":"Liu Xinhang","year":"2023","unstructured":"Xinhang Liu, Shiu-hong Kao, Jiaben Chen, Yu-Wing Tai, and Chi-Keung Tang. 2023b. Deceptive-NeRF: Enhancing NeRF Reconstruction using Pseudo-Observations from Diffusion Models. arXiv preprint arXiv:2305.15171 (2023)."},{"key":"e_1_2_1_34_1","volume-title":"Sdedit: Guided image synthesis and editing with stochastic differential equations. ICLR","author":"Meng Chenlin","year":"2022","unstructured":"Chenlin Meng, Yutong He, Yang Song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu, and Stefano Ermon. 2022. Sdedit: Guided image synthesis and editing with stochastic differential equations. ICLR (2022)."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01218"},{"key":"e_1_2_1_36_1","volume-title":"Matthias Nie\u00dfner, and Peter Kontschieder.","author":"M\u00fcller Norman","year":"2024","unstructured":"Norman M\u00fcller, Katja Schwarz, Barbara R\u00f6ssle, Lorenzo Porzi, Samuel Rota Bul\u00f2, Matthias Nie\u00dfner, and Peter Kontschieder. 2024. MultiDiff: Consistent Novel View Synthesis from a Single Image. In CVPR. 10258--10268."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00540"},{"key":"e_1_2_1_38_1","volume-title":"CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians. ECCV","author":"Paliwal Avinash","year":"2024","unstructured":"Avinash Paliwal, Wei Ye, Jinhui Xiong, Dmytro Kotovenko, Rakesh Ranjan, Vikas Chandra, and Nima Khademi Kalantari. 2024. CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians. ECCV (2024)."},{"key":"e_1_2_1_39_1","volume-title":"Fast Dynamic 3D Object Generation from a Single-view Video. arXiv preprint arXiv 2401.08742","author":"Pan Zijie","year":"2024","unstructured":"Zijie Pan, Zeyu Yang, Xiatian Zhu, and Li Zhang. 2024. Fast Dynamic 3D Object Generation from a Single-view Video. arXiv preprint arXiv 2401.08742 (2024)."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3478513.3480487"},{"key":"e_1_2_1_41_1","volume-title":"Dreamfusion: Text-to-3d using 2d diffusion. ICLR","author":"Poole Ben","year":"2023","unstructured":"Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Mildenhall. 2023. Dreamfusion: Text-to-3d using 2d diffusion. ICLR (2023)."},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the 38th International Conference on Machine Learning. PMLR, 8748--8763","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning. PMLR, 8748--8763. https:\/\/proceedings.mlr.press\/v139\/radford21a.html"},{"key":"e_1_2_1_43_1","doi-asserted-by":"crossref","unstructured":"Amit Raj Srinivas Kaza Ben Poole Michael Niemeyer Nataniel Ruiz Ben Mildenhall Shiran Zada Kfir Aberman Michael Rubinstein Jonathan Barron et al. 2023. Dreambooth3d: Subject-driven text-to-3d generation. ICCV (2023).","DOI":"10.1109\/ICCV51070.2023.00223"},{"key":"e_1_2_1_44_1","volume-title":"Vision Transformers for Dense Prediction. ICCV","author":"Ranftl Ren\u00e9","year":"2021","unstructured":"Ren\u00e9 Ranftl, Alexey Bochkovskiy, and Vladlen Koltun. 2021. Vision Transformers for Dense Prediction. ICCV (2021)."},{"key":"e_1_2_1_45_1","volume-title":"Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer","author":"Ranftl Ren\u00e9","year":"2020","unstructured":"Ren\u00e9 Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, and Vladlen Koltun. 2020. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE transactions on pattern analysis and machine intelligence 44, 3 (2020), 1623--1637."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2020.3019967"},{"key":"e_1_2_1_47_1","doi-asserted-by":"crossref","unstructured":"Barbara Roessle Jonathan T Barron Ben Mildenhall Pratul P Srinivasan and Matthias Nie\u00dfner. 2022. Dense depth priors for neural radiance fields from sparse input views. In CVPR. 12892--12901.","DOI":"10.1109\/CVPR52688.2022.01255"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_2_1_49_1","volume-title":"Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In CVPR. 22500--22510.","author":"Ruiz Nataniel","year":"2023","unstructured":"Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. 2023. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In CVPR. 22500--22510."},{"key":"e_1_2_1_50_1","volume-title":"Structure-from-Motion Revisited. In Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Sch\u00f6nberger Johannes Lutz","year":"2016","unstructured":"Johannes Lutz Sch\u00f6nberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. In Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision. 22883--22893","author":"Seo Seunghyeon","year":"2023","unstructured":"Seunghyeon Seo, Yeonjin Chang, and Nojun Kwak. 2023. Flipnerf: Flipped reflection rays for few-shot novel view synthesis. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 22883--22893."},{"key":"e_1_2_1_52_1","volume-title":"Control4D: Efficient 4D Portrait Editing with Text. CVPR","author":"Shao Ruizhi","year":"2024","unstructured":"Ruizhi Shao, Jingxiang Sun, Cheng Peng, Zerong Zheng, Boyao Zhou, Hongwen Zhang, and Yebin Liu. 2024. Control4D: Efficient 4D Portrait Editing with Text. CVPR (2024)."},{"key":"e_1_2_1_53_1","volume-title":"ZeroRF: Fast Sparse View 360\u00b0 Reconstruction with Zero Pretraining. CVPR","author":"Shi Ruoxi","year":"2024","unstructured":"Ruoxi Shi, Xinyue Wei, Cheng Wang, and Hao Su. 2024b. ZeroRF: Fast Sparse View 360\u00b0 Reconstruction with Zero Pretraining. CVPR (2024)."},{"key":"e_1_2_1_54_1","volume-title":"MV-Dream: Multi-view Diffusion for 3D Generation. ICLR","author":"Shi Yichun","year":"2024","unstructured":"Yichun Shi, Peng Wang, Jianglong Ye, Long Mai, Kejie Li, and Xiao Yang. 2024a. MV-Dream: Multi-view Diffusion for 3D Generation. ICLR (2024)."},{"key":"e_1_2_1_55_1","volume-title":"Sai Harsha Mupparaju, and Rajiv Soundararajan","author":"Somraj Nagabhushan","year":"2024","unstructured":"Nagabhushan Somraj, Adithyan Karanayil, Sai Harsha Mupparaju, and Rajiv Soundararajan. 2024. Simple-RF: Regularizing Sparse Input Radiance Fields with Simpler Solutions. arXiv preprint arXiv:2404.19015 (2024)."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3610548.3618188"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588432.3591539"},{"key":"e_1_2_1_58_1","volume-title":"Denoising diffusion implicit models. ICLR","author":"Song Jiaming","year":"2021","unstructured":"Jiaming Song, Chenlin Meng, and Stefano Ermon. 2021. Denoising diffusion implicit models. ICLR (2021)."},{"key":"e_1_2_1_59_1","volume-title":"DaRF: Boosting Radiance Fields from Sparse Inputs with Monocular Depth Adaptation. 2023 NIPS","author":"Song Jiuhn","year":"2023","unstructured":"Jiuhn Song, Seonghoon Park, Honggyu An, Seokju Cho, Min-Seop Kwak, Sungjin Cho, and Seungryong Kim. 2023b. DaRF: Boosting Radiance Fields from Sparse Inputs with Monocular Depth Adaptation. 2023 NIPS (2023)."},{"key":"e_1_2_1_60_1","volume-title":"Harnessing Low-Frequency Neural Fields for Few-Shot View Synthesis. arXiv preprint arXiv:2303.08370","author":"Song Liangchen","year":"2023","unstructured":"Liangchen Song, Zhong Li, Xuan Gong, Lele Chen, Zhang Chen, Yi Xu, and Junsong Yuan. 2023a. Harnessing Low-Frequency Neural Fields for Few-Shot View Synthesis. arXiv preprint arXiv:2303.08370 (2023)."},{"key":"e_1_2_1_61_1","doi-asserted-by":"crossref","unstructured":"Cheng Sun Min Sun and Hwann-Tzong Chen. 2022. Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction. In CVPR.","DOI":"10.1109\/CVPR52688.2022.00538"},{"key":"e_1_2_1_62_1","volume-title":"LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation. ECCV","author":"Tang Jiaxiang","year":"2024","unstructured":"Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, and Ziwei Liu. 2024a. LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation. ECCV (2024)."},{"key":"e_1_2_1_63_1","volume-title":"Dream-Gaussian: Generative Gaussian Splatting for Efficient 3D Content Creation. ICLR","author":"Tang Jiaxiang","year":"2024","unstructured":"Jiaxiang Tang, Jiawei Ren, Hang Zhou, Ziwei Liu, and Gang Zeng. 2024b. Dream-Gaussian: Generative Gaussian Splatting for Efficient 3D Content Creation. ICLR (2024)."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01214"},{"key":"e_1_2_1_65_1","volume-title":"PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction. ICLR","author":"Wang Peng","year":"2024","unstructured":"Peng Wang, Hao Tan, Sai Bi, Yinghao Xu, Fujun Luan, Kalyan Sunkavalli, Wenping Wang, Zexiang Xu, and Kai Zhang. 2024b. PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction. ICLR (2024)."},{"key":"e_1_2_1_66_1","volume-title":"DUSt3R: Geometric 3D Vision Made Easy. CVPR","author":"Wang Shuzhe","year":"2024","unstructured":"Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. 2024a. DUSt3R: Geometric 3D Vision Made Easy. CVPR (2024)."},{"key":"e_1_2_1_67_1","unstructured":"Zhengyi Wang Cheng Lu Yikai Wang Fan Bao Chongxuan Li Hang Su and Jun Zhu. 2023b. ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation. In Advances in Neural Information Processing Systems (NeurIPS)."},{"key":"e_1_2_1_68_1","volume-title":"NeRF-: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064","author":"Wang Zirui","year":"2021","unstructured":"Zirui Wang, Shangzhe Wu, Weidi Xie, Min Chen, and Victor Adrian Prisacariu. 2021. NeRF-: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064 (2021)."},{"key":"e_1_2_1_69_1","volume-title":"MeshLRM: Large Reconstruction Model for High-Quality Mesh. arXiv preprint arXiv:2404.12385","author":"Wei Xinyue","year":"2024","unstructured":"Xinyue Wei, Kai Zhang, Sai Bi, Hao Tan, Fujun Luan, Valentin Deschaintre, Kalyan Sunkavalli, Hao Su, and Zexiang Xu. 2024. MeshLRM: Large Reconstruction Model for High-Quality Mesh. arXiv preprint arXiv:2404.12385 (2024)."},{"key":"e_1_2_1_70_1","volume-title":"Template-Free Single-View 3D Human Digitalization with Diffusion-Guided LRM. Preprint","author":"Weng Zhenzhen","year":"2023","unstructured":"Zhenzhen Weng, Jingyuan Liu, Hao Tan, Zhan Xu, Yang Zhou, Serena Yeung-Levy, and Jimei Yang. 2023. Template-Free Single-View 3D Human Digitalization with Diffusion-Guided LRM. Preprint (2023)."},{"key":"e_1_2_1_71_1","unstructured":"Rundi Wu Ben Mildenhall Philipp Henzler Keunhong Park Ruiqi Gao Daniel Watson Pratul P Srinivasan Dor Verbin Jonathan T Barron Ben Poole et al. 2024. ReconFusion: 3D Reconstruction with Diffusion Priors. CVPR (2024)."},{"key":"e_1_2_1_72_1","volume-title":"Reconstruction and Generation. 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Wu Tong","year":"2023","unstructured":"Tong Wu, Jiarui Zhang, Xiao Fu, Yuxin Wang, Jiawei Ren, Liang Pan, Wayne Wu, Lei Yang, Jiaqi Wang, Chen Qian, Dahua Lin, and Ziwei Liu. 2023. OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation. 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), 803--814. https:\/\/api.semanticscholar.org\/CorpusID:255998491"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00407"},{"key":"e_1_2_1_74_1","volume-title":"SparseGS: Real-Time 360\u00b0 Sparse View Synthesis using Gaussian Splatting. Arxiv","author":"Xiong Haolin","year":"2023","unstructured":"Haolin Xiong, Sairisheek Muttukuru, Rishi Upadhyay, Pradyumna Chari, and Achuta Kadambi. 2023. SparseGS: Real-Time 360\u00b0 Sparse View Synthesis using Gaussian Splatting. Arxiv (2023)."},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-20047-2_42"},{"key":"e_1_2_1_76_1","volume-title":"AGG: Amortized Generative 3D Gaussians for Single Image to 3D. arXiv preprint 2401.04099","author":"Xu Dejia","year":"2024","unstructured":"Dejia Xu, Ye Yuan, Morteza Mardani, Sifei Liu, Jiaming Song, Zhangyang Wang, and Arash Vahdat. 2024c. AGG: Amortized Generative 3D Gaussians for Single Image to 3D. arXiv preprint 2401.04099 (2024)."},{"key":"e_1_2_1_77_1","volume-title":"Grm: Large gaussian reconstruction model for efficient 3d reconstruction and generation. ECCV","author":"Xu Yinghao","year":"2024","unstructured":"Yinghao Xu, Zifan Shi, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, and Gordon Wetzstein. 2024a. Grm: Large gaussian reconstruction model for efficient 3d reconstruction and generation. ECCV (2024)."},{"key":"e_1_2_1_78_1","volume-title":"DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model. ICLR","author":"Xu Yinghao","year":"2024","unstructured":"Yinghao Xu, Hao Tan, Fujun Luan, Sai Bi, Peng Wang, Jiahao Li, Zifan Shi, Kalyan Sunkavalli, Gordon Wetzstein, Zexiang Xu, and Kai Zhang. 2024b. DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model. ICLR (2024)."},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00798"},{"key":"e_1_2_1_80_1","volume-title":"GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models. CVPR","author":"Yi Taoran","year":"2024","unstructured":"Taoran Yi, Jiemin Fang, Junjie Wang, Guanjun Wu, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Qi Tian, and Xinggang Wang. 2024. GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models. CVPR (2024)."},{"key":"e_1_2_1_81_1","volume-title":"Gaussian Opacity Fields: Efficient High-quality Compact Surface Reconstruction in Unbounded Scenes. arXiv:2404.10772","author":"Yu Zehao","year":"2024","unstructured":"Zehao Yu, Torsten Sattler, and Andreas Geiger. 2024. Gaussian Opacity Fields: Efficient High-quality Compact Surface Reconstruction in Unbounded Scenes. arXiv:2404.10772 (2024)."},{"key":"e_1_2_1_82_1","volume-title":"GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting. arXiv","author":"Zhang Kai","year":"2024","unstructured":"Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, and Zexiang Xu. 2024. GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting. arXiv (2024)."},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00355"},{"key":"e_1_2_1_84_1","unstructured":"Lvmin Zhang Anyi Rao and Maneesh Agrawala. 2023b. ControlNet-v1-1-nightly. https:\/\/github.com\/lllyasviel\/ControlNet-v1-1-nightly."},{"key":"e_1_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00068"},{"key":"e_1_2_1_86_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01211"},{"key":"e_1_2_1_87_1","volume-title":"HiFA: High-fidelity Text-to-3D Generation with Advanced Diffusion Guidance. ICLR","author":"Zhu Junzhe","year":"2024","unstructured":"Junzhe Zhu and Peiye Zhuang. 2024. HiFA: High-fidelity Text-to-3D Generation with Advanced Diffusion Guidance. ICLR (2024)."},{"key":"e_1_2_1_88_1","volume-title":"FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting. ECCV","author":"Zhu Zehao","year":"2024","unstructured":"Zehao Zhu, Zhiwen Fan, Yifan Jiang, and Zhangyang Wang. 2024. FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting. ECCV (2024)."},{"key":"e_1_2_1_89_1","volume-title":"Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers. CVPR","author":"Zou Zi-Xin","year":"2024","unstructured":"Zi-Xin Zou, Zhipeng Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Yan-Pei Cao, and Song-Hai Zhang. 2024. Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers. CVPR (2024)."}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3687759","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3687759","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:17:45Z","timestamp":1750295865000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3687759"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,19]]},"references-count":89,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,12,19]]}},"alternative-id":["10.1145\/3687759"],"URL":"https:\/\/doi.org\/10.1145\/3687759","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,19]]},"assertion":[{"value":"2024-11-19","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}