{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T21:21:58Z","timestamp":1764969718980,"version":"3.46.0"},"reference-count":65,"publisher":"Association for Computing Machinery (ACM)","issue":"6","funder":[{"name":"Research Grants Council of the Hong Kong Special Administrative Region","award":["14201921"],"award-info":[{"award-number":["14201921"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2025,12]]},"abstract":"<jats:p>\n                    This paper presents LEGO\n                    <jats:sup>\u00ae<\/jats:sup>\n                    -Maker, a new learning-based generative model that can effectively consider over 100 unique brick types and rapidly generate hundreds of bricks to create LEGO\n                    <jats:sup>\u00ae<\/jats:sup>\n                    models conditioned on images. This work has three major technical contributions that enable it to achieve surpassing capabilities beyond existing generative approaches. First, we design a compact LEGO\n                    <jats:sup>\u00ae<\/jats:sup>\n                    tokenization scheme to serialize LEGO\n                    <jats:sup>\u00ae<\/jats:sup>\n                    models and bricks into tokens for autoregressive learning. Second, we build LEGO\n                    <jats:sup>\u00ae<\/jats:sup>\n                    -Maker, an autoregressive image-conditioned architecture, with a multi-token prediction strategy to encourage pre-considering multiple brick attributes and a rollback mechanism for collision-free generation. Third, we propose an effective data preparation pipeline with a procedural generator to synthesize LEGO\n                    <jats:sup>\u00ae<\/jats:sup>\n                    models and a LEGO\n                    <jats:sup>\u00ae<\/jats:sup>\n                    -to-real image translator distilled from a large vision language model to translate LEGO\n                    <jats:sup>\u00ae<\/jats:sup>\n                    renderings into associated photorealistic images, leveraging rich prior to address the scarcity of image-to-LEGO\n                    <jats:sup>\u00ae<\/jats:sup>\n                    data. Extensive evaluations and comparisons are conducted on two object categories, facade and portrait, over metrics in four aspects: geometry, color, semantics, and structural integrity, together with a user study. Experimental results demonstrate the versatility and compelling strengths of LEGO\n                    <jats:sup>\u00ae<\/jats:sup>\n                    -Maker in producing structures and details given by the reference image. Also, the evaluation scores manifest that our method clearly surpasses the baselines, consistently for all evaluation metrics.\n                  <\/jats:p>","DOI":"10.1145\/3763285","type":"journal-article","created":{"date-parts":[[2025,12,4]],"date-time":"2025-12-04T17:15:39Z","timestamp":1764868539000},"page":"1-15","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["LEGO\u00ae-Maker: Autoregressive Image-Conditioned LEGO\u00ae Model Creation"],"prefix":"10.1145","volume":"44","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-7318-2887","authenticated-orcid":false,"given":"Jiahao","family":"Ge","sequence":"first","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8261-6949","authenticated-orcid":false,"given":"Mingjun","family":"Zhou","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-6753-8605","authenticated-orcid":false,"given":"Hanyou","family":"Zheng","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4011-5745","authenticated-orcid":false,"given":"Hao","family":"Xu","sequence":"additional","affiliation":[{"name":"Unicus Research, Hong Kong, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5238-593X","authenticated-orcid":false,"given":"Chi-Wing","family":"Fu","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, Hong Kong"}]}],"member":"320","published-online":{"date-parts":[[2025,12,4]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"BrickLink. 2024. BrickLink - Studio. https:\/\/www.bricklink.com\/v2\/build\/studio.page"},{"key":"e_1_2_2_2_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Chen Yiwen","year":"2025","unstructured":"Yiwen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, and Chi Zhang. 2025a. MeshAnything: Artist-created mesh generation with autoregressive transformers. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_2_3_1","volume-title":"Conference on Computer Vision and Pattern Recognition (CVPR). 28371\u201328382","author":"Chen Yongwei","year":"2025","unstructured":"Yongwei Chen, Yushi Lan, Shangchen Zhou, Tengfei Wang, and Xingang Pan. 2025b. SAR3D: autoregressive 3D object generation and understanding via multi-scale 3D VQVAE. In Conference on Computer Vision and Pattern Recognition (CVPR). 28371\u201328382."},{"key":"e_1_2_2_4_1","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV).","author":"Chen Yiwen","year":"2025","unstructured":"Yiwen Chen, Yikai Wang, Yihao Luo, Zhengyi Wang, Zilong Chen, Jun Zhu, Chi Zhang, and Guosheng Lin. 2025c. MeshAnything V2: Artist-created mesh generation with adjacent mesh tokenization. In Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV)."},{"key":"e_1_2_2_5_1","volume-title":"Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track.","author":"Deitke Matt","year":"2023","unstructured":"Matt Deitke, Ruoshi Liu, Matthew Wallingford, Huong Ngo, Oscar Michel, Aditya Kusupati, Alan Fan, Christian Laforte, Vikram Voleti, Samir Yitzhak Gadre, Eli VanderBilt, Aniruddha Kembhavi, Carl Vondrick, Georgia Gkioxari, Kiana Ehsani, Ludwig Schmidt, and Ali Farhadi. 2023a. Objaverse-XL: A universe of 10m+ 3D objects. In Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track."},{"key":"e_1_2_2_6_1","volume-title":"Conference on Computer Vision and Pattern Recognition (CVPR). 13142\u201313153","author":"Deitke Matt","year":"2023","unstructured":"Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, and Ali Farhadi. 2023b. Objaverse: A universe of annotated 3D objects. In Conference on Computer Vision and Pattern Recognition (CVPR). 13142\u201313153."},{"key":"e_1_2_2_7_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML). 12606\u201312633","author":"Esser Patrick","year":"2024","unstructured":"Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas M\u00fcller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, and Robin Rombach. 2024. Scaling rectified flow Transformers for high-resolution image synthesis. In Proceedings of the International Conference on Machine Learning (ICML). 12606\u201312633."},{"key":"e_1_2_2_8_1","volume-title":"Creating LEGO\u00ae figurines from single images. ACM Trans. Graph. (SIGGRAPH 2024) 43, 4","author":"Ge Jiahao","year":"2024","unstructured":"Jiahao Ge, Mingjun Zhou, Wenrui Bao, Hao Xu, and Chi-Wing Fu. 2024b. Creating LEGO\u00ae figurines from single images. ACM Trans. Graph. (SIGGRAPH 2024) 43, 4 (2024), 1\u201316."},{"key":"e_1_2_2_9_1","volume-title":"ACM Trans. Graph. (SIGGRAPH ASIA 2024)","author":"Ge Jiahao","year":"2024","unstructured":"Jiahao Ge, Mingjun Zhou, and Chi-Wing Fu. 2024a. Learn to create simple LEGO\u00ae micro buildings. ACM Trans. Graph. (SIGGRAPH ASIA 2024) 43, 6 (2024), 1\u201313."},{"key":"e_1_2_2_10_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML). 15706\u201315734","author":"Gloeckle Fabian","year":"2024","unstructured":"Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Rozi\u00e8re, David Lopez-Paz, and Gabriel Synnaeve. 2024. Better & faster large language models via multi-token prediction. In Proceedings of the International Conference on Machine Learning (ICML). 15706\u201315734."},{"key":"e_1_2_2_11_1","volume-title":"Petersen","author":"Gower Rebecca A. H.","year":"1998","unstructured":"Rebecca A. H. Gower, Agnes E. Heydtmann, and Henrik G. Petersen. 1998. LEGO\u00ae: Automated model construction. In European Study Group with Industry. 81\u201394."},{"key":"e_1_2_2_12_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Hong Yicong","year":"2024","unstructured":"Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, and Hao Tan. 2024. LRM: Large reconstruction model for single image to 3D. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_2_13_1","volume-title":"Conference on Computer Vision and Pattern Recognition (CVPR). 8153\u20138163","author":"Hu Li","year":"2024","unstructured":"Li Hu. 2024. Animate anyone: Consistent and controllable image-to-video synthesis for character animation. In Conference on Computer Vision and Pattern Recognition (CVPR). 8153\u20138163."},{"key":"e_1_2_2_14_1","volume-title":"SIGGRAPH 2025 Conference Papers","author":"Huang Junming","year":"2025","unstructured":"Junming Huang, Chi Wang, Letian Li, Changxin Huang, Qiang Dai, and Weiwei Xu. 2025. BuildingBlock: A hybrid approach for structured building generation. In SIGGRAPH 2025 Conference Papers (Vancouver, BC, Canada)."},{"key":"e_1_2_2_15_1","volume-title":"Proceedings of the 41st International Conference on Machine Learning (ICML). 20660\u201320681","author":"Hui Ka-Hei","year":"2024","unstructured":"Ka-Hei Hui, Aditya Sanghi, Arianna Rampini, Kamal Rahimi Malekshan, Zhengzhe Liu, Hooman Shayani, and Chi-Wing Fu. 2024. Make-A-Shape: A ten-million-scale 3D shape model. In Proceedings of the 41st International Conference on Machine Learning (ICML). 20660\u201320681."},{"key":"e_1_2_2_16_1","unstructured":"James Jessiman. 1995. LDraw.Org - LDraw file format specification. https:\/\/ldraw.org\/article\/218.html."},{"key":"e_1_2_2_17_1","unstructured":"Black Forest Labs. 2024. FLUX. https:\/\/github.com\/black-forest-labs\/flux."},{"key":"e_1_2_2_18_1","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR).","author":"Lan Yushi","year":"2025","unstructured":"Yushi Lan, Shangchen Zhou, Zhaoyang Lyu, Fangzhou Hong, Shuai Yang, Bo Dai, Xingang Pan, and Chen Change Loy. 2025. GaussianAnything: Interactive point cloud flow matching for 3D object generation. In Proceedings of the International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_2_19_1","article-title":"Latent L-Systems: Transformer-based tree generator","volume":"43","author":"Lee Jae Joong","year":"2023","unstructured":"Jae Joong Lee, Bosheng Li, and Bedrich Benes. 2023. Latent L-Systems: Transformer-based tree generator. ACM Trans. Graph. 43, 1, Article 7 (2023).","journal-title":"ACM Trans. Graph."},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2859039"},{"key":"e_1_2_2_21_1","volume-title":"Image2Lego: Customized LEGO\u00ae set generation from images. arXiv:2108.08477","author":"Lennon Kyle","year":"2021","unstructured":"Kyle Lennon, Katharina Fransen, Alexander O'Brien, Yumeng Cao, Yamin Beveridge, Matthewand Arefeen, Nikhil Singh, and Iddo Drori. 2021. Image2Lego: Customized LEGO\u00ae set generation from images. arXiv:2108.08477 (2021)."},{"key":"e_1_2_2_22_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Li Jiahao","year":"2024","unstructured":"Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan Sunkavalli, Greg Shakhnarovich, and Sai Bi. 2024b. Instant3D: Fast text-to-3D with sparse-view generation and large reconstruction model. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_2_23_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Li Weiyu","year":"2024","unstructured":"Weiyu Li, Rui Chen, Xuelin Chen, and Ping Tan. 2024a. SweetDreamer: Aligning geometric priors in 2D diffusion for consistent text-to-3D. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_2_24_1","unstructured":"Aixin Liu Bei Feng Bing Xue Bingxuan Wang Bochao Wu Chengda Lu Chenggang Zhao Chengqi Deng Chenyu Zhang Chong Ruan et al. 2024a. Deepseek-v3 technical report. arXiv:2412.19437 (2024)."},{"key":"e_1_2_2_25_1","volume-title":"International Conference on Computer Vision (ICCV). 9298\u20139309","author":"Liu Ruoshi","year":"2023","unstructured":"Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tokmakov, Sergey Zakharov, and Carl Vondrick. 2023. Zero-1-to-3: Zero-shot one image to 3D object. In International Conference on Computer Vision (ICCV). 9298\u20139309."},{"key":"e_1_2_2_26_1","volume-title":"European Conference on Computer Vision (ECCV). 38\u201355","author":"Liu Shilong","year":"2024","unstructured":"Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Yang, Qing Jiang, Chunyuan Li, Jianwei Yang, Hang Su, et al. 2024b. Grounding DINO: Marrying DINO with grounded pre-training for open-set object detection. In European Conference on Computer Vision (ECCV). 38\u201355."},{"key":"e_1_2_2_27_1","volume-title":"ACM Trans. Graph. (SIGGRAPH ASIA 2015)","author":"Luo Sheng-Jie","year":"2015","unstructured":"Sheng-Jie Luo, Yonghao Yue, Chun-Kai Huang, Yu-Huan Chung, Sei Imai, Tomoyuki Nishita, and Bing-Yu Chen. 2015. Legolization: optimizing LEGO\u00ae designs. ACM Trans. Graph. (SIGGRAPH ASIA 2015) 34, 6 (2015), 1\u201318."},{"key":"e_1_2_2_28_1","volume-title":"European Conference on Computer Vision (ECCV). 405\u2013421","author":"Mildenhall Ben","year":"2020","unstructured":"Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing scenes as neural radiance fields for view synthesis. In European Conference on Computer Vision (ECCV). 405\u2013421."},{"key":"e_1_2_2_29_1","volume-title":"ShapeShift: Towards text-to-shape arrangement synthesis with content-aware geometric constraints. arXiv:2503.14720","author":"Misra Vihaan","year":"2025","unstructured":"Vihaan Misra, Peter Schaldenbrand, and Jean Oh. 2025. ShapeShift: Towards text-to-shape arrangement synthesis with content-aware geometric constraints. arXiv:2503.14720 (2025)."},{"key":"e_1_2_2_30_1","unstructured":"OpenAI. 2024. GPT-4o system card. https:\/\/arxiv.org\/abs\/2410.21276. Accessed: 2025-05-12."},{"key":"e_1_2_2_31_1","volume-title":"DINOv2: Learning robust visual features without supervision. Transactions on Machine Learning Research (TMLR)","author":"Oquab Maxime","year":"2024","unstructured":"Maxime Oquab, Timoth\u00e9e Darcet, Th\u00e9o Moutakanni, Huy V. Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mido Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Herve Jegou, Julien Mairal, Patrick Labatut, Armand Joulin, and Piotr Bojanowski. 2024. DINOv2: Learning robust visual features without supervision. Transactions on Machine Learning Research (TMLR) (2024)."},{"key":"e_1_2_2_32_1","volume-title":"Proc. NIK (Norsk Informatikkonferanse). 87\u201397","author":"Petrovi\u010d Pavel","year":"2001","unstructured":"Pavel Petrovi\u010d. 2001. Solving LEGO\u00ae brick layout problem using evolutionary algorithms. In Proc. NIK (Norsk Informatikkonferanse). 87\u201397."},{"key":"e_1_2_2_33_1","volume-title":"Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence. 1089","author":"Peysakhov Maxim","year":"2000","unstructured":"Maxim Peysakhov, Vlada Galinskaya, and William C Regli. 2000. Representation and evolution of LEGO\u00ae-based assemblies. In Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence. 1089."},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3680528.3687657"},{"key":"e_1_2_2_35_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Poole Ben","year":"2023","unstructured":"Ben Poole, Ajay Jain, Jonathan T. Barron, and Ben Mildenhall. 2023. DreamFusion: Text-to-3D using 2D diffusion. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_2_36_1","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV).","author":"Pun Ava","year":"2025","unstructured":"Ava Pun, Kangle Deng, Ruixuan Liu, Deva Ramanan, Changliu Liu, and Jun-Yan Zhu. 2025. Generating physically stable and buildable brick structures from text. In Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV)."},{"key":"e_1_2_2_37_1","volume-title":"Conference on Computer Vision and Pattern Recognition (CVPR). 9914\u20139925","author":"Qiu Lingteng","year":"2024","unstructured":"Lingteng Qiu, Guanying Chen, Xiaodong Gu, Qi Zuo, Mutian Xu, Yushuang Wu, Weihao Yuan, Zilong Dong, Liefeng Bo, and Xiaoguang Han. 2024. RichDreamer: A generalizable normal-depth diffusion model for detail richness in text-to-3D. In Conference on Computer Vision and Pattern Recognition (CVPR). 9914\u20139925."},{"key":"e_1_2_2_38_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML). 8748\u20138763","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning (ICML). 8748\u20138763."},{"key":"e_1_2_2_39_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML). 8821\u20138831","author":"Ramesh Aditya","year":"2021","unstructured":"Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-shot text-to-image generation. In Proceedings of the International Conference on Machine Learning (ICML). 8821\u20138831."},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_2_2_41_1","volume-title":"Edgebreaker: Connectivity compression for triangle meshes","author":"Rossignac Jarek","year":"2002","unstructured":"Jarek Rossignac. 2002. Edgebreaker: Connectivity compression for triangle meshes. IEEE transactions on visualization and computer graphics (TVCG) 5, 1 (2002), 47\u201361."},{"key":"e_1_2_2_42_1","volume-title":"International Conference on Neural Information Processing Systems (NeurIPS). 36479\u201336494","author":"Saharia Chitwan","year":"2022","unstructured":"Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L. Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Raphael Gontijo-Lopes, Tim Salimans, Jonathan Ho, David J. Fleet, and Mohammad Norouzi. 2022. Photorealistic text-to-image diffusion models with deep language understanding. In International Conference on Neural Information Processing Systems (NeurIPS). 36479\u201336494."},{"key":"e_1_2_2_43_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Seo Junyoung","year":"2024","unstructured":"Junyoung Seo, Wooseok Jang, Min-Seop Kwak, In\u00e8s Hyeonsu Kim, Jaehoon Ko, Junho Kim, Jin-Hwa Kim, Jiyoung Lee, and Seungryong Kim. 2024. Let 2D diffusion model know 3D-consistency for robust text-to-3D generation. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_2_44_1","unstructured":"Ruoxi Shi Hansheng Chen Zhuoyang Zhang Minghua Liu Chao Xu Xinyue Wei Linghao Chen Chong Zeng and Hao Su. 2023. Zero123++: A single image to consistent multi-view diffusion base model. arXiv:2310.15110 [cs.CV]"},{"key":"e_1_2_2_45_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Shi Yichun","year":"2024","unstructured":"Yichun Shi, Peng Wang, Jianglong Ye, Long Mai, Kejie Li, and Xiao Yang. 2024. MV-Dream: Multi-view diffusion for 3D generation. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_2_46_1","volume-title":"Conference on Computer Vision and Pattern Recognition (CVPR). 19615\u201319625","author":"Siddiqui Yawar","year":"2024","unstructured":"Yawar Siddiqui, Antonio Alliegro, Alexey Artemov, Tatiana Tommasi, Daniele Sirigatti, Vladislav Rosov, Angela Dai, and Matthias Nie\u00dfner. 2024. MeshGPT: Generating triangle meshes with decoder-only transformers. In Conference on Computer Vision and Pattern Recognition (CVPR). 19615\u201319625."},{"volume-title":"Automated Brick Sculpture Construction. Thesis","author":"Smal Eugene","key":"e_1_2_2_47_1","unstructured":"Eugene Smal. 2008. Automated Brick Sculpture Construction. Thesis. Stellenbosch: Stellenbosch University."},{"key":"e_1_2_2_48_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Tang Jiaxiang","year":"2025","unstructured":"Jiaxiang Tang, Zhaoshuo Li, Zekun Hao, Xian Liu, Gang Zeng, Ming-Yu Liu, and Qinsheng Zhang. 2025. EdgeRunner: Auto-regressive auto-encoder for artistic mesh generation. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_2_49_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Tang Jiaxiang","year":"2024","unstructured":"Jiaxiang Tang, Jiawei Ren, Hang Zhou, Ziwei Liu, and Gang Zeng. 2024. DreamGaussian: Generative gaussian splatting for efficient 3D content creation. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_2_50_1","unstructured":"Romain Testuz Yuliy Schwartzburg and Mark Pauly. 2013. Automatic generation of constructible brick sculptures. In Eurographics (short paper). 81\u201384."},{"key":"e_1_2_2_51_1","volume-title":"Taylor","author":"Thompson Rylee","year":"2020","unstructured":"Rylee Thompson, Ghalebi Elahe, Terrance DeVries, and Graham W. Taylor. 2020. Building LEGO\u00ae using deep generative models of graphs. Machine Learning for Engineering Modeling, Simulation, and Design Workshop at Neural Information Processing Systems (2020)."},{"key":"e_1_2_2_52_1","volume-title":"SIGGRAPH 2025 Conference Papers.","author":"Wei Si-Tong","year":"2025","unstructured":"Si-Tong Wei, Rui-Huan Wang, Chuan-Zhi Zhou, Baoquan Chen, and Peng-Shuai Wang. 2025. OctGPT: Octree-based multiscale autoregressive models for 3D shape generation. In SIGGRAPH 2025 Conference Papers."},{"key":"e_1_2_2_53_1","volume-title":"Proc. BrickFest. 145\u2013166","author":"Winkler David V.","year":"2005","unstructured":"David V. Winkler. 2005. Automated brick layout. In Proc. BrickFest. 145\u2013166."},{"key":"e_1_2_2_54_1","volume-title":"International Conference on Neural Information Processing Systems (NeurIPS). 121859\u2013121881","author":"Wu Shuang","year":"2024","unstructured":"Shuang Wu, Youtian Lin, Feihu Zhang, Yifei Zeng, Jingxi Xu, Philip Torr, Xun Cao, and Yao Yao. 2024. Direct3D: Scalable image-to-3D generation via 3D latent diffusion Transformer. In International Conference on Neural Information Processing Systems (NeurIPS). 121859\u2013121881."},{"key":"e_1_2_2_55_1","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Xiang Jianfeng","year":"2025","unstructured":"Jianfeng Xiang, Zelong Lv, Sicheng Xu, Yu Deng, Ruicheng Wang, Bowen Zhang, Dong Chen, Xin Tong, and Jiaolong Yang. 2025. Structured 3D latents for scalable and versatile 3D generation. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_2_56_1","volume-title":"Article 196","author":"Xu Hao","year":"2019","unstructured":"Hao Xu, Ka-Hei Hui, Chi-Wing Fu, and Hao Zhang. 2019. Computational LEGO\u00ae technic design. ACM Trans. Graph. (SIGGRAPH Asia 2019) 38, 6, Article 196 (2019), 14 pages."},{"key":"e_1_2_2_57_1","volume-title":"Instantmesh: Efficient 3D mesh generation from a single image with sparse-view large reconstruction models. arXiv:2404.07191","author":"Xu Jiale","year":"2024","unstructured":"Jiale Xu, Weihao Cheng, Yiming Gao, Xintao Wang, Shenghua Gao, and Ying Shan. 2024. Instantmesh: Efficient 3D mesh generation from a single image with sparse-view large reconstruction models. arXiv:2404.07191 (2024)."},{"key":"e_1_2_2_58_1","volume-title":"SIGGRAPH 2025 Conference Papers.","author":"Ye Jingwen","year":"2025","unstructured":"Jingwen Ye, Yuze He, Yanning Zhou, Yiqin Zhu, Kaiwen Xiao, Yong-Jin Liu, Wei Yang, and Xiao Han. 2025. PrimitiveAnything: Human-crafted 3D primitive assembly generation with auto-regressive Transformer. In SIGGRAPH 2025 Conference Papers."},{"key":"e_1_2_2_59_1","volume-title":"International Conference on Neural Information Processing Systems (NeurIPS)","author":"Zhang Chubin","year":"2024","unstructured":"Chubin Zhang, Hongliang Song, Yi Wei, Chen Yu, Jiwen Lu, and Yansong Tang. 2024a. GeoLRM: Geometry-aware large reconstruction model for high-quality 3D gaussian generation. International Conference on Neural Information Processing Systems (NeurIPS) (2024), 55761\u201355784."},{"key":"e_1_2_2_60_1","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV). 3813\u20133824","author":"Zhang Lvmin","year":"2023","unstructured":"Lvmin Zhang, Anyi Rao, and Maneesh Agrawala. 2023. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV). 3813\u20133824."},{"key":"e_1_2_2_61_1","volume-title":"CLAY: A controllable large-scale generative model for creating high-quality 3D assets. ACM Trans. Graph. (SIGGRAPH 2024) 43, 4","author":"Zhang Longwen","year":"2024","unstructured":"Longwen Zhang, Ziyu Wang, Qixuan Zhang, Qiwei Qiu, Anqi Pang, Haoran Jiang, Wei Yang, Lan Xu, and Jingyi Yu. 2024b. CLAY: A controllable large-scale generative model for creating high-quality 3D assets. ACM Trans. Graph. (SIGGRAPH 2024) 43, 4 (2024), 1\u201320."},{"key":"e_1_2_2_62_1","volume-title":"Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, and Luke Zettlemoyer.","author":"Zhang Susan","year":"2022","unstructured":"Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, and Luke Zettlemoyer. 2022. OPT: Open pre-trained transformer language models. arXiv:2205.01068 (2022)."},{"key":"e_1_2_2_63_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.13603"},{"key":"e_1_2_2_64_1","volume-title":"Computational design of LEGO\u00ae sketch art. ACM Trans. Graph. (SIGGRAPH Asia 2023) 42, 6","author":"Zhou Mingjun","year":"2023","unstructured":"Mingjun Zhou, Jiahao Ge, Hao Xu, and Chi-Wing Fu. 2023. Computational design of LEGO\u00ae sketch art. ACM Trans. Graph. (SIGGRAPH Asia 2023) 42, 6 (2023), 1\u201315."},{"key":"e_1_2_2_65_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Zhu Junzhe","year":"2024","unstructured":"Junzhe Zhu, Peiye Zhuang, and Sanmi Koyejo. 2024. HiFA: High-fidelity text-to-3D generation with advanced diffusion guidance. In International Conference on Learning Representations (ICLR)."}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3763285","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T21:19:50Z","timestamp":1764969590000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3763285"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12]]},"references-count":65,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,12]]}},"alternative-id":["10.1145\/3763285"],"URL":"https:\/\/doi.org\/10.1145\/3763285","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"type":"print","value":"0730-0301"},{"type":"electronic","value":"1557-7368"}],"subject":[],"published":{"date-parts":[[2025,12]]},"assertion":[{"value":"2025-05-24","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-09","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-12-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}