{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T16:42:08Z","timestamp":1777567328882,"version":"3.51.4"},"reference-count":55,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T00:00:00Z","timestamp":1721347200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/"}],"funder":[{"name":"The Research Grants Council of the Hong Kong Special Administrative Region","award":["14201921"],"award-info":[{"award-number":["14201921"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2024,7,19]]},"abstract":"<jats:p>\n            This paper presents a computational pipeline for creating personalized, physical LEGO\n            <jats:sup>\u00ae1<\/jats:sup>\n            figurines from user-input portrait photos. The generated figurine is an assembly of coherently-connected LEGO\n            <jats:sup>\u00ae<\/jats:sup>\n            bricks detailed with uv-printed decals, capturing prominent features such as hairstyle, clothing style, and garment color, and also intricate details such as logos, text, and patterns. This task is non-trivial, due to the substantial domain gap between unconstrained user photos and the stylistically-consistent LEGO\n            <jats:sup>\u00ae<\/jats:sup>\n            figurine models. To ensure assemble-ability by LEGO\n            <jats:sup>\u00ae<\/jats:sup>\n            bricks while capturing prominent features and intricate details, we design a three-stage pipeline: (i) we formulate a CLIP-guided retrieval approach to connect the domains of user photos and LEGO\n            <jats:sup>\u00ae<\/jats:sup>\n            figurines, then output physically-assemble-able LEGO\n            <jats:sup>\u00ae<\/jats:sup>\n            figurines with decals excluded; (ii) we then synthesize decals on the figurines via a symmetric U-Nets architecture conditioned on appearance features extracted from user photos; and (iii) we next reproject and uv-print the decals on associated LEGO\n            <jats:sup>\u00ae<\/jats:sup>\n            bricks for physical model production. We evaluate the effectiveness of our method against eight hundred expert-designed figurines, using a comprehensive set of metrics, which include a novel GPT-4V-based evaluation metric, demonstrating superior performance of our method in visual quality and resemblance to input photos. Also, we show our method's robustness by generating LEGO\n            <jats:sup>\u00ae<\/jats:sup>\n            figurines from diverse inputs and physically fabricating and assembling several of them.\n          <\/jats:p>","DOI":"10.1145\/3658167","type":"journal-article","created":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T14:47:57Z","timestamp":1721400477000},"page":"1-16","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Creating LEGO Figurines from Single Images"],"prefix":"10.1145","volume":"43","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-7318-2887","authenticated-orcid":false,"given":"Jiahao","family":"Ge","sequence":"first","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8261-6949","authenticated-orcid":false,"given":"Mingjun","family":"Zhou","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-7830-1684","authenticated-orcid":false,"given":"Wenrui","family":"Bao","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4011-5745","authenticated-orcid":false,"given":"Hao","family":"Xu","sequence":"additional","affiliation":[{"name":"Qianzhi Technology Inc., Shenzhen, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5238-593X","authenticated-orcid":false,"given":"Chi-Wing","family":"Fu","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, Hong Kong"}]}],"member":"320","published-online":{"date-parts":[[2024,7,19]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1561\/1100000055"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3618366"},{"key":"e_1_2_2_3_1","unstructured":"BrickLink. 2024. Bricklink Color Guide. https:\/\/www.bricklink.com\/catalogColors.asp"},{"key":"e_1_2_2_4_1","volume-title":"Carigans: Unpaired photo-to-caricature translation. arXiv preprint arXiv:1811.00222","author":"Cao Kaidi","year":"2018","unstructured":"Kaidi Cao, Jing Liao, and Lu Yuan. 2018. Carigans: Unpaired photo-to-caricature translation. arXiv preprint arXiv:1811.00222 (2018)."},{"key":"e_1_2_2_5_1","volume-title":"Subject-driven text-to-image generation via apprenticeship learning. arXiv preprint arXiv:2304.00186","author":"Chen Wenhu","year":"2023","unstructured":"Wenhu Chen, Hexiang Hu, Yandong Li, Nataniel Rui, Xuhui Jia, Ming-Wei Chang, and William W Cohen. 2023a. Subject-driven text-to-image generation via apprenticeship learning. arXiv preprint arXiv:2304.00186 (2023)."},{"key":"e_1_2_2_6_1","volume-title":"Anydoor: Zero-shot object-level image customization. arXiv preprint arXiv:2307.09481","author":"Chen Xi","year":"2023","unstructured":"Xi Chen, Lianghua Huang, Yu Liu, Yujun Shen, Deli Zhao, and Hengshuang Zhao. 2023b. Anydoor: Zero-shot object-level image customization. arXiv preprint arXiv:2307.09481 (2023)."},{"key":"e_1_2_2_7_1","volume-title":"Blender - a 3D modelling and rendering package","author":"Community Blender Online","unstructured":"Blender Online Community. 2024. Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam. http:\/\/www.blender.org"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3130800.3130831"},{"key":"e_1_2_2_9_1","volume-title":"An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618","author":"Gal Rinon","year":"2022","unstructured":"Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H Bermano, Gal Chechik, and Daniel Cohen-Or. 2022a. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618 (2022)."},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3528223.3530164"},{"key":"e_1_2_2_11_1","volume-title":"Pose Estimation, Segmentation and Re-Identification of Clothing Images. CVPR","author":"Ge Yuying","year":"2019","unstructured":"Yuying Ge, Ruimao Zhang, Lingyun Wu, Xiaogang Wang, Xiaoou Tang, and Ping Luo. 2019. A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images. CVPR (2019)."},{"key":"e_1_2_2_12_1","volume-title":"Petersen","author":"Gower Rebecca A. H.","year":"1998","unstructured":"Rebecca A. H. Gower, Agnes E. Heydtmann, and Henrik G. Petersen. 1998. LEGO: Automated Model Construction. In European Study Group with Industry. 81--94."},{"key":"e_1_2_2_13_1","volume-title":"Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation. arXiv preprint arXiv:2311.17117","author":"Hu Li","year":"2023","unstructured":"Li Hu, Xin Gao, Peng Zhang, Ke Sun, Bang Zhang, and Liefeng Bo. 2023. Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation. arXiv preprint arXiv:2311.17117 (2023)."},{"key":"e_1_2_2_14_1","volume-title":"Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, and Yu-Chuan Su.","author":"Jia Xuhui","year":"2023","unstructured":"Xuhui Jia, Yang Zhao, Kelvin CK Chan, Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, and Yu-Chuan Su. 2023. Taming encoder for zero fine-tuning image customization with text-to-image diffusion models. arXiv preprint arXiv:2304.02642 (2023)."},{"key":"e_1_2_2_15_1","unstructured":"Glenn Jocher Ayush Chaurasia and Jing Qiu. 2023. YOLO by Ultralytics. https:\/\/github.com\/ultralytics\/ultralytics"},{"key":"e_1_2_2_16_1","volume-title":"Training generative adversarial networks with limited data. Advances in neural information processing systems 33","author":"Karras Tero","year":"2020","unstructured":"Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2020. Training generative adversarial networks with limited data. Advances in neural information processing systems 33 (2020), 12104--12114."},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00453"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3618351"},{"key":"e_1_2_2_19_1","volume-title":"The Eleventh International Conference on Learning Representations, ICLR 2023","author":"Kwon Gihyun","year":"2023","unstructured":"Gihyun Kwon and Jong Chul Ye. 2023. Diffusion-based Image Translation using disentangled style and content representation. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1--5, 2023. OpenReview.net. https:\/\/openreview.net\/pdf?id=Nayau9fwXU"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2859039"},{"key":"e_1_2_2_21_1","volume-title":"Image2lego: Customized lego set generation from images. arXiv preprint arXiv:2108.08477","author":"Lennon Kyle","year":"2021","unstructured":"Kyle Lennon, Katharina Fransen, Alexander O'Brien, Yumeng Cao, Yamin Beveridge, Matthewand Arefeen, Nikhil Singh, and Iddo Drori. 2021. Image2lego: Customized lego set generation from images. arXiv preprint arXiv:2108.08477 (2021)."},{"key":"e_1_2_2_22_1","volume-title":"Blip-diffusion: Pre-trained subject representation for controllable text-to-image generation and editing. arXiv preprint arXiv:2305.14720","author":"Li Dongxu","year":"2023","unstructured":"Dongxu Li, Junnan Li, and Steven CH Hoi. 2023. Blip-diffusion: Pre-trained subject representation for controllable text-to-image generation and editing. arXiv preprint arXiv:2305.14720 (2023)."},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2015.2408360"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.163"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818091"},{"key":"e_1_2_2_26_1","volume-title":"Making papercraft toys from meshes using strip-based approximate unfolding. ACM transactions on graphics (TOG) 23, 3","author":"Mitani Jun","year":"2004","unstructured":"Jun Mitani and Hiromasa Suzuki. 2004. Making papercraft toys from meshes using strip-based approximate unfolding. ACM transactions on graphics (TOG) 23, 3 (2004), 259--263."},{"key":"e_1_2_2_28_1","unstructured":"Maxime Oquab Timoth\u00e9e Darcet Th\u00e9o Moutakanni Huy Vo Marc Szafraniec Vasil Khalidov Pierre Fernandez Daniel Haziza Francisco Massa Alaaeldin El-Nouby et al. 2023. Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193 (2023)."},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3306346.3323040"},{"key":"e_1_2_2_30_1","volume-title":"Proc. NIK (Norsk Informatikkonferanse). 87--97","author":"Petrovi\u010d Pavel","year":"2001","unstructured":"Pavel Petrovi\u010d. 2001. Solving LEGO Brick Layout Problem using Evolutionary Algorithms. In Proc. NIK (Norsk Informatikkonferanse). 87--97."},{"key":"e_1_2_2_31_1","volume-title":"International conference on machine learning. PMLR, 8748--8763","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763."},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.02155"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3528233.3530757"},{"key":"e_1_2_2_35_1","volume-title":"Instantbooth: Personalized text-to-image generation without test-time finetuning. arXiv preprint arXiv:2304.03411","author":"Shi Jing","year":"2023","unstructured":"Jing Shi, Wei Xiong, Zhe Lin, and Hyun Joon Jung. 2023. Instantbooth: Personalized text-to-image generation without test-time finetuning. arXiv preprint arXiv:2304.03411 (2023)."},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3450626.3459771"},{"key":"e_1_2_2_37_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3130800.3130808","article-title":"Computational design of wind-up toys","volume":"36","author":"Song Peng","year":"2017","unstructured":"Peng Song, Xiaofei Wang, Xiao Tang, Chi-Wing Fu, Hongfei Xu, Ligang Liu, and Niloy J Mitra. 2017. Computational design of wind-up toys. ACM Transactions on Graphics (TOG) 36, 6 (2017), 1--13.","journal-title":"ACM Transactions on Graphics (TOG)"},{"key":"e_1_2_2_38_1","volume-title":"Proceedings of IASS annual symposia","volume":"2019","author":"Soriano Enrique","year":"2019","unstructured":"Enrique Soriano, Ramon Sastre, and Dionis Boixader. 2019. G-shells: Flat collapsible geodesic mechanisms for gridshells. In Proceedings of IASS annual symposia, Vol. 2019. International Association for Shell and Spatial Structures (IASS), 1--8."},{"key":"e_1_2_2_39_1","volume-title":"Proc. Symposium on Combinatorial Search (SoCS). 89--97","author":"Stephenson Ben","year":"2016","unstructured":"Ben Stephenson. 2016. A Multi-Phase Search Approach to the LEGO Construction Problem. In Proc. Symposium on Combinatorial Search (SoCS). 89--97."},{"key":"e_1_2_2_40_1","unstructured":"Romain Testuz Yuliy Schwartzburg and Mark Pauly. 2013. Automatic Generation of Constructible Brick Sculptures. In Eurographics (short paper). 81--84."},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01048"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00191"},{"key":"e_1_2_2_43_1","volume-title":"Pretraining is all you need for image-to-image translation. arXiv preprint arXiv:2205.12952","author":"Wang Tengfei","year":"2022","unstructured":"Tengfei Wang, Ting Zhang, Bo Zhang, Hao Ouyang, Dong Chen, Qifeng Chen, and Fang Wen. 2022. Pretraining is all you need for image-to-image translation. arXiv preprint arXiv:2205.12952 (2022)."},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2003.819861"},{"key":"e_1_2_2_45_1","volume-title":"Proc. BrickFest. 145--166","author":"Winkler David V.","year":"2005","unstructured":"David V. Winkler. 2005. Automated Brick Layout. In Proc. BrickFest. 145--166."},{"key":"e_1_2_2_46_1","first-page":"12077","article-title":"SegFormer: Simple and efficient design for semantic segmentation with transformers","volume":"34","author":"Xie Enze","year":"2021","unstructured":"Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. 2021. SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems 34 (2021), 12077--12090.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3355089.3356504"},{"key":"e_1_2_2_48_1","volume-title":"Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, and Mike Zheng Shou.","author":"Xu Zhongcong","year":"2023","unstructured":"Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, and Mike Zheng Shou. 2023. MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model. arXiv preprint arXiv:2311.16498 (2023)."},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3450626.3459796"},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00355"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3453477"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/3618342"},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00978"},{"key":"e_1_2_2_54_1","volume-title":"Computer Graphics Forum","author":"Zhou Jie","unstructured":"Jie Zhou, Xuejin Chen, and Y Xu. 2019. Automatic generation of vivid LEGO architectural sculptures. In Computer Graphics Forum, Vol. 38. Wiley Online Library, 31--42."},{"key":"e_1_2_2_55_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3618306","article-title":"Computational Design of LEGO\u00ae Sketch Art","volume":"42","author":"Zhou Mingjun","year":"2023","unstructured":"Mingjun Zhou, Jiahao Ge, Hao Xu, and Chi-Wing Fu. 2023. Computational Design of LEGO\u00ae Sketch Art. ACM Transactions on Graphics (TOG) 42, 6 (2023), 1--15.","journal-title":"ACM Transactions on Graphics (TOG)"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.244"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3658167","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3658167","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:05:54Z","timestamp":1750291554000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3658167"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,19]]},"references-count":55,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,7,19]]}},"alternative-id":["10.1145\/3658167"],"URL":"https:\/\/doi.org\/10.1145\/3658167","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,19]]},"assertion":[{"value":"2024-07-19","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}