{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:16:39Z","timestamp":1750220199755,"version":"3.41.0"},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2023,9,26]],"date-time":"2023-09-26T00:00:00Z","timestamp":1695686400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Zhejiang Provincial Natural Science Foundation of China","award":["LY21F020019"],"award-info":[{"award-number":["LY21F020019"]}]},{"DOI":"10.13039\/501100001809","name":"National Science Foundation of China","doi-asserted-by":"crossref","award":["62125201, 62020106007, U21B2040, 61802100, 61972119"],"award-info":[{"award-number":["62125201, 62020106007, U21B2040, 61802100, 61972119"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001381","name":"National Research Foundation, Singapore","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001381","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2024,2,29]]},"abstract":"<jats:p>\n            In this article, we investigate a new problem of generating a variety of multi-view fashion designs conditioned on a human pose and texture examples of arbitrary sizes, which can replace the repetitive and low-level design work for fashion designers. To solve this challenging multi-modal image translation problem, we propose a novel Photo-reAlistic fashIon desigN synThesis (PAINT) framework, which decomposes the framework into three manageable stages. In the first stage, we employ a\n            <jats:italic>Layout Generative Network<\/jats:italic>\n            (LGN) to transform an input human pose into a series of person semantic layouts. In the second stage, we propose a\n            <jats:italic>Texture Synthesis Network<\/jats:italic>\n            (TSN) to synthesize textures on all transformed semantic layouts. Specifically, we design a novel attentive texture transfer mechanism for precisely expanding texture patches to the irregular clothing regions of the target fashion designs. In the third stage, we leverage an\n            <jats:italic>Appearance Flow Network<\/jats:italic>\n            (AFN) to generate the fashion design images of other viewpoints from a single-view observation by learning 2D multi-scale appearance flow fields. Experimental results demonstrate that our method is capable of generating diverse photo-realistic multi-view fashion design images with fine-grained appearance details conditioned on the provided multiple inputs. The source code and trained models are available at\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"https:\/\/github.com\/gxl-groups\/PAINT\">https:\/\/github.com\/gxl-groups\/PAINT<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3545610","type":"journal-article","created":{"date-parts":[[2022,6,30]],"date-time":"2022-06-30T10:21:03Z","timestamp":1656584463000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["PAINT: Photo-realistic Fashion Design Synthesis"],"prefix":"10.1145","volume":"20","author":[{"given":"Xiaoling","family":"Gu","sequence":"first","affiliation":[{"name":"Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4433-2499","authenticated-orcid":false,"given":"Jie","family":"Huang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1239-4428","authenticated-orcid":false,"given":"Yongkang","family":"Wong","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1922-7283","authenticated-orcid":false,"given":"Jun","family":"Yu","sequence":"additional","affiliation":[{"name":"Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4923-0910","authenticated-orcid":false,"given":"Jianping","family":"Fan","sequence":"additional","affiliation":[{"name":"Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7129-6174","authenticated-orcid":false,"given":"Pai","family":"Peng","sequence":"additional","affiliation":[{"name":"YoutuLab, Tencent Technology Co., Ltd, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4846-2015","authenticated-orcid":false,"given":"Mohan S.","family":"Kankanhalli","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,9,26]]},"reference":[{"key":"e_1_3_2_2_2","first-page":"9015","volume-title":"ICCV","author":"Albahar Badour","year":"2019","unstructured":"Badour Albahar and Jia-Bin Huang. 2019. Guided image-to-image translation with bi-directional feature transformation. In ICCV. 9015\u20139024."},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.1986.4767851"},{"key":"e_1_3_2_4_2","first-page":"3640","volume-title":"CVPR","author":"Chen Liang-Chieh","year":"2016","unstructured":"Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu, and Alan L. Yuille. 2016. Attention to scale: Scale-aware semantic image segmentation. In CVPR. 3640\u20133649."},{"key":"e_1_3_2_5_2","first-page":"7103","volume-title":"CVPR","author":"Chen Yilun","year":"2018","unstructured":"Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, and Jian Sun. 2018. Cascaded pyramid network for multi-person pose estimation. In CVPR. 7103\u20137112."},{"key":"e_1_3_2_6_2","first-page":"8117","volume-title":"CVPR","author":"Dong Haoye","year":"2020","unstructured":"Haoye Dong, Xiaodan Liang, Yixuan Zhang, Xujie Zhang, Xiaohui Shen, Zhenyu Xie, Bowen Wu, and Jian Yin. 2020. Fashion editing with adversarial parsing learning. In CVPR. 8117\u20138125."},{"key":"e_1_3_2_7_2","article-title":"AI assisted apparel design","volume":"2007","author":"Dubey Alpana","year":"2020","unstructured":"Alpana Dubey, Nitish Bhardwaj, Kumar Abhinav, Mani Suma Kuriakose, Sakshi Jain, and Veenu Arora. 2020. AI assisted apparel design. CoRR abs\/2007.04950 (2020).","journal-title":"CoRR"},{"key":"e_1_3_2_8_2","first-page":"8857","volume-title":"CVPR","author":"Esser Patrick","year":"2018","unstructured":"Patrick Esser, Ekaterina Sutter, and Bjorn Ommer. 2018. A variational U-Net for conditional appearance and shape generation. In CVPR. 8857\u20138866."},{"key":"e_1_3_2_9_2","first-page":"2414","volume-title":"CVPR","author":"Gatys Leon A.","year":"2016","unstructured":"Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In CVPR. 2414\u20132423."},{"key":"e_1_3_2_10_2","first-page":"805","volume-title":"ECCV","author":"Gong Ke","year":"2018","unstructured":"Ke Gong, Xiaodan Liang, Yicheng Li, Yimin Chen, Ming Yang, and Liang Lin. 2018. Instance-level human parsing via part grouping network. In ECCV. 805\u2013822."},{"issue":"5","key":"e_1_3_2_11_2","doi-asserted-by":"crossref","first-page":"102276","DOI":"10.1016\/j.ipm.2020.102276","article-title":"Fashion analysis and understanding with artificial intelligence","volume":"57","author":"Gu Xiaoling","year":"2020","unstructured":"Xiaoling Gu, Fei Gao, Min Tan, and Pai Peng. 2020. Fashion analysis and understanding with artificial intelligence. Inf. Process. Manag. 57, 5 (2020), 102276.","journal-title":"Inf. Process. Manag."},{"key":"e_1_3_2_12_2","first-page":"7543","volume-title":"CVPR","author":"Han Xintong","year":"2018","unstructured":"Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, and Larry S. Davis. 2018. VITON: An image-based virtual try-on network. In CVPR. 7543\u20137552."},{"key":"e_1_3_2_13_2","first-page":"6626","volume-title":"NIPS","author":"Heusel Martin","year":"2017","unstructured":"Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In NIPS. 6626\u20136637."},{"key":"e_1_3_2_14_2","first-page":"1510","volume-title":"ICCV","author":"Huang Xun","year":"2017","unstructured":"Xun Huang and Serge J. Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV. 1510\u20131519."},{"key":"e_1_3_2_15_2","first-page":"8981","volume-title":"CVPR","author":"Hui Tak-Wai","year":"2018","unstructured":"Tak-Wai Hui, Xiaoou Tang, and Chen Change Loy. 2018. LiteFlowNet: A lightweight convolutional neural network for optical flow estimation. In CVPR. 8981\u20138989."},{"key":"e_1_3_2_16_2","first-page":"5967","volume-title":"CVPR","author":"Isola Phillip","year":"2017","unstructured":"Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR. 5967\u20135976."},{"key":"e_1_3_2_17_2","first-page":"2017","volume-title":"NIPS","author":"Jaderberg Max","year":"2015","unstructured":"Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. 2015. Spatial transformer networks. In NIPS. 2017\u20132025."},{"key":"e_1_3_2_18_2","first-page":"316","volume-title":"ECCV","author":"Jia Menglin","year":"2020","unstructured":"Menglin Jia, Mengyun Shi, Mikhail Sirotenko, Yin Cui, Claire Cardie, Bharath Hariharan, Hartwig Adam, and Serge J. Belongie. 2020. Fashionpedia: Ontology, segmentation, and an attribute localization dataset. In ECCV. 316\u2013332."},{"key":"e_1_3_2_19_2","first-page":"3721","volume-title":"IJCAI","author":"Jiang Shuhui","year":"2017","unstructured":"Shuhui Jiang and Yun Fu. 2017. Fashion style generator. In IJCAI. 3721\u20133727."},{"key":"e_1_3_2_20_2","first-page":"694","volume-title":"ECCV","author":"Johnson Justin","year":"2016","unstructured":"Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. 694\u2013711."},{"key":"e_1_3_2_21_2","first-page":"21:1\u201321:7","volume-title":"AH","author":"Kato Natsumi","year":"2019","unstructured":"Natsumi Kato, Hiroyuki Osone, Kotaro Oomori, Chun Wei Ooi, and Yoichi Ochiai. 2019. GANs-based clothes design: Pattern maker is all you need to design clothing. In AH. 21:1\u201321:7."},{"key":"e_1_3_2_22_2","first-page":"9260","volume-title":"CVPR","author":"Li Tao","year":"2020","unstructured":"Tao Li, Zhiyuan Liang, Sanyuan Zhao, Jiahao Gong, and Jianbing Shen. 2020. Self-learning with rectification strategy for human parsing. In CVPR. 9260\u20139269."},{"key":"e_1_3_2_23_2","first-page":"3693","volume-title":"CVPR","author":"Li Yining","year":"2019","unstructured":"Yining Li, Chen Huang, and Chen Change Loy. 2019. Dense intrinsic appearance flow for human pose transfer. In CVPR. 3693\u20133702."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2021.08.085"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2020.3015015"},{"key":"e_1_3_2_26_2","first-page":"936","volume-title":"CVPR","author":"Lin Tsung-Yi","year":"2017","unstructured":"Tsung-Yi Lin, Piotr Dollar, Ross B. Girshick, Kaiming He, Bharath Hariharan, and Serge J. Belongie. 2017. Feature pyramid networks for object detection. In CVPR. 936\u2013944."},{"key":"e_1_3_2_27_2","first-page":"89","volume-title":"ECCV","author":"Liu Guilin","year":"2018","unstructured":"Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. 2018. Image inpainting for irregular holes using partial convolutions. In ECCV. 89\u2013105."},{"key":"e_1_3_2_28_2","first-page":"1096","volume-title":"CVPR","author":"Liu Ziwei","year":"2016","unstructured":"Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. 2016. DeepFashion: Powering robust clothes recognition and retrieval with rich annotations. In CVPR. 1096\u20131104."},{"key":"e_1_3_2_29_2","first-page":"405","volume-title":"NIPS","author":"Ma Liqian","year":"2017","unstructured":"Liqian Ma, Xu Jia, Qianru Sun, Bernt Schiele, Tinne Tuytelaars, and Luc Van Gool. 2017. Pose-guided person image generation. In NIPS. 405\u2013415."},{"key":"e_1_3_2_30_2","first-page":"2337","volume-title":"CVPR","author":"Park Taesung","year":"2019","unstructured":"Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. 2019. Semantic image synthesis with spatially-adaptive normalization. In CVPR. 2337\u20132346."},{"key":"e_1_3_2_31_2","article-title":"Fashion-Gen: The generative fashion dataset and challenge","volume":"1806","author":"Rostamzadeh Negar","year":"2018","unstructured":"Negar Rostamzadeh, Seyedarian Hosseini, Thomas Boquet, Wojciech Stokowiec, Ying Zhang, Christian Jauvin, and Chris Pal. 2018. Fashion-Gen: The generative fashion dataset and challenge. CoRR abs\/1806.08317 (2018).","journal-title":"CoRR"},{"key":"e_1_3_2_32_2","first-page":"2226","volume-title":"NIPS","author":"Salimans Tim","year":"2016","unstructured":"Tim Salimans, Ian J. Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training GANs. In NIPS. 2226\u20132234."},{"key":"e_1_3_2_33_2","first-page":"37","volume-title":"ECCV Workshops","author":"Sbai Othman","year":"2018","unstructured":"Othman Sbai, Mohamed Elhoseiny, Antoine Bordes, Yann LeCun, and Camille Couprie. 2018. DesIGN: Design inspiration from generative networks. In ECCV Workshops. 37\u201344."},{"key":"e_1_3_2_34_2","first-page":"3408","volume-title":"CVPR","author":"Siarohin Aliaksandr","year":"2018","unstructured":"Aliaksandr Siarohin, Enver Sangineto, St\u00e9phane Lathuili\u00e8re, and Nicu Sebe. 2018. Deformable GANs for pose-based human image generation. In CVPR. 3408\u20133416."},{"key":"e_1_3_2_35_2","first-page":"6450","volume-title":"CVPR","author":"Wang Fei","year":"2017","unstructured":"Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual attention network for image classification. In CVPR. 6450\u20136458."},{"key":"e_1_3_2_36_2","first-page":"5702","volume-title":"ICCV","author":"Wang Wenguan","year":"2019","unstructured":"Wenguan Wang, Zhijie Zhang, Siyuan Qi, Jianbing Shen, Yanwei Pang, and Ling Shao. 2019. Learning compositional neural information fusion for human parsing. In ICCV. 5702\u20135712."},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3055780"},{"key":"e_1_3_2_38_2","first-page":"8926","volume-title":"CVPR","author":"Wang Wenguan","year":"2020","unstructured":"Wenguan Wang, Hailong Zhu, Jifeng Dai, Yanwei Pang, Jianbing Shen, and Ling Shao. 2020. Hierarchical human parsing with typed part-relation reasoning. In CVPR. 8926\u20138936."},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2003.819861"},{"key":"e_1_3_2_40_2","first-page":"8456","volume-title":"CVPR","author":"Xian Wenqi","year":"2018","unstructured":"Wenqi Xian, Patsorn Sangkloy, Varun Agrawal, Amit Raj, Jingwan Lu, Chen Fang, Fisher Yu, and James Hays. 2018. TextureGAN: Controlling deep image synthesis with texture patches. In CVPR. 8456\u20138465."},{"key":"e_1_3_2_41_2","first-page":"1316","volume-title":"CVPR","author":"Xu Tao","year":"2018","unstructured":"Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, and Xiaodong He. 2018. AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks. In CVPR. 1316\u20131324."},{"key":"e_1_3_2_42_2","first-page":"3570","volume-title":"CVPR","author":"Yamaguchi Kota","year":"2012","unstructured":"Kota Yamaguchi, M. Hadi Kiapour, Luis E. Ortiz, and Tamara L. Berg. 2012. Parsing clothing in fashion photographs. In CVPR. 3570\u20133577."},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2021.3055062"},{"key":"e_1_3_2_44_2","first-page":"9045","volume-title":"ICCV","author":"Yu Cong","year":"2019","unstructured":"Cong Yu, Yang Hu, Yan Chen, and Bing Zeng. 2019. Personalized fashion design. In ICCV. 9045\u20139054."},{"key":"e_1_3_2_45_2","first-page":"5505","volume-title":"CVPR","author":"Yu Jiahui","year":"2018","unstructured":"Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. 2018. Generative image inpainting with contextual attention. In CVPR. 5505\u20135514."},{"key":"e_1_3_2_46_2","first-page":"1486","volume-title":"CVPR","author":"Zeng Yanhong","year":"2019","unstructured":"Yanhong Zeng, Jianlong Fu, Hongyang Chao, and Baining Guo. 2019. Learning pyramid-context encoder network for high-quality image inpainting. In CVPR. 1486\u20131494."},{"key":"e_1_3_2_47_2","first-page":"7354","volume-title":"ICML","author":"Zhang Han","year":"2019","unstructured":"Han Zhang, Ian J. Goodfellow, Dimitris N. Metaxas, and Augustus Odena. 2019. Self-attention generative adversarial networks. In ICML. 7354\u20137363."},{"key":"e_1_3_2_48_2","first-page":"586","volume-title":"CVPR","author":"Zhang Richard","year":"2018","unstructured":"Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In CVPR. 586\u2013595."},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3049156"},{"key":"e_1_3_2_50_2","first-page":"286","volume-title":"ECCV","author":"Zhou Tinghui","year":"2016","unstructured":"Tinghui Zhou, Shubham Tulsiani, Weilun Sun, Jitendra Malik, and Alexei A. Efros. 2016. View synthesis by appearance flow. In ECCV. 286\u2013301."},{"key":"e_1_3_2_51_2","first-page":"465","volume-title":"NIPS","author":"Zhu Jun-Yan","year":"2017","unstructured":"Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, and Eli Shechtman. 2017. Toward multimodal image-to-image translation. In NIPS. 465\u2013476."},{"key":"e_1_3_2_52_2","first-page":"1689","volume-title":"ICCV","author":"Zhu Shizhan","year":"2017","unstructured":"Shizhan Zhu, Sanja Fidler, Raquel Urtasun, Dahua Lin, and Chen Change Loy. 2017. Be your own Prada: Fashion synthesis with structural coherence. In ICCV. 1689\u20131697."},{"key":"e_1_3_2_53_2","first-page":"2347","volume-title":"CVPR","author":"Zhu Zhen","year":"2019","unstructured":"Zhen Zhu, Tengteng Huang, Baoguang Shi, Miao Yu, Bofei Wang, and Xiang Bai. 2019. Progressive pose attention transfer for person image generation. In CVPR. 2347\u20132356."}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3545610","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3545610","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:02:45Z","timestamp":1750186965000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3545610"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,26]]},"references-count":52,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,2,29]]}},"alternative-id":["10.1145\/3545610"],"URL":"https:\/\/doi.org\/10.1145\/3545610","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"type":"print","value":"1551-6857"},{"type":"electronic","value":"1551-6865"}],"subject":[],"published":{"date-parts":[[2023,9,26]]},"assertion":[{"value":"2021-12-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-06-23","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-09-26","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}