{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T15:55:37Z","timestamp":1774626937307,"version":"3.50.1"},"reference-count":63,"publisher":"Association for Computing Machinery (ACM)","issue":"10","license":[{"start":{"date-parts":[[2024,10,29]],"date-time":"2024-10-29T00:00:00Z","timestamp":1730160000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62072347, 62371350"],"award-info":[{"award-number":["62072347, 62371350"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2024,10,31]]},"abstract":"<jats:p>\n            Face sketch-to-photo synthesis is widely used in law enforcement and digital entertainment, which can be achieved by\n            <jats:bold>Image-to-Image (I2I)<\/jats:bold>\n            translation. Traditional I2I translation algorithms usually regard the bidirectional translation of two image domains as two symmetric processes, so the two translation networks adopt the same structure. However, due to the scarcity of face sketches and the abundance of face photos, the sketch-to-photo and photo-to-sketch processes are asymmetric. Considering this issue, we propose a few-shot face sketch-to-photo synthesis model based on asymmetric I2I translation, where the sketch-to-photo process uses a feature-embedded generating network, while the photo-to-sketch process uses a style transfer network. On this basis, a three-stage asymmetric training strategy with style transfer as the trigger is proposed to optimize the proposed model by utilizing the advantage that the style transfer network only needs few-shot face sketches for training. Additionally, we discover that stylistic differences between the global and local sketch faces lead to inconsistencies between the global and local sketch-to-photo processes. Thus, a dual branch of the global face and local face is adopted in the sketch-to-photo synthesis model to learn the specific transformation processes for global structure and local details. Finally, the high-quality synthetic face photo can be generated through the global-local face fusion sub-network. Extensive experimental results demonstrate that the proposed\n            <jats:bold>Global-Local Asymmetric (GLAS)<\/jats:bold>\n            I2I translation algorithm compared to SOTA methods, at least improves FSIM by 0.0126, and reduces LPIPS (alex), LPIPS (squeeze), and LPIPS (vgg) by 0.0610, 0.0883, and 0.0719, respectively.\n          <\/jats:p>","DOI":"10.1145\/3672400","type":"journal-article","created":{"date-parts":[[2024,7,20]],"date-time":"2024-07-20T15:14:41Z","timestamp":1721488481000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Few-Shot Face Sketch-to-Photo Synthesis via Global-Local Asymmetric Image-to-Image Translation"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-1033-0890","authenticated-orcid":false,"given":"Yongkang","family":"Li","sequence":"first","affiliation":[{"name":"National Engineering Research Center for Multimedia Software, and Hubei Key Laboratory of Multimedia and Network Communication Engineering, School of Computer Science, Wuhan University, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-3459-4659","authenticated-orcid":false,"given":"Qifan","family":"Liang","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Multimedia Software, and Hubei Key Laboratory of Multimedia and Network Communication Engineering, School of Computer Science, Wuhan University, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1862-4781","authenticated-orcid":false,"given":"Zhen","family":"Han","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Multimedia Software, and Hubei Key Laboratory of Multimedia and Network Communication Engineering, School of Computer Science, Wuhan University, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-9709-5679","authenticated-orcid":false,"given":"Wenjun","family":"Mai","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Multimedia Software, and Hubei Key Laboratory of Multimedia and Network Communication Engineering, School of Computer Science, Wuhan University, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9796-488X","authenticated-orcid":false,"given":"Zhongyuan","family":"Wang","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Multimedia Software, and Hubei Key Laboratory of Multimedia and Network Communication Engineering, School of Computer Science, Wuhan University, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,10,29]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01477"},{"key":"e_1_3_1_3_2","first-page":"0262","article-title":"Zero-shot unsupervised image-to-image translation via exploiting semantic attributes","volume":"124","author":"Chen Yuanqi","year":"2022","unstructured":"Yuanqi Chen, Xiaoming Yu, Shan Liu, Wei Gao, and Ge Li. 2022. Zero-shot unsupervised image-to-image translation via exploiting semantic attributes. Image Vision Computing 124, C (2022), 10, 0262\u20138856.","journal-title":"Image Vision Computing"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581783.3611834"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00916"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00482"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2020.3031386"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2020.2975961"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW53098.2021.00084"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/3396237"},{"key":"e_1_3_1_11_2","unstructured":"Forrest N. Iandola Matthew W. Moskewicz Khalid Ashraf Song Han William J. Dally and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \\({\\lt}\\) 1MB model size. arXiv:1602.07360. Retrieved from https:\/\/arxiv.org\/abs\/1602.07360"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.632"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46475-6_43"},{"key":"e_1_3_1_14_2","first-page":"1857","volume-title":"Proceedings of the 34th International Conference on Machine Learning (ICML \u201917)","author":"Kim Taeksoo","year":"2017","unstructured":"Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, and Jiwon Kim. 2017. Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning (ICML \u201917), 1857\u20131865."},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3065386"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58548-8_2"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i07.6816"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2021.07.037"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2022.04.077"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2020.3005039"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503161.3547802"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.01065"},{"key":"e_1_3_1_23_2","first-page":"1005","volume-title":"Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR \u201905)","volume":"1","author":"Liu Qingshan","year":"2005","unstructured":"Qingshan Liu, Xiaoou Tang, Hongliang Jin, Hanqing Lu, and Songde Ma. 2005. A nonlinear approach for face sketch synthesis and recognition. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR \u201905), Vol. 1, 1005\u20131010."},{"key":"e_1_3_1_24_2","first-page":"24","author":"Martinez Aleix","year":"1998","unstructured":"Aleix Martinez and Robert Benavente. 1998. The AR Face Database: CVC Technical Report, 24.","journal-title":"The AR Face Database:"},{"key":"e_1_3_1_25_2","first-page":"965","volume-title":"Proceedings of the 2nd International Conference on Audio- and Video-Based Biometric Person Authentication (AVBPA \u201999)","author":"Messer K.","year":"1999","unstructured":"K. Messer, J. Matas, J. Kittler, Juergen Luettin, and Gilbert Ma\u00eetre. 1999. XM2VTSDB: The Extended M2VTS Database. Proceedings of the 2nd International Conference on Audio- and Video-Based Biometric Person Authentication (AVBPA \u201999), 965\u2013966."},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58545-7_19"},{"key":"e_1_3_1_27_2","first-page":"0950","article-title":"Face photo\u2013sketch synthesis via intra-domain enhancement","volume":"259","author":"Peng Chunlei","year":"2023","unstructured":"Chunlei Peng, Congyu Zhang, Decheng Liu, Nannan Wang, and Xinbo Gao. 2023. Face photo\u2013sketch synthesis via intra-domain enhancement. Knowledge-Based Systems. 259, C (Jan 2023), 12, 0950\u20137051","journal-title":"Knowledge-Based Systems"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2023.3326680"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/34.879790"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-19790-1_27"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2023.3341246"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2021.115980"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-021-05916-9"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.290.5500.2323"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58580-8_23"},{"key":"e_1_3_1_36_2","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arxiv.org\/abs\/1409.1556. Retrieved from https:\/\/arxiv.org\/abs\/1409.1556"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2021.3105725"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00252"},{"key":"e_1_3_1_39_2","first-page":"687","volume-title":"Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV \u201909)","author":"Tang Xiaoou","year":"2003","unstructured":"Xiaoou Tang and Xiaogang Wang. 2003. Face sketch synthesis and recognition. In Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV \u201909), 687\u2013694."},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACV56688.2023.00077"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICIG.2011.112"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00917"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00451"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503161.3547789"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2021.3120669"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01100"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.310"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2020.2972944"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2022.3226413"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2022.3229614"},{"issue":"2","key":"e_1_3_1_51_2","first-page":"1551","article-title":"Equivariant adversarial network for image-to-image translation","volume":"17","author":"Zareapoor Masoumeh","year":"2021","unstructured":"Masoumeh Zareapoor and Jie Yang. 2021. Equivariant adversarial network for image-to-image translation. ACM Transactions on Multimedia Computing Communications, and Applications 17, 2 (2021), 14. 1551\u20136857","journal-title":"ACM Transactions on Multimedia Computing Communications, and Applications"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1145\/2671188.2749321"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2011.2109730"},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2017.2664499"},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2019.2942514"},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2018.2890017"},{"key":"e_1_3_1_57_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2019.2924589"},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2019.2933590"},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2018.2869688"},{"key":"e_1_3_1_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00068"},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2021.3053775"},{"key":"e_1_3_1_62_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.244"},{"key":"e_1_3_1_63_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2021\/187"},{"key":"e_1_3_1_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2023.3253773"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3672400","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3672400","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:58:00Z","timestamp":1750294680000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3672400"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,29]]},"references-count":63,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2024,10,31]]}},"alternative-id":["10.1145\/3672400"],"URL":"https:\/\/doi.org\/10.1145\/3672400","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,29]]},"assertion":[{"value":"2024-01-03","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-05-20","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-10-29","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}