{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T08:14:19Z","timestamp":1770538459130,"version":"3.49.0"},"reference-count":58,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2018,12,4]],"date-time":"2018-12-04T00:00:00Z","timestamp":1543881600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2018,12,31]]},"abstract":"<jats:p>This paper introduces a novel method for realtime portrait animation in a single photo. Our method requires only a single portrait photo and a set of facial landmarks derived from a driving source (e.g., a photo or a video sequence), and generates an animated image with rich facial details. The core of our method is a warp-guided generative model that instantly fuses various fine facial details (e.g., creases and wrinkles), which are necessary to generate a high-fidelity facial expression, onto a pre-warped image. Our method factorizes out the nonlinear geometric transformations exhibited in facial expressions by lightweight 2D warps and leaves the appearance detail synthesis to conditional generative neural networks for high-fidelity facial animation generation. We show such a factorization of geometric transformation and appearance synthesis largely helps the network better learn the high nonlinearity of the facial expression functions and also facilitates the design of the network architecture. Through extensive experiments on various portrait photos from the Internet, we show the significant efficacy of our method compared with prior arts.<\/jats:p>","DOI":"10.1145\/3272127.3275043","type":"journal-article","created":{"date-parts":[[2018,11,28]],"date-time":"2018-11-28T19:16:10Z","timestamp":1543432570000},"page":"1-12","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":92,"title":["Warp-guided GANs for single-photo facial animation"],"prefix":"10.1145","volume":"37","author":[{"given":"Jiahao","family":"Geng","sequence":"first","affiliation":[{"name":"Zhejiang University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tianjia","family":"Shao","sequence":"additional","affiliation":[{"name":"Zhejiang University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Youyi","family":"Zheng","sequence":"additional","affiliation":[{"name":"Zhejiang University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yanlin","family":"Weng","sequence":"additional","affiliation":[{"name":"Zhejiang University and ZJU-FaceUnity Joint Lab of Intelligent Graphics, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kun","family":"Zhou","sequence":"additional","affiliation":[{"name":"Zhejiang University and ZJU-FaceUnity Joint Lab of Intelligent Graphics, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2018,12,4]]},"reference":[{"key":"e_1_2_2_1_1","first-page":"265","article-title":"TensorFlow: A System for Large-Scale Machine Learning","volume":"16","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , 2016 . TensorFlow: A System for Large-Scale Machine Learning .. In OSDI , Vol. 16. 265 -- 283 . Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A System for Large-Scale Machine Learning.. In OSDI, Vol. 16. 265--283.","journal-title":"OSDI"},{"key":"e_1_2_2_2_1","volume-title":"Image Analysis for Multimedia Interactive Services (WIAMIS), 2010 11th international workshop on. IEEE, 1--4.","author":"Aifanti Niki","year":"2010","unstructured":"Niki Aifanti , Christos Papachristou , and Anastasios Delopoulos . 2010 . The MUG facial expression database . In Image Analysis for Multimedia Interactive Services (WIAMIS), 2010 11th international workshop on. IEEE, 1--4. Niki Aifanti, Christos Papachristou, and Anastasios Delopoulos. 2010. The MUG facial expression database. In Image Analysis for Multimedia Interactive Services (WIAMIS), 2010 11th international workshop on. IEEE, 1--4."},{"key":"e_1_2_2_3_1","volume-title":"Wasserstein gan. arXiv preprint arXiv:1701.07875","author":"Arjovsky Martin","year":"2017","unstructured":"Martin Arjovsky , Soumith Chintala , and L\u00e9on Bottou . 2017. Wasserstein gan. arXiv preprint arXiv:1701.07875 ( 2017 ). Martin Arjovsky, Soumith Chintala, and L\u00e9on Bottou. 2017. Wasserstein gan. arXiv preprint arXiv:1701.07875 (2017)."},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3130800.3130818"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.12147"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1111\/1467-8659.t01-1-00712"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/311535.311556"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2461912.2461976"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/AFGR.2008.4813339"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766943"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601204"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2461912.2462012"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2013.249"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897824.2925873"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2070781.2024164"},{"key":"e_1_2_2_16_1","doi-asserted-by":"crossref","unstructured":"Hui Ding Kumar Sricharan and Rama Chellappa. 2018. ExprGAN: Facial Expression Editing with Controllable Expression Intensity. In AAAI.  Hui Ding Kumar Sricharan and Rama Chellappa. 2018. ExprGAN: Facial Expression Editing with Controllable Expression Intensity. In AAAI.","DOI":"10.1609\/aaai.v32i1.12277"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897824.2925933"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46475-6_20"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.537"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.12552"},{"key":"e_1_2_2_21_1","volume-title":"Winter semester","author":"Gauthier Jon","year":"2014","unstructured":"Jon Gauthier . 2014. Conditional generative adversarial nets for convolutional face generation. Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition , Winter semester 2014 , 5 (2014), 2. Jon Gauthier. 2014. Conditional generative adversarial nets for convolutional face generation. Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition, Winter semester 2014, 5 (2014), 2."},{"key":"e_1_2_2_22_1","unstructured":"Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems (NIPS). 2672--2680.   Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems (NIPS). 2672--2680."},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299058"},{"key":"e_1_2_2_24_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778. Pei-Lun Hsieh, Chongyang Ma, Jihun Yu, and Hao Li.","author":"He Kaiming","year":"2015","unstructured":"Kaiming He , Xiangyu Zhang , Shaoqing Ren , and Jian Sun . 2016. Deep residual learning for image recognition . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778. Pei-Lun Hsieh, Chongyang Ma, Jihun Yu, and Hao Li. 2015 . Unconstrained realtime facial performance capture. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 1675--1683. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778. Pei-Lun Hsieh, Chongyang Ma, Jihun Yu, and Hao Li. 2015. Unconstrained realtime facial performance capture. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1675--1683."},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3072959.3073659"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.632"},{"key":"e_1_2_2_27_1","volume-title":"Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196","author":"Karras Tero","year":"2017","unstructured":"Tero Karras , Timo Aila , Samuli Laine , and Jaakko Lehtinen . 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 ( 2017 ). Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017)."},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2503385.2503395"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.2197\/ipsjjip.22.401"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201283"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.397"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366193"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.272"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2461912.2462019"},{"key":"e_1_2_2_35_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 57--64","author":"Li Kai","year":"2012","unstructured":"Kai Li , Feng Xu , Jue Wang , Qionghai Dai , and Yebin Liu . 2012 . A data-driven approach for facial expression synthesis in video . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 57--64 . Kai Li, Feng Xu, Jue Wang, Qionghai Dai, and Yebin Liu. 2012. A data-driven approach for facial expression synthesis in video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 57--64."},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818122"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/383259.383289"},{"key":"e_1_2_2_38_1","volume-title":"The Chicago face database: A free stimulus set of faces and norming data. Behavior research methods 47, 4","author":"Ma Debbie S","year":"2015","unstructured":"Debbie S Ma , Joshua Correll , and Bernd Wittenbrink . 2015. The Chicago face database: A free stimulus set of faces and norming data. Behavior research methods 47, 4 ( 2015 ), 1122--1135. Debbie S Ma, Joshua Correll, and Bernd Wittenbrink. 2015. The Chicago face database: A free stimulus set of faces and norming data. Behavior research methods 47, 4 (2015), 1122--1135."},{"key":"e_1_2_2_39_1","first-page":"3","article-title":"Rectifier nonlinearities improve neural network acoustic models","volume":"30","author":"Maas Andrew L","year":"2013","unstructured":"Andrew L Maas , Awni Y Hannun , and Andrew Y Ng . 2013 . Rectifier nonlinearities improve neural network acoustic models . In Proc. ICML , Vol. 30. 3 . Andrew L Maas, Awni Y Hannun, and Andrew Y Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In Proc. ICML, Vol. 30. 3.","journal-title":"Proc. ICML"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46454-1_35"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1531326.1531363"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.580"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2005.1521424"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.372"},{"key":"e_1_2_2_45_1","volume-title":"Geometry-Contrastive Generative Adversarial Network for Facial Expression Synthesis. arXiv preprint arXiv:1802.01822","author":"Qiao Fengchun","year":"2018","unstructured":"Fengchun Qiao , Naiming Yao , Zirui Jiao , Zhihao Li , Hui Chen , and Hongan Wang . 2018. Geometry-Contrastive Generative Adversarial Network for Facial Expression Synthesis. arXiv preprint arXiv:1802.01822 ( 2018 ). Fengchun Qiao, Naiming Yao, Zirui Jiao, Zhihao Li, Hui Chen, and Hongan Wang. 2018. Geometry-Contrastive Generative Adversarial Network for Facial Expression Synthesis. arXiv preprint arXiv:1802.01822 (2018)."},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661229.2661290"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.241"},{"key":"e_1_2_2_48_1","volume-title":"Geometry Guided Adversarial Facial Expression Synthesis. arXiv preprint arXiv:1712.03474","author":"Song Lingxiao","year":"2017","unstructured":"Lingxiao Song , Zhihe Lu , Ran He , Zhenan Sun , and Tieniu Tan . 2017. Geometry Guided Adversarial Facial Expression Synthesis. arXiv preprint arXiv:1712.03474 ( 2017 ). Lingxiao Song, Zhihe Lu, Ran He, Zhenan Sun, and Tieniu Tan. 2017. Geometry Guided Adversarial Facial Expression Synthesis. arXiv preprint arXiv:1712.03474 (2017)."},{"key":"e_1_2_2_49_1","unstructured":"Joshua M Susskind Geoffrey E Hinton Javier R Movellan and Adam K Anderson. 2008. Generating facial expressions with deep belief nets. In Affective Computing. InTech.  Joshua M Susskind Geoffrey E Hinton Javier R Movellan and Adam K Anderson. 2008. Generating facial expressions with deep belief nets. In Affective Computing. InTech."},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818056"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2929464.2929475"},{"key":"e_1_2_2_52_1","volume-title":"Proc. Int'l Conf. Language Resources and Evaluation, Workshop EMOTION. 65--70","author":"Valstar Michel","year":"2010","unstructured":"Michel Valstar and M Pantic . 2010 . Induced disgust, happiness and surprise: An addition to the mmi facial expression database . In Proc. Int'l Conf. Language Resources and Evaluation, Workshop EMOTION. 65--70 . Michel Valstar and M Pantic. 2010. Induced disgust, happiness and surprise: An addition to the mmi facial expression database. In Proc. Int'l Conf. Language Resources and Evaluation, Workshop EMOTION. 65--70."},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/1073204.1073209"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897824.2925947"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964972"},{"key":"e_1_2_2_56_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 861--868","author":"Yang Fei","year":"2012","unstructured":"Fei Yang , Lubomir Bourdev , Eli Shechtman , Jue Wang , and Dimitris Metaxas . 2012 . Facial expression editing in video using a temporally-smooth factorization . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 861--868 . Fei Yang, Lubomir Bourdev, Eli Shechtman, Jue Wang, and Dimitris Metaxas. 2012. Facial expression editing in video using a temporally-smooth factorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 861--868."},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964955"},{"key":"e_1_2_2_58_1","volume-title":"Semantic facial expression editing using autoencoded flow. arXiv preprint arXiv:1611.09961","author":"Yeh Raymond","year":"2016","unstructured":"Raymond Yeh , Ziwei Liu , Dan B Goldman , and Aseem Agarwala . 2016. Semantic facial expression editing using autoencoded flow. arXiv preprint arXiv:1611.09961 ( 2016 ). Raymond Yeh, Ziwei Liu, Dan B Goldman, and Aseem Agarwala. 2016. Semantic facial expression editing using autoencoded flow. arXiv preprint arXiv:1611.09961 (2016)."}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3272127.3275043","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3272127.3275043","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T00:44:04Z","timestamp":1750207444000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3272127.3275043"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,12,4]]},"references-count":58,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2018,12,31]]}},"alternative-id":["10.1145\/3272127.3275043"],"URL":"https:\/\/doi.org\/10.1145\/3272127.3275043","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,12,4]]},"assertion":[{"value":"2018-12-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}