{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T14:11:11Z","timestamp":1766067071733,"version":"3.41.0"},"reference-count":62,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2019,11,8]],"date-time":"2019-11-08T00:00:00Z","timestamp":1573171200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2019,12,31]]},"abstract":"<jats:p>From angling smiles to duck faces, all kinds of facial expressions can be seen in selfies, portraits, and Internet pictures. These photos are taken from various camera types, and under a vast range of angles and lighting conditions. We present a deep learning framework that can fully normalize unconstrained face images, i.e., remove perspective distortions, relight to an evenly lit environment, and predict a frontal and neutral face. Our method can produce a high resolution image while preserving important facial details and the likeness of the subject, along with the original background. We divide this ill-posed problem into three consecutive normalization steps, each using a different generative adversarial network that acts as an image generator. Perspective distortion removal is performed using a dense flow field predictor. A uniformly illuminated face is obtained using a lighting translation network, and the facial expression is neutralized using a generalized facial expression synthesis framework combined with a regression network based on deep features for facial recognition. We introduce new data representations for conditional inference, as well as training methods for supervised learning to ensure that different expressions of the same person can yield to not only a plausible but also a similar neutral face. We demonstrate our results on a wide range of challenging images collected in the wild. Key applications of our method range from robust image-based 3D avatar creation, portrait manipulation, to facial enhancement and reconstruction tasks for crime investigation. We also found through an extensive user study, that our normalization results can be hardly distinguished from ground truth ones if the person is not familiar.<\/jats:p>","DOI":"10.1145\/3355089.3356568","type":"journal-article","created":{"date-parts":[[2019,11,8]],"date-time":"2019-11-08T20:27:58Z","timestamp":1573244878000},"page":"1-16","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":29,"title":["Deep face normalization"],"prefix":"10.1145","volume":"38","author":[{"given":"Koki","family":"Nagano","sequence":"first","affiliation":[{"name":"Pinscreen"}]},{"given":"Huiwen","family":"Luo","sequence":"additional","affiliation":[{"name":"Pinscreen"}]},{"given":"Zejian","family":"Wang","sequence":"additional","affiliation":[{"name":"Pinscreen"}]},{"given":"Jaewoo","family":"Seo","sequence":"additional","affiliation":[{"name":"Pinscreen"}]},{"given":"Jun","family":"Xing","sequence":"additional","affiliation":[{"name":"Mihoyo"}]},{"given":"Liwen","family":"Hu","sequence":"additional","affiliation":[{"name":"Pinscreen"}]},{"given":"Lingyu","family":"Wei","sequence":"additional","affiliation":[{"name":"Pinscreen"}]},{"given":"Hao","family":"Li","sequence":"additional","affiliation":[{"name":"Pinscreen, University of Southern California, USC Institute for Creative Technologies"}]}],"member":"320","published-online":{"date-parts":[[2019,11,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.598229"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3130800.3130818"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.51"},{"key":"e_1_2_1_4_1","volume-title":"Smith","author":"Bas Anil","year":"2018","unstructured":"Anil Bas and William A. P. Smith. 2018. Statistical transformer networks: learning shape and appearance models via self supervision. CoRR abs\/1804.02541 (2018). arXiv:1804.02541 http:\/\/arxiv.org\/abs\/1804.02541"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/311535.311556"},{"key":"e_1_2_1_6_1","volume-title":"Matteo Ruggero Ronchi, and Pietro Perona","author":"Burgos-Artizzu Xavier P.","year":"2014","unstructured":"Xavier P. Burgos-Artizzu, Matteo Ruggero Ronchi, and Pietro Perona. 2014. Distance Estimation of an Unknown Person from a Portrait. In ECCV. Springer International Publishing, Cham, 313--327."},{"key":"e_1_2_1_7_1","first-page":"413","article-title":"Facewarehouse: A 3d facial expression database for visual computing","volume":"20","author":"Cao Chen","year":"2014","unstructured":"Chen Cao, Yanlin Weng, Shun Zhou, Yiying Tong, and Kun Zhou. 2014. Facewarehouse: A 3d facial expression database for visual computing. IEEE TVCG 20, 3 (2014), 413--425.","journal-title":"IEEE TVCG"},{"volume-title":"StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation","author":"Choi Yunjey","key":"e_1_2_1_8_1","unstructured":"Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. In IEEE CVPR."},{"key":"e_1_2_1_9_1","volume-title":"Freeman","author":"Cole Forrester","year":"2017","unstructured":"Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, Inbar Mosseri, and William T. Freeman. 2017. Synthesizing Normalized Faces From Facial Identity Features. In IEEE CVPR."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/357290.357293"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1322355111"},{"key":"e_1_2_1_12_1","unstructured":"Federal Bureau of Investigation. 2019. FBI Most Wanted. https:\/\/www.fbi.gov\/wanted."},{"volume-title":"Advances in Visual Computing","author":"Flores Arturo","key":"e_1_2_1_13_1","unstructured":"Arturo Flores, Eric Christiansen, David Kriegman, and Serge Belongie. 2013. Camera Distance from Face Images. In Advances in Visual Computing. Springer Berlin Heidelberg, Berlin, Heidelberg, 513--522."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897824.2925933"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3272127.3275043"},{"key":"e_1_2_1_16_1","volume-title":"Freeman","author":"Genova Kyle","year":"2018","unstructured":"Kyle Genova, Forrester Cole, Aaron Maschinot, Aaron Sarna, Daniel Vlasic, and William T. Freeman. 2018. Unsupervised Training for 3D Morphable Model Regression. In IEEE CVPR."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.927464"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2070781.2024163"},{"volume-title":"Multi-PIE. In 2008 8th IEEE International Conference on Automatic Face Gesture Recognition. 1--8.","author":"Gross R.","key":"e_1_2_1_19_1","unstructured":"R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker. 2008. Multi-PIE. In 2008 8th IEEE International Conference on Automatic Face Gesture Recognition. 1--8."},{"volume-title":"Effective Face Frontalization in Unconstrained Images","author":"Hassner Tal","key":"e_1_2_1_20_1","unstructured":"Tal Hassner, Shai Harel, Eran Paz, and Roee Enbar. 2015. Effective Face Frontalization in Unconstrained Images. In IEEE CVPR."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3130800.31310887"},{"key":"e_1_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Y. Hu B. Wang and S. Lin. 2017b. FC4: Fully Convolutional Color Constancy with Confidence-Weighted Pooling. In IEEE CVPR. 330--339.","DOI":"10.1109\/CVPR.2017.43"},{"volume-title":"Pose-Guided Photorealistic Face Rotation","author":"Hu Yibo","key":"e_1_2_1_23_1","unstructured":"Yibo Hu, Xiang Wu, Bing Yu, Ran He, and Zhenan Sun. 2018a. Pose-Guided Photorealistic Face Rotation. In IEEE CVPR."},{"volume-title":"Pose-Guided Photorealistic Face Rotation","author":"Hu Yibo","key":"e_1_2_1_24_1","unstructured":"Yibo Hu, Xiang Wu, Bing Yu, Ran He, and Zhenan Sun. 2018b. Pose-Guided Photorealistic Face Rotation. In IEEE CVPR."},{"volume-title":"Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis","author":"Huang Rui","key":"e_1_2_1_25_1","unstructured":"Rui Huang, Shu Zhang, Tianyu Li, and Ran He. 2017. Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis. In IEEE ICCV."},{"key":"e_1_2_1_26_1","doi-asserted-by":"crossref","unstructured":"P. Isola J. Zhu T. Zhou and A. A. Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. In IEEE CVPR. 5967--5976.","DOI":"10.1109\/CVPR.2017.632"},{"volume-title":"Avatar SDK","year":"2019","key":"e_1_2_1_27_1","unstructured":"itSeez3D: Avatar SDK. 2019. https:\/\/avatarsdk.com."},{"key":"e_1_2_1_28_1","volume-title":"Perceptual Losses for Real-Time Style Transfer and Super-Resolution. CoRR abs\/1603.08155","author":"Johnson Justin","year":"2016","unstructured":"Justin Johnson, Alexandre Alahi, and Fei-Fei Li. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. CoRR abs\/1603.08155 (2016). http:\/\/arxiv.org\/abs\/1603.08155"},{"key":"e_1_2_1_29_1","volume-title":"A Style-Based Generator Architecture for Generative Adversarial Networks. CoRR abs\/1812.04948","author":"Karras Tero","year":"2018","unstructured":"Tero Karras, Samuli Laine, and Timo Aila. 2018. A Style-Based Generator Architecture for Generative Adversarial Networks. CoRR abs\/1812.04948 (2018). http:\/\/arxiv.org\/abs\/1812.04948"},{"key":"e_1_2_1_30_1","volume-title":"One millisecond face alignment with an ensemble of regression trees","author":"Kazemi Vahid","year":"1867","unstructured":"Vahid Kazemi and Josephine Sullivan. 2014. One millisecond face alignment with an ensemble of regression trees. In IEEE CVPR. 1867--1874."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201283"},{"key":"e_1_2_1_32_1","volume-title":"Skyler T Hawk, and AD Van Knippenberg.","author":"Langner Oliver","year":"2010","unstructured":"Oliver Langner, Ron Dotsch, Gijsbert Bijlstra, Daniel HJ Wigboldus, Skyler T Hawk, and AD Van Knippenberg. 2010. Presentation and validation of the Radboud Faces Database. Cognition and emotion 24, 8 (2010), 1377--1388."},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Chen Li Kun Zhou and Stephen Lin. 2014. Intrinsic Face Image Decomposition with Human Face Priors. In ECCV. 218--233.","DOI":"10.1007\/978-3-319-10602-1_15"},{"key":"e_1_2_1_34_1","volume-title":"IEEE CVPR","volume":"1","author":"Liu Ce","year":"2001","unstructured":"Ce Liu, Heung-Yeung Shum, and Chang-Shui Zhang. 2001. A two-step approach to hallucinating faces: global parametric model and local nonparametric model. In IEEE CVPR, Vol. 1. I--I."},{"key":"e_1_2_1_35_1","unstructured":"Loom.ai. 2019. http:\/\/www.loom.ai."},{"key":"e_1_2_1_36_1","volume-title":"The Chicago face database: A free stimulus set of faces and norming data. Behavior research methods 47, 4","author":"Ma Debbie S","year":"2015","unstructured":"Debbie S Ma, Joshua Correll, and Bernd Wittenbrink. 2015. The Chicago face database: A free stimulus set of faces and norming data. Behavior research methods 47, 4 (2015), 1122--1135."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3272127.3275075"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/882262.882269"},{"key":"e_1_2_1_39_1","unstructured":"Pinscreen. 2019. http:\/\/www.pinscreen.com."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/383259.383317"},{"volume-title":"Photorealistic Facial Texture Inference Using Deep Neural Networks","author":"Saito Shunsuke","key":"e_1_2_1_41_1","unstructured":"Shunsuke Saito, Lingyu Wei, Liwen Hu, Koki Nagano, and Hao Li. 2017. Photorealistic Facial Texture Inference Using Deep Neural Networks. In IEEE CVPR."},{"volume-title":"FaceNet: A Unified Embedding for Face Recognition and Clustering","author":"Schroff Florian","key":"e_1_2_1_42_1","unstructured":"Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A Unified Embedding for Face Recognition and Clustering. In IEEE CVPR."},{"key":"e_1_2_1_43_1","volume-title":"Jacobs","author":"Sengupta Soumyadip","year":"2018","unstructured":"Soumyadip Sengupta, Angjoo Kanazawa, Carlos D. Castillo, and David W. Jacobs. 2018. SfSNet: Learning Shape, Refectance and Illuminance of Faces in the Wild. In IEEE CVPR."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.908964"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3306346.3322948"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601137"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3072959.3095816"},{"key":"e_1_2_1_48_1","unstructured":"K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs\/1409.1556 (2014)."},{"key":"e_1_2_1_49_1","volume-title":"Geometry Guided Adversarial Facial Expression Synthesis. arXiv preprint arXiv:1712.03474","author":"Song Lingxiao","year":"2017","unstructured":"Lingxiao Song, Zhihe Lu, Ran He, Zhenan Sun, and Tieniu Tan. 2017. Geometry Guided Adversarial Facial Expression Synthesis. arXiv preprint arXiv:1712.03474 (2017)."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3306346.3323008"},{"key":"e_1_2_1_51_1","volume-title":"Inception-ResNet and the Impact of Residual Connections on Learning. In ICLR Workshop.","author":"Szegedy Christian","year":"2016","unstructured":"Christian Szegedy, Sergey Ioffe, and Vincent Vanhoucke. 2016. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. In ICLR Workshop."},{"volume-title":"Face2face: Real-time face capture and reenactment of rgb videos","author":"Thies Justus","key":"e_1_2_1_52_1","unstructured":"Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Nie\u00dfner. 2016. Face2face: Real-time face capture and reenactment of rgb videos. In IEEE CVPR. 2387--2395."},{"volume-title":"High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs","author":"Wang Ting-Chun","key":"e_1_2_1_53_1","unstructured":"Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. In IEEE CVPR."},{"key":"e_1_2_1_54_1","first-page":"11","article-title":"Face Relighting from a Single Image under Arbitrary Unknown Lighting Conditions","volume":"31","author":"Wang Y.","year":"2009","unstructured":"Y. Wang, L. Zhang, Z. Liu, G. Hua, Z. Wen, Z. Zhang, and D. Samaras. 2009. Face Relighting from a Single Image under Arbitrary Unknown Lighting Conditions. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 11 (Nov 2009), 1968--1984.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1001\/jamafacial.2018.0009"},{"volume-title":"Convolutional pose machines","author":"Wei Shih-En","key":"e_1_2_1_56_1","unstructured":"Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. 2016. Convolutional pose machines. In IEEE CVPR."},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3272127.3275101"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2018.2833032"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298679"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201364"},{"key":"e_1_2_1_61_1","volume-title":"Learning Perspective Undistortion of Portraits. arXiv preprint arXiv:1905.07515","author":"Zhao Yajie","year":"2019","unstructured":"Yajie Zhao, Zeng Huang, Tianye Li, Weikai Chen, Chloe LeGendre, Xinglei Ren, Jun Xing, Ari Shapiro, and Hao Li. 2019. Learning Perspective Undistortion of Portraits. arXiv preprint arXiv:1905.07515 (2019)."},{"key":"e_1_2_1_62_1","unstructured":"Andrey Zhmoginov and Mark Sandler. 2016. Inverting Face Embeddings with Convolutional Neural Networks. https:\/\/arxiv.org\/abs\/1606.04189"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3355089.3356568","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3355089.3356568","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:44:41Z","timestamp":1750203881000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3355089.3356568"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,11,8]]},"references-count":62,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2019,12,31]]}},"alternative-id":["10.1145\/3355089.3356568"],"URL":"https:\/\/doi.org\/10.1145\/3355089.3356568","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"type":"print","value":"0730-0301"},{"type":"electronic","value":"1557-7368"}],"subject":[],"published":{"date-parts":[[2019,11,8]]},"assertion":[{"value":"2019-11-08","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}