{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T17:19:33Z","timestamp":1777569573145,"version":"3.51.4"},"reference-count":69,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2016,11,11]],"date-time":"2016-11-11T00:00:00Z","timestamp":1478822400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2016,11,11]]},"abstract":"<jats:p>In facial animation, the accurate shape and motion of the lips of virtual humans is of paramount importance, since subtle nuances in mouth expression strongly influence the interpretation of speech and the conveyed emotion. Unfortunately, passive photometric reconstruction of expressive lip motions, such as a kiss or rolling lips, is fundamentally hard even with multi-view methods in controlled studios. To alleviate this problem, we present a novel approach for fully automatic reconstruction of detailed and expressive lip shapes along with the dense geometry of the entire face, from just monocular RGB video. To this end, we learn the difference between inaccurate lip shapes found by a state-of-the-art monocular facial performance capture approach, and the true 3D lip shapes reconstructed using a high-quality multi-view system in combination with applied lip tattoos that are easy to track. A robust gradient domain regressor is trained to infer accurate lip shapes from coarse monocular reconstructions, with the additional help of automatically extracted inner and outer 2D lip contours. We quantitatively and qualitatively show that our monocular approach reconstructs higher quality lip shapes, even for complex shapes like a kiss or lip rolling, than previous monocular approaches. Furthermore, we compare the performance of person-specific and multi-person generic regression strategies and show that our approach generalizes to new individuals and general scenes, enabling high-fidelity reconstruction even from commodity video footage.<\/jats:p>","DOI":"10.1145\/2980179.2982419","type":"journal-article","created":{"date-parts":[[2016,11,11]],"date-time":"2016-11-11T17:02:54Z","timestamp":1478883774000},"page":"1-11","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":28,"title":["Corrective 3D reconstruction of lips from monocular video"],"prefix":"10.1145","volume":"35","author":[{"given":"Pablo","family":"Garrido","sequence":"first","affiliation":[{"name":"Max Planck Institute for Informatics"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Zollh\u00f6fer","sequence":"additional","affiliation":[{"name":"Max Planck Institute for Informatics"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chenglei","family":"Wu","sequence":"additional","affiliation":[{"name":"ETH Zurich"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Derek","family":"Bradley","sequence":"additional","affiliation":[{"name":"Disney Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Patrick","family":"P\u00e9rez","sequence":"additional","affiliation":[{"name":"Technicolor"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Thabo","family":"Beeler","sequence":"additional","affiliation":[{"name":"Disney Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christian","family":"Theobalt","sequence":"additional","affiliation":[{"name":"Max Planck Institute for Informatics"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2016,12,5]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/566654.566592"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCG.2010.65"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2503385.2503387"},{"key":"e_1_2_2_4_1","volume-title":"Proc. MVA, 145--148","author":"Anderson R.","unstructured":"Anderson , R. , Stenger , B. , and Cipolla , R . 2013. Lip tracking for 3D face registration . In Proc. MVA, 145--148 . Anderson, R., Stenger, B., and Cipolla, R. 2013. Lip tracking for 3D face registration. In Proc. MVA, 145--148."},{"key":"e_1_2_2_5_1","volume-title":"Proc. ACCV, 1--6.","author":"Barnard M.","unstructured":"Barnard , M. , Holden , E. J. , and Owens , R . 2002. Lip tracking using pattern matching snakes . In Proc. ACCV, 1--6. Barnard, M., Holden, E. J., and Owens, R. 2002. Lip tracking using pattern matching snakes. In Proc. ACCV, 1--6."},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1778765.1778777"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964970"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185613"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661229.2661285"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766924"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485895.2485915"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1276377.1276419"},{"key":"e_1_2_2_13_1","volume-title":"Pattern Recognition and Machine Learning (Information Science and Statistics)","author":"Bishop C. M.","unstructured":"Bishop , C. M. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics) . Springer-Verlag New York, Inc. , Secaucus, NJ, USA . Bishop, C. M. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag New York, Inc., Secaucus, NJ, USA."},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/311535.311556"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/965400.965469"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2461912.2461976"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1778765.1778778"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601204"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766943"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.449"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.927467"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2070781.2024164"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2006.298"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601133"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2004.826754"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2638549"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2508363.2508380"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.12552"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2890493"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2070781.2024163"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.12053"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/280814.280822"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1137\/0907079"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.2307\/1271436"},{"key":"e_1_2_2_35_1","volume-title":"Proc. CVPR, 1675--1683","author":"Hsieh P.-L.","unstructured":"Hsieh , P.-L. , Ma , C. , Yu , J. , and Li , H . 2015. Unconstrained realtime facial performance capture . In Proc. CVPR, 1675--1683 . Hsieh, P.-L., Ma, C., Yu, J., and Li, H. 2015. Unconstrained realtime facial performance capture. In Proc. CVPR, 1675--1683."},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766931"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964969"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766974"},{"key":"e_1_2_2_39_1","volume-title":"Proc. ICCV, 370--375","author":"Kaucic R.","unstructured":"Kaucic , R. , and Blake , A . 1998. Accurate, real-time, unadorned lip tracking . In Proc. ICCV, 370--375 . Kaucic, R., and Blake, A. 1998. Accurate, real-time, unadorned lip tracking. In Proc. ICCV, 370--375."},{"key":"e_1_2_2_40_1","volume-title":"Proc. ISVC, 51--62","author":"Kawai M.","unstructured":"Kawai , M. , Iwao , T. , Maejima , A. , and Morishima , S . 2014. Automatic photorealistic 3D inner mouth restoration from frontal images . In Proc. ISVC, 51--62 . Kawai, M., Iwao, T., Maejima, A., and Morishima, S. 2014. Automatic photorealistic 3D inner mouth restoration from frontal images. In Proc. ISVC, 51--62."},{"key":"e_1_2_2_41_1","volume-title":"Proc. ECCV, 341--353","author":"Kemelmacher-Shlizerman I.","unstructured":"Kemelmacher-Shlizerman , I. , Sankar , A. , Shechtman , E. , and Seitz , S. M . 2010. Being John Malkovich . In Proc. ECCV, 341--353 . Kemelmacher-Shlizerman, I., Sankar, A., Shechtman, E., and Seitz, S. M. 2010. Being John Malkovich. In Proc. ECCV, 341--353."},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/3DIMPVT.2012.67"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCG.2010.41"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2461912.2462019"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818122"},{"key":"e_1_2_2_46_1","volume-title":"Proc. CVPR, 1490--1497","author":"Luo L.","unstructured":"Luo , L. , Li , H. , Paris , S. , Weise , T. , Pauly , M. , and Rusinkiewicz , S . 2012. Multi-view hair capture using orientation fields . In Proc. CVPR, 1490--1497 . Luo, L., Li, H., Paris, S., Weise, T., Pauly, M., and Rusinkiewicz, S. 2012. Multi-view hair capture using orientation fields. In Proc. CVPR, 1490--1497."},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766894"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuroimage.2011.07.024"},{"key":"e_1_2_2_49_1","volume-title":"Proc. ICIP, 2437--2440","author":"Nguyen Q. D.","unstructured":"Nguyen , Q. D. , and Milgram , M . 2009. Semi adaptive appearance models for lip tracking . In Proc. ICIP, 2437--2440 . Nguyen, Q. D., and Milgram, M. 2009. Semi adaptive appearance models for lip tracking. In Proc. ICIP, 2437--2440."},{"key":"e_1_2_2_50_1","unstructured":"Pighin F. and Lewis J. 2006. Performance-driven facial animation. In ACM Siggraph Courses.  Pighin F. and Lewis J. 2006. Performance-driven facial animation. In ACM Siggraph Courses."},{"key":"e_1_2_2_51_1","volume-title":"Proc. ICCV, 1034--1041","author":"Saragih J. M.","unstructured":"Saragih , J. M. , Lucey , S. , and Cohn , J. F . 2009. Face alignment through subspace constrained mean-shifts . In Proc. ICCV, 1034--1041 . Saragih, J. M., Lucey, S., and Cohn, J. F. 2009. Face alignment through subspace constrained mean-shifts. In Proc. ICCV, 1034--1041."},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-010-0380-4"},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661229.2661290"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/1073204.1073208"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015706.1015736"},{"key":"e_1_2_2_56_1","volume-title":"Proc. ECCV, 796--812","author":"Suwajanakorn S.","unstructured":"Suwajanakorn , S. , Kemelmacher-Shlizerman , I. , and Seitz , S. M . 2014. Total moving face reconstruction . In Proc. ECCV, 796--812 . Suwajanakorn, S., Kemelmacher-Shlizerman, I., and Seitz, S. M. 2014. Total moving face reconstruction. In Proc. ECCV, 796--812."},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.450"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818056"},{"key":"e_1_2_2_59_1","volume-title":"Proc. CVPR.","author":"Thies J.","unstructured":"Thies , J. , Zollh\u00f6fer , M. , Stamminger , M. , Theobalt , C. , and Niessner , M . 2016. Face2Face: Real-time face capture and reenactment of RGB videos . In Proc. CVPR. Thies, J., Zollh\u00f6fer, M., Stamminger, M., Theobalt, C., and Niessner, M. 2016. Face2Face: Real-time face capture and reenactment of RGB videos. In Proc. CVPR."},{"key":"e_1_2_2_60_1","volume-title":"Proc. ACCV, 1--6.","author":"Tian Y.-L.","unstructured":"Tian , Y.-L. , Kanade , T. , and Cohn , J. F . 2000. Robust lip tracking by combining shape, color and motion . In Proc. ACCV, 1--6. Tian, Y.-L., Kanade, T., and Cohn, J. F. 2000. Robust lip tracking by combining shape, color and motion. In Proc. ACCV, 1--6."},{"key":"e_1_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366206"},{"key":"e_1_2_2_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/1073204.1073209"},{"key":"e_1_2_2_63_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2004.04.016"},{"key":"e_1_2_2_64_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-8659.2004.00800.x"},{"key":"e_1_2_2_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/1599470.1599472"},{"key":"e_1_2_2_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964972"},{"key":"e_1_2_2_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/1073204.1073258"},{"key":"e_1_2_2_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/1141911.1141987"},{"key":"e_1_2_2_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/97879.97906"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2980179.2982419","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2980179.2982419","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:49:57Z","timestamp":1750218597000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2980179.2982419"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,11,11]]},"references-count":69,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2016,11,11]]}},"alternative-id":["10.1145\/2980179.2982419"],"URL":"https:\/\/doi.org\/10.1145\/2980179.2982419","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,11,11]]},"assertion":[{"value":"2016-12-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}