{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,9]],"date-time":"2025-12-09T04:21:36Z","timestamp":1765254096261,"version":"3.41.0"},"reference-count":49,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2020,8,12]],"date-time":"2020-08-12T00:00:00Z","timestamp":1597190400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2020,8,31]]},"abstract":"<jats:p>\n            3D photography is a new medium that allows viewers to more fully experience a captured moment. In this work, we refer to a\n            <jats:italic toggle=\"yes\">3D photo<\/jats:italic>\n            as one that displays parallax induced by moving the viewpoint (as opposed to a stereo pair with a fixed viewpoint). 3D photos are static in time, like traditional photos, but are displayed with interactive parallax on mobile or desktop screens, as well as on Virtual Reality devices, where viewing it\n            <jats:italic toggle=\"yes\">also<\/jats:italic>\n            includes stereo. We present an end-to-end system for creating and viewing 3D photos, and the algorithmic and design choices therein. Our 3D photos are captured in a single shot and processed directly on a mobile device. The method starts by estimating depth from the 2D input image using a new monocular depth estimation network that is optimized for mobile devices. It performs competitively to the state-of-the-art, but has lower latency and peak memory consumption and uses an order of magnitude fewer parameters. The resulting depth is lifted to a layered depth image, and new geometry is synthesized in parallax regions. We synthesize color texture and structures in the parallax regions as well, using an inpainting network, also optimized for mobile devices, on the LDI directly. Finally, we convert the result into a mesh-based representation that can be efficiently transmitted and rendered even on low-end devices and over poor network connections. Altogether, the processing takes just a few seconds on a mobile device, and the result can be instantly viewed and shared. We perform extensive quantitative evaluation to validate our system and compare its new components against the current state-of-the-art.\n          <\/jats:p>","DOI":"10.1145\/3386569.3392420","type":"journal-article","created":{"date-parts":[[2020,8,12]],"date-time":"2020-08-12T11:44:27Z","timestamp":1597232667000},"update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":59,"title":["One shot 3D photography"],"prefix":"10.1145","volume":"39","author":[{"given":"Johannes","family":"Kopf","sequence":"first","affiliation":[{"name":"Facebook"}]},{"given":"Kevin","family":"Matzen","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Suhib","family":"Alsisan","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Ocean","family":"Quigley","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Francis","family":"Ge","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Yangming","family":"Chong","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Josh","family":"Patterson","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Jan-Michael","family":"Frahm","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Shu","family":"Wu","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Matthew","family":"Yu","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Peizhao","family":"Zhang","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Zijian","family":"He","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Peter","family":"Vajda","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Ayush","family":"Saraf","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Michael","family":"Cohen","sequence":"additional","affiliation":[{"name":"Facebook"}]}],"member":"320","published-online":{"date-parts":[[2020,8,12]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"Weifeng Chen Zhao Fu Dawei Yang and Jia Deng. 2016. Single-image depth perception in the wild. In Advances in Neural Information Processing Systems. 730--738."},{"key":"e_1_2_2_2_1","volume-title":"Vijayalakshmi Srinivasan, and Kailash Gopalakrishnan.","author":"Choi Jungwook","year":"2018","unstructured":"Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, and Kailash Gopalakrishnan. 2018. Pact: Parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085 (2018)."},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.350"},{"volume-title":"ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Dai Xiaoliang","key":"e_1_2_2_4_1","unstructured":"Xiaoliang Dai, Yangqing Jia, Peter Vajda, Matt Uyttendaele, Niraj K. Jha, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, and et al. 2019. ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.5555\/1370949"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.3138\/FM57-6770-U75U-7727"},{"key":"e_1_2_2_7_1","volume-title":"DeepView: View Synthesis With Learned Gradient Descent. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Flynn John","year":"2019","unstructured":"John Flynn, Michael Broxton, Paul Debevec, Matthew DuVall, Graham Fyffe, Ryan Overbeck, Noah Snavely, and Richard Tucker. 2019. DeepView: View Synthesis With Learned Gradient Descent. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_2_8_1","first-page":"740","article-title":"Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue","volume":"2016","author":"Garg Ravi","year":"2016","unstructured":"Ravi Garg, Vijay Kumar B.G., Gustavo Carneiro, and Ian Reid. 2016. Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue. In Computer Vision - ECCV 2016. 740--756.","journal-title":"Computer Vision - ECCV"},{"volume-title":"Unsupervised Monocular Depth Estimation with Left-Right Consistency. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Godard Cl\u00e9ment","key":"e_1_2_2_9_1","unstructured":"Cl\u00e9ment Godard, Oisin Mac Aodha, and Gabriel J. Brostow. 2017. Unsupervised Monocular Depth Estimation with Left-Right Consistency. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"volume-title":"International Conference on Computer Vision (ICCV).","author":"Godard Cl\u00e9ment","key":"e_1_2_2_10_1","unstructured":"Cl\u00e9ment Godard, Oisin Mac Aodha, Michael Firman, and Gabriel J. Brostow. 2019. Digging into Self-Supervised Monocular Depth Prediction. In International Conference on Computer Vision (ICCV)."},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3130800.3130828"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201384"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/258734.258854"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00286"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2018.2836318"},{"key":"e_1_2_2_16_1","volume-title":"Semi-Supervised Deep Learning for Monocular Depth Map Prediction. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Kuznietsov Yevhen","year":"2017","unstructured":"Yevhen Kuznietsov, Jorg Stuckler, and Bastian Leibe. 2017. Semi-Supervised Deep Learning for Monocular Depth Map Prediction. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_2_17_1","unstructured":"Zhengqi Li and Noah Snavely. 2018. MegaDepth: Learning Single-View Depth Prediction from Internet Photos. In Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.2964733"},{"key":"e_1_2_2_19_1","volume-title":"Image Inpainting for Irregular Holes Using Partial Convolutions. In The European Conference on Computer Vision (ECCV).","author":"Liu Guilin","year":"2018","unstructured":"Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. 2018b. Image Inpainting for Irregular Holes Using Partial Convolutions. In The European Conference on Computer Vision (ECCV)."},{"key":"e_1_2_2_20_1","unstructured":"Guilin Liu Kevin J. Shih Ting-Chun Wang Fitsum A. Reda Karan Sapra Zhiding Yu Andrew Tao and Bryan Catanzaro. 2018c. Partial Convolution based Padding. In arXiv preprint arXiv:1811.11718."},{"key":"e_1_2_2_21_1","volume-title":"Geometry-Aware Deep Network for Single-Image Novel View Synthesis. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR. 4616--4624","author":"Liu Miaomiao","year":"2018","unstructured":"Miaomiao Liu, Xuming He, and Mathieu Salzmann. 2018a. Geometry-Aware Deep Network for Single-Image Novel View Synthesis. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR. 4616--4624."},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3272127.3275099"},{"key":"e_1_2_2_23_1","volume-title":"Object Scene Flow for Autonomous Vehicles. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)","author":"Menze Moritz","year":"2015","unstructured":"Moritz Menze and Andreas Geiger. 2015. Object Scene Flow for Autonomous Vehicles. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2015. 3061--3070."},{"key":"e_1_2_2_24_1","volume-title":"Neural Rerendering in the Wild. arXiv preprint","author":"Meshry Moustafa","year":"2019","unstructured":"Moustafa Meshry, Dan B Goldman, Sameh Khamis, Hugues Hoppe, Rohit Pandey, Noah Snavely, and Ricardo Martin-Brualla. 2019. Neural Rerendering in the Wild. arXiv preprint (2019)."},{"key":"e_1_2_2_25_1","volume-title":"Ravi Ramamoorthi, Ren Ng, and Abhishek Kar.","author":"Mildenhall Ben","year":"2019","unstructured":"Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. 2019. Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines. ACM Transactions on Graphics (TOG) (2019)."},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3355089.3356528"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/383259.383310"},{"volume-title":"Automation Test in Europe Conference Exhibition (DATE). 1703--1708","author":"Peluso V.","key":"e_1_2_2_28_1","unstructured":"V. Peluso, A. Cipolletta, A. Calimera, M. Poggi, F. Tosi, and S. Mattoccia. 2019. Enabling Energy-Efficient Unsupervised Monocular Depth Estimation on ARMv7-Based Platforms. In 2019 Design, Automation Test in Europe Conference Exhibition (DATE). 1703--1708."},{"key":"e_1_2_2_29_1","volume-title":"IEEE\/JRS Conference on Intelligent Robots and Systems (IROS).","author":"Poggi Matteo","year":"2018","unstructured":"Matteo Poggi, Filippo Aleotti, Fabio Tosi, and Stefano Mattoccia. 2018. Towards realtime unsupervised monocular depth estimation on CPU. In IEEE\/JRS Conference on Intelligent Robots and Systems (IROS)."},{"key":"e_1_2_2_30_1","volume-title":"SharpNet: Fast and Accurate Recovery of Occluding Contours in Monocular Depth Estimation. The IEEE International Conference on Computer Vision (ICCV) Workshops","author":"Ramamonjisoa Michael","year":"2019","unstructured":"Michael Ramamonjisoa and Vincent Lepetit. 2019. SharpNet: Fast and Accurate Recovery of Occluding Contours in Monocular Depth Estimation. The IEEE International Conference on Computer Vision (ICCV) Workshops (2019)."},{"key":"e_1_2_2_31_1","volume-title":"Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer. arXiv:1907.01341","author":"Ranftl Rene","year":"2019","unstructured":"Rene Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, and Vladlen Koltun. 2019. Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer. arXiv:1907.01341 (2019)."},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00474"},{"key":"e_1_2_2_34_1","first-page":"1161","article-title":"Learning Depth from Single Monocular Images","volume":"18","author":"Saxena Ashutosh","year":"2006","unstructured":"Ashutosh Saxena, Sung H. Chung, and Andrew Y. Ng. 2006. Learning Depth from Single Monocular Images. Advances in Neural Information Processing Systems 18 (2006), 1161--1168.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/280814.280882"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33715-4_54"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00254"},{"key":"e_1_2_2_38_1","volume-title":"Pushing the Boundaries of View Extrapolation with Multiplane Images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Srinivasan Pratul P.","year":"2019","unstructured":"Pratul P. Srinivasan, Richard Tucker, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng, and Noah Snavely. 2019. Pushing the Boundaries of View Extrapolation with Multiplane Images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.246"},{"volume-title":"Real-Time Self-Adaptive Deep Stereo. In 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 195--204","author":"Tonioni A.","key":"e_1_2_2_40_1","unstructured":"A. Tonioni, F. Tosi, M. Poggi, S. Mattoccia, and L. D. Stefano. 2019. Real-Time Self-Adaptive Deep Stereo. In 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 195--204."},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/3DV.2019.00046"},{"key":"e_1_2_2_42_1","volume-title":"FastDepth: Fast Monocular Depth Estimation on Embedded Systems. In IEEE International Conference on Robotics and Automation (ICRA).","author":"Wofk Diana","year":"2019","unstructured":"Diana Wofk, Fangchang Ma, Tien-Ju Yang, Sertac Karaman, and Vivienne Sze. 2019. FastDepth: Fast Monocular Depth Estimation on Embedded Systems. In IEEE International Conference on Robotics and Automation (ICRA)."},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01099"},{"key":"e_1_2_2_44_1","volume-title":"Monocular Relative Depth Perception With Web Stereo Data Supervision. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Xian Ke","year":"2018","unstructured":"Ke Xian, Chunhua Shen, Zhiguo Cao, Hao Lu, Yang Xiao, Ruibo Li, and Zhenbo Luo. 2018. Monocular Relative Depth Perception With Web Stereo Data Supervision. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00068"},{"key":"e_1_2_2_46_1","volume-title":"Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Zhang Yinda","year":"2017","unstructured":"Yinda Zhang, Shuran Song, Ersin Yumer, Manolis Savva, Joon-Young Lee, Hailin Jin, and Thomas Funkhouser. 2017. Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.660"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201323"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015706.1015766"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3386569.3392420","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3386569.3392420","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,25]],"date-time":"2025-06-25T05:37:16Z","timestamp":1750829836000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3386569.3392420"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,12]]},"references-count":49,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,8,31]]}},"alternative-id":["10.1145\/3386569.3392420"],"URL":"https:\/\/doi.org\/10.1145\/3386569.3392420","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"type":"print","value":"0730-0301"},{"type":"electronic","value":"1557-7368"}],"subject":[],"published":{"date-parts":[[2020,8,12]]},"assertion":[{"value":"2020-08-12","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}