{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,15]],"date-time":"2026-06-15T15:57:58Z","timestamp":1781539078363,"version":"3.54.5"},"publisher-location":"New York, NY, USA","reference-count":53,"publisher":"ACM","license":[{"start":{"date-parts":[[2026,6,15]],"date-time":"2026-06-15T00:00:00Z","timestamp":1781481600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/legalcode"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2026,6,16]]},"DOI":"10.1145\/3805622.3810840","type":"proceedings-article","created":{"date-parts":[[2026,6,15]],"date-time":"2026-06-15T14:42:57Z","timestamp":1781534577000},"page":"996-1005","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["GeoPro-Depth: Geometrically Consistent Prompting for Robust Metric Depth Completion"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-1615-1723","authenticated-orcid":false,"given":"Xun","family":"Fang","sequence":"first","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-2603-2797","authenticated-orcid":false,"given":"Zixuan","family":"Hua","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0467-4347","authenticated-orcid":false,"given":"Lihua","family":"Zhang","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2026,6,15]]},"reference":[{"key":"e_1_3_3_1_2_2","unstructured":"Josh Achiam Steven Adler Sandhini Agarwal Lama Ahmad Ilge Akkaya Florencia\u00a0Leoni Aleman Diogo Almeida Janko Altenschmidt Sam Altman Shyamal Anadkat et\u00a0al. 2023. Gpt-4 technical report. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2303.08774 (2023)."},{"key":"e_1_3_3_1_3_2","unstructured":"Shariq\u00a0Farooq Bhat Reiner Birkl Diana Wofk Peter Wonka and Matthias M\u00fcller. 2023. Zoedepth: Zero-shot transfer by combining relative and metric depth. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2302.12288 (2023)."},{"key":"e_1_3_3_1_4_2","unstructured":"Aleksei Bochkovskii Ama\u00c3\u0122l Delaunoy Hugo Germain Marcel Santos Yichao Zhou Stephan\u00a0R Richter and Vladlen Koltun. 2024. Depth pro: Sharp monocular metric depth in less than a second. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2410.02073 (2024)."},{"key":"e_1_3_3_1_5_2","unstructured":"Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared\u00a0D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et\u00a0al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020) 1877\u20131901."},{"key":"e_1_3_3_1_6_2","unstructured":"Nicolas Carion Laura Gustafson Yuan-Ting Hu Shoubhik Debnath Ronghang Hu Didac Suris Chaitanya Ryali Kalyan\u00a0Vasudev Alwala Haitham Khedr Andrew Huang Jie Lei Tengyu Ma Baishan Guo Arpit Kalla Markus Marks Joseph Greer Meng Wang Peize Sun Roman R\u00e4dle Triantafyllos Afouras Effrosyni Mavroudi Katherine Xu Tsung-Han Wu Yu Zhou Liliane Momeni Rishi Hazra Shuangrui Ding Sagar Vaze Francois Porcher Feng Li Siyuan Li Aishwarya Kamath Ho\u00a0Kei Cheng Piotr Doll\u00e1r Nikhila Ravi Kate Saenko Pengchuan Zhang and Christoph Feichtenhofer. 2025. SAM 3: Segment Anything with Concepts. arxiv:https:\/\/arXiv.org\/abs\/2511.16719\u00a0[cs.CV] https:\/\/arxiv.org\/abs\/2511.16719"},{"key":"e_1_3_3_1_7_2","first-page":"54","volume-title":"European Conference on Computer Vision","author":"Cecille Aur\u00e9lien","year":"2024","unstructured":"Aur\u00e9lien Cecille, Stefan Duffner, Franck Davoine, Thibault Neveu, and R\u00e9mi Agier. 2024. Groco: Ground constraint for metric self-supervised monocular depth. In European Conference on Computer Vision. Springer, 54\u201369."},{"key":"e_1_3_3_1_8_2","doi-asserted-by":"crossref","unstructured":"Yashashwee Chakrabarty Akanksha Dixit and Smruti\u00a0R Sarangi. 2025. Voxdepth: Rectification of depth images on edge devices. ACM Transactions on Embedded Computing Systems 24 6 (2025) 1\u201327.","DOI":"10.1145\/3763793"},{"key":"e_1_3_3_1_9_2","unstructured":"Minghao Chen Vladlen Koltun and Ren\u00e9 Ranftl. 2024. Unpaired Depth Super-Resolution in the Wild. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024)."},{"key":"e_1_3_3_1_10_2","unstructured":"Alexey Dosovitskiy. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2010.11929 (2020)."},{"key":"e_1_3_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.304"},{"key":"e_1_3_3_1_12_2","unstructured":"David Eigen Christian Puhrsch and Rob Fergus. 2014. Depth map prediction from a single image using a multi-scale deep network. Advances in neural information processing systems 27 (2014)."},{"key":"e_1_3_3_1_13_2","first-page":"241","volume-title":"European Conference on Computer Vision","author":"Fu Xiao","year":"2024","unstructured":"Xiao Fu, Wei Yin, Mu Hu, Kaixuan Wang, Yuexin Ma, Ping Tan, Shaojie Shen, Dahua Lin, and Xiaoxiao Long. 2024. Geowizard: Unleashing the diffusion priors for 3d geometry estimation from a single image. In European Conference on Computer Vision. Springer, 241\u2013258."},{"key":"e_1_3_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.699"},{"key":"e_1_3_3_1_15_2","first-page":"6851","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Gonzalez Juan\u00a0Luis","year":"2021","unstructured":"Juan\u00a0Luis Gonzalez and Munchurl Kim. 2021. Plade-net: Towards pixel-level accuracy for self-supervised single-view depth estimation with neural positional encoding and distilled matting loss. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 6851\u20136860."},{"key":"e_1_3_3_1_16_2","unstructured":"Jing He Haodong Li Wei Yin Yixun Liang Leheng Li Kaiqiang Zhou Hongbo Zhang Bingbing Liu and Ying-Cong Chen. 2024. Lotus: Diffusion-based visual foundation model for high-quality dense prediction. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2409.18124 (2024)."},{"key":"e_1_3_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/1186822.1073232"},{"key":"e_1_3_3_1_18_2","doi-asserted-by":"crossref","unstructured":"Derek Hoiem Alexei\u00a0A Efros and Martial Hebert. 2007. Recovering surface layout from an image. International Journal of Computer Vision 75 1 (2007) 151\u2013172.","DOI":"10.1007\/s11263-006-0031-y"},{"key":"e_1_3_3_1_19_2","doi-asserted-by":"crossref","unstructured":"Mu Hu Wei Yin Chi Zhang Zhipeng Cai Xiaoxiao Long Hao Chen Kaixuan Wang Gang Yu Chunhua Shen and Shaojie Shen. 2024. Metric3d v2: A versatile monocular geometric foundation model for zero-shot metric depth and surface normal estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024).","DOI":"10.1109\/TPAMI.2024.3444912"},{"key":"e_1_3_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP49660.2025.10887546"},{"key":"e_1_3_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-19827-4_41"},{"key":"e_1_3_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.00907"},{"key":"e_1_3_3_1_23_2","unstructured":"Mohammad Khalilian Mohammad\u00a0Ali Rahmani Taha Akhavan Neda Abedinzadeh Maryam Karami and Mahdi\u00a0Abolfazli Esfahani. 2025. DepthRL: a weakly supervised approach for monocular depth estimation using deep reinforcement learning. Complex & Intelligent Systems (2025) 1\u201314."},{"key":"e_1_3_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"e_1_3_3_1_25_2","doi-asserted-by":"crossref","unstructured":"Brian Lester Rami Al-Rfou and Noah Constant. 2021. The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2104.08691 (2021).","DOI":"10.18653\/v1\/2021.emnlp-main.243"},{"key":"e_1_3_3_1_26_2","unstructured":"Yufei Li Yucheng Xue Yingyan Tan Chao Tian Yihua Xu and Alex\u00a0C Kot. 2024. SparseDC: Depth completion from sparse and non-uniform inputs. Information Fusion 111 (2024) 102495."},{"key":"e_1_3_3_1_27_2","unstructured":"Haotong Lin Sili Chen Jun\u00a0Hao Liew Donny\u00a0Y. Chen Zhenyu Li Guang Shi Jiashi Feng and Bingyi Kang. 2025. Depth Anything 3: Recovering the visual space from any views. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2511.10647 (2025)."},{"key":"e_1_3_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52734.2025.01591"},{"key":"e_1_3_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2010.5539823"},{"key":"e_1_3_3_1_30_2","doi-asserted-by":"crossref","unstructured":"Haotian Liu Chunyuan Li Qingyang Wu and Yong\u00a0Jae Lee. 2023. Visual instruction tuning. Advances in neural information processing systems 36 (2023) 34892\u201334916.","DOI":"10.52202\/075280-1516"},{"key":"e_1_3_3_1_31_2","unstructured":"Maxime Oquab Timoth\u00e9e Darcet Theo Moutakanni Huy\u00a0V. Vo Marc Szafraniec Vasil Khalidov Pierre Fernandez Daniel Haziza Francisco Massa Alaaeldin El-Nouby Russell Howes Po-Yao Huang Hu Xu Vasu Sharma Shang-Wen Li Wojciech Galuba Mike Rabbat Mido Assran Nicolas Ballas Gabriel Synnaeve Ishan Misra Herve Jegou Julien Mairal Patrick Labatut Armand Joulin and Piotr Bojanowski. 2023. DINOv2: Learning Robust Visual Features without Supervision."},{"key":"e_1_3_3_1_32_2","volume-title":"Advances in Neural Information Processing Systems","author":"Park Jun\u00a0Ho","year":"2024","unstructured":"Jun\u00a0Ho Park and Hum\u00a0Gil Jeon. 2024. A Simple yet Universal Framework for Depth Completion. In Advances in Neural Information Processing Systems."},{"key":"e_1_3_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.00941"},{"key":"e_1_3_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.01196"},{"key":"e_1_3_3_1_35_2","doi-asserted-by":"crossref","unstructured":"Ren\u00e9 Ranftl Katrin Lasinger David Hafner Konrad Schindler and Vladlen Koltun. 2020. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE transactions on pattern analysis and machine intelligence 44 3 (2020) 1623\u20131637.","DOI":"10.1109\/TPAMI.2020.3019967"},{"key":"e_1_3_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_3_3_1_37_2","unstructured":"Ashutosh Saxena Sung Chung and Andrew Ng. 2005. Learning depth from single monocular images. Advances in neural information processing systems 18 (2005)."},{"key":"e_1_3_3_1_38_2","doi-asserted-by":"crossref","unstructured":"Ashutosh Saxena Sung\u00a0H Chung and Andrew\u00a0Y Ng. 2008. 3-d depth reconstruction from a single still image. International journal of computer vision 76 1 (2008) 53\u201369.","DOI":"10.1007\/s11263-007-0071-y"},{"key":"e_1_3_3_1_39_2","doi-asserted-by":"crossref","unstructured":"Ashutosh Saxena Min Sun and Andrew\u00a0Y Ng. 2008. Make3d: Learning 3d scene structure from a single still image. IEEE transactions on pattern analysis and machine intelligence 31 5 (2008) 824\u2013840.","DOI":"10.1109\/TPAMI.2008.132"},{"key":"e_1_3_3_1_40_2","unstructured":"Saurabh Saxena Abhishek Kar Mohammad Norouzi and David\u00a0J Fleet. 2023. Monocular depth estimation using diffusion models. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2302.14816 (2023)."},{"key":"e_1_3_3_1_41_2","unstructured":"Yingliang Shen Bingbing Dai Jinmiao Tang Lingxi Li and Jianwei Song. 2016. Filling Kinect depth holes via position-guided matrix completion. Neurocomputing 211 (2016) 13\u201320."},{"key":"e_1_3_3_1_42_2","unstructured":"Julian Straub Thomas Whelan et\u00a0al. 2019. The Replica dataset: A digital replica of indoor spaces. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/1906.05797 (2019)."},{"key":"e_1_3_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2012.6385773"},{"key":"e_1_3_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.00932"},{"key":"e_1_3_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/3DV.2017.00012"},{"key":"e_1_3_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.02052"},{"key":"e_1_3_3_1_47_2","unstructured":"Yuli Wang Rui Zhang Jianhua Li and Liang Chen. 2024. An Efficient Diffusion Depth Image Inpainting Method Based on RGB-Guided. Proceedings of the 2024 6th International Conference on Image Video and Signal Processing (2024) 87\u201394."},{"key":"e_1_3_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52734.2025.01076"},{"key":"e_1_3_3_1_49_2","unstructured":"Chuhua Xian Kun Qian Zitian Zhang and Charlie\u00a0CL Wang. 2020. Multi-scale progressive fusion learning for depth map super-resolution. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2011.11865 (2020)."},{"key":"e_1_3_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.00987"},{"key":"e_1_3_3_1_51_2","doi-asserted-by":"crossref","unstructured":"Lihe Yang Bingyi Kang Zilong Huang Zhen Zhao Xiaogang Xu Jiashi Feng and Hengshuang Zhao. 2024. Depth anything v2. Advances in Neural Information Processing Systems 37 (2024) 21875\u201321911.","DOI":"10.52202\/079017-0688"},{"key":"e_1_3_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00008"},{"key":"e_1_3_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00355"},{"key":"e_1_3_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00026"}],"event":{"name":"ICMR '26: International Conference on Multimedia Retrieval","location":"Amsterdam The Netherlands","acronym":"ICMR '26","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 2026 International Conference on Multimedia Retrieval"],"original-title":[],"deposited":{"date-parts":[[2026,6,15]],"date-time":"2026-06-15T15:48:50Z","timestamp":1781538530000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3805622.3810840"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,6,15]]},"references-count":53,"alternative-id":["10.1145\/3805622.3810840","10.1145\/3805622"],"URL":"https:\/\/doi.org\/10.1145\/3805622.3810840","relation":{},"subject":[],"published":{"date-parts":[[2026,6,15]]},"assertion":[{"value":"2026-06-15","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}