{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T01:40:45Z","timestamp":1760233245001,"version":"build-2065373602"},"reference-count":34,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2022,12,26]],"date-time":"2022-12-26T00:00:00Z","timestamp":1672012800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62101256","2021M691591"],"award-info":[{"award-number":["62101256","2021M691591"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"publisher","award":["62101256","2021M691591"],"award-info":[{"award-number":["62101256","2021M691591"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Due to the low cost and easy deployment, self-supervised depth completion has been widely studied in recent years. In this work, a self-supervised depth completion method is designed based on multi-modal spatio-temporal consistency (MSC). The self-supervised depth completion nowadays faces other problems: moving objects, occluded\/dark light\/low texture parts, long-distance completion, and cross-modal fusion. In the face of these problems, the most critical novelty of this work lies in that the self-supervised mechanism is designed to train the depth completion network by MSC constraint. It not only makes better use of depth-temporal data, but also plays the advantage of photometric-temporal constraint. With the self-supervised mechanism of MSC constraint, the overall system outperforms many other self-supervised networks, even exceeding partially supervised networks.<\/jats:p>","DOI":"10.3390\/rs15010135","type":"journal-article","created":{"date-parts":[[2022,12,27]],"date-time":"2022-12-27T02:53:11Z","timestamp":1672109591000},"page":"135","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Self-Supervised Depth Completion Based on Multi-Modal Spatio-Temporal Consistency"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1074-5757","authenticated-orcid":false,"given":"Quan","family":"Zhang","sequence":"first","affiliation":[{"name":"School of Electronic Engineering and Optoelectronic Technology, Nanjing University of Science and Technology, Nanjing 210000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1026-2824","authenticated-orcid":false,"given":"Xiaoyu","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering and Optoelectronic Technology, Nanjing University of Science and Technology, Nanjing 210000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xingguo","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering and Optoelectronic Technology, Nanjing University of Science and Technology, Nanjing 210000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jing","family":"Han","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering and Optoelectronic Technology, Nanjing University of Science and Technology, Nanjing 210000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yi","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering and Optoelectronic Technology, Nanjing University of Science and Technology, Nanjing 210000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiang","family":"Yue","sequence":"additional","affiliation":[{"name":"College of Science, Hohai University, Nanjing 210000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,12,26]]},"reference":[{"doi-asserted-by":"crossref","unstructured":"Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohi, P., Shotton, J., Hodges, S., and Fitzgibbon, A. (2011, January 26\u201329). Kinectfusion: Real-time dense surface mapping and tracking. Proceedings of the 2011 10th IEEE International Symposium on Mixed and Augmented Reality, Basel, Switzerland.","key":"ref_1","DOI":"10.1109\/ISMAR.2011.6092378"},{"doi-asserted-by":"crossref","unstructured":"Zhang, J., and Singh, S. (2014, January 12\u201316). LOAM: Lidar odometry and mapping in real-time. Proceedings of the Robotics: Science and Systems, Berkeley, CA, USA.","key":"ref_2","DOI":"10.15607\/RSS.2014.X.007"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"14588","DOI":"10.1109\/TVT.2020.3031330","article-title":"Coarse-to-fine segmentation on lidar point clouds in spherical coordinate and beyond","volume":"69","author":"Li","year":"2020","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1364","DOI":"10.1109\/TVT.2015.2388780","article-title":"StructSLAM: Visual SLAM with building structure lines","volume":"64","author":"Zhou","year":"2015","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"11654","DOI":"10.1109\/TITS.2021.3106055","article-title":"Self-Supervised Depth Completion From Direct Visual-LiDAR Odometry in Autonomous Driving","volume":"23","author":"Song","year":"2021","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"doi-asserted-by":"crossref","unstructured":"Ma, F., Cavalheiro, G.V., and Karaman, S. (2019, January 20\u201324). Self-supervised sparse-to-dense: Self-supervised depth completion from lidar and monocular camera. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","key":"ref_6","DOI":"10.1109\/ICRA.2019.8793637"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"722","DOI":"10.1109\/TITS.2020.3023541","article-title":"Deep learning for image and point cloud fusion in autonomous driving: A review","volume":"23","author":"Cui","year":"2021","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"doi-asserted-by":"crossref","unstructured":"Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., and Geiger, A. (2017, January 10\u201312). Sparsity invariant cnns. Proceedings of the 2017 International Conference on 3D Vision (3DV), Qingdao, China.","key":"ref_8","DOI":"10.1109\/3DV.2017.00012"},{"doi-asserted-by":"crossref","unstructured":"Jaritz, M., De Charette, R., Wirbel, E., Perrotton, X., and Nashashibi, F. (2018, January 5\u20138). Sparse and dense data with cnns: Depth completion and semantic segmentation. Proceedings of the 2018 International Conference on 3D Vision (3DV), Verona, Italy.","key":"ref_9","DOI":"10.1109\/3DV.2018.00017"},{"unstructured":"Eldesokey, A., Felsberg, M., and Khan, F.S. (2018). Propagating confidences through cnns for sparse data regression. arXiv.","key":"ref_10"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"2423","DOI":"10.1109\/TPAMI.2019.2929170","article-title":"Confidence propagation through cnns for guided sparse depth regression","volume":"42","author":"Eldesokey","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"126323","DOI":"10.1109\/ACCESS.2020.3008404","article-title":"Revisiting sparsity invariant convolution: A network for image guided depth completion","volume":"8","author":"Yan","year":"2020","journal-title":"IEEE Access"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"3429","DOI":"10.1109\/TIP.2019.2960589","article-title":"Hms-net: Hierarchical multi-scale sparsity-invariant network for sparse depth completion","volume":"29","author":"Huang","year":"2019","journal-title":"IEEE Trans. Image Process."},{"doi-asserted-by":"crossref","unstructured":"Ma, F., and Karaman, S. (2018, January 21\u201325). Sparse-to-dense: Depth prediction from sparse depth samples and a single image. Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia.","key":"ref_14","DOI":"10.1109\/ICRA.2018.8460184"},{"doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","key":"ref_15","DOI":"10.1109\/CVPR.2016.90"},{"doi-asserted-by":"crossref","unstructured":"Wei, M., Zhu, M., Zhang, Y., Sun, J., and Wang, J. (2022). An Efficient Information-Reinforced Lidar Deep Completion Network without RGB Guided. Remote. Sens., 14.","key":"ref_16","DOI":"10.3390\/rs14194689"},{"doi-asserted-by":"crossref","unstructured":"Hu, M., Wang, S., Li, B., Ning, S., Fan, L., and Gong, X. (June, January 30). Penet: Towards precise and efficient image guided depth completion. Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi\u2019an, China.","key":"ref_17","DOI":"10.1109\/ICRA48506.2021.9561035"},{"doi-asserted-by":"crossref","unstructured":"Li, A., Yuan, Z., Ling, Y., Chi, W., and Zhang, C. (2020, January 1\u20135). A multi-scale guided cascade hourglass network for depth completion. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Snowmass, CO, USA.","key":"ref_18","DOI":"10.1109\/WACV45572.2020.9093407"},{"doi-asserted-by":"crossref","unstructured":"Liu, L., Song, X., Lyu, X., Diao, J., Wang, M., Liu, Y., and Zhang, L. (2020). FCFR-Net: Feature fusion based coarse-to-fine residual learning for depth completion. arXiv.","key":"ref_19","DOI":"10.1609\/aaai.v35i3.16311"},{"doi-asserted-by":"crossref","unstructured":"Zhang, Y., and Funkhouser, T. (2018, January 18\u201323). Deep depth completion of a single rgb-d image. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","key":"ref_20","DOI":"10.1109\/CVPR.2018.00026"},{"doi-asserted-by":"crossref","unstructured":"Qiu, J., Cui, Z., Zhang, Y., Zhang, X., Liu, S., Zeng, B., and Pollefeys, M. (2019, January 15\u201320). Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","key":"ref_21","DOI":"10.1109\/CVPR.2019.00343"},{"unstructured":"Xu, Y., Zhu, X., Shi, J., Zhang, G., Bao, H., and Li, H. (November, January 27). Depth completion from sparse lidar data with depth-normal constraints. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea.","key":"ref_22"},{"doi-asserted-by":"crossref","unstructured":"Nazir, D., Liwicki, M., Stricker, D., and Afzal, M.Z. (2022). SemAttNet: Towards Attention-based Semantic Aware Guided Depth Completion. arXiv.","key":"ref_23","DOI":"10.1109\/ACCESS.2022.3214316"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"4098","DOI":"10.1109\/TVT.2021.3069212","article-title":"3D Point Clouds Data Super Resolution-Aided LiDAR Odometry for Vehicular Positioning in Urban Canyons","volume":"70","author":"Yue","year":"2021","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"2361","DOI":"10.1109\/TPAMI.2019.2947374","article-title":"Learning depth with convolutional spatial propagation network","volume":"42","author":"Cheng","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"doi-asserted-by":"crossref","unstructured":"Cheng, X., Wang, P., Guan, C., and Yang, R. (2020, January 7\u201312). Cspn++: Learning context and resource aware convolutional spatial propagation networks for depth completion. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.","key":"ref_26","DOI":"10.1609\/aaai.v34i07.6635"},{"doi-asserted-by":"crossref","unstructured":"Yang, Y., Wong, A., and Soatto, S. (2019, January 15\u201320). Dense depth posterior (ddp) from single image and sparse range. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","key":"ref_27","DOI":"10.1109\/CVPR.2019.00347"},{"doi-asserted-by":"crossref","unstructured":"Shivakumar, S.S., Nguyen, T., Miller, I.D., Chen, S.W., Kumar, V., and Taylor, C.J. (2019, January 15\u201320). Dfusenet: Deep fusion of rgb and sparse depth information for image guided dense depth completion. Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Long Beach, CA, USA.","key":"ref_28","DOI":"10.1109\/ITSC.2019.8917294"},{"unstructured":"Feng, Z., Jing, L., Yin, P., Tian, Y., and Li, B. (2022, January 14\u201318). Advancing self-supervised monocular depth learning with sparse liDAR. Proceedings of the Conference on Robot Learning, Auckland, New Zealand.","key":"ref_29"},{"doi-asserted-by":"crossref","unstructured":"Choi, J., Jung, D., Lee, Y., Kim, D., Manocha, D., and Lee, D. (June, January 30). Selfdeco: Self-supervised monocular depth completion in challenging indoor environments. Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi\u2019an, China.","key":"ref_30","DOI":"10.1109\/ICRA48506.2021.9560831"},{"doi-asserted-by":"crossref","unstructured":"Wong, A., and Soatto, S. (2021, January 10\u201317). Unsupervised depth completion with calibrated backprojection layers. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","key":"ref_31","DOI":"10.1109\/ICCV48922.2021.01251"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1899","DOI":"10.1109\/LRA.2020.2969938","article-title":"Unsupervised depth completion from visual inertial odometry","volume":"5","author":"Wong","year":"2020","journal-title":"IEEE Robot. Autom. Lett."},{"unstructured":"Godard, C., Mac Aodha, O., Firman, M., and Brostow, G.J. (November, January 27). Digging into self-supervised monocular depth estimation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea.","key":"ref_33"},{"doi-asserted-by":"crossref","unstructured":"Ku, J., Harakeh, A., and Waslander, S.L. (2018, January 8\u201310). In defense of classical image processing: Fast depth completion on the cpu. Proceedings of the 2018 15th Conference on Computer and Robot Vision (CRV), Toronto, ON, Canada.","key":"ref_34","DOI":"10.1109\/CRV.2018.00013"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/1\/135\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:52:02Z","timestamp":1760147522000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/1\/135"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,26]]},"references-count":34,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,1]]}},"alternative-id":["rs15010135"],"URL":"https:\/\/doi.org\/10.3390\/rs15010135","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2022,12,26]]}}}