{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:26:41Z","timestamp":1760236001077,"version":"build-2065373602"},"reference-count":72,"publisher":"MDPI AG","issue":"20","license":[{"start":{"date-parts":[[2021,10,13]],"date-time":"2021-10-13T00:00:00Z","timestamp":1634083200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Pyramid architecture is a useful strategy to fuse multi-scale features in deep monocular depth estimation approaches. However, most pyramid networks fuse features only within the adjacent stages in a pyramid structure. To take full advantage of the pyramid structure, inspired by the success of DenseNet, this paper presents DCPNet, a densely connected pyramid network that fuses multi-scale features from multiple stages of the pyramid structure. DCPNet not only performs feature fusion between the adjacent stages, but also non-adjacent stages. To fuse these features, we design a simple and effective dense connection module (DCM). In addition, we offer a new consideration of the common upscale operation in our approach. We believe DCPNet offers a more efficient way to fuse features from multiple scales in a pyramid-like network. We perform extensive experiments using both outdoor and indoor benchmark datasets (i.e., the KITTI and the NYU Depth V2 datasets) and DCPNet achieves the state-of-the-art results.<\/jats:p>","DOI":"10.3390\/s21206780","type":"journal-article","created":{"date-parts":[[2021,10,13]],"date-time":"2021-10-13T21:48:39Z","timestamp":1634161719000},"page":"6780","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["DCPNet: A Densely Connected Pyramid Network for Monocular Depth Estimation"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4132-7942","authenticated-orcid":false,"given":"Zhitong","family":"Lai","sequence":"first","affiliation":[{"name":"Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China"},{"name":"University of the Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4284-7969","authenticated-orcid":false,"given":"Rui","family":"Tian","sequence":"additional","affiliation":[{"name":"Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8111-4867","authenticated-orcid":false,"given":"Zhiguo","family":"Wu","sequence":"additional","affiliation":[{"name":"Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5070-6499","authenticated-orcid":false,"given":"Nannan","family":"Ding","sequence":"additional","affiliation":[{"name":"Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9146-6564","authenticated-orcid":false,"given":"Linjian","family":"Sun","sequence":"additional","affiliation":[{"name":"National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2305-1041","authenticated-orcid":false,"given":"Yanjie","family":"Wang","sequence":"additional","affiliation":[{"name":"Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,10,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Hoiem, D., Efros, A.A., and Hebert, M. (2005, January 17\u201320). Geometric context from a single image. Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV 2005), Beijing, China.","DOI":"10.1109\/ICCV.2005.107"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"628","DOI":"10.1111\/j.1467-9280.2006.01755.x","article-title":"The contribution of monocular depth cues to scene perception by pigeons","volume":"17","author":"Cavoto","year":"2006","journal-title":"Psychol. Sci."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"824","DOI":"10.1109\/TPAMI.2008.132","article-title":"Make3d: Learning 3d scene structure from a single still image","volume":"31","author":"Saxena","year":"2008","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_4","unstructured":"Delage, E., Lee, H., and Ng, A.Y. (2006, January 17\u201322). A dynamic bayesian network model for autonomous 3d reconstruction from a single indoor image. Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), New York, NY, USA."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., and Navab, N. (2016, January 25\u201328). Deeper depth prediction with fully convolutional residual networks. Proceedings of the 4th IEEE International Conference on 3D Vision (3DV), Stanford, CA, USA.","DOI":"10.1109\/3DV.2016.32"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"2024","DOI":"10.1109\/TPAMI.2015.2505283","article-title":"Learning depth from single monocular images using deep convolutional neural fields","volume":"38","author":"Liu","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_7","unstructured":"Eigen, D., Puhrsch, C., and Fergus, R. (2014). Depth map prediction from a single image using a multi-scale deep network. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Lee, J.H., and Kim, C.S. (2020, January 23\u201328). Multi-loss rebalancing algorithm for monocular depth estimation. Proceedings of the 16th European Conference on Computer Vision (ECCV), Glasgow, UK. Part XVII 16.","DOI":"10.1007\/978-3-030-58520-4_46"},{"key":"ref_9","unstructured":"Yu, F., and Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. arXiv."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Yang, M., Yu, K., Zhang, C., Li, Z., and Yang, K. (2018, January 18\u201323). Denseaspp for semantic segmentation in street scenes. Proceedings of the 31st IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00388"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Huynh, L., Nguyen-Ha, P., Matas, J., Rahtu, E., and Heikkil\u00e4, J. (2020, January 23\u201328). Guiding monocular depth estimation using depth-attention volume. Proceedings of the 16th European Conference on Computer Vision (ECCV), Virtual Event.","DOI":"10.1007\/978-3-030-58574-7_35"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1583","DOI":"10.1007\/s13042-020-01251-y","article-title":"Attention-based context aggregation network for monocular depth estimation","volume":"12","author":"Chen","year":"2021","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Ling, C., Zhang, X., and Chen, H. (2021). Unsupervised Monocular Depth Estimation using Attention and Multi-Warp Reconstruction. IEEE Trans. Multimed.","DOI":"10.1109\/TMM.2021.3091308"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid scene parsing network. Proceedings of the 30th IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Pan, X., Gao, L., Zhang, B., Yang, F., and Liao, W. (2018). High-resolution aerial imagery semantic labeling with dense pyramid network. Sensors, 18.","DOI":"10.3390\/s18113774"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Chen, X., Chen, X., and Zha, Z.J. (2019). Structure-aware residual pyramid network for monocular depth estimation. arXiv.","DOI":"10.24963\/ijcai.2019\/98"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Miangoleh, S.M.H., Dille, S., Mai, L., Paris, S., and Aksoy, Y. (2021, January 19\u201325). Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging. Proceedings of the CVRR, Virtual Event.","DOI":"10.1109\/CVPR46437.2021.00956"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1007\/978-3-030-30949-7_30","article-title":"Residual Feature Pyramid Architecture for Monocular Depth Estimation","volume":"Volume 11792","author":"Luo","year":"2019","journal-title":"Cooperative Design, Visualization, and Engineering"},{"key":"ref_20","unstructured":"Lee, J.H., Han, M.K., Ko, D.W., and Suh, I.H. (2019). From big to small: Multi-scale local planar guidance for monocular depth estimation. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Liu, J., Zhang, X., Li, Z., and Mao, T. (2021, January 10\u201315). Multi-Scale Residual Pyramid Attention Network for Monocular Depth Estimation. Proceedings of the 25th International Conference on Pattern Recognition (ICPR), Milan, Italy.","DOI":"10.1109\/ICPR48806.2021.9412670"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the 30th IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1177\/0278364913491297","article-title":"Vision meets robotics: The kitti dataset","volume":"32","author":"Geiger","year":"2013","journal-title":"Int. Robot. Res."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Silberman, N., Hoiem, D., Kohli, P., and Fergus, R. (2012, January 7\u201313). Indoor segmentation and support inference from rgbd images. Proceedings of the 12th European Conference on Computer Vision (ECCV), Florence, Italy.","DOI":"10.1007\/978-3-642-33715-4_54"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1023\/A:1026598000963","article-title":"Single view metrology","volume":"40","author":"Criminisi","year":"2000","journal-title":"IJCV"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Xu, D., Ricci, E., Ouyang, W., Wang, X., and Sebe, N. (2017, January 21\u201326). Multi-scale continuous crfs as sequential deep networks for monocular depth estimation. Proceedings of the 30th IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.25"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Hao, Z., Li, Y., You, S., and Lu, F. (2018, January 5\u20138). Detail preserving depth estimation from a single image using attention guided networks. Proceedings of the 6th International Conference on 3D Vision (3DV), Verona, Italy.","DOI":"10.1109\/3DV.2018.00043"},{"key":"ref_28","first-page":"107578","article-title":"DPNet: Detail-preserving network for high quality monocular depth estimation","volume":"109","author":"Ye","year":"2021","journal-title":"PR"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Godard, C., Mac Aodha, O., and Brostow, G.J. (2017, January 21\u201326). Unsupervised monocular depth estimation with left-right consistency. Proceedings of the 30th IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.699"},{"key":"ref_30","unstructured":"Godard, C., Aodha, M.O., Firman, M., and Brostow, G.J. (November, January 27). Digging into self-supervised monocular depth estimation. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Bristow, Seoul, Korea."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1016\/j.neucom.2019.12.049","article-title":"Unsupervised framework for depth estimation and camera motion prediction from video","volume":"385","author":"Yang","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Johnston, A., and Carneiro, G. (2020, January 14\u201319). Self-supervised monocular trained depth estimation using self-attention and discrete disparity volume. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00481"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"41337","DOI":"10.1109\/ACCESS.2018.2857703","article-title":"Wearable depth camera: Monocular depth estimation via sparse optimization under weak supervision","volume":"6","author":"He","year":"2018","journal-title":"IEEE Access"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Qi, X., Liao, R., Liu, Z., Urtasun, R., and Jia, J. (2018, January 18\u201322). Geonet: Geometric neural network for joint depth and surface normal estimation. Proceedings of the CVRR, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00037"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1007\/s10846-020-01205-0","article-title":"Semi-Supervised Monocular Depth Estimation Based on Semantic Supervision","volume":"100","author":"Yue","year":"2020","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Chang, J.R., and Chen, Y.S. (2018, January 18\u201323). Pyramid stereo matching network. Proceedings of the 31st IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00567"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2548","DOI":"10.1007\/s11263-021-01484-6","article-title":"Unsupervised Scale-consistent Depth Learning from Video","volume":"129","author":"Bian","year":"2021","journal-title":"Int. J. Comput. Vis."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Fang, Z., Chen, X., Chen, Y., and Gool, L.V. (2020, January 1\u20135). Towards good practice for CNN-based monocular depth estimation. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV), Snowmass, CO, USA.","DOI":"10.1109\/WACV45572.2020.9093334"},{"key":"ref_39","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Xie, S., Girshick, R., Doll\u00e1r, P., Tu, Z., and He, K. (2017, January 21\u201326). Aggregated residual transformations for deep neural networks. Proceedings of the 30th IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.634"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Jiao, J., Cao, Y., Song, Y., and Lau, R. (2018, January 8\u201314). Look deeper into depth: Monocular depth estimation with semantic booster and attention-driven loss. Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01267-0_4"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Ramamonjisoa, M., Firman, M., Watson, J., Lepetit, V., and Turmukhambetov, D. (2021, January 19\u201325). Single Image Depth Prediction with Wavelet Decomposition. Proceedings of the CVRR, Virtual Event.","DOI":"10.1109\/CVPR46437.2021.01094"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Yang, G., Tang, H., Ding, M., Sebe, N., and Ricci, E. (2021, January 11\u201317). Transformer-Based Attention Networks for Continuous Pixel-Wise Prediction. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Virtual Event.","DOI":"10.1109\/ICCV48922.2021.01596"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Kaushik, V., Jindgar, K., and Lall, B. (2021). ADAADepth: Adapting Data Augmentation and Attention for Self-Supervised Monocular Depth Estimation. arXiv.","DOI":"10.1109\/LRA.2021.3101049"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Seferbekov, S., Iglovikov, V., Buslaev, A., and Shvets, A. (2018, January 18\u201322). Feature pyramid network for multi-class land segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00051"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"2990","DOI":"10.1109\/TITS.2019.2922252","article-title":"Residual pyramid learning for single-shot semantic segmentation","volume":"21","author":"Chen","year":"2019","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"3008","DOI":"10.1109\/TMI.2020.2983721","article-title":"CPFNet: Context pyramid fusion network for medical image segmentation","volume":"39","author":"Feng","year":"2020","journal-title":"IEEE Trans. Med. Imag."},{"key":"ref_49","unstructured":"Nie, D., Xue, J., and Ren, X. (December, January 30). Bidirectional Pyramid Networks for Semantic Segmentation. Proceedings of the Asia Conference on Computer Vision (ACCV), Online Conference."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"4673","DOI":"10.1109\/TGRS.2020.3016086","article-title":"Road segmentation for remote sensing images using adversarial spatial pyramid networks","volume":"59","author":"Shamsolmoali","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"107940","DOI":"10.1016\/j.patcog.2021.107940","article-title":"GPNet: Gated pyramid network for semantic segmentation","volume":"115","author":"Zhang","year":"2021","journal-title":"Pattern Recognit."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Xin, Y., Wang, S., Li, L., Zhang, W., and Huang, Q. (2018, January 2\u20136). Reverse densely connected feature pyramid network for object detection. Proceedings of the 14th Asian Conference on Computer Vision (ACCV), Perth, Australia.","DOI":"10.1007\/978-3-030-20873-8_34"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Wang, T., Zhang, X., and Sun, J. (2020). Implicit feature pyramid network for object detection. arXiv.","DOI":"10.1109\/CAC53003.2021.9727887"},{"key":"ref_54","unstructured":"Ma, J., and Chen, B. (2020). Dual Refinement Feature Pyramid Networks for Object Detection. arXiv."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"242","DOI":"10.23919\/JCC.2020.08.020","article-title":"Dual attention based feature pyramid network","volume":"17","author":"Xing","year":"2020","journal-title":"China Commun."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"2738","DOI":"10.1109\/JSTARS.2020.2997081","article-title":"Attention receptive pyramid network for ship detection in SAR images","volume":"13","author":"Zhao","year":"2020","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_57","unstructured":"Liang, T., Wang, Y., Zhao, Q., Tang, Z., and Ling, H. (2019). MFPN: A novel mixture feature pyramid network of multiple architectures for object detection. arXiv."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"678","DOI":"10.1109\/LSP.2021.3067498","article-title":"Monocular Depth Estimation With Multi-Scale Feature Fusion","volume":"28","author":"Xu","year":"2021","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_59","unstructured":"Deng, Z., Yu, H., and Long, Y. (2021). Fractal Pyramid Networks. arXiv."},{"key":"ref_60","unstructured":"Kaushik, V., and Lall, B. (2020). Deep feature fusion for self-supervised monocular depth prediction. arXiv."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Poggi, M., Aleotti, F., Tosi, F., and Mattoccia, S. (2018, January 1\u20135). Towards real-time unsupervised monocular depth estimation on cpu. Proceedings of the 25th IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593814"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Kim, S.W., Kook, H.K., Sun, J.Y., Kang, M.C., and Ko, S.J. (2018, January 8\u201314). Parallel feature pyramid network for object detection. Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01228-1_15"},{"key":"ref_63","unstructured":"Liu, R., Lehman, J., Molino, P., Such, F.P., Frank, E., Sergeev, A., and Yosinski, J. (2018). An intriguing failing of convolutional neural networks and the coordconv solution. arXiv."},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Hu, J., Ozay, M., Zhang, Y., and Okatani, T. (2019, January 7\u201311). Revisiting single image depth estimation: Toward higher resolution maps with accurate object boundaries. Proceedings of the 19th IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA.","DOI":"10.1109\/WACV.2019.00116"},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Fu, H., Gong, M., Wang, C., Batmanghelich, K., and Tao, D. (2018, January 18\u201323). Deep ordinal regression network for monocular depth estimation. Proceedings of the 31st IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00214"},{"key":"ref_66","unstructured":"Yin, W., Liu, Y., Shen, C., and Yan, Y. (November, January 27). Enforcing geometric constraints of virtual normal for depth prediction. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_67","unstructured":"Alhashim, I., and Wonka, P. (2018). High quality monocular depth estimation via transfer learning. arXiv."},{"key":"ref_68","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., and Antiga, L. (2019, January 8\u201314). Pytorch: An imperative style, high-performance deep learning library. Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada."},{"key":"ref_69","unstructured":"Loshchilov, I., and Hutter, F. (2017). Decoupled weight decay regularization. arXiv."},{"key":"ref_70","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the IEEE-Computer-Society Conference on Computer Vision and Pattern Recognition Workshops, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Cheng, B., Saggu, I.S., Shah, R., Bansal, G., and Bharadia, D. (2020, January 23\u201328). S3Net: Semantic-Aware Self-supervised Depth Estimation with Monocular Videos and Synthetic Data. Proceedings of the 15th European Conference on Computer Vision (ECCV), Virtual Event.","DOI":"10.1007\/978-3-030-58577-8_4"},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Tiwari, L., Ji, P., Tran, Q.H., Zhuang, B., Anand, S., and Chandraker, M. (2020, January 23\u201328). Pseudo rgb-d for self-improving monocular slam and depth prediction. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Virtual Event.","DOI":"10.1007\/978-3-030-58621-8_26"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/20\/6780\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:12:21Z","timestamp":1760166741000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/20\/6780"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,13]]},"references-count":72,"journal-issue":{"issue":"20","published-online":{"date-parts":[[2021,10]]}},"alternative-id":["s21206780"],"URL":"https:\/\/doi.org\/10.3390\/s21206780","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,10,13]]}}}