{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,22]],"date-time":"2026-03-22T07:58:05Z","timestamp":1774166285006,"version":"3.50.1"},"reference-count":54,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T00:00:00Z","timestamp":1649721600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"publisher","award":["2021M702232"],"award-info":[{"award-number":["2021M702232"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Shenzhen Institute of Artificial Intelligence and Robotics for Society, the Science and Research Service Project of Shenzhen Metro Group Co., Ltd","award":["STJS-DT413-KY002\/2021"],"award-info":[{"award-number":["STJS-DT413-KY002\/2021"]}]},{"name":"Basic and Applied Basic Research Funding Program of Guangdong Province of China","award":["2019A1515110303"],"award-info":[{"award-number":["2019A1515110303"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Visual odometry is the task of estimating the trajectory of the moving agents from consecutive images. It is a hot research topic both in robotic and computer vision communities and facilitates many applications, such as autonomous driving and virtual reality. The conventional odometry methods predict the trajectory by utilizing the multiple view geometry between consecutive overlapping images. However, these methods need to be carefully designed and fine-tuned to work well in different environments. Deep learning has been explored to alleviate the challenge by directly predicting the relative pose from the paired images. Deep learning-based methods usually focus on the consecutive images that are feasible to propagate the error over time. In this paper, graph loss and geodesic rotation loss are proposed to enhance deep learning-based visual odometry methods based on graph constraints and geodesic distance, respectively. The graph loss not only considers the relative pose loss of consecutive images, but also the relative pose of non-consecutive images. The relative pose of non-consecutive images is not directly predicted but computed from the relative pose of consecutive ones. The geodesic rotation loss is constructed by the geodesic distance and the model regresses a Lie algebra so(3) (3D vector). This allows a robust and stable convergence. To increase the efficiency, a random strategy is adopted to select the edges of the graph instead of using all of the edges. This strategy provides additional regularization for training the networks. Extensive experiments are conducted on visual odometry benchmarks, and the obtained results demonstrate that the proposed method has comparable performance to other supervised learning-based methods, as well as monocular camera-based methods. The source code and the weight are made publicly available.<\/jats:p>","DOI":"10.3390\/rs14081854","type":"journal-article","created":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T22:48:45Z","timestamp":1649803725000},"page":"1854","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Exploiting Graph and Geodesic Distance Constraint for Deep Learning-Based Visual Odometry"],"prefix":"10.3390","volume":"14","author":[{"given":"Xu","family":"Fang","sequence":"first","affiliation":[{"name":"Guangdong Key Laboratory of Urban Informatics, Shenzhen University, Shenzhen 518060, China"},{"name":"College of Electronics and Information Engineering, Shenzhen University, Shenzhen 518060, China"}]},{"given":"Qing","family":"Li","sequence":"additional","affiliation":[{"name":"Guangdong Key Laboratory of Urban Informatics, Shenzhen University, Shenzhen 518060, China"},{"name":"College of Civil and Transportation Engineering, Shenzhen University, Shenzhen 518060, China"}]},{"given":"Qingquan","family":"Li","sequence":"additional","affiliation":[{"name":"Guangdong Key Laboratory of Urban Informatics, Shenzhen University, Shenzhen 518060, China"},{"name":"College of Civil and Transportation Engineering, Shenzhen University, Shenzhen 518060, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4214-1923","authenticated-orcid":false,"given":"Kai","family":"Ding","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Dongguan University of Technology, Dongguan 523808, China"}]},{"given":"Jiasong","family":"Zhu","sequence":"additional","affiliation":[{"name":"College of Civil and Transportation Engineering, Shenzhen University, Shenzhen 518060, China"},{"name":"Shenzhen University Branch, Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen 518060, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,4,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1109\/MRA.2011.943233","article-title":"Visual Odometry [Tutorial]","volume":"18","author":"Scaramuzza","year":"2011","journal-title":"IEEE Robot. Autom. Mag."},{"key":"ref_2","unstructured":"Nister, D., Naroditsky, O., and Bergen, J. (July, January 27). Visual odometry. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, Washington, DC, USA."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"854","DOI":"10.1002\/rob.20412","article-title":"Monocular-SLAM\u2013based navigation for autonomous micro helicopters in GPS-denied environments","volume":"28","author":"Weiss","year":"2011","journal-title":"J. Field Robot."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1016\/S0262-8856(00)00086-X","article-title":"A probabilistic model for appearance-based robot localization","volume":"19","author":"Vlassis","year":"2001","journal-title":"Image Vis. Comput."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1109\/TRO.2004.835453","article-title":"Robust vision-based localization by combining an image-retrieval system with Monte Carlo localization","volume":"21","author":"Wolf","year":"2005","journal-title":"IEEE Trans. Robot."},{"key":"ref_6","first-page":"49","article-title":"Ancillary ultrasonic rangefinder for autonomous vehicles","volume":"12","author":"Wiseman","year":"2018","journal-title":"Int. J. Secur. Its Appl."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wang, S., Clark, R., Wen, H., and Trigoni, N. (June, January 29). DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989236"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Saputra, M.R.U., de Gusmao, P.P., Wang, S., Markham, A., and Trigoni, N. (2019, January 20\u201324). Learning monocular visual odometry through geometry-aware curriculum learning. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8793581"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhan, H., Weerasekera, C.S., Bian, J.-W., and Reid, I. (August, January 31). Visual odometry revisited: What should be learnt?. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France.","DOI":"10.1109\/ICRA40945.2020.9197374"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Clark, R., Wang, S., Markham, A., Trigoni, N., and Wen, H. (2017, January 21\u201326). Vidloc: A deep spatio-temporal model for 6-dof video-clip relocalization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.284"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1016\/j.neucom.2020.09.071","article-title":"Relative Geometry-Aware Siamese Neural Network for 6DOF Camera Relocalization","volume":"426","author":"Li","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Xue, F., Wu, X., Cai, S., and Wang, J. (2020, January 13\u201319). Learning Multi-View Camera Relocalization With Graph Neural Networks. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01139"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1330","DOI":"10.1109\/34.888718","article-title":"A flexible new technique for camera calibration","volume":"22","author":"Zhang","year":"2000","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Kumar, G., and Bhatia, P.K. (2014, January 8\u20139). A detailed review of feature extraction in image processing systems. Proceedings of the 2014 Fourth International Conference on Advanced Computing & Communication Technologies, Rohtak, India.","DOI":"10.1109\/ACCT.2014.74"},{"key":"ref_16","unstructured":"Kummerle, R., Grisetti, G., Strasdat, H., Konolige, K., and Burgard, W. (2011, January 9\u201313). G2O: A general framework for graph optimization. Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/2300000043","article-title":"Factor Graphs for Robot Perception","volume":"6","author":"Dellaert","year":"2017","journal-title":"Found. Trends Robot."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"230","DOI":"10.1016\/j.isprsjprs.2020.04.016","article-title":"Efficient structure from motion for large-scale UAV images: A review and a comparison of SfM tools","volume":"167","author":"Jiang","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1016\/j.isprsjprs.2019.11.014","article-title":"Panoramic SLAM from a multiple fisheye camera rig","volume":"159","author":"Ji","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1004","DOI":"10.1109\/TRO.2018.2853729","article-title":"VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator","volume":"34","author":"Qin","year":"2018","journal-title":"IEEE Trans. Robot."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1874","DOI":"10.1109\/TRO.2021.3075644","article-title":"ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual\u2013Inertial, and Multimap SLAM","volume":"37","author":"Campos","year":"2021","journal-title":"IEEE Trans. Robot."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Rosinol, A., Abate, M., Chang, Y., and Carlone, L. (August, January 31). Kimera: An open-source library for real-time metric-semantic localization and mapping. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France.","DOI":"10.1109\/ICRA40945.2020.9196885"},{"key":"ref_23","unstructured":"Zhang, G. (2021). Towards Optimal 3D Reconstruction and Semantic Mapping. [Ph.D. Thesis, University of California]."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Rosten, E., and Drummond, T. (2006, January 7\u201313). Machine learning for high-speed corner detection. Proceedings of the European Conference on Computer Vision, Graz, Austria.","DOI":"10.1007\/11744023_34"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.cviu.2007.09.014","article-title":"Speeded-up robust features (SURF)","volume":"110","author":"Bay","year":"2008","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Calonder, M., Lepetit, V., Strecha, C., and Fua, P. (2010, January 5\u201311). Brief: Binary robust independent elementary features. Proceedings of the European Conference on Computer Vision, Heraklion, Greece.","DOI":"10.1007\/978-3-642-15561-1_56"},{"key":"ref_27","first-page":"10","article-title":"A combined corner and edge detector","volume":"15","author":"Harris","year":"1988","journal-title":"Alvey Vis. Conf."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Aguiar, A., Sousa, A., Santos, F.N.d., and Oliveira, M. (2019, January 24\u201326). Monocular Visual Odometry Benchmarking and Turn Performance Optimization. Proceedings of the 2019 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), Porto, Portugal.","DOI":"10.1109\/ICARSC.2019.8733633"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Geiger, A., Ziegler, J., and Stiller, C. (2011, January 5\u20139). StereoScan: Dense 3d reconstruction in real-time. Proceedings of the 2011 IEEE Intelligent Vehicles Symposium (IV), Baden-Baden, Germany.","DOI":"10.1109\/IVS.2011.5940405"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1255","DOI":"10.1109\/TRO.2017.2705103","article-title":"ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras","volume":"33","year":"2017","journal-title":"IEEE Trans. Robot."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1007\/978-3-319-10605-2_54","article-title":"LSD-SLAM: Large-Scale Direct Monocular SLAM","volume":"Volume 8690","author":"Fleet","year":"2014","journal-title":"Proceedings of the Computer Vision\u2014ECCV 2014"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1109\/TPAMI.2017.2658577","article-title":"Direct sparse odometry","volume":"40","author":"Engel","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"Imagenet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_34","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv, Available online: http:\/\/arxiv.org\/abs\/1804.02767."},{"key":"ref_35","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2016). Faster R-CNN: Towar ds Real-Time Object Detection with Region Proposal Networks. arXiv, Available online: http:\/\/arxiv.org\/abs\/1506.01497."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"6217","DOI":"10.1007\/s11042-021-11135-0","article-title":"BlindNet backdoor: Attack on deep neural network using blind watermark","volume":"81","author":"Kwon","year":"2022","journal-title":"Multimed. Tools Appl."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Kendall, A., and Cipolla, R. (2017, January 21\u201326). Geometric Loss Functions for Camera Pose Regression With Deep Learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.694"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Konda, K.R., and Memisevic, R. (2015, January 11\u201314). Learning visual odometry with a convolutional network. Proceedings of the VISAPP (1), Berlin, Germany.","DOI":"10.5220\/0005299304860490"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Vu, T., van Nguyen, C., Pham, T.X., Luu, T.M., and Yoo, C.D. (2018, January 8\u201314). Fast and efficient image quality enhancement via desubpixel convolutional neural networks. Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany.","DOI":"10.1007\/978-3-030-11021-5_16"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Jeon, M., and Jeong, Y.-S. (2020). Compact and Accurate Scene Text Detector. Appl. Sci., 10.","DOI":"10.3390\/app10062096"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Kendall, A., Grimes, M., and Cipolla, R. (2015, January 7\u201313). PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.336"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Zhou, L., Luo, Z., Shen, T., Zhang, J., Zhen, M., Yao, Y., Fang, T., and Quan, L. (2020, January 13\u201319). KFNet: Learning Temporal Camera Relocalization using Kalman Filtering. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00497"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1016\/j.isprsjprs.2020.01.008","article-title":"3D map-guided single indoor image localization refinement","volume":"161","author":"Li","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1109\/LRA.2015.2505717","article-title":"Exploring Representation Learning With CNNs for Frame-to-Frame Ego-Motion Estimation","volume":"1","author":"Costante","year":"2016","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Muller, P., and Savakis, A. (2017, January 24\u201331). Flowdometry: An optical flow and deep learning based approach to visual odometry. Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA.","DOI":"10.1109\/WACV.2017.75"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., and Brox, T. (2015, January 7\u201313). Flownet: Learning optical flow with convolutional networks. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.316"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"513","DOI":"10.1177\/0278364917734298","article-title":"End-to-end, sequence-to-sequence probabilistic visual odometry through deep neural networks","volume":"37","author":"Wang","year":"2018","journal-title":"Int. J. Robot. Res."},{"key":"ref_48","first-page":"35","article-title":"Unsupervised scale-consistent depth and ego-motion learning from monocular video","volume":"32","author":"Bian","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_49","unstructured":"Zhao, C., Tang, Y., Sun, Q., and Vasilakos, A.V. (2021). Deep Direct Visual Odometry. IEEE Trans. Intell. Transp. Syst., 1\u201310."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Clark, R., Wang, S., Wen, H., Markham, A., and Trigoni, N. (2017). VINet: Visual-Inertial Odometry as a Sequence-to-Sequence Learning Problem. arXiv.","DOI":"10.1609\/aaai.v31i1.11215"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"18076","DOI":"10.1109\/ACCESS.2019.2896988","article-title":"Using Unsupervised Deep Learning Technique for Monocular Visual Odometry","volume":"7","author":"Liu","year":"2019","journal-title":"IEEE Access"},{"key":"ref_52","unstructured":"Jiao, J., Jiao, J., Mo, Y., Liu, W., and Deng, Z. (2018). Magicvo: End-to-end monocular visual odometry through deep bi-directional recurrent convolutional neural network. arXiv."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Fang, Q., and Hu, T. (2018, January 19\u201323). Euler angles based loss function for camera relocalization with Deep learning. Proceedings of the 2018 IEEE 8th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), Tianjin, China.","DOI":"10.1109\/CYBER.2018.8688359"},{"key":"ref_54","unstructured":"Li, D., and Dunson, D.B. (2020). Geodesic Distance Estimation with Spherelets. arXiv, Available online: http:\/\/arxiv.org\/abs\/1907.00296."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/8\/1854\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:52:37Z","timestamp":1760136757000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/8\/1854"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,12]]},"references-count":54,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2022,4]]}},"alternative-id":["rs14081854"],"URL":"https:\/\/doi.org\/10.3390\/rs14081854","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,4,12]]}}}