{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:45:11Z","timestamp":1760150711992,"version":"build-2065373602"},"reference-count":41,"publisher":"MDPI AG","issue":"24","license":[{"start":{"date-parts":[[2023,12,12]],"date-time":"2023-12-12T00:00:00Z","timestamp":1702339200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Fundamental Research Funds for the Central Universities","award":["QZJC20230304","62073163"],"award-info":[{"award-number":["QZJC20230304","62073163"]}]},{"name":"National Natural Science Foundation of China","award":["QZJC20230304","62073163"],"award-info":[{"award-number":["QZJC20230304","62073163"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Accurate localization between cameras is a prerequisite for a vision-based heterogeneous robot systems task. The core issue is how to accurately perform place recognition from different view-points. Traditional appearance-based methods have a high probability of failure in place recognition and localization under large view-point changes. In recent years, semantic graph matching-based place recognition methods have been proposed to solve the above problem. However, these methods rely on high-precision semantic segmentation results and have a high time complexity in node extraction or graph matching. In addition, methods only utilize the semantic labels of the landmarks themselves to construct graphs and descriptors, making such approaches fail in some challenging scenarios (e.g., scene repetition). In this paper, we propose a graph-matching method based on a novel landmark topology descriptor, which is robust to view-point changes. According to the experiment on real-world data, our algorithm can run in real-time and is approximately four times and three times faster than state-of-the-art algorithms in the graph extraction and matching phases, respectively. In terms of place recognition performance, our algorithm achieves the best place recognition precision at a recall of 0\u201370% compared with classic appearance-based algorithms and an advanced graph-based algorithm in the scene of significant view-point changes. In terms of positioning accuracy, compared to the traditional appearance-based DBoW2 and NetVLAD algorithms, our method outperforms by 95%, on average, in terms of the mean translation error and 95% in terms of the mean RMSE. Compared to the state-of-the-art SHM algorithm, our method outperforms by 30%, on average, in terms of the mean translation error and 29% in terms of the mean RMSE. In addition, our method outperforms the current state-of-the-art algorithm, even in challenging scenarios where the benchmark algorithms fail.<\/jats:p>","DOI":"10.3390\/s23249775","type":"journal-article","created":{"date-parts":[[2023,12,12]],"date-time":"2023-12-12T05:23:22Z","timestamp":1702358602000},"page":"9775","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Landmark Topology Descriptor-Based Place Recognition and Localization under Large View-Point Changes"],"prefix":"10.3390","volume":"23","author":[{"given":"Guanhong","family":"Gao","sequence":"first","affiliation":[{"name":"College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China"}]},{"given":"Zhi","family":"Xiong","sequence":"additional","affiliation":[{"name":"College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9370-7934","authenticated-orcid":false,"given":"Yao","family":"Zhao","sequence":"additional","affiliation":[{"name":"College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China"}]},{"given":"Ling","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,12,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1002\/rob.21620","article-title":"Multiple-robot simultaneous localization and mapping: A review","volume":"33","author":"Saeedi","year":"2016","journal-title":"J. Field Robot."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"103072","DOI":"10.1016\/j.jvcir.2021.103072","article-title":"Semantic loop closure detection based on graph matching in multi-objects scenes","volume":"76","author":"Qin","year":"2021","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.cviu.2007.09.014","article-title":"Speeded-up robust features (SURF)","volume":"110","author":"Bay","year":"2008","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Rublee, E., Rabaud, V., Konolige, K., and Bradski, G. (2011, January 6\u201313). ORB: An efficient alternative to SIFT or SURF. Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126544"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1188","DOI":"10.1109\/TRO.2012.2197158","article-title":"Bags of binary words for fast place recognition in image sequences","volume":"28","author":"Tardos","year":"2012","journal-title":"IEEE Trans. Robot."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1023\/A:1011139631724","article-title":"Modeling the shape of the scene: A holistic representation of the spatial envelope","volume":"42","author":"Oliva","year":"2001","journal-title":"Int. J. Comput. Vis."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1016\/j.jvcir.2011.11.002","article-title":"Comparative study of global color and texture descriptors for web image retrieval","volume":"23","author":"Penatti","year":"2012","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"886","DOI":"10.1109\/CVPR.2005.177","article-title":"Histograms of oriented gradients for human detection","volume":"Volume 1","author":"Dalal","year":"2005","journal-title":"Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905)"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., and Sivic, J. (2016, January 27\u201330). NetVLAD: CNN architecture for weakly supervised place recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.572"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"S\u00fcnderhauf, N., Shirazi, S., Dayoub, F., Upcroft, B., and Milford, M. (October, January 28). On the performance of convnet features for place recognition. Proceedings of the 2015 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany.","DOI":"10.1109\/IROS.2015.7353986"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Merrill, N., and Huang, G. (2018). Lightweight unsupervised deep loop closure. arXiv.","DOI":"10.15607\/RSS.2018.XIV.032"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/j.robot.2017.03.004","article-title":"Robust visual semi-semantic loop closure detection by a covisibility graph and CNN features","volume":"92","author":"Cascianelli","year":"2017","journal-title":"Robot. Auton. Syst."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhu, Y., Ma, Y., Chen, L., Liu, C., Ye, M., and Li, L. (2020, January 25\u201329). Gosmatch: Graph-of-semantics matching for detecting loop closures in 3d lidar data. Proceedings of the 2020 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA.","DOI":"10.1109\/IROS45743.2020.9341299"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Stumm, E., Mei, C., Lacroix, S., Nieto, J., Hutter, M., and Siegwart, R. (2016, January 27\u201330). Robust visual place recognition with graph kernels. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.491"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1687","DOI":"10.1109\/LRA.2018.2801879","article-title":"X-view: Graph-based semantic multi-view localization","volume":"3","author":"Gawel","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"7041","DOI":"10.1109\/LRA.2021.3097242","article-title":"Topology aware object-level semantic mapping towards more robust loop closure","volume":"6","author":"Lin","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Liu, Y., Petillot, Y., Lane, D., and Wang, S. (2019, January 20\u201324). Global localization with object-level semantics and topology. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8794475"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"8349","DOI":"10.1109\/LRA.2021.3058935","article-title":"Semantic histogram based graph matching for real-time multi-robot global localization in large scale environment","volume":"6","author":"Guo","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1177\/0278364908090961","article-title":"FAB-MAP: Probabilistic localization and mapping in the space of appearance","volume":"27","author":"Cummins","year":"2008","journal-title":"Int. J. Robot. Res."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Rosten, E., and Drummond, T. (2006, January 7\u201313). Machine learning for high-speed corner detection. Proceedings of the Computer Vision\u2013ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria. Proceedings; Part I 9.","DOI":"10.1007\/11744023_34"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Calonder, M., Lepetit, V., Strecha, C., and Fua, P. (2010, January 5\u201311). Brief: Binary robust independent elementary features. Proceedings of the Computer Vision\u2013ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece. Proceedings; Part IV 11.","DOI":"10.1007\/978-3-642-15561-1_56"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Schmuck, P., and Chli, M. (June, January 29). Multi-uav collaborative monocular slam. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989445"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Chen, X., Lu, H., Xiao, J., and Zhang, H. (2018, January 19\u201323). Distributed monocular multi-robot slam. Proceedings of the 2018 IEEE 8th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), Tianjin, China.","DOI":"10.1109\/CYBER.2018.8688219"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1255","DOI":"10.1109\/TRO.2017.2705103","article-title":"Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras","volume":"33","year":"2017","journal-title":"IEEE Trans. Robot."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1874","DOI":"10.1109\/TRO.2021.3075644","article-title":"Orb-slam3: An accurate open-source library for visual, visual\u2013inertial, and multimap slam","volume":"37","author":"Campos","year":"2021","journal-title":"IEEE Trans. Robot."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1004","DOI":"10.1109\/TRO.2018.2853729","article-title":"Vins-mono: A robust and versatile monocular visual-inertial state estimator","volume":"34","author":"Qin","year":"2018","journal-title":"IEEE Trans. Robot."},{"key":"ref_28","unstructured":"Qin, T., Pan, J., Cao, S., and Shen, S. (2019). A General Optimization-based Framework for Local Odometry Estimation with Multiple Sensors. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10846-022-01613-4","article-title":"KSF-SLAM: A Key Segmentation Frame Based Semantic SLAM in Dynamic Environments","volume":"105","author":"Zhao","year":"2022","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1286","DOI":"10.1177\/0278364917732640","article-title":"Distributed mapping with privacy and communication constraints: Lightweight algorithms and object-based models","volume":"36","author":"Choudhary","year":"2017","journal-title":"Int. J. Robot. Res."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Mousavian, A., Ko\u0161eck\u00e1, J., and Lien, J.M. (2015, January 26\u201330). Semantically guided location recognition for outdoors scenes. Proceedings of the 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA.","DOI":"10.1109\/ICRA.2015.7139877"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Li, J., Meger, D., and Dudek, G. (2019, January 20\u201324). Semantic mapping for view-invariant relocalization. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8793624"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Liu, Z., Wu, Z., and T\u00f3th, R. (2020, January 14\u201319). Smoke: Single-stage monocular 3d object detection via keypoint estimation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA.","DOI":"10.1109\/CVPRW50498.2020.00506"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16\u201321). Are we ready for autonomous driving? The kitti vision benchmark suite. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1145\/358669.358692","article-title":"Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography","volume":"24","author":"Fischler","year":"1981","journal-title":"Commun. ACM"},{"key":"ref_37","first-page":"586","article-title":"Method for registration of 3-D shapes","volume":"Volume 1611","author":"Besl","year":"1992","journal-title":"Proceedings of the Sensor Fusion IV: Control Paradigms and Data Structures"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"629","DOI":"10.1364\/JOSAA.4.000629","article-title":"Closed-form solution of absolute orientation using unit quaternions","volume":"4","author":"Horn","year":"1987","journal-title":"Josa A"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Lambert, J., Liu, Z., Sener, O., Hays, J., and Koltun, V. (2020, January 14\u201319). MSeg: A composite dataset for multi-domain semantic segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00295"},{"key":"ref_40","unstructured":"Lee, J.H., Han, M.K., Ko, D.W., and Suh, I.H. (2019). From big to small: Multi-scale local planar guidance for monocular depth estimation. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1016\/j.cag.2005.03.005","article-title":"Note: An algorithm for contour-based region filling","volume":"29","author":"Codrea","year":"2005","journal-title":"Comput. Graph."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/24\/9775\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:37:09Z","timestamp":1760132229000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/24\/9775"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,12]]},"references-count":41,"journal-issue":{"issue":"24","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["s23249775"],"URL":"https:\/\/doi.org\/10.3390\/s23249775","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2023,12,12]]}}}