{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T06:44:10Z","timestamp":1774421050610,"version":"3.50.1"},"reference-count":48,"publisher":"MDPI AG","issue":"23","license":[{"start":{"date-parts":[[2023,12,3]],"date-time":"2023-12-03T00:00:00Z","timestamp":1701561600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Central Government Guides Local Science and Technology Development Foundation Projects","award":["ZY19183003"],"award-info":[{"award-number":["ZY19183003"]}]},{"name":"Central Government Guides Local Science and Technology Development Foundation Projects","award":["AB20058001"],"award-info":[{"award-number":["AB20058001"]}]},{"name":"Guangxi Key Research and Development Project","award":["ZY19183003"],"award-info":[{"award-number":["ZY19183003"]}]},{"name":"Guangxi Key Research and Development Project","award":["AB20058001"],"award-info":[{"award-number":["AB20058001"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Simultaneous location and mapping (SLAM) technology is key in robot autonomous navigation. Most visual SLAM (VSLAM) algorithms for dynamic environments cannot achieve sufficient positioning accuracy and real-time performance simultaneously. When the dynamic object proportion is too high, the VSLAM algorithm will collapse. To solve the above problems, this paper proposes an indoor dynamic VSLAM algorithm called YDD-SLAM based on ORB-SLAM3, which introduces the YOLOv5 object detection algorithm and integrates deep information. Firstly, the objects detected by YOLOv5 are divided into eight subcategories according to their motion characteristics and depth values. Secondly, the depth ranges of the dynamic object and potentially dynamic object in the moving state in the scene are calculated. Simultaneously, the depth value of the feature point in the detection box is compared with that of the feature point in the detection box to determine whether the point is a dynamic feature point; if it is, the dynamic feature point is eliminated. Further, multiple feature point optimization strategies were developed for VSLAM in dynamic environments. A public data set and an actual dynamic scenario were used for testing. The accuracy of the proposed algorithm was significantly improved compared to that of ORB-SLAM3. This work provides a theoretical foundation for the practical application of a dynamic VSLAM algorithm.<\/jats:p>","DOI":"10.3390\/s23239592","type":"journal-article","created":{"date-parts":[[2023,12,3]],"date-time":"2023-12-03T04:59:16Z","timestamp":1701579556000},"page":"9592","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["YDD-SLAM: Indoor Dynamic Visual SLAM Fusing YOLOv5 with Depth Information"],"prefix":"10.3390","volume":"23","author":[{"given":"Peichao","family":"Cong","sequence":"first","affiliation":[{"name":"School of Mechanical and Automotive Engineering, Guangxi University of Science and Technology, Liuzhou 545006, China"}]},{"given":"Junjie","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Mechanical and Automotive Engineering, Guangxi University of Science and Technology, Liuzhou 545006, China"}]},{"given":"Jiaxing","family":"Li","sequence":"additional","affiliation":[{"name":"School of Mechanical and Automotive Engineering, Guangxi University of Science and Technology, Liuzhou 545006, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-0461-9374","authenticated-orcid":false,"given":"Yixuan","family":"Xiao","sequence":"additional","affiliation":[{"name":"School of Mechanical and Automotive Engineering, Guangxi University of Science and Technology, Liuzhou 545006, China"}]},{"given":"Xilai","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Mechanical and Automotive Engineering, Guangxi University of Science and Technology, Liuzhou 545006, China"}]},{"given":"Xinjie","family":"Feng","sequence":"additional","affiliation":[{"name":"School of Mechanical and Automotive Engineering, Guangxi University of Science and Technology, Liuzhou 545006, China"}]},{"given":"Xin","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Mechanical and Automotive Engineering, Guangxi University of Science and Technology, Liuzhou 545006, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,12,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Chen, W., Shang, G., Ji, A., Zhou, C., Wang, X., Xu, C., and Li, Z. (2022). An overview on visual slam: From tradition to semantic. Remote Sens., 14.","DOI":"10.3390\/rs14133010"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"5462","DOI":"10.1109\/TIP.2017.2735192","article-title":"Unified blind quality assessment of compressed natural, graphic, and screen content images","volume":"26","author":"Min","year":"2017","journal-title":"IEEE Trans. Image Process."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"2879","DOI":"10.1109\/TITS.2018.2868771","article-title":"Objective quality evaluation of dehazed images","volume":"20","author":"Min","year":"2018","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"3790","DOI":"10.1109\/TIP.2020.2966081","article-title":"A metric for light field reconstruction, compression, and display quality evaluation","volume":"29","author":"Min","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1109\/TIE.2018.2826471","article-title":"A monocular vision sensor-based efficient SLAM method for indoor service robots","volume":"66","author":"Lee","year":"2018","journal-title":"IEEE Trans. Ind. Electron."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"107822","DOI":"10.1016\/j.patcog.2021.107822","article-title":"Visual SLAM for robot navigation in healthcare facility","volume":"113","author":"Fang","year":"2021","journal-title":"Pattern Recognit."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1004","DOI":"10.1109\/TRO.2018.2853729","article-title":"VINS-Mono: A robust and versatile monocular visual-inertial state estimator","volume":"34","author":"Qin","year":"2018","journal-title":"IEEE Trans. Robot."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2004","DOI":"10.1109\/TRO.2021.3133730","article-title":"GVINS: Tightly coupled GNSS\u2013visual\u2013inertial fusion for smooth and consistent state estimation","volume":"38","author":"Cao","year":"2022","journal-title":"IEEE Trans. Robot."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1147","DOI":"10.1109\/TRO.2015.2463671","article-title":"ORB-SLAM: A versatile and accurate monocular SLAM system","volume":"31","author":"Montiel","year":"2015","journal-title":"IEEE Trans. Robot."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1255","DOI":"10.1109\/TRO.2017.2705103","article-title":"Orb-slam2: An open-source slam system for monocular, stereo, and RGB-D cameras","volume":"33","year":"2017","journal-title":"IEEE Trans. Robot."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1874","DOI":"10.1109\/TRO.2021.3075644","article-title":"Orb-slam3: An accurate open-source library for visual, visual\u2013inertial, and multimap slam","volume":"37","author":"Campos","year":"2021","journal-title":"IEEE Trans. Robot."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Rublee, E., Rabaud, V., Konolige, K., and Bradski, G. (2011, January 6\u201313). ORB: An Efficient Alternative to SIFT or SURF. Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126544"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Lu, X., Wang, H., Tang, S., Huang, H., and Li, C. (2020). DM-SLAM: Monocular SLAM in dynamic environments. Appl. Sci., 10.","DOI":"10.20944\/preprints202001.0123.v1"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1016\/j.robot.2018.07.002","article-title":"Motion removal for reliable RGB-D SLAM in dynamic environments","volume":"108","author":"Sun","year":"2018","journal-title":"Robot. Auton. Syst."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Fu, Y., Han, B., Hu, Z., Shen, X., and Zhao, Y. (2022, January 9\u201311). CBAM-SLAM: A Semantic SLAM Based on Attention Module in Dynamic Environment. Proceedings of the 2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT), Changzhou, China.","DOI":"10.1109\/ACAIT56212.2022.10137973"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"106981","DOI":"10.1109\/ACCESS.2021.3100426","article-title":"RDMO-SLAM: Real-time visual SLAM for dynamic environments using semantic label prediction with optical flow","volume":"9","author":"Liu","year":"2021","journal-title":"IEEE Access"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Sun, D., Yang, X., Liu, M.Y., and Kautz, J. (2018, January 18\u201322). PWC-Net: Cnns for Optical Flow Using Pyramid, Warping, and Cost Volume. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00931"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017). Mask R-CNN. Proc. IEEE Int. Conf. Comput. Vis., 2961\u20132969.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Yan, H., Zhou, X., Liu, J., Yin, Z., and Yang, Z. (2022, January 11\u201314). Robust Vision SLAM Based on YOLOX for Dynamic Environments. Proceedings of the 2022 IEEE 22nd International Conference on Communication Technology (ICCT), Nanjing, China.","DOI":"10.1109\/ICCT56141.2022.10073383"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"G\u00f6kcen, B., and Uslu, E. (2022, January 8\u201310). Object aware RGBD SLAM in Dynamic Environments. Proceedings of the 2022 International Conference on Innovations in Intelligent Systems and Applications (INISTA), Biarritz, France.","DOI":"10.1109\/INISTA55318.2022.9894245"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Gong, H., Gong, L., Ma, T., Sun, Z., and Li, L. (2023). AHY-SLAM: Toward faster and more accurate visual SLAM in dynamic scenes using homogenized feature extraction and object detection method. Sensors, 23.","DOI":"10.3390\/s23094241"},{"key":"ref_22","unstructured":"(2021, October 12). YOLO-V5. Available online: https:\/\/github.com\/ultralytics\/yolov5\/releases."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wang, Y., Bu, H., Zhang, X., and Cheng, J. (2022). YPD-SLAM: A real-time VSLAM system for handling dynamic indoor environments. Sensors, 22.","DOI":"10.3390\/s22218561"},{"key":"ref_24","first-page":"7501012","article-title":"SG-SLAM: A real-time RGB-D visual SLAM toward dynamic scenes with semantic and geometric information","volume":"72","author":"Cheng","year":"2022","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhao, X., and Ye, L. (2022, January 7\u201310). Object Detection-Based Visual SLAM for Dynamic Scenes. Proceedings of the 2022 IEEE International Conference on Mechatronics and Automation (ICMA), Guilin, China.","DOI":"10.1109\/ICMA54519.2022.9856202"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"87754","DOI":"10.1109\/ACCESS.2022.3199350","article-title":"Real-time dynamic SLAM algorithm based on deep learning","volume":"10","author":"Su","year":"2022","journal-title":"IEEE Access"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"4076","DOI":"10.1109\/LRA.2018.2860039","article-title":"DynaSLAM: Tracking, mapping, and inpainting in dynamic scenes","volume":"3","author":"Bescos","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"10818","DOI":"10.1109\/JSEN.2022.3169340","article-title":"WF-SLAM: A robust VSLAM for dynamic scenarios via weighted features","volume":"22","author":"Zhong","year":"2022","journal-title":"IEEE Sens. J."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Sun, L., Wei, J., Su, S., and Wu, P. (2022). Solo-slam: A parallel semantic slam algorithm for dynamic scenes. Sensors, 22.","DOI":"10.3390\/s22186977"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"69636","DOI":"10.1109\/ACCESS.2022.3185766","article-title":"Visual SLAM based on semantic segmentation and geometric constraints for dynamic indoor environments","volume":"10","author":"Yang","year":"2022","journal-title":"IEEE Access"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Eslamian, A., and Ahmadzadeh, M.R. (2022, January 28\u201329). Det-SLAM: A Semantic Visual SLAM for Highly Dynamic Scenes using Detectron 2. Proceedings of the 2022 8th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS), Mazandaran, Iran.","DOI":"10.1109\/ICSPIS56952.2022.10043931"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Tian, Y.L., Xu, G.C., Li, J.X., and Sun, Y. (2022, January 28\u201330). Visual SLAM Based on YOLOX-S in Dynamic Scenes. Proceedings of the 2022 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML), Xi\u2019an, China.","DOI":"10.1109\/ICICML57342.2022.10009828"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"9573","DOI":"10.1109\/LRA.2022.3191193","article-title":"RGB-D inertial odometry for a resource-restricted robot in dynamic environments","volume":"7","author":"Liu","year":"2022","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, Y.I., Mikawa, M., and Fujisawa, M. (2022, January 12\u201313). FCH-SLAM: A SLAM Method for Dynamic Environments using Semantic Segmentation. Proceedings of the 2022 2nd International Conference on Image Processing and Robotics (ICIPRob), Colombo, Sri Lanka.","DOI":"10.1109\/ICIPRob54042.2022.9798717"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1016\/j.mechatronics.2017.12.002","article-title":"SLAM in dynamic environments via ML-RANSAC","volume":"49","author":"Bahraini","year":"2018","journal-title":"Mechatronics"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"166528","DOI":"10.1109\/ACCESS.2019.2952161","article-title":"SOF-SLAM: A semantic visual SLAM for dynamic environments","volume":"7","author":"Cui","year":"2019","journal-title":"IEEE Access"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"Segnet: A deep convolutional encoder-decoder architecture for image segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"B\u00e2rsan, I.A., Liu, P., Pollefeys, M., and Geiger, A. (2018, January 21\u201325). Robust Dense Mapping for Large-Scale Dynamic Environments. Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia.","DOI":"10.1109\/ICRA.2018.8462974"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"20657","DOI":"10.1109\/JSEN.2021.3099511","article-title":"RS-SLAM: A robust semantic SLAM in dynamic environments based on RGB-D sensor","volume":"21","author":"Ran","year":"2021","journal-title":"IEEE Sens. J."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"21160","DOI":"10.1109\/ACCESS.2022.3154086","article-title":"Semantic SLAM based on improved DeepLabv3\u207a in dynamic scenarios","volume":"10","author":"Hu","year":"2022","journal-title":"IEEE Access"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1007\/s11370-021-00400-8","article-title":"An improved multi-object classification algorithm for visual SLAM under dynamic environment","volume":"15","author":"Wen","year":"2022","journal-title":"Intell. Serv. Robot."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"3947","DOI":"10.1109\/TMM.2021.3110667","article-title":"Multi-classes and motion properties for concurrent visual slam in dynamic environments","volume":"24","author":"Yang","year":"2021","journal-title":"IEEE Trans. Multimed."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Sturm, J., Engelhard, N., Endres, F., Burgard, W., and Cremers, D. (2012, January 7\u201312). A Benchmark for the Evaluation of RGB-D SLAM Systems. Proceedings of the 2012 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Vilamoura-Algarve, Portugal.","DOI":"10.1109\/IROS.2012.6385773"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Yu, C., Liu, Z., Liu, X.J., Xie, F., Yang, Y., Wei, Q., and Fei, Q. (2018, January 1\u20135). DS-SLAM: A Semantic Visual SLAM Towards Dynamic Environments. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593691"},{"key":"ref_45","first-page":"1","article-title":"Fixation Prediction through Multimodal Analysis","volume":"Volume 13","author":"Min","year":"2016","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"3805","DOI":"10.1109\/TIP.2020.2966082","article-title":"A multimodal saliency model for videos with high audio-visual correspondence","volume":"29","author":"Min","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1882","DOI":"10.1109\/TIP.2023.3251695","article-title":"Attention-Guided Neural Networks for Full-Reference and No-Reference Audio-Visual Quality Assessment","volume":"32","author":"Cao","year":"2023","journal-title":"IEEE Trans. Image Process."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1109\/MSP.2018.2885359","article-title":"Protecting water infrastructure from cyber and physical threats: Using multimodal data fusion and adaptive deep learning to monitor critical systems","volume":"36","author":"Bakalos","year":"2019","journal-title":"IEEE Signal Process. Mag."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/23\/9592\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:36:59Z","timestamp":1760132219000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/23\/9592"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,3]]},"references-count":48,"journal-issue":{"issue":"23","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["s23239592"],"URL":"https:\/\/doi.org\/10.3390\/s23239592","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,3]]}}}