{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T05:56:08Z","timestamp":1775109368144,"version":"3.50.1"},"reference-count":48,"publisher":"MDPI AG","issue":"18","license":[{"start":{"date-parts":[[2020,9,9]],"date-time":"2020-09-09T00:00:00Z","timestamp":1599609600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["91748101"],"award-info":[{"award-number":["91748101"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Indoor service robots need to build an object-centric semantic map to understand and execute human instructions. Conventional visual simultaneous localization and mapping (SLAM) systems build a map using geometric features such as points, lines, and planes as landmarks. However, they lack a semantic understanding of the environment. This paper proposes an object-level semantic SLAM algorithm based on RGB-D data, which uses a quadric surface as an object model to compactly represent the object\u2019s position, orientation, and shape. This paper proposes and derives two types of RGB-D camera-quadric observation models: a complete model and a partial model. The complete model combines object detection and point cloud data to estimate a complete ellipsoid in a single RGB-D frame. The partial model is activated when the depth data is severely missing because of illuminations or occlusions, which uses bounding boxes from object detection to constrain objects. Compared with the state-of-the-art quadric SLAM algorithms that use a monocular observation model, the RGB-D observation model reduces the requirements of the observation number and viewing angle changes, which helps improve the accuracy and robustness. This paper introduces a nonparametric pose graph to solve data associations in the back end, and innovatively applies it to the quadric surface model. We thoroughly evaluated the algorithm on two public datasets and an author-collected mobile robot dataset in a home-like environment. We obtained obvious improvements on the localization accuracy and mapping effects compared with two state-of-the-art object SLAM algorithms.<\/jats:p>","DOI":"10.3390\/s20185150","type":"journal-article","created":{"date-parts":[[2020,9,10]],"date-time":"2020-09-10T09:10:09Z","timestamp":1599729009000},"page":"5150","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":29,"title":["RGB-D Object SLAM Using Quadrics for Indoor Environments"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4304-6020","authenticated-orcid":false,"given":"Ziwei","family":"Liao","sequence":"first","affiliation":[{"name":"Robotics Institute, Beihang University, Beijing 100191, China"}]},{"given":"Wei","family":"Wang","sequence":"additional","affiliation":[{"name":"Robotics Institute, Beihang University, Beijing 100191, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6636-374X","authenticated-orcid":false,"given":"Xianyu","family":"Qi","sequence":"additional","affiliation":[{"name":"Robotics Institute, Beihang University, Beijing 100191, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8674-3790","authenticated-orcid":false,"given":"Xiaoyu","family":"Zhang","sequence":"additional","affiliation":[{"name":"Robotics Institute, Beihang University, Beijing 100191, China"}]}],"member":"1968","published-online":{"date-parts":[[2020,9,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1309","DOI":"10.1109\/TRO.2016.2624754","article-title":"Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age","volume":"32","author":"Cadena","year":"2016","journal-title":"IEEE Trans. Robot."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"3033","DOI":"10.3390\/s140203033","article-title":"Performance of Global-Appearance Descriptors in Map Building and Localization Using Omnidirectional Vision","volume":"14","author":"Paya","year":"2014","journal-title":"Sensors"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Paya, L., Peidr\u00f3, A., Amoros, F., Valiente, D., and Reinoso, O. (2018). Modeling Environments Hierarchically with Omnidirectional Imaging and Global-Appearance Descriptors. Remote. Sens., 10.","DOI":"10.3390\/rs10040522"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"146","DOI":"10.1109\/TRO.2012.2220211","article-title":"Localization in Urban Environments Using a Panoramic Gist Descriptor","volume":"29","author":"Murillo","year":"2012","journal-title":"IEEE Trans. Robot."},{"key":"ref_5","unstructured":"Shi, X., Li, D., Zhao, P., Tian, Q., Tian, Y., Long, Q., Zhu, C., Song, J., Qiao, F., and Song, L. (August, January 31). Are We Ready for Service Robots? The OpenLORIS-Scene Datasets for Lifelong SLAM. Proceedings of the International Conference on Robotics and Automation (ICRA), Paris, France."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive Image Features from Scale-Invariant Keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Rublee, E., Rabaud, V., Konolige, K., and Bradski, G.R. (2011, January 25\u201327). ORB: An efficient alternative to SIFT or SURF. Proceedings of the 2011 International Conference on Computer Vision, Tokyo, Japan.","DOI":"10.1109\/ICCV.2011.6126544"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1109\/TPAMI.2017.2658577","article-title":"Direct Sparse Odometry","volume":"40","author":"Engel","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Engel, J., Sch\u00f6ps, T., and Cremers, D. (2014). LSD-SLAM: Large-Scale Direct Monocular SLAM. Haptics: Science, Technology, Applications, Springer Science and Business Media LLC.","DOI":"10.1007\/978-3-319-10605-2_54"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhang, X., Wang, W., Qi, X., Liao, Z., and Wei, R. (2019). Point-Plane SLAM Using Supposed Planes for Indoor Environments. Sensors, 19.","DOI":"10.3390\/s19173795"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Kaess, M. (2015, January 26\u201330). Simultaneous localization and mapping with infinite planes. Proceedings of the 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA.","DOI":"10.1109\/ICRA.2015.7139837"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1364","DOI":"10.1109\/TVT.2015.2388780","article-title":"StructSLAM: Visual Slam with Building Structure Lines","volume":"64","author":"Zhou","year":"2015","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Hartley, R., and Zisserman, A. (2004). Multiple View Geometry in Computer Vision, Cambridge University Press (CUP).","DOI":"10.1017\/CBO9780511811685"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/LRA.2018.2866205","article-title":"QuadricSLAM: Dual Quadrics from Object Detections as Landmarks in Object-Oriented SLAM","volume":"4","author":"Nicholson","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ok, K., Liu, K., Frey, K., How, J.P., and Roy, N. (2019, January 20\u201324). Robust Object-based SLAM for High-speed Autonomous Navigation. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8794344"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Mu, B., Liu, S.-Y., Paull, L., Leonard, J., and How, J.P. (2016, January 9\u201314). SLAM with objects using a nonparametric pose graph. Proceedings of the 2016 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea.","DOI":"10.1109\/IROS.2016.7759677"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Handa, A., Whelan, T., McDonald, J., Davison, A.J., and Handa, A. (June, January 31). A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. Proceedings of the 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China.","DOI":"10.1109\/ICRA.2014.6907054"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Sturm, J., Engelhard, N., Endres, F., Burgard, W., and Cremers, D. (2012, January 7\u201312). A benchmark for the evaluation of RGB-D SLAM systems. Proceedings of the 2012 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, Portugal.","DOI":"10.1109\/IROS.2012.6385773"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1255","DOI":"10.1109\/TRO.2017.2705103","article-title":"ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras","volume":"33","author":"Tardos","year":"2017","journal-title":"IEEE Trans. Robot."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Salas-Moreno, R.F., Newcombe, R.A., Strasdat, H., Kelly, P.H., and Davison, A.J. (2013, January 23\u201328). SLAM++: Simultaneous Localisation and Mapping at the Level of Objects. Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.178"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Runz, M., Buffier, M., and Agapito, L. (2018, January 16\u201320). MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects. Proceedings of the 2018 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Munich, Germany.","DOI":"10.1109\/ISMAR.2018.00024"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask R-CNN. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"McCormac, J., Clark, R., Bloesch, M., Davison, A., and Leutenegger, S. (2018, January 5\u20138). Fusion++: Volumetric Object-Level SLAM. Proceedings of the 2018 International Conference on 3D Vision (3DV), Verona, Italy.","DOI":"10.1109\/3DV.2018.00015"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Xu, B., Li, W., Tzoumanikas, D., Bloesch, M., Davison, A., and Leutenegger, S. (2019, January 20\u201324). MID-Fusion: Octree-based Object-Level Multi-Instance Dynamic SLAM. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8794371"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Rosinol, A., Abate, M., Chang, Y., and Carlone, L. (2019). Kimera: An Open-Source Library for Real-Time Metric-Semantic Localization and Mapping. arXiv.","DOI":"10.1109\/ICRA40945.2020.9196885"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"925","DOI":"10.1109\/TRO.2019.2909168","article-title":"CubeSLAM: Monocular 3-D Object SLAM","volume":"35","author":"Yang","year":"2019","journal-title":"IEEE Trans. Robot."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TPAMI.2017.2701373","article-title":"3D Object Localisation from Multi-view Image Detections","volume":"40","author":"Rubino","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Crocco, M., Rubino, C., and Del Bue, A. (2016, January 27\u201330). Structure from Motion with Objects. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.449"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Gay, P., Bansal, V., Rubino, C., and Del Bue, A. (2017, January 22\u201329). Probabilistic Structure from Motion with Objects (PSfMO). Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.334"},{"key":"ref_31","unstructured":"Jablonsky, N., Milford, M., and S\u00fcnderhauf, N. (2018). An Orientation Factor for Object-Oriented SLAM. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Hosseinzadeh, M., Latif, Y., Pham, T., Suenderhauf, N., and Reid, I. (2019). Structure Aware SLAM Using Quadrics and Planes. Proceedings of the Haptics: Science, Technology, Applications, Springer Science and Business Media LLC.","DOI":"10.1007\/978-3-030-20893-6_26"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Gaudilliere, V., Simon, G., and Berger, M.-O. (2019, January 10\u201318). Camera Relocalization with Ellipsoidal Abstraction of Objects. Proceedings of the 2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Beijing, China.","DOI":"10.1109\/ISMAR.2019.00017"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Hosseinzadeh, M., Li, K., Latif, Y., and Reid, I. (2019, January 20\u201324). Real-Time Monocular Object-Model Aware Sparse SLAM. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8793728"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Schiebener, D., Schmidt, A., Vahrenkamp, N., and Asfour, T. (2016, January 9\u201314). Heuristic 3D object shape completion based on symmetry and scene context. Proceedings of the 2016 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea.","DOI":"10.1109\/IROS.2016.7759037"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Vezzani, G., Pattacini, U., and Natale, L. (June, January 29). A grasping approach based on superquadric models. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989187"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Makhal, A., Thomas, F., and Perez-Gracia, A. (February, January 31). Grasping Unknown Objects in Clutter by Superquadric Representation. Proceedings of the 2018 Second IEEE International Conference on Robotic Computing (IRC), Laguna Hills, CA, USA.","DOI":"10.1109\/IRC.2018.00062"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Vezzani, G., Pattacini, U., Pasquale, G., and Natale, L. (2018, January 21\u201325). Improving Superquadric Modeling and Grasping with Prior on Object Shapes. Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia.","DOI":"10.1109\/ICRA.2018.8463161"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Iqbal, A., and Gans, N.R. (2020). Data Association and Localization of Classified Objects in Visual SLAM. J. Intell. Robot. Syst., 1\u201318.","DOI":"10.1007\/s10846-020-01189-x"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Bowman, S.L., Atanasov, N., Daniilidis, K., and Pappas, G.J. (June, January 29). Probabilistic data association for semantic SLAM. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989203"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Doherty, K., Fourie, D., and Leonard, J. (2019, January 20\u201324). Multimodal Semantic SLAM with Probabilistic Data Association. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8794244"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Coughlan, J., and Yuille, A. (1999, January 20\u201327). Manhattan World: Compass direction from a single image by Bayesian inference. Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece.","DOI":"10.1109\/ICCV.1999.790349"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"K\u0131rcal\u0131, D., Tek, F.B., and Kircali, D. (2014). Ground Plane Detection Using an RGB-D Sensor. Information Sciences and Systems 2014, Springer Science and Business Media LLC.","DOI":"10.1007\/978-3-319-09465-6_8"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014). Microsoft COCO: Common Objects in Context. Haptics: Science, Technology, Applications, Proceedings of the 13th European Conference, Zurich, Switzerland, 6\u201312 September 2014, Springer Science and Business Media LLC.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Whelan, T., Leutenegger, S., Moreno, R.S., Glocker, B., and Davison, A. (2015, January 13\u201317). ElasticFusion: Dense SLAM without A Pose Graph. Proceedings of the Robotics: Science and Systems Foundation, Rome, Italy.","DOI":"10.15607\/RSS.2015.XI.001"},{"key":"ref_46","unstructured":"Grisetti, G., K\u00fcmmerle, R., Strasdat, H., and Konolige, K. (2011, January 9\u201313). g2o: A general Framework for (Hyper) Graph Optimization. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Rusu, R.B., and Cousins, S. (2011, January 9\u201313). 3D is here: Point Cloud Library (PCL). Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China.","DOI":"10.1109\/ICRA.2011.5980567"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"8714","DOI":"10.1109\/ACCESS.2018.2801813","article-title":"Fast and Lightweight Object Detection Network: Detection and Recognition on Resource Constrained Devices","volume":"6","author":"Oliveira","year":"2018","journal-title":"IEEE Access"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/18\/5150\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:08:28Z","timestamp":1760177308000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/18\/5150"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,9]]},"references-count":48,"journal-issue":{"issue":"18","published-online":{"date-parts":[[2020,9]]}},"alternative-id":["s20185150"],"URL":"https:\/\/doi.org\/10.3390\/s20185150","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,9]]}}}