{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T19:12:40Z","timestamp":1760209960963,"version":"build-2065373602"},"reference-count":44,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2017,10,1]],"date-time":"2017-10-01T00:00:00Z","timestamp":1506816000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Given a stream of depth images with a known cuboid reference object present in the scene, we propose a novel approach for accurate camera tracking and volumetric surface reconstruction in real-time. Our contribution in this paper is threefold: (a) utilizing a priori knowledge of the precisely manufactured cuboid reference object, we keep drift-free camera tracking without explicit global optimization; (b) we improve the fineness of the volumetric surface representation by proposing a prediction-corrected data fusion strategy rather than a simple moving average, which enables accurate reconstruction of high-frequency details such as the sharp edges of objects and geometries of high curvature; (c) we introduce a benchmark dataset CU3D that contains both synthetic and real-world scanning sequences with ground-truth camera trajectories and surface models for the quantitative evaluation of 3D reconstruction algorithms. We test our algorithm on our dataset and demonstrate its accuracy compared with other state-of-the-art algorithms. We release both our dataset and code as open-source (https:\/\/github.com\/zhangxaochen\/CuFusion) for other researchers to reproduce and verify our results.<\/jats:p>","DOI":"10.3390\/s17102260","type":"journal-article","created":{"date-parts":[[2017,10,2]],"date-time":"2017-10-02T13:10:05Z","timestamp":1506949805000},"page":"2260","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["CuFusion: Accurate Real-Time Camera Tracking and Volumetric Scene Reconstruction with a Cuboid"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5250-4962","authenticated-orcid":false,"given":"Chen","family":"Zhang","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China"}]},{"given":"Yu","family":"Hu","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China"}]}],"member":"1968","published-online":{"date-parts":[[2017,10,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., and Fitzgibbon, A. (2011, January 26\u201329). KinectFusion: Real-time Dense Surface Mapping and Tracking. Proceedings of the 2011 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR \u201911), Basel, Switzerland.","DOI":"10.1109\/ISMAR.2011.6162880"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1109\/34.121791","article-title":"A method for registration of 3-D shapes","volume":"14","author":"Besl","year":"1992","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Curless, B., and Levoy, M. (1996, January 4\u20139). A Volumetric Method for Building Complex Models from Range Images. Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH \u201996), New Orleans, LA, USA.","DOI":"10.1145\/237170.237269"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Lorensen, W.E., and Cline, H.E. (1987, January 27\u201331). Marching Cubes: A High Resolution 3D Surface Construction Algorithm. Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH\u201987), Anaheim, CA, USA.","DOI":"10.1145\/37401.37422"},{"key":"ref_5","unstructured":"Rusinkiewicz, S., and Levoy, M. (June, January 28). Efficient variants of the ICP algorithm. Proceedings of the Third International Conference on 3-D Digital Imaging and Modeling, Quebec City, QC, Canada."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Hernandez, C., Vogiatzis, G., and Cipolla, R. (2007, January 17\u201322). Probabilistic visibility for multi-view stereo. Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA.","DOI":"10.1109\/CVPR.2007.383193"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1109\/TRO.2013.2279412","article-title":"3-D Mapping with an RGB-D Camera","volume":"30","author":"Endres","year":"2014","journal-title":"IEEE Trans. Robot."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1016\/S0924-2716(99)00008-8","article-title":"Processing of laser scanner data\u2014Algorithms and applications","volume":"54","author":"Axelsson","year":"1999","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_9","first-page":"33","article-title":"Recognising structure in laser scanner point clouds","volume":"46","author":"Vosselman","year":"2004","journal-title":"Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Cui, Y., Schuon, S., Chan, D., Thrun, S., and Theobalt, C. (2010, January 13\u201318). 3D shape scanning with a time-of-flight camera. Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5540082"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1109\/3.910448","article-title":"Solid-state time-of-flight range camera","volume":"37","author":"Lange","year":"2001","journal-title":"IEEE J. Quantum Electron."},{"key":"ref_12","unstructured":"Scharstein, D., and Szeliski, R. (2003, January 16\u201322). High-accuracy stereo depth maps using structured light. Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, WI, USA."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1177\/0278364911434148","article-title":"RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments","volume":"31","author":"Henry","year":"2012","journal-title":"Int. J. Robot. Res."},{"key":"ref_14","unstructured":"Segal, A., Haehnel, D., and Thrun, S. (July, January 28). Generalized-ICP. Proceedings of the Robotics: Science and Systems Conference, Seattle, WA, USA."},{"key":"ref_15","unstructured":"Nist\u00e9r, D., Naroditsky, O., and Bergen, J. (July, January 27). Visual odometry. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), Washington, DC, USA."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., and Burgard, W. (2012, January 14\u201318). An evaluation of the RGB-D SLAM system. Proceedings of the 2012 IEEE International Conference on Robotics and Automation (ICRA), St. Paul, MN, USA.","DOI":"10.1109\/ICRA.2012.6225199"},{"key":"ref_17","unstructured":"Whelan, T., Kaess, M., Fallon, M., Johannsson, H., Leonard, J., and McDonald, J. (2017, September 30). Kintinuous: Spatially Extended Kinectfusion. Available online: https:\/\/dspace.mit.edu\/handle\/1721.1\/71756."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"598","DOI":"10.1177\/0278364914551008","article-title":"Real-time large-scale dense RGB-D SLAM with volumetric fusion","volume":"34","author":"Whelan","year":"2015","journal-title":"Int. J. Robot. Res."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Kerl, C., Sturm, J., and Cremers, D. (2013, January 3\u20137). Dense visual SLAM for RGB-D cameras. Proceedings of the 2013 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, Japan.","DOI":"10.1109\/IROS.2013.6696650"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Whelan, T., Leutenegger, S., Moreno, R.S., Glocker, B., and Davison, A. (2015, January 13\u201317). ElasticFusion: Dense SLAM without A Pose Graph. Proceedings of the 2015 Robotics: Science and Systems, Rome, Italy.","DOI":"10.15607\/RSS.2015.XI.001"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Bose, L., and Richards, A. (2016, January 16\u201321). Fast depth edge detection and edge based RGB-D SLAM. Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden.","DOI":"10.1109\/ICRA.2016.7487265"},{"key":"ref_22","unstructured":"Choi, C., Trevor, A.J.B., and Christensen, H.I. (2013, January 3\u20137). RGB-D edge detection and edge-based registration. Proceedings of the 2013 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan."},{"key":"ref_23","unstructured":"Zhou, Q.-Y., and Koltun, V. (2015, January 7\u201312). Depth camera tracking with contour cues. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA."},{"key":"ref_24","first-page":"1","article-title":"Comprehensive Use of Curvature for Robust and Accurate Online Surface Reconstruction","volume":"PP","author":"Lefloch","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Pumarola, A., Vakhitov, A., Agudo, A., Sanfeliu, A., and Moreno-Noguer, F. (June, January 29). PL-SLAM: Real-Time Monocular Visual SLAM with Points and Lines. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989522"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Ma, L., Kerl, C., St\u00fcckler, J., and Cremers, D. (2016, January 16\u201321). Cpa-slam: Consistent plane-model alignment for direct RGB-D Slam. Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden.","DOI":"10.1109\/ICRA.2016.7487260"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1230","DOI":"10.1109\/TVCG.2015.2459831","article-title":"Structural modeling from depth images","volume":"21","author":"Nguyen","year":"2015","journal-title":"IEEE Trans. Vis. Comput. Graph."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Salas-Moreno, R.F., Glocken, B., Kelly, P.H., and Davison, A.J. (2014, January 10\u201312). Dense planar SLAM. Proceedings of the 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Munich, Germany.","DOI":"10.1109\/ISMAR.2014.6948422"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Taguchi, Y., Jian, Y.-D., Ramalingam, S., and Feng, C. (2013, January 6\u201310). Point-plane SLAM for hand-held 3D sensors. Proceedings of the 2013 IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany.","DOI":"10.1109\/ICRA.2013.6631318"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1007\/s10514-012-9321-0","article-title":"OctoMap: An efficient probabilistic 3D mapping framework based on octrees","volume":"34","author":"Hornung","year":"2013","journal-title":"Auton. Robots"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Keller, M., Lefloch, D., Lambers, M., Izadi, S., Weyrich, T., and Kolb, A. (July, January 29). Real-time 3D reconstruction in dynamic scenes using point-based fusion. Proceedings of the 2013 International Conference on 3DTV-Conference, Zurich, Switzerland.","DOI":"10.1109\/3DV.2013.9"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Serafin, J., and Grisetti, G. (October, January 28). NICP: Dense normal based point cloud registration. Proceedings of the 2015 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany.","DOI":"10.1109\/IROS.2015.7353455"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"438","DOI":"10.1145\/566654.566600","article-title":"Real-time 3D model acquisition","volume":"21","author":"Rusinkiewicz","year":"2002","journal-title":"ACM Trans. Graph."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Weise, T., Wismer, T., Leibe, B., and Van Gool, L. (October, January 27). In-hand scanning with online loop closure. Proceedings of the 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops), Kyoto, Japan.","DOI":"10.1109\/ICCVW.2009.5457479"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Pfister, H., Zwicker, M., Van Baar, J., and Gross, M. (2000, January 23\u201328). Surfels: Surface elements as rendering primitives. Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, New Orleans, LA, USA.","DOI":"10.1145\/344779.344936"},{"key":"ref_36","unstructured":"Meister, S., Izadi, S., Kohli, P., H\u00e4mmerle, M., Rother, C., and Kondermann, D. (2012, January 7). When can we use kinectfusion for ground truth acquisition. Proceedings of the Workshop on Color-Depth Camera Fusion in Robotics, Algarve, Portugal."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Rusu, R.B., and Cousins, S. (2011, January 9\u201313). 3D is here: Point Cloud Library (PCL). Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China.","DOI":"10.1109\/ICRA.2011.5980567"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Feng, C., Taguchi, Y., and Kamat, V.R. (June, January 31). Fast plane extraction in organized point clouds using agglomerative hierarchical clustering. Proceedings of the 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China.","DOI":"10.1109\/ICRA.2014.6907776"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1437","DOI":"10.3390\/s120201437","article-title":"Accuracy and resolution of kinect depth data for indoor mapping applications","volume":"12","author":"Khoshelham","year":"2012","journal-title":"Sensors"},{"key":"ref_40","unstructured":"Low, K.-L. (2004). Linear Least-Squares Optimization for Point-to-Plane ICP Surface Registration, University of North Carolina."},{"key":"ref_41","unstructured":"(2017, July 18). The Stanford 3D Scanning Repository. Available online: http:\/\/graphics.stanford.edu\/data\/3Dscanrep\/."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Sturm, J., Engelhard, N., Endres, F., Burgard, W., and Cremers, D. (2012, January 7\u201312). A benchmark for the evaluation of RGB-D SLAM systems. Proceedings of the 2012 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura-Algarve, Portugal.","DOI":"10.1109\/IROS.2012.6385773"},{"key":"ref_43","unstructured":"Handa, A., Whelan, T., McDonald, J.B., and Davison, A.J. (June, January 31). A Benchmark for RGB-D Visual Odometry, 3D Reconstruction and SLAM. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China."},{"key":"ref_44","unstructured":"(2017, July 19). CloudCompare\u2014Open Source Project. Available online: http:\/\/www.danielgm.net\/cc\/."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/17\/10\/2260\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T18:46:26Z","timestamp":1760208386000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/17\/10\/2260"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,10,1]]},"references-count":44,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2017,10]]}},"alternative-id":["s17102260"],"URL":"https:\/\/doi.org\/10.3390\/s17102260","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2017,10,1]]}}}