{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T18:32:38Z","timestamp":1770748358265,"version":"3.49.0"},"reference-count":61,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2022,4,23]],"date-time":"2022-04-23T00:00:00Z","timestamp":1650672000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2018YFB2100503"],"award-info":[{"award-number":["2018YFB2100503"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>In this paper, we present a challenging stereo-inertial dataset collected onboard a sports utility vehicle (SUV) for the tasks of visual-inertial odometry (VIO), simultaneous localization and mapping (SLAM), autonomous driving, object detection, and other computer vision techniques. We recorded a large set of time-synchronized stereo image sequences (2 \u00d7 1280 \u00d7 720 @ 30 fps RGB) and corresponding inertial measurement unit (IMU) readings (400 Hz) from a Stereolabs ZED2 camera, along with centimeter-level-accurate six-degree-of-freedom ground truth (100 Hz) from a u-blox GNSS-IMU navigation device with real-time kinematic correction signals. The dataset comprises 34 sequences recorded during November 2020 in Wuhan, the largest city of Central China. Further, the dataset contains abundant unique urban scenes and features of a complex modern metropolis, which have rarely appeared in previously released benchmarks. Results from milestone VIO\/SLAM algorithms reveal that methods exhibiting excellent performance on established datasets such as KITTI and EuRoC perform unsatisfactorily when moved outside the laboratory to the real world. We expect our dataset to reduce this limitation by providing more challenging and diverse scenarios to the research community. The full dataset with raw and calibrated data is publicly available along with a lightweight MATLAB\/Python toolbox for preprocessing and evaluation. The dataset can be downloaded in its entirety from the uniform resource locator (URL) we provide in the main text.<\/jats:p>","DOI":"10.3390\/rs14092033","type":"journal-article","created":{"date-parts":[[2022,4,24]],"date-time":"2022-04-24T00:45:21Z","timestamp":1650761121000},"page":"2033","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["WHUVID: A Large-Scale Stereo-IMU Dataset for Visual-Inertial Odometry and Autonomous Driving in Chinese Urban Scenarios"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4923-8250","authenticated-orcid":false,"given":"Tianyang","family":"Chen","sequence":"first","affiliation":[{"name":"Electronic Information School, Wuhan University, Wuhan 430072, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1490-0347","authenticated-orcid":false,"given":"Fangling","family":"Pu","sequence":"additional","affiliation":[{"name":"Electronic Information School, Wuhan University, Wuhan 430072, China"}]},{"given":"Hongjia","family":"Chen","sequence":"additional","affiliation":[{"name":"Electronic Information School, Wuhan University, Wuhan 430072, China"}]},{"given":"Zhihong","family":"Liu","sequence":"additional","affiliation":[{"name":"Electronic Information School, Wuhan University, Wuhan 430072, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,4,23]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"969","DOI":"10.1177\/0278364911398404","article-title":"The UTIAS multi-robot cooperative localization and mapping dataset","volume":"30","author":"Leung","year":"2011","journal-title":"Int. J. Rob. Res."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Chen, D.M., Baatz, G., K\u00f6ser, K., Tsai, S.S., Vedantham, R., Pylv\u00e4n\u00e4inen, T., Roimela, K., Chen, X., Bach, J., and Pollefeys, M. (2011, January 20\u201325). City-scale landmark identification on mobile devices. Proceedings of the CVPR 2011, Colorado Springs, CO, USA.","DOI":"10.1109\/CVPR.2011.5995610"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Milford, M.J., and Wyeth, G.F. (2012, January 11\u201314). SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights. Proceedings of the 2012 IEEE International Conference on Robotics and Automation, Guangzhou, China.","DOI":"10.1109\/ICRA.2012.6224623"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Guzm\u00e1n, R., Hayet, J.B., and Klette, R. (2015). Towards ubiquitous autonomous driving: The CCSAD dataset. International Conference on Computer Analysis of Images and Patterns, Springer.","DOI":"10.1007\/978-3-319-23192-1_49"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. (2016, January 27\u201330). The cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.350"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1023","DOI":"10.1177\/0278364915614638","article-title":"University of Michigan North Campus long-term vision and lidar dataset","volume":"35","author":"Ushani","year":"2016","journal-title":"Int. J. Rob. Res."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Jung, H., Oto, Y., Mozos, O.M., Iwashita, Y., and Kurazume, R. (2016, January 9\u201314). Multi-modal panoramic 3D outdoor datasets for place categorization. Proceedings of the 2016 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea.","DOI":"10.1109\/IROS.2016.7759669"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1543","DOI":"10.1177\/0278364911400640","article-title":"Ford campus vision and lidar data set","volume":"30","author":"Pandey","year":"2011","journal-title":"Int. J. Rob. Res."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1177\/0278364913491297","article-title":"Vision meets robotics: The kitti dataset","volume":"32","author":"Geiger","year":"2013","journal-title":"Int. J. Rob. Res."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16\u201321). Are we ready for autonomous driving? The kitti vision benchmark suite. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Sturm, J., Engelhard, N., Endres, F., Burgard, W., and Cremers, D. (2012, January 7\u201312). A benchmark for the evaluation of RGB-D SLAM systems. Proceedings of the 2012 IEEE\/RSJ international conference on intelligent robots and systems, Faro, Portugal.","DOI":"10.1109\/IROS.2012.6385773"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2032","DOI":"10.1109\/LRA.2018.2800793","article-title":"The multivehicle stereo event camera dataset: An event camera dataset for 3D perception","volume":"3","author":"Zhu","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1045","DOI":"10.1177\/0278364917720510","article-title":"Agricultural robot dataset for plant classification, localization and mapping on sugar beet fields","volume":"36","author":"Chebrolu","year":"2017","journal-title":"Int. J. Rob. Res."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1177\/0278364917737153","article-title":"The Katwijk beach planetary rover dataset","volume":"37","author":"Hewitt","year":"2018","journal-title":"Int. J. Rob. Res."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1157","DOI":"10.1177\/0278364915620033","article-title":"The EuRoC micro aerial vehicle datasets","volume":"35","author":"Burri","year":"2016","journal-title":"Int. J. Rob. Res."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1177\/0278364917702237","article-title":"The Zurich urban micro aerial vehicle dataset","volume":"36","author":"Majdik","year":"2017","journal-title":"Int. J. Rob. Res."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1549","DOI":"10.1177\/0278364919883346","article-title":"AQUALOC: An underwater dataset for visual\u2013inertial\u2013pressure localization","volume":"38","author":"Ferrera","year":"2019","journal-title":"Int. J. Rob. Res."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1247","DOI":"10.1177\/0278364917732838","article-title":"Underwater caves sonar data set","volume":"36","author":"Mallios","year":"2017","journal-title":"Int. J. Rob. Res."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Bender, A., Williams, S.B., and Pizarro, O. (2013, January 6\u201310). Autonomous exploration of large-scale benthic environments. Proceedings of the 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany.","DOI":"10.1109\/ICRA.2013.6630605"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"2895","DOI":"10.1109\/TVCG.2018.2868533","article-title":"Collaborative large-scale dense 3d reconstruction with online inter-agent pose optimisation","volume":"24","author":"Golodetz","year":"2018","journal-title":"IEEE Trans. Vis. Comput. Graph."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Schubert, D., Goll, T., Demmel, N., Usenko, V., St\u00fcckler, J., and Cremers, D. (2018, January 1\u20135). The TUM VI benchmark for evaluating visual-inertial odometry. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593419"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"934","DOI":"10.1109\/TITS.2018.2791533","article-title":"KAIST multi-spectral day\/night data set for autonomous and assisted driving","volume":"19","author":"Choi","year":"2018","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"642","DOI":"10.1177\/0278364919843996","article-title":"Complex urban dataset with multi-level sensors from highly diverse urban environments","volume":"38","author":"Jeong","year":"2019","journal-title":"Int. J. Rob. Res."},{"key":"ref_24","first-page":"80","article-title":"Visual odometry: Part i: The first 30 years and fundamentals","volume":"18","author":"Fraundorfer","year":"2011","journal-title":"IEEE Robot. Autom. Mag."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1109\/MRA.2012.2182810","article-title":"Visual odometry: Part ii: Matching, robustness, optimization, and applications","volume":"19","author":"Fraundorfer","year":"2012","journal-title":"IEEE Robot. Autom. Mag."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1897","DOI":"10.1186\/s40064-016-3573-7","article-title":"Review of visual odometry: Types, approaches, challenges, and applications","volume":"5","author":"Aqel","year":"2016","journal-title":"Springerplus"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1109\/MRA.2006.1678144","article-title":"Simultaneous localization and mapping: Part I","volume":"13","author":"Bailey","year":"2006","journal-title":"IEEE Robot. Autom. Mag."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1109\/MRA.2006.1678144","article-title":"Simultaneous localization and mapping (SLAM): Part II","volume":"13","author":"Bailey","year":"2006","journal-title":"IEEE Robot. Autom. Mag."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1309","DOI":"10.1109\/TRO.2016.2624754","article-title":"Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age","volume":"32","author":"Cadena","year":"2016","journal-title":"IEEE Trans. Robot."},{"key":"ref_30","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1147","DOI":"10.1109\/TRO.2015.2463671","article-title":"ORB-SLAM: A versatile and accurate monocular SLAM system","volume":"31","author":"Montiel","year":"2015","journal-title":"IEEE Trans. Robot."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1255","DOI":"10.1109\/TRO.2017.2705103","article-title":"Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras","volume":"33","year":"2017","journal-title":"IEEE Trans. Robot."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1004","DOI":"10.1109\/TRO.2018.2853729","article-title":"Vins-mono: A robust and versatile monocular visual-inertial state estimator","volume":"34","author":"Qin","year":"2018","journal-title":"IEEE Trans. Robot."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Qin, T., and Shen, S. (2018, January 1\u20135). Online temporal calibration for monocular visual-inertial systems. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593603"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1177\/0278364913507326","article-title":"The M\u00e1laga urban dataset: High-rate stereo and LiDAR in a realistic urban scenario","volume":"33","year":"2014","journal-title":"Int. J. Rob. Res."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1177\/0278364916679498","article-title":"1 year, 1000 km: The Oxford RobotCar dataset","volume":"36","author":"Maddern","year":"2017","journal-title":"Int. J. Rob. Res."},{"key":"ref_37","unstructured":"Walter, L., Oliver, B., Oliver, B., Karol, P., Pierre, Y., and Max, P. (2018, May 22). A Platform-Agnostic Camera and Sensor Capture API for the ZED Stereo Camera Family. Available online: https:\/\/github.com\/stereolabs\/zed-open-capture."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Grafarend, E. (1995). The optimal universal transverse Mercator projection. Geodetic Theory Today, Springer.","DOI":"10.1007\/978-3-642-79824-5_13"},{"key":"ref_39","unstructured":"Woodman, O.J. (2007). An introduction to inertial navigation. Technical Report UCAM-CL-TR-696, University of Cambridge, Computer Laboratory."},{"key":"ref_40","unstructured":"Zhang, Z. (1999, January 20\u201325). Flexible camera calibration by viewing a plane from unknown orientations. Proceedings of the seventh IEEE international conference on computer vision, Washington, DC, USA."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Rehder, J., Nikolic, J., Schneider, T., Hinzmann, T., and Siegwart, R. (2016, January 16\u201321). Extending kalibr: Calibrating the extrinsics of multiple IMUs and of individual axes. Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden.","DOI":"10.1109\/ICRA.2016.7487628"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Furgale, P., Rehder, J., and Siegwart, R. (2013, January 3\u20137). Unified temporal and spatial calibration for multi-sensor systems. Proceedings of the 2013 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan.","DOI":"10.1109\/IROS.2013.6696514"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Furgale, P., Barfoot, T.D., and Sibley, G. (2012, January 11\u201314). Continuous-time batch estimation using temporal basis functions. Proceedings of the 2012 IEEE International Conference on Robotics and Automation, Guangzhou, China.","DOI":"10.1109\/ICRA.2012.6225005"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Maye, J., Furgale, P., and Siegwart, R. (2013, January 23\u201326). Self-supervised calibration for robotic systems. Proceedings of the 2013 IEEE Intelligent Vehicles Symposium (IV), Gold Coast City, Australia.","DOI":"10.1109\/IVS.2013.6629513"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Oth, L., Furgale, P., Kneip, L., and Siegwart, R. (2013, January 23\u201328). Rolling shutter camera calibration. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.179"},{"key":"ref_46","unstructured":"Grupp, M. (2017, September 14). Evo: Python Package for the Evaluation of Odometry and SLAM. Available online: https:\/\/github.com\/MichaelGrupp\/evo."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Klein, G., and Murray, D. (2007, January 13\u201316). Parallel tracking and mapping for small AR workspaces. Proceedings of the 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, Nara, Japan.","DOI":"10.1109\/ISMAR.2007.4538852"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"1052","DOI":"10.1109\/TPAMI.2007.1049","article-title":"MonoSLAM: Real-time single camera SLAM","volume":"29","author":"Davison","year":"2007","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Newcombe, R.A., Lovegrove, S.J., and Davison, A.J. (2011, January 6\u201313). DTAM: Dense tracking and mapping in real-time. Proceedings of the 2011 IEEE international conference on computer vision, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126513"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Kerl, C., Sturm, J., and Cremers, D. (2013, January 3\u20137). Dense visual SLAM for RGB-D cameras. Proceedings of the 2013 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan.","DOI":"10.1109\/IROS.2013.6696650"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Engel, J., Sch\u00f6ps, T., and Cremers, D. (2014). LSD-SLAM: Large-scale direct monocular SLAM. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-10605-2_54"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Forster, C., Pizzoli, M., and Scaramuzza, D. (June, January 31). SVO: Fast semi-direct monocular visual odometry. Proceedings of the 2014 IEEE international conference on robotics and automation (ICRA), Hong Kong, China.","DOI":"10.1109\/ICRA.2014.6906584"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"314","DOI":"10.1177\/0278364914554813","article-title":"Keyframe-based visual\u2013inertial odometry using nonlinear optimization","volume":"34","author":"Leutenegger","year":"2015","journal-title":"Int. J. Rob. Res."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Bloesch, M., Omari, S., Hutter, M., and Siegwart, R. (October, January 28). Robust visual inertial odometry using a direct EKF-based approach. Proceedings of the 2015 IEEE\/RSJ international conference on intelligent robots and systems (IROS), Hamburg, Germany.","DOI":"10.1109\/IROS.2015.7353389"},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Zihao Zhu, A., Atanasov, N., and Daniilidis, K. (2017, January 21\u201326). Event-based visual inertial odometry. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.616"},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Yu, C., Liu, Z., Liu, X.J., Xie, F., Yang, Y., Wei, Q., and Fei, Q. (2018, January 1\u20135). DS-SLAM: A semantic visual SLAM towards dynamic environments. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593691"},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"4076","DOI":"10.1109\/LRA.2018.2860039","article-title":"DynaSLAM: Tracking, mapping, and inpainting in dynamic scenes","volume":"3","author":"Bescos","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Henein, M., Zhang, J., Mahony, R., and Ila, V. (August, January 31). Dynamic SLAM: The need for speed. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France.","DOI":"10.1109\/ICRA40945.2020.9196895"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Nair, G.B., Daga, S., Sajnani, R., Ramesh, A., Ansari, J.A., Jatavallabhula, K.M., and Krishna, K.M. (November, January 19). Multi-object monocular SLAM for dynamic environments. Proceedings of the 2020 IEEE Intelligent Vehicles Symposium (IV), Las Vegas, NV, USA.","DOI":"10.1109\/IV47402.2020.9304648"},{"key":"ref_60","unstructured":"Zhang, J., Henein, M., Mahony, R., and Ila, V. (2020). VDO-SLAM: A visual dynamic object-aware SLAM system. arXiv."},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"1874","DOI":"10.1109\/TRO.2021.3075644","article-title":"Orb-slam3: An accurate open-source library for visual, visual\u2013inertial, and multimap slam","volume":"37","author":"Campos","year":"2021","journal-title":"IEEE Trans. Robot."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/9\/2033\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:59:22Z","timestamp":1760137162000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/9\/2033"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,23]]},"references-count":61,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2022,5]]}},"alternative-id":["rs14092033"],"URL":"https:\/\/doi.org\/10.3390\/rs14092033","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,4,23]]}}}