{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,8]],"date-time":"2026-05-08T05:24:17Z","timestamp":1778217857069,"version":"3.51.4"},"reference-count":82,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2024,10,13]],"date-time":"2024-10-13T00:00:00Z","timestamp":1728777600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,10,13]],"date-time":"2024-10-13T00:00:00Z","timestamp":1728777600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100005713","name":"Technische Universit\u00e4t M\u00fcnchen","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100005713","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Vis"],"published-print":{"date-parts":[[2025,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>In this paper, we present a novel visual SLAM and long-term localization benchmark for autonomous driving in challenging conditions based on the large-scale 4Seasons dataset. The proposed benchmark provides drastic appearance variations caused by seasonal changes and diverse weather and illumination conditions. While significant progress has been made in advancing visual SLAM on small-scale datasets with similar conditions, there is still a lack of unified benchmarks representative of real-world scenarios for autonomous driving. We introduce a new unified benchmark for jointly evaluating visual odometry, global place recognition, and map-based visual localization performance which is crucial to successfully enable autonomous driving in any condition. The data has been collected for more than one year, resulting in more than 300\u00a0km of recordings in nine different environments ranging from a multi-level parking garage to urban (including tunnels) to countryside and highway. We provide globally consistent reference poses with up to centimeter-level accuracy obtained from the fusion of direct stereo-inertial odometry with RTK GNSS. We evaluate the performance of several state-of-the-art visual odometry and visual localization baseline approaches on the benchmark and analyze their properties. The experimental results provide new insights into current approaches and show promising potential for future research. Our benchmark and evaluation protocols will be available at\u00a0<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"https:\/\/go.vision.in.tum.de\/4seasons\" ext-link-type=\"uri\">https:\/\/go.vision.in.tum.de\/4seasons<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s11263-024-02230-4","type":"journal-article","created":{"date-parts":[[2024,10,13]],"date-time":"2024-10-13T21:01:43Z","timestamp":1728853303000},"page":"1564-1586","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["4Seasons: Benchmarking Visual SLAM and Long-Term Localization for Autonomous Driving in Challenging Conditions"],"prefix":"10.1007","volume":"133","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1846-6701","authenticated-orcid":false,"given":"Patrick","family":"Wenzel","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nan","family":"Yang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rui","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Niclas","family":"Zeller","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel","family":"Cremers","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,10,13]]},"reference":[{"key":"2230_CR1","doi-asserted-by":"crossref","unstructured":"Angeli, A., Filliat, D., Doncieux, S., et\u00a0al. (2008) Fast and incremental method for loop-closure detection using bags of visual words. IEEE Transactions on Robotics (T-RO) 24(5), 1027\u20131037","DOI":"10.1109\/TRO.2008.2004514"},{"key":"2230_CR2","doi-asserted-by":"crossref","unstructured":"Arandjelovic, R., & Zisserman, A. (2013). All about VLAD. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1578\u20131585.","DOI":"10.1109\/CVPR.2013.207"},{"key":"2230_CR3","doi-asserted-by":"crossref","unstructured":"Arandjelovic, R., Gronat, P., Torii, A., et\u00a0al. (2016). NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5297\u20135307.","DOI":"10.1109\/CVPR.2016.572"},{"key":"2230_CR4","doi-asserted-by":"crossref","unstructured":"Babenko, A., Slesarev, A., Chigorin, A., et\u00a0al. (2014). Neural codes for image retrieval. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 584\u2013599","DOI":"10.1007\/978-3-319-10590-1_38"},{"key":"2230_CR5","doi-asserted-by":"crossref","unstructured":"Badino, H., Huber, D., & Kanade, T. (2011). Visual topometric localization. In: textitProceedings of the IEEE Intelligent Vehicles Symposium (IV), pp. 794\u2013799.","DOI":"10.1109\/IVS.2011.5940504"},{"key":"2230_CR6","doi-asserted-by":"crossref","unstructured":"Barnes, D., Gadd, M., Murcutt, P., et\u00a0al. (2020). The oxford radar robotcar dataset: A radar extension to the oxford robotcar dataset. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)","DOI":"10.1109\/ICRA40945.2020.9196884"},{"key":"2230_CR7","doi-asserted-by":"crossref","unstructured":"Bijelic, M., Gruber, T., Mannan, F., et\u00a0al. (2020). Seeing through fog without seeing fog: Deep multimodal sensor fusion in unseen adverse weather. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","DOI":"10.1109\/CVPR42600.2020.01170"},{"issue":"2","key":"2230_CR8","doi-asserted-by":"publisher","first-page":"207","DOI":"10.1177\/0278364913507326","volume":"33","author":"JL Blanco-Claraco","year":"2014","unstructured":"Blanco-Claraco, J. L., \u00c1ngel Moreno-Due\u00f1as, F., & Gonz\u00e1lez-Jim\u00e9nez, J. (2014). The M\u00e1laga urban dataset: High-rate stereo and LiDAR in a realistic urban scenario. International Journal of Robotics Research (IJRR), 33(2), 207\u2013214.","journal-title":"International Journal of Robotics Research (IJRR)"},{"issue":"10","key":"2230_CR9","doi-asserted-by":"publisher","first-page":"1157","DOI":"10.1177\/0278364915620033","volume":"35","author":"M Burri","year":"2016","unstructured":"Burri, M., Nikolic, J., Gohl, P., et al. (2016). The EuRoC micro aerial vehicle datasets. International Journal of Robotics Research (IJRR), 35(10), 1157\u20131163.","journal-title":"International Journal of Robotics Research (IJRR)"},{"key":"2230_CR10","doi-asserted-by":"crossref","unstructured":"Caesar, H., Bankiti, V., Lang, A.H., et\u00a0al. (2020). nuScenes: A multimodal dataset for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11,621\u201311,631","DOI":"10.1109\/CVPR42600.2020.01164"},{"key":"2230_CR11","doi-asserted-by":"crossref","unstructured":"Campos, C., Elvira, R., G\u00f3mez, J.J., et\u00a0al. (2020). ORB-SLAM3: An accurate open-source library for visual, visual-inertial and multi-map SLAM. In: arXiv preprint arXiv:2007.11898","DOI":"10.1109\/TRO.2021.3075644"},{"key":"2230_CR12","doi-asserted-by":"crossref","unstructured":"Cordts, M., Omran, M., Ramos, S., et\u00a0al. (2016). The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3213\u20133223","DOI":"10.1109\/CVPR.2016.350"},{"key":"2230_CR13","doi-asserted-by":"crossref","unstructured":"DeTone, D., Malisiewicz, T., Rabinovich, A. (2018). SuperPoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 224\u2013236","DOI":"10.1109\/CVPRW.2018.00060"},{"key":"2230_CR14","doi-asserted-by":"crossref","unstructured":"Diaz-Ruiz, C.A., Xia, Y., You, Y., et\u00a0al. (2022). Ithaca365: Dataset and driving perception under repeated and challenging weather conditions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 21,351\u201321,360","DOI":"10.1109\/CVPR52688.2022.02069"},{"key":"2230_CR15","doi-asserted-by":"crossref","unstructured":"Dusmanu, M., Rocco, I., Pajdla, T., et\u00a0al. (2019). D2-Net: A trainable CNN for joint detection and description of local features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 8092\u20138101","DOI":"10.1109\/CVPR.2019.00828"},{"key":"2230_CR16","doi-asserted-by":"crossref","unstructured":"Engel, J., Sch\u00f6ps, T., Cremers, D. (2014). LSD-SLAM: Large-scale direct monocular SLAM. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 834\u2013849","DOI":"10.1007\/978-3-319-10605-2_54"},{"key":"2230_CR17","doi-asserted-by":"crossref","unstructured":"Engel, J., St\u00fcckler, J., Cremers, D. (2015). Large-scale direct SLAM with stereo cameras. In: Proceedings of the IEEE\/RSJ Conference on Intelligent Robots and Systems (IROS), pp 1935\u20131942","DOI":"10.1109\/IROS.2015.7353631"},{"key":"2230_CR18","unstructured":"Engel, J., Usenko, V., Cremers, D. (2016). A photometrically calibrated benchmark for monocular visual odometry. In: arXiv preprint arXiv:1607.02555"},{"issue":"3","key":"2230_CR19","doi-asserted-by":"publisher","first-page":"611","DOI":"10.1109\/TPAMI.2017.2658577","volume":"40","author":"J Engel","year":"2017","unstructured":"Engel, J., Koltun, V., & Cremers, D. (2017). Direct sparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(3), 611\u2013625.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)"},{"key":"2230_CR20","doi-asserted-by":"publisher","first-page":"4842","DOI":"10.1109\/TIP.2022.3187565","volume":"31","author":"B Fan","year":"2022","unstructured":"Fan, B., Zhou, J., Feng, W., et al. (2022). Learning semantic-aware local features for long term visual localization. IEEE Transactions on Image Processing, 31, 4842\u20134855.","journal-title":"IEEE Transactions on Image Processing"},{"issue":"6","key":"2230_CR21","doi-asserted-by":"publisher","first-page":"381","DOI":"10.1145\/358669.358692","volume":"24","author":"MA Fischler","year":"1981","unstructured":"Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381\u2013395.","journal-title":"Communications of the ACM"},{"key":"2230_CR22","doi-asserted-by":"crossref","unstructured":"G\u00e1lvez-L\u00f3pez, D., Tardos, J.D. (2012). Bags of binary words for fast place recognition in image sequences. IEEE Transactions on Robotics (T-RO) 28(5):1188\u20131197","DOI":"10.1109\/TRO.2012.2197158"},{"key":"2230_CR23","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., Urtasun, R. (2012). Are we ready for autonomous driving? the KITTI vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3354\u20133361","DOI":"10.1109\/CVPR.2012.6248074"},{"issue":"11","key":"2230_CR24","doi-asserted-by":"publisher","first-page":"1231","DOI":"10.1177\/0278364913491297","volume":"32","author":"A Geiger","year":"2013","unstructured":"Geiger, A., Lenz, P., Stiller, C., et al. (2013). Vision meets robotics: The KITTI dataset. International Journal of Robotics Research (IJRR), 32(11), 1231\u20131237.","journal-title":"International Journal of Robotics Research (IJRR)"},{"key":"2230_CR25","doi-asserted-by":"crossref","unstructured":"Gordo, A., Almaz\u00e1n, J., Revaud, J., et\u00a0al. (2016). Deep image retrieval: Learning global representations for image search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 241\u2013257","DOI":"10.1007\/978-3-319-46466-4_15"},{"issue":"2","key":"2230_CR26","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1007\/s11263-017-1016-8","volume":"124","author":"A Gordo","year":"2017","unstructured":"Gordo, A., Almazan, J., Revaud, J., et al. (2017). End-to-end learning of deep visual representations for image retrieval. International Journal of Computer Vision (IJCV), 124(2), 237\u2013254.","journal-title":"International Journal of Computer Vision (IJCV)"},{"issue":"3","key":"2230_CR27","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1007\/s11263-012-0601-0","volume":"103","author":"R Hartley","year":"2013","unstructured":"Hartley, R., Trumpf, J., Dai, Y., et al. (2013). Rotation averaging. International Journal of Computer Vision (IJCV), 103(3), 267\u2013305.","journal-title":"International Journal of Computer Vision (IJCV)"},{"key":"2230_CR28","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., et\u00a0al. (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"2230_CR29","doi-asserted-by":"crossref","unstructured":"Hu, H., & de\u00a0Haan, G. (2006). Low cost robust blur estimator. In: Proceedings of the IEEE International Conference on Image Processing (ICIP), pp 617\u2013620","DOI":"10.1109\/ICIP.2006.312411"},{"key":"2230_CR30","doi-asserted-by":"crossref","unstructured":"Huang, X., Cheng, X., Geng, Q., et\u00a0al. (2018). The ApolloScape dataset for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 954\u2013960","DOI":"10.1109\/CVPRW.2018.00141"},{"key":"2230_CR31","doi-asserted-by":"crossref","unstructured":"Jafarzadeh, A., Antequera, M.L., Gargallo, P., et\u00a0al. (2021). Crowddriven: A new challenging dataset for outdoor visual localization. In: Proceedings of the International Conference on Computer Vision (ICCV), pp 9845\u20139855","DOI":"10.1109\/ICCV48922.2021.00970"},{"key":"2230_CR32","doi-asserted-by":"crossref","unstructured":"Jaramillo, C. (2017). Direct multichannel tracking. In: Proceedings of the International Conference on 3D Vision (3DV), pp 347\u2013355","DOI":"10.1109\/3DV.2017.00047"},{"key":"2230_CR33","doi-asserted-by":"crossref","unstructured":"J\u00e9gou, H., Douze, M., Schmid, C., et\u00a0al. (2010). Aggregating local descriptors into a compact image representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3304\u20133311","DOI":"10.1109\/CVPR.2010.5540039"},{"key":"2230_CR34","unstructured":"Jung, E., Yang, N., Cremers, D. (2019). Multi-frame GAN: Image enhancement for stereo visual odometry in low light. In: Conference on Robot Learning (CoRL), pp 651\u2013660"},{"issue":"8","key":"2230_CR35","doi-asserted-by":"publisher","first-page":"1335","DOI":"10.1109\/TPAMI.2006.153","volume":"28","author":"J Kannala","year":"2006","unstructured":"Kannala, J., & Brandt, S. S. (2006). A generic camera model and calibration method for conventional, wide-angle, and fish-eye lenses. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 28(8), 1335\u20131340.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)"},{"key":"2230_CR36","doi-asserted-by":"crossref","unstructured":"Kendall, A., Grimes, M., Cipolla, R. (2015). Posenet: A convolutional network for real-time 6-dof camera relocalization. In: Proceedings of the International Conference on Computer Vision (ICCV), pp 2938\u20132946","DOI":"10.1109\/ICCV.2015.336"},{"key":"2230_CR37","unstructured":"Kenk, M.A., Hassaballah, M. (2020). Dawn: Vehicle detection in adverse weather nature. In: arXiv preprint arXiv:2008.05402"},{"key":"2230_CR38","unstructured":"Krizhevsky, A., Sutskever, I., Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In: Neural Information Processing Systems (NIPS), pp 1097\u20131105"},{"key":"2230_CR39","doi-asserted-by":"crossref","unstructured":"K\u00fcmmerle, R., Grisetti, G., Strasdat, H., et\u00a0al. (2011). g2o: A general framework for graph optimization. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp 3607\u20133613","DOI":"10.1109\/ICRA.2011.5979949"},{"issue":"2","key":"2230_CR40","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","volume":"60","author":"DG Lowe","year":"2004","unstructured":"Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (IJCV), 60(2), 91\u2013110.","journal-title":"International Journal of Computer Vision (IJCV)"},{"key":"2230_CR41","doi-asserted-by":"crossref","unstructured":"Lowry, S., S\u00fcnderhauf, N., Newman, P., et\u00a0al. (2015). Visual place recognition: A survey. IEEE Transactions on Robotics (T-RO) 32(1):1\u201319","DOI":"10.1109\/TRO.2015.2496823"},{"issue":"1","key":"2230_CR42","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1177\/0278364916679498","volume":"36","author":"W Maddern","year":"2017","unstructured":"Maddern, W., Pascoe, G., Linegar, C., et al. (2017). 1 year, 1000 km: The oxford robotcar dataset. International Journal of Robotics Research (IJRR), 36(1), 3\u201315.","journal-title":"International Journal of Robotics Research (IJRR)"},{"key":"2230_CR43","doi-asserted-by":"crossref","unstructured":"Mur-Artal, R., & Tard\u00f3s, J.D. (2017). ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics (T-RO) 33(5):1255\u20131262","DOI":"10.1109\/TRO.2017.2705103"},{"key":"2230_CR44","doi-asserted-by":"crossref","unstructured":"Mur-Artal, R., Montiel, J.M.M., & Tardos, J.D. (2015). ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Transactions on Robotics (T-RO) 31(5):1147\u20131163","DOI":"10.1109\/TRO.2015.2463671"},{"key":"2230_CR45","doi-asserted-by":"crossref","unstructured":"Newcombe, R.A., Lovegrove, S.J., & Davison, A.J. (2011). DTAM: dense tracking and mapping in real-time. In: Proceedings of the International Conference on Computer Vision (ICCV), pp 2320\u20132327","DOI":"10.1109\/ICCV.2011.6126513"},{"issue":"4\u20135","key":"2230_CR46","doi-asserted-by":"publisher","first-page":"681","DOI":"10.1177\/0278364920979368","volume":"40","author":"M Pitropov","year":"2021","unstructured":"Pitropov, M., Garcia, D. E., Rebello, J., et al. (2021). Canadian adverse driving conditions datasett. International Journal of Robotics Research (IJRR), 40(4\u20135), 681\u2013690.","journal-title":"International Journal of Robotics Research (IJRR)"},{"key":"2230_CR47","doi-asserted-by":"crossref","unstructured":"Radenovi\u0107, F., Tolias, G., & Chum, O. (2016). CNN image retrieval learns from BoW: Unsupervised fine-tuning with hard examples. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 3\u201320","DOI":"10.1007\/978-3-319-46448-0_1"},{"issue":"7","key":"2230_CR48","doi-asserted-by":"publisher","first-page":"1655","DOI":"10.1109\/TPAMI.2018.2846566","volume":"41","author":"F Radenovi\u0107","year":"2018","unstructured":"Radenovi\u0107, F., Tolias, G., & Chum, O. (2018). Fine-tuning CNN image retrieval with no human annotation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 41(7), 1655\u20131668.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)"},{"key":"2230_CR49","doi-asserted-by":"crossref","unstructured":"Rehder, J., Nikolic, J., Schneider, T., et\u00a0al. (2016). Extending kalibr: Calibrating the extrinsics of multiple IMUs and of individual axes. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp 4304\u20134311","DOI":"10.1109\/ICRA.2016.7487628"},{"key":"2230_CR50","doi-asserted-by":"crossref","unstructured":"Revaud, J., Almazan, J., Rezende, R., et\u00a0al. (2019a). Learning with average precision: Training image retrieval with a listwise loss. In: Proceedings of the International Conference on Computer Vision (ICCV), pp 5107\u20135116","DOI":"10.1109\/ICCV.2019.00521"},{"key":"2230_CR51","unstructured":"Revaud, J., Weinzaepfel, P., de\u00a0Souza, C.R., et\u00a0al. (2019b). R2D2: repeatable and reliable detector and descriptor. In: Neural Information Processing Systems (NeurIPS), pp 12,405\u201312,415"},{"key":"2230_CR52","doi-asserted-by":"crossref","unstructured":"Sakaridis, C., Dai, D., Van\u00a0Gool, L. (2021). ACDC: The adverse conditions dataset with correspondences for semantic driving scene understanding. In: Proceedings of the International Conference on Computer Vision (ICCV)","DOI":"10.1109\/ICCV48922.2021.01059"},{"key":"2230_CR53","doi-asserted-by":"crossref","unstructured":"Sarlin, P.E., Cadena, C., Siegwart, R., et\u00a0al. (2019). From coarse to fine: Robust hierarchical localization at large scale. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","DOI":"10.1109\/CVPR.2019.01300"},{"key":"2230_CR54","doi-asserted-by":"crossref","unstructured":"Sarlin, P.E., DeTone, D., Malisiewicz, T., et\u00a0al. (2020). SuperGlue: Learning feature matching with graph neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","DOI":"10.1109\/CVPR42600.2020.00499"},{"key":"2230_CR55","doi-asserted-by":"crossref","unstructured":"Sarlin, P.E., Dusmanu, M., Sch\u00f6nberger, J.L., et\u00a0al. (2022). Lamar: Benchmarking localization and mapping for augmented reality. In: Proceedings of the European Conference on Computer Vision (ECCV)","DOI":"10.1007\/978-3-031-20071-7_40"},{"key":"2230_CR56","doi-asserted-by":"crossref","unstructured":"Sattler, T., Weyand, T., Leibe, B., et\u00a0al. (2012). Image retrieval for image-based localization revisited. In: Proceedings of the British Machine Vision Conference (BMVC)","DOI":"10.5244\/C.26.76"},{"key":"2230_CR57","doi-asserted-by":"crossref","unstructured":"Sattler, T., Maddern, W., Toft, C., et\u00a0al. (2018). Benchmarking 6DOF outdoor visual localization in changing conditions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 8601\u20138610","DOI":"10.1109\/CVPR.2018.00897"},{"key":"2230_CR58","doi-asserted-by":"crossref","unstructured":"Sch\u00f6nberger, J.L., & Frahm, J.M. (2016). Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4104\u20134113","DOI":"10.1109\/CVPR.2016.445"},{"key":"2230_CR59","doi-asserted-by":"crossref","unstructured":"Schonberger, J.L., Radenovic, F., Chum, O., et\u00a0al. (2015). From single image query to detailed 3D reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 5126\u20135134","DOI":"10.1109\/CVPR.2015.7299148"},{"key":"2230_CR60","doi-asserted-by":"crossref","unstructured":"Schubert, D., Goll, T., Demmel, N., et\u00a0al. (2018). The TUM VI benchmark for evaluating visual-inertial odometry. In: Proceedings of the IEEE\/RSJ Conference on Intelligent Robots and Systems (IROS), pp 1680\u20131687","DOI":"10.1109\/IROS.2018.8593419"},{"key":"2230_CR61","doi-asserted-by":"crossref","unstructured":"Sheeny, M., De\u00a0Pellegrin, E., Mukherjee, S., et\u00a0al. (2021). Radiate: A radar dataset for automotive perception in bad weather. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)","DOI":"10.1109\/ICRA48506.2021.9562089"},{"key":"2230_CR62","unstructured":"Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations (ICLR)"},{"key":"2230_CR63","doi-asserted-by":"crossref","unstructured":"Spencer, J., Bowden, R., & Hadfield, S. (2020). Same features, different day: Weakly supervised feature learning for seasonal invariance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 6459\u20136468","DOI":"10.1109\/CVPR42600.2020.00649"},{"key":"2230_CR64","doi-asserted-by":"crossref","unstructured":"von Stumberg, L., & Cremers, D. (2022). DM-VIO: Delayed marginalization visual-inertial odometry. IEEE Robotics and Automation Letters (RA-L) 7(2):1408\u20131415","DOI":"10.1109\/LRA.2021.3140129"},{"key":"2230_CR65","doi-asserted-by":"crossref","unstructured":"von Stumberg, L., Wenzel, P., Khan, Q., et\u00a0al. (2020). GN-Net: The gauss-newton loss for multi-weather relocalization. IEEE Robotics and Automation Letters (RA-L) 5(2):890\u2013897","DOI":"10.1109\/LRA.2020.2965031"},{"key":"2230_CR66","doi-asserted-by":"crossref","unstructured":"Sturm, J., Engelhard, N., Endres, F., et\u00a0al. (2012). A benchmark for the evaluation of RGB-D SLAM systems. In: Proceedings of the IEEE\/RSJ Conference on Intelligent Robots and Systems (IROS), pp 573\u2013580","DOI":"10.1109\/IROS.2012.6385773"},{"key":"2230_CR67","doi-asserted-by":"crossref","unstructured":"Sun, J., Shen, Z., Wang, Y., et\u00a0al. (2021). Loftr: Detector-free local feature matching with transformers. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 8922\u20138931","DOI":"10.1109\/CVPR46437.2021.00881"},{"key":"2230_CR68","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., et\u00a0al. (2015). Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1\u20139","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"2230_CR69","doi-asserted-by":"crossref","unstructured":"Taira, H., Okutomi, M., Sattler, T., et\u00a0al. (2018). Inloc: Indoor visual localization with dense matching and view synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 7199\u20137209","DOI":"10.1109\/CVPR.2018.00752"},{"key":"2230_CR70","doi-asserted-by":"crossref","unstructured":"Teed, Z., Lipson, L., & Deng, J. (2023). Deep patch visual odometry. In: Neural Information Processing Systems (NeurIPS)","DOI":"10.1007\/978-3-031-72627-9_24"},{"key":"2230_CR71","unstructured":"Tolias, G., Sicre, R., & J\u00e9gou, H. (2015). Particular object retrieval with integral max-pooling of CNN activations. In: arXiv preprint arXiv:1511.05879"},{"key":"2230_CR72","doi-asserted-by":"crossref","unstructured":"Torii, A., Sivic, J., Pajdla, T., et\u00a0al. (2013). Visual place recognition with repetitive structures. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 883\u2013890","DOI":"10.1109\/CVPR.2013.119"},{"key":"2230_CR73","doi-asserted-by":"crossref","unstructured":"Torii, A., Arandjelovic, R., Sivic, J., et\u00a0al. (2015). 24\/7 place recognition by view synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1808\u20131817","DOI":"10.1109\/CVPR.2015.7298790"},{"issue":"4","key":"2230_CR74","doi-asserted-by":"publisher","first-page":"376","DOI":"10.1109\/34.88573","volume":"13","author":"S Umeyama","year":"1991","unstructured":"Umeyama, S. (1991). Least-squares estimation of transformation parameters between two point patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 13(4), 376\u2013380.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)"},{"key":"2230_CR75","doi-asserted-by":"crossref","unstructured":"Usenko, V., Demmel, N., Schubert, D., et\u00a0al. (2019). Visual-inertial mapping with non-linear factor recovery. IEEE Robotics and Automation Letters (RA-L) 5(2):422\u2013429","DOI":"10.1109\/LRA.2019.2961227"},{"key":"2230_CR76","doi-asserted-by":"crossref","unstructured":"Valentin, J., Dai, A., Nie\u00dfner, M., et\u00a0al. (2016). Learning to navigate the energy landscape. In: Proceedings of the International Conference on 3D Vision (3DV), pp 323\u2013332","DOI":"10.1109\/3DV.2016.41"},{"key":"2230_CR77","doi-asserted-by":"crossref","unstructured":"Von\u00a0Stumberg, L., Usenko, V., & Cremers, D. (2018). Direct sparse visual-inertial odometry using dynamic marginalization. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp 2510\u20132517","DOI":"10.1109\/ICRA.2018.8462905"},{"key":"2230_CR78","doi-asserted-by":"crossref","unstructured":"Wang, R., Schw\u00f6rer, M., & Cremers, D. (2017a). Stereo DSO: Large-scale direct sparse visual odometry with stereo cameras. In: Proceedings of the International Conference on Computer Vision (ICCV), pp 3903\u20133911","DOI":"10.1109\/ICCV.2017.421"},{"key":"2230_CR79","doi-asserted-by":"crossref","unstructured":"Wang, S., Bai, M., Mattyus, G., et\u00a0al. (2017b). TorontoCity: Seeing the world with a million eyes. In: Proceedings of the International Conference on Computer Vision (ICCV)","DOI":"10.1109\/ICCV.2017.327"},{"key":"2230_CR80","doi-asserted-by":"crossref","unstructured":"Warburg, F., Hauberg, S., Lopez-Antequera, M., et\u00a0al. (2020). Mapillary street-level sequences: A dataset for lifelong place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2626\u20132635","DOI":"10.1109\/CVPR42600.2020.00270"},{"key":"2230_CR81","doi-asserted-by":"crossref","unstructured":"Wenzel, P., Wang, R., Yang, N., et\u00a0al. (2020). 4Seasons: A cross-season dataset for multi-weather SLAM in autonomous driving. In: Proceedings of the German Conference on Pattern Recognition (GCPR)","DOI":"10.1007\/978-3-030-71278-5_29"},{"key":"2230_CR82","doi-asserted-by":"crossref","unstructured":"Yang, N., Wang, R., St\u00fcckler, J., et\u00a0al. (2018). Deep virtual stereo odometry: Leveraging deep depth prediction for monocular direct sparse odometry. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 817\u2013833","DOI":"10.1007\/978-3-030-01237-3_50"}],"container-title":["International Journal of Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-024-02230-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11263-024-02230-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-024-02230-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,30]],"date-time":"2025-03-30T22:03:12Z","timestamp":1743372192000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11263-024-02230-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,13]]},"references-count":82,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,4]]}},"alternative-id":["2230"],"URL":"https:\/\/doi.org\/10.1007\/s11263-024-02230-4","relation":{},"ISSN":["0920-5691","1573-1405"],"issn-type":[{"value":"0920-5691","type":"print"},{"value":"1573-1405","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,13]]},"assertion":[{"value":"21 January 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 August 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 October 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}