{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T13:55:19Z","timestamp":1774965319680,"version":"3.50.1"},"reference-count":30,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2021,5,12]],"date-time":"2021-05-12T00:00:00Z","timestamp":1620777600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Accurately estimating the current state of local traffic scenes is one of the key problems in the development of software components for automated vehicles. In addition to details on free space and drivability, static and dynamic traffic participants and information on the semantics may also be included in the desired representation. Multi-layer grid maps allow the inclusion of all of this information in a common representation. However, most existing grid mapping approaches only process range sensor measurements such as Lidar and Radar and solely model occupancy without semantic states. In order to add sensor redundancy and diversity, it is desired to add vision-based sensor setups in a common grid map representation. In this work, we present a semantic evidential grid mapping pipeline, including estimates for eight semantic classes, that is designed for straightforward fusion with range sensor data. Unlike other publications, our representation explicitly models uncertainties in the evidential model. We present results of our grid mapping pipeline based on a monocular vision setup and a stereo vision setup. Our mapping results are accurate and dense mapping due to the incorporation of a disparity- or depth-based ground surface estimation in the inverse perspective mapping. We conclude this paper by providing a detailed quantitative evaluation based on real traffic scenarios in the KITTI odometry benchmark dataset and demonstrating the advantages compared to other semantic grid mapping approaches.<\/jats:p>","DOI":"10.3390\/s21103380","type":"journal-article","created":{"date-parts":[[2021,5,12]],"date-time":"2021-05-12T22:46:14Z","timestamp":1620859574000},"page":"3380","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Semantic Evidential Grid Mapping Using Monocular and Stereo Cameras"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5429-4476","authenticated-orcid":false,"given":"Sven","family":"Richter","sequence":"first","affiliation":[{"name":"Institute of Measurement and Control Systems, Karlsruhe Institute of Technology (KIT), Engler-Bunte-Ring 21, 76131 Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8459-8334","authenticated-orcid":false,"given":"Yiqun","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Measurement and Control Systems, Karlsruhe Institute of Technology (KIT), Engler-Bunte-Ring 21, 76131 Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Johannes","family":"Beck","sequence":"additional","affiliation":[{"name":"Atlatec GmbH, Haid-und-Neu-Stra\u00dfe 7, 76131 Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sascha","family":"Wirges","sequence":"additional","affiliation":[{"name":"Institute of Measurement and Control Systems, Karlsruhe Institute of Technology (KIT), Engler-Bunte-Ring 21, 76131 Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4165-2075","authenticated-orcid":false,"given":"Christoph","family":"Stiller","sequence":"additional","affiliation":[{"name":"Institute of Measurement and Control Systems, Karlsruhe Institute of Technology (KIT), Engler-Bunte-Ring 21, 76131 Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,5,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"102","DOI":"10.1515\/teme-2019-0052","article-title":"Fusion of range measurements and semantic estimates in an evidential framework\/Fusion von Distanzmessungen und semantischen Gr\u00f6\u00dfen im Rahmen der Evidenztheorie","volume":"86","author":"Richter","year":"2019","journal-title":"Tm-Tech. Mess."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Shafer, G. (1976). A Mathematical Theory of Evidence, Princeton University Press.","DOI":"10.1515\/9780691214696"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1109\/2.30720","article-title":"Using Occupancy Grids for Mobile Robot Perception and Navigation","volume":"22","author":"Elfes","year":"1989","journal-title":"Computer"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1177\/0278364918775523","article-title":"A Random Finite Set Approach for Dynamic Occupancy Grid Maps with Real-time Application","volume":"37","author":"Nuss","year":"2018","journal-title":"Int. J. Robot. Res."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"384","DOI":"10.1109\/TIV.2018.2843130","article-title":"Grid-Based Environment Estimation Using Evidential Mapping and Particle Tracking","volume":"3","author":"Steyer","year":"2018","journal-title":"IEEE Trans. Intell. Veh."},{"key":"ref_6","unstructured":"Badino, H., and Franke, U. (2007). Free Space Computation Using Stochastic Occupancy Grids and Dynamic Programming, Citeseer. Technical Report."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Perrollaz, M., Spalanzani, A., and Aubert, D. (2010, January 21\u201324). Probabilistic Representation of the Uncertainty of Stereo-Vision and Application to Obstacle Detection. Proceedings of the IEEE Intelligent Vehicles Symposium, La Jolla, CA, USA.","DOI":"10.1109\/IVS.2010.5548010"},{"key":"ref_8","unstructured":"Pocol, C., Nedevschi, S., and Meinecke, M.M. (2008, January 18\u201319). Obstacle Detection Based on Dense Stereovision for Urban ACC Systems. Proceedings of the 5th International Workshop on Intelligent Transportation, Hamburg, Germany."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1109\/MITS.2011.2178492","article-title":"Particle Grid Tracking System Stereovision Based Obstacle Perception in Driving Environments","volume":"4","author":"Danescu","year":"2012","journal-title":"IEEE Intell. Transp. Syst. Mag."},{"key":"ref_10","unstructured":"Yu, C., Cherfaoui, V., and Bonnifait, P. (July, January 28). Evidential Occupancy Grid Mapping with Stereo-Vision. Proceedings of the IEEE Intelligent Vehicles Symposium, Seoul, Korea."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Giovani, B.V., Victorino, A.C., and Ferreira, J.V. (2015, January 15\u201318). Stereo Vision for Dynamic Urban Environment Perception Using Semantic Context in Evidential Grid. Proceedings of the IEEE Conference on Intelligent Transportation Systems, Gran Canaria, Spain.","DOI":"10.1109\/ITSC.2015.398"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Valente, M., Joly, C., and de la Fortelle, A. (2018, January 18\u201321). Fusing Laser Scanner and Stereo Camera in Evidential Grid Maps. Proceedings of the 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV), Singapore.","DOI":"10.1109\/ICARCV.2018.8580635"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Thomas, J., Tatsch, J., Van Ekeren, W., Rojas, R., and Knoll, A. (2019, January 9\u201312). Semantic grid-based road model estimation for autonomous driving. Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France.","DOI":"10.1109\/IVS.2019.8813790"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Erkent, O., Wolf, C., Laugier, C., Gonzalez, D.S., and Cano, V.R. (2018, January 1\u20135). Semantic Grid Estimation with a Hybrid Bayesian and Deep Neural Network Approach. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593434"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1109\/LRA.2019.2891028","article-title":"Monocular Semantic Occupancy Grid Mapping With Convolutional Variational Encoder\u2013Decoder Networks","volume":"4","author":"Lu","year":"2019","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_16","unstructured":"Eigen, D., Puhrsch, C., and Fergus, R. (2014). Depth map prediction from a single image using a multi-scale deep network. Advances in neural information processing systems. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"2024","DOI":"10.1109\/TPAMI.2015.2505283","article-title":"Learning depth from single monocular images using deep convolutional neural fields","volume":"38","author":"Liu","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Fu, H., Gong, M., Wang, C., Batmanghelich, K., and Tao, D. (2018, January 18\u201323). Deep ordinal regression network for monocular depth estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00214"},{"key":"ref_19","unstructured":"Alhashim, I., and Wonka, P. (2018). High quality monocular depth estimation via transfer learning. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zhou, T., Brown, M., Snavely, N., and Lowe, D.G. (2017, January 21\u201326). Unsupervised learning of depth and ego-motion from video. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.700"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Godard, C., Mac Aodha, O., and Brostow, G.J. (2017, January 21\u201326). Unsupervised monocular depth estimation with left-right consistency. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.699"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Qiao, S., Zhu, Y., Adam, H., Yuille, A., and Chen, L.C. (2020). ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation. arXiv.","DOI":"10.1109\/CVPR46437.2021.00399"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Richter, S., Beck, J., Wirges, S., and Stiller, C. (2020, January 14\u201316). Semantic Evidential Grid Mapping based on Stereo Vision. Proceedings of the 2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), Karlsruhe, Germany.","DOI":"10.1109\/MFI49285.2020.9235217"},{"key":"ref_24","unstructured":"Bertalmio, M., Bertozzi, A., and Sapiro, G. (2001, January 8\u201314). Navier-Stokes, Fluid Dynamics, and Image and Video Inpainting. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, Kauai, HI, USA."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1504\/IJVAS.2008.016478","article-title":"Efficient GPU-based construction of occupancy grids using several laser range-finders","volume":"6","author":"Yguel","year":"2008","journal-title":"Int. J. Veh. Auton. Syst."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1177\/0278364913491297","article-title":"Vision meets robotics: The kitti dataset","volume":"32","author":"Geiger","year":"2013","journal-title":"Int. J. Robot. Res."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Zhang, F., Prisacariu, V., Yang, R., and Torr, P.H.S. (2019, January 16\u201320). GA-Net: Guided Aggregation Net for End-To-End Stereo Matching. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00027"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhu, Y., Sapra, K., Reda, F.A., Shih, K.J., Newsam, S., Tao, A., and Catanzaro, B. (2019, January 16\u201320). Improving semantic segmentation via video propagation and label relaxation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00906"},{"key":"ref_29","unstructured":"Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., and Gall, J. (November, January 27). SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Bieder, F., Wirges, S., Janosovits, J., Richter, S., Wang, Z., and Stiller, C. (2020). Exploiting Multi-Layer Grid Maps for Surround-View Semantic Segmentation of Sparse LiDAR Data. arXiv.","DOI":"10.1109\/IV47402.2020.9304848"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/10\/3380\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:59:57Z","timestamp":1760162397000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/10\/3380"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,5,12]]},"references-count":30,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2021,5]]}},"alternative-id":["s21103380"],"URL":"https:\/\/doi.org\/10.3390\/s21103380","relation":{"has-preprint":[{"id-type":"doi","id":"10.20944\/preprints202105.0119.v1","asserted-by":"object"}]},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,5,12]]}}}