{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,12]],"date-time":"2026-06-12T17:51:33Z","timestamp":1781286693399,"version":"3.54.1"},"reference-count":46,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2023,3,8]],"date-time":"2023-03-08T00:00:00Z","timestamp":1678233600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Secure Systems Research Center (SSRC), Technology Innovation Institute (TII)"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Over the last decade, robotic perception algorithms have significantly benefited from the rapid advances in deep learning (DL). Indeed, a significant amount of the autonomy stack of different commercial and research platforms relies on DL for situational awareness, especially vision sensors. This work explored the potential of general-purpose DL perception algorithms, specifically detection and segmentation neural networks, for processing image-like outputs of advanced lidar sensors. Rather than processing the three-dimensional point cloud data, this is, to the best of our knowledge, the first work to focus on low-resolution images with a 360\u00b0 field of view obtained with lidar sensors by encoding either depth, reflectivity, or near-infrared light in the image pixels. We showed that with adequate preprocessing, general-purpose DL models can process these images, opening the door to their usage in environmental conditions where vision sensors present inherent limitations. We provided both a qualitative and quantitative analysis of the performance of a variety of neural network architectures. We believe that using DL models built for visual cameras offers significant advantages due to their much wider availability and maturity compared to point cloud-based perception.<\/jats:p>","DOI":"10.3390\/s23062936","type":"journal-article","created":{"date-parts":[[2023,3,9]],"date-time":"2023-03-09T02:01:47Z","timestamp":1678327307000},"page":"2936","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":17,"title":["General-Purpose Deep Learning Detection and Segmentation Models for Images from a Lidar-Based Camera Sensor"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9042-3730","authenticated-orcid":false,"given":"Xianjia","family":"Yu","sequence":"first","affiliation":[{"name":"Turku Intelligent Embedded and Robotic Systems Laboratory, Faculty of Technology, University of Turku, 20500 Turku, Finland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4954-0000","authenticated-orcid":false,"given":"Sahar","family":"Salimpour","sequence":"additional","affiliation":[{"name":"Turku Intelligent Embedded and Robotic Systems Laboratory, Faculty of Technology, University of Turku, 20500 Turku, Finland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3091-3217","authenticated-orcid":false,"given":"Jorge Pe\u00f1a","family":"Queralta","sequence":"additional","affiliation":[{"name":"Turku Intelligent Embedded and Robotic Systems Laboratory, Faculty of Technology, University of Turku, 20500 Turku, Finland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1793-2694","authenticated-orcid":false,"given":"Tomi","family":"Westerlund","sequence":"additional","affiliation":[{"name":"Turku Intelligent Embedded and Robotic Systems Laboratory, Faculty of Technology, University of Turku, 20500 Turku, Finland"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,8]]},"reference":[{"key":"ref_1","unstructured":"Fan, R., Jiao, J., Ye, H., Yu, Y., Pitas, I., and Liu, M. (2019). Key ingredients of self-driving cars. arXiv."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Kato, S., Tokunaga, S., Maruyama, Y., Maeda, S., Hirabayashi, M., Kitsukawa, Y., Monrroy, A., Ando, T., Fujii, Y., and Azumi, T. (2018, January 11\u201313). Autoware on board: Enabling autonomous vehicles with embedded systems. Proceedings of the 2018 ACM\/IEEE 9th International Conference on Cyber-Physical Systems (ICCPS), Porto, Portugal.","DOI":"10.1109\/ICCPS.2018.00035"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"5512","DOI":"10.1109\/LRA.2022.3154047","article-title":"Large-scale Autonomous Flight with Real-time Semantic SLAM under Dense Forest Canopy","volume":"7","author":"Liu","year":"2022","journal-title":"IEEE Robot. Autom. Lett. (RA-L)"},{"key":"ref_4","first-page":"852","article-title":"Review of LiDAR sensor data acquisition and compression for automotive applications","volume":"2","author":"Maksymova","year":"2018","journal-title":"Multidiscip. Digit. Publ. Inst. Proc."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Yoo, J.H., Kim, Y., Kim, J., and Choi, J.W. (2020, January 23\u201328). 3d-cvf: Generating joint camera and lidar features using cross-view spatial feature fusion for 3d object detection. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58583-9_43"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"579","DOI":"10.1016\/j.procs.2021.02.100","article-title":"A survey of LiDAR and camera fusion enhancement","volume":"183","author":"Zhong","year":"2021","journal-title":"Procedia Comput. Sci."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1142\/S2301385020500168","article-title":"Multi-sensor fusion for navigation and mapping in autonomous vehicles: Accurate localization in urban environments","volume":"8","author":"Li","year":"2020","journal-title":"Unmanned Syst."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"722","DOI":"10.1109\/TITS.2020.3023541","article-title":"Deep learning for image and point cloud fusion in autonomous driving: A review","volume":"23","author":"Cui","year":"2021","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_9","unstructured":"Li, Q., Queralta, J.P., Gia, T.N., and Westerlund, T. (2019, January 21\u201323). Offloading Monocular Visual Odometry with Edge Computing: Optimizing Image Compression Ratios in Multi-Robot Systems. Proceedings of the 5th ICSCC, Wuhan, China."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"821","DOI":"10.1080\/01691864.2017.1365009","article-title":"Deep learning in robotics: A review of recent research","volume":"31","author":"Pierson","year":"2017","journal-title":"Adv. Robot."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Zhao, W., Queralta, J.P., and Westerlund, T. (2020, January 1\u20134). Sim-to-real transfer in deep reinforcement learning for robotics: A survey. Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, Australia.","DOI":"10.1109\/SSCI47803.2020.9308468"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"191617","DOI":"10.1109\/ACCESS.2020.3030190","article-title":"Collaborative multi-robot search and rescue: Planning, coordination, perception, and active vision","volume":"8","author":"Queralta","year":"2020","journal-title":"IEEE Access"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"3412","DOI":"10.1109\/TNNLS.2020.3015992","article-title":"Deep learning for lidar point clouds in autonomous driving: A review","volume":"32","author":"Li","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1016\/j.robot.2018.11.002","article-title":"LIDAR\u2013camera fusion for road detection using fully convolutional neural networks","volume":"111","author":"Caltagirone","year":"2019","journal-title":"Robot. Auton. Syst."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Tsiourva, M., and Papachristos, C. (2020, January 1\u20134). LiDAR Imaging-Based Attentive Perception. Proceedings of the 2020 International Conference on Unmanned Aircraft Systems (ICUAS), Athens, Greece.","DOI":"10.1109\/ICUAS48674.2020.9213910"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Tampuu, A., Aidla, R., van Gent, J.A., and Matiisen, T. (2022). LiDAR-as-Camera for End-to-End Driving. arXiv.","DOI":"10.3390\/s23052845"},{"key":"ref_17","unstructured":"Pacala, A. (2023, March 03). Lidar as a Camera-Digital Lidar\u2019s Implications for Computer Vision, Ouster Blog Online Resource. Available online: https:\/\/ouster.com\/blog\/the-camera-is-in-the-lidar\/."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhou, Y., and Tuzel, O. (2018, January 18\u201323). Voxelnet: End-to-end learning for point cloud based 3d object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00472"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Li, B. (2017, January 24\u201328). 3d fully convolutional network for vehicle detection in point cloud. Proceedings of the 2017 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada.","DOI":"10.1109\/IROS.2017.8205955"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Pang, S., Morris, D., and Radha, H. (2020, January 25\u201329). CLOCs: Camera-LiDAR object candidates fusion for 3D object detection. Proceedings of the 2020 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA.","DOI":"10.1109\/IROS45743.2020.9341791"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"22080","DOI":"10.1109\/ACCESS.2021.3055491","article-title":"Fast and accurate 3D object detection for lidar-camera-based autonomous vehicles using one shared voxel-based backbone","volume":"9","author":"Wen","year":"2021","journal-title":"IEEE Access"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2708","DOI":"10.1109\/TIE.2021.3070508","article-title":"OpenStreetMap-based autonomous navigation for the four wheel-legged robot via 3D-Lidar and CCD camera","volume":"69","author":"Li","year":"2021","journal-title":"IEEE Trans. Ind. Electron."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Schlosser, J., Chow, C.K., and Kira, Z. (2016, January 16\u201321). Fusing lidar and images for pedestrian detection using convolutional neural networks. Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden.","DOI":"10.1109\/ICRA.2016.7487370"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.patrec.2017.09.038","article-title":"Multimodal vehicle detection: Fusing 3D-LIDAR and color camera data","volume":"115","author":"Asvadi","year":"2018","journal-title":"Pattern Recognit. Lett."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Sier, H., Yu, X., Catalano, I., Pe\u00f1a Queralta, J., Zou, Z., and Westerlund, T. (2023). UAV Tracking with Lidar as a Camera Sensors in GNSS-Denied Environments. arXiv.","DOI":"10.1109\/ICL-GNSS57829.2023.10148919"},{"key":"ref_26","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (July, January 26). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_28","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 11\u201314). Ssd: Single shot multibox detector. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Kim, J., Kim, J., and Cho, J. (2019, January 16\u201318). An advanced object classification strategy using YOLO through camera and LiDAR sensor fusion. Proceedings of the 2019 13th International Conference on Signal Processing and Communication Systems (ICSPCS), Gold Coast, Australia.","DOI":"10.1109\/ICSPCS47537.2019.9008742"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Chen, X., Ma, H., Wan, J., Li, B., and Xia, T. (2017, January 21\u201326). Multi-view 3D Object Detection Network for Autonomous Driving. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.691"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Ku, J., Mozifian, M., Lee, J., Harakeh, A., and Waslander, S.L. (2018, January 1\u20135). Joint 3d proposal generation and object detection from view aggregation. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8594049"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Geng, K., Dong, G., Yin, G., and Hu, J. (2020). Deep dual-modal traffic objects instance segmentation method using camera and lidar data for autonomous driving. Remote Sens., 12.","DOI":"10.3390\/rs12203274"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Wu, B., Wan, A., Yue, X., and Keutzer, K. (2018, January 21\u201325). Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud. Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia.","DOI":"10.1109\/ICRA.2018.8462926"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Imad, M., Doukhi, O., and Lee, D.J. (2021). Transfer learning based semantic segmentation for 3D object detection from point cloud. Sensors, 21.","DOI":"10.3390\/s21123964"},{"key":"ref_39","unstructured":"Zou, Z., Shi, Z., Guo, Y., and Ye, J. (2019). Object detection in 20 years: A survey. arXiv."},{"key":"ref_40","first-page":"91","article-title":"Faster r-cnn: Towards real-time object detection with region proposal networks","volume":"28","author":"Ren","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_41","unstructured":"Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021). Yolox: Exceeding yolo series in 2021. arXiv."},{"key":"ref_42","unstructured":"Jocher, G., Nishimura, K., Mineeva, T., and Vilari\u00f1o, R. (2023, March 03). yolov5. Code Repository. Available online: https:\/\/github.com\/ultralytics\/yolov5."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1016\/j.asoc.2018.05.018","article-title":"A survey on deep learning techniques for image and video semantic segmentation","volume":"70","author":"Oprea","year":"2018","journal-title":"Appl. Soft Comput."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1007\/s13735-020-00195-x","article-title":"A survey on instance segmentation: State of the art","volume":"9","author":"Hafiz","year":"2020","journal-title":"Int. J. Multimed. Inf. Retr."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Yuan, Y., Chen, X., Chen, X., and Wang, J. (2019). Segmentation transformer: Object-contextual representations for semantic segmentation. arXiv.","DOI":"10.1007\/978-3-030-58539-6_11"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Wu, Y., He, K., and Girshick, R. (2020, January 14\u201319). Pointrend: Image segmentation as rendering. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00982"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/6\/2936\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:51:03Z","timestamp":1760122263000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/6\/2936"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,8]]},"references-count":46,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2023,3]]}},"alternative-id":["s23062936"],"URL":"https:\/\/doi.org\/10.3390\/s23062936","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,8]]}}}