{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T15:39:41Z","timestamp":1773157181031,"version":"3.50.1"},"reference-count":33,"publisher":"MDPI AG","issue":"13","license":[{"start":{"date-parts":[[2023,7,7]],"date-time":"2023-07-07T00:00:00Z","timestamp":1688688000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Major Science and Technology Projects of Xiamen of China","award":["3502Z20201015"],"award-info":[{"award-number":["3502Z20201015"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Road scene understanding is crucial to the safe driving of autonomous vehicles. Comprehensive road scene understanding requires a visual perception system to deal with a large number of tasks at the same time, which needs a perception model with a small size, fast speed, and high accuracy. As multi-task learning has evident advantages in performance and computational resources, in this paper, a multi-task model YOLO-Object, Drivable Area, and Lane Line Detection (YOLO-ODL) based on hard parameter sharing is proposed to realize joint and efficient detection of traffic objects, drivable areas, and lane lines. In order to balance tasks of YOLO-ODL, a weight balancing strategy is introduced so that the weight parameters of the model can be automatically adjusted during training, and a Mosaic migration optimization scheme is adopted to improve the evaluation indicators of the model. Our YOLO-ODL model performs well on the challenging BDD100K dataset, achieving the state of the art in terms of accuracy and computational efficiency.<\/jats:p>","DOI":"10.3390\/s23136238","type":"journal-article","created":{"date-parts":[[2023,7,10]],"date-time":"2023-07-10T01:02:50Z","timestamp":1688950970000},"page":"6238","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":25,"title":["Research on Road Scene Understanding of Autonomous Vehicles Based on Multi-Task Learning"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9123-6817","authenticated-orcid":false,"given":"Jinghua","family":"Guo","sequence":"first","affiliation":[{"name":"Department of Mechanical and Electrical Engineering, Xiamen University, Xiamen 361005, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1013-5374","authenticated-orcid":false,"given":"Jingyao","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Automation, Xiamen University, Xiamen 361005, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Huinian","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Electrical Engineering, Xiamen University, Xiamen 361005, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Baoping","family":"Xiao","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Electrical Engineering, Xiamen University, Xiamen 361005, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhifei","family":"He","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Electrical Engineering, Xiamen University, Xiamen 361005, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lubin","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Electrical Engineering, Xiamen University, Xiamen 361005, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,7,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"4670","DOI":"10.1109\/TITS.2019.2943777","article-title":"DLT-Net: Joint detection of drivable areas, lane lines, and traffic objects","volume":"21","author":"Qian","year":"2019","journal-title":"IEEE Trans. Intell. Transp. Syst. (IVS)"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Teichmann, M., Weber, M., Zollner, M., Cipolla, R., and Urtasun, R. (2018, January 26\u201330). MultiNet: Real-time joint semantic reasoning for autonomous driving. Proceedings of the 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, China.","DOI":"10.1109\/IVS.2018.8500504"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"694","DOI":"10.1109\/TPAMI.2006.104","article-title":"On-road vehicle detection: A review","volume":"28","author":"Sun","year":"2006","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"118134","DOI":"10.1016\/j.eswa.2022.118134","article-title":"Traffic sensor location problem: Three decades of research","volume":"208","author":"Owais","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Bhaggiaraj, S., Priyadharsini, M., Karuppasamy, K., and Snegha, R. (2023, January 5\u20136). Deep Learning Based Self Driving Cars Using Computer Vision. Proceedings of the 2023 International Conference on Networking and Communications (ICNWC), Chennai, India.","DOI":"10.1109\/ICNWC57852.2023.10127448"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Hu, L. (2023, January 24\u201326). An Improved YOLOv5 Algorithm of Target Recognition. Proceedings of the 2023 IEEE 2nd International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA), Changchun, China.","DOI":"10.1109\/EEBDA56825.2023.10090620"},{"key":"ref_7","unstructured":"Jocher, G. (2023, June 01). 2020. Available online: https:\/\/github.com\/ultralytics\/yolov5."},{"key":"ref_8","unstructured":"Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan, V., and Darrell, T. (2018). BDD100K: A diverse driving video database with scalable annotation tooling. arXiv, Available online: https:\/\/arxiv.org\/abs\/1805.04687."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhu, X., Lyu, S., Wang, X., and Zhao, Q. (2021, January 11\u201317). TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV) Workshops, Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00312"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Railkar, Y., Nasikkar, A., Pawar, S., Patil, P., and Pise, R. (2023, January 7\u20139). Object Detection and Recognition System Using Deep Learning Method. Proceedings of the 2023 IEEE 8th International Conference for Convergence in Technology (I2CT), Lonavla, India.","DOI":"10.1109\/I2CT57861.2023.10126316"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"JMaurya, J., Ranipa, K.R., Yamaguchi, O., Shibata, T., and Kobayashi, D. (2023, January 2\u20137). Domain Adaptation using Self-Training with Mixup for One-Stage Object Detection. Proceedings of the 2023 IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA.","DOI":"10.1109\/WACV56688.2023.00417"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A.C. (2016, January 8\u201316). SSD: Single shot multibox detector. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_14","unstructured":"Redmon, J., and Farhadi, A. (2020). YOLOv3: An incremental improvement. arXiv, Available online: https:\/\/arxiv.org\/abs\/1804.02767."},{"key":"ref_15","unstructured":"Bochkovskiy, A., Wang, C.-Y., and Liao, H.-J.M. (2020). YOLOv4: Optimal speed and accuracy of object detection. arXiv, Available online: https:\/\/arxiv.org\/abs\/2004.10934."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TIV.2023.3270878","article-title":"Real-Time Memory Efficient Multitask Learning Model for Autonomous Driving","volume":"8","author":"Miraliev","year":"2023","journal-title":"IEEE Trans. Intell. Veh."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"A deep convolutional encoder-decoder architecture for image segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid scene parsing network. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Tian, Z., He, T., Shen, C., and Yan, Y. (2019, January 15\u201320). Decoders matter for semantic segmentation: Data-dependent decoding enables flexible feature aggregation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00324"},{"key":"ref_21","unstructured":"Takikawa, T., Acuna, D., Jampani, V., and Fidler, S. (November, January 27). Gated-SCNN: Gated shape cnns for semantic segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"3729","DOI":"10.1109\/ACCESS.2023.3234442","article-title":"Lane Detection in Autonomous Vehicles: A Systematic Review","volume":"11","author":"Zakaria","year":"2023","journal-title":"IEEE Access"},{"key":"ref_23","unstructured":"Pan, X., Shi, J., Luo, P., Wang, X., and Tang, X. (2023, January 7\u201314). Spatial as deep: Spatial cnn for traffic scene understanding. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Washington DC, USA."},{"key":"ref_24","unstructured":"Hou, Y., Ma, Z., Liu, C., and Loy, C.C. (November, January 27). Learning lightweight lane detection cnns by self attention distillation. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zheng, T., Fang, H., Zhang, Y., Tang, W., Yang, Z., Liu, H., and Cai, D. (2021, January 2\u20139). RESA: Recurrent feature-shift aggregator for lane detection. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Virtual.","DOI":"10.1609\/aaai.v35i4.16469"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Lee, T., and Seok, J. (2023, January 20\u201323). Multi Task Learning: A Survey and Future Directions. Proceedings of the 2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Virtual.","DOI":"10.1109\/ICAIIC57133.2023.10067098"},{"key":"ref_27","unstructured":"Wu, D., Liao, M.-W., Zhang, W.-T., Wang, X.-G., Bai, X., Cheng, W.-Q., and Liu, W.-Y. (2021). YOLOP: You only look once for panoptic driving perception. arXiv, Available online: https:\/\/arxiv.org\/abs\/2108.11250."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Kim, D., Lan, T., Zou, C., Xu, N., Plummer, B.A., Sclaroff, S., Eledath, J., and Medioni, G. (2021, January 11\u201317). MILA: Multi-task learning from videos via efficient inter-frame attention. Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00251"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Lin, T., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Cipolla, R., Gal, Y., and Kendall, A. (2018, January 18\u201323). Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00781"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., and Garcia-Rodriguez, J. (2017). A review on deep learning techniques applied to semantic segmentation. arXiv, Available online: https:\/\/arxiv.org\/abs\/1704.06857.","DOI":"10.1016\/j.asoc.2018.05.018"},{"key":"ref_32","unstructured":"Wirthmuller, F., Schlechtriemen, J., Hipp, J., and Reichert, M. (2021, January 13\u201316). Teaching vehicles to anticipate: A systematic study on probabilistic behavior prediction using large data sets. Proceedings of the 20th IEEE International Conference on Machine Learning and Applications (ICMLA), Virtually, Online."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"4986","DOI":"10.1109\/TITS.2020.2983077","article-title":"SALMNet: A structure-aware lane marking detection network","volume":"22","author":"Xu","year":"2021","journal-title":"IEEE Trans. Intell. Transp. Syst."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/13\/6238\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:08:21Z","timestamp":1760126901000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/13\/6238"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,7]]},"references-count":33,"journal-issue":{"issue":"13","published-online":{"date-parts":[[2023,7]]}},"alternative-id":["s23136238"],"URL":"https:\/\/doi.org\/10.3390\/s23136238","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,7]]}}}