{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,30]],"date-time":"2026-03-30T22:26:32Z","timestamp":1774909592653,"version":"3.50.1"},"reference-count":48,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2022,11,24]],"date-time":"2022-11-24T00:00:00Z","timestamp":1669248000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,11,24]],"date-time":"2022-11-24T00:00:00Z","timestamp":1669248000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["51605003"],"award-info":[{"award-number":["51605003"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["51575001"],"award-info":[{"award-number":["51575001"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Natural Science Foundation of the Anhui Higher Education Institutions of China","award":["KJ2020A0358"],"award-info":[{"award-number":["KJ2020A0358"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2023,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Lane detection is one of the key techniques to realize advanced driving assistance and automatic driving. However, lane detection networks based on deep learning have significant shortcomings. The detection results are often unsatisfactory when there are shadows, degraded lane markings, and vehicle occlusion lanes. Therefore, a continuous multi-frame image sequence lane detection network is proposed. Specifically, the continuous six-frame image sequence is input into the network, in which the scene information of each frame image is extracted by an encoder composed of Swin Transformer blocks and input into the PredRNN. Continuous multi-frame of the driving scene is modeled as time-series by ST-LSTM blocks, and then, the shape changes and motion trajectory in the spatiotemporal sequence are effectively modeled. Finally, through the decoder composed of Swin Transformer blocks, the features are obtained and reconstructed to complete the detection task. Extensive experiments on two large-scale datasets demonstrate that the proposed method outperforms the competing methods in lane detection, especially in handling difficult situations. Experiments are carried out based on the TuSimple dataset. The results show: for easy scenes, the validation accuracy is 97.46%, the test accuracy is 97.37%, and the precision is 0.865. For complex scenes, the validation accuracy is 97.38%, the test accuracy is 97.29%, and the precision is 0.859. The running time is 4.4\u00a0ms. Experiments are carried out based on the CULane dataset. The results show that, for easy scenes, the validation accuracy is 97.03%, the test accuracy is 96.84%, and the precision is 0.837. For complex scenes, the validation accuracy is 96.18%, the test accuracy is 95.92%, and the precision is 0.829. The running time is 6.5\u00a0ms.<\/jats:p>","DOI":"10.1007\/s40747-022-00909-0","type":"journal-article","created":{"date-parts":[[2022,11,24]],"date-time":"2022-11-24T19:13:38Z","timestamp":1669317218000},"page":"4837-4855","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["ST-MAE: robust lane detection in continuous multi-frame driving scenes based on a deep hybrid network"],"prefix":"10.1007","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3030-2125","authenticated-orcid":false,"given":"Rongyun","family":"Zhang","sequence":"first","affiliation":[]},{"given":"Yufeng","family":"Du","sequence":"additional","affiliation":[]},{"given":"Peicheng","family":"Shi","sequence":"additional","affiliation":[]},{"given":"Lifeng","family":"Zhao","sequence":"additional","affiliation":[]},{"given":"Yaming","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Haoran","family":"Li","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,11,24]]},"reference":[{"key":"909_CR1","doi-asserted-by":"publisher","first-page":"9711","DOI":"10.1109\/CVPR46437.2021.00959","volume":"2","author":"CY Wang","year":"2021","unstructured":"Wang CY, Bochkovskiy A, Liao HYM (2021) Rethinking BiSeNet for real-time semantic segmentation. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2:9711\u20139720. https:\/\/doi.org\/10.1109\/CVPR46437.2021.00959","journal-title":"Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit"},{"key":"909_CR2","doi-asserted-by":"publisher","first-page":"16750","DOI":"10.1109\/CVPR46437.2021.01648","volume":"1","author":"C Huynh","year":"2021","unstructured":"Huynh C, Tran AT, Luu K et al (2021) Progressive semantic segmentation. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 1:16750\u201316759. https:\/\/doi.org\/10.1109\/CVPR46437.2021.01648","journal-title":"Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit"},{"key":"909_CR3","doi-asserted-by":"publisher","unstructured":"Gwilliam M, Teuscher A, Anderson C et al (2021) Fair comparison: quantifying variance in results for fine-grained visual categorization. In: Proc\u20142021 IEEE Winter Conf Appl Comput Vision, WACV 2021, pp 3308\u20133317. https:\/\/doi.org\/10.1109\/WACV48630.2021.00335","DOI":"10.1109\/WACV48630.2021.00335"},{"key":"909_CR4","doi-asserted-by":"publisher","unstructured":"Mafla A, Dey S, Biten AF et al (2021) Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval. In: Proc\u20142021 IEEE Winter Conf Appl Comput Vision, WACV 2021, pp 4022\u20134032. https:\/\/doi.org\/10.1109\/WACV48630.2021.00407","DOI":"10.1109\/WACV48630.2021.00407"},{"key":"909_CR5","doi-asserted-by":"publisher","unstructured":"Gong X, Xia X, Zhu W et al (2021) Deformable gabor feature networks for biomedical image classification. In: Proc\u20142021 IEEE winter conf appl comput vision, WACV 2021, pp 4003\u20134011. https:\/\/doi.org\/10.1109\/WACV48630.2021.00405","DOI":"10.1109\/WACV48630.2021.00405"},{"key":"909_CR6","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01284","author":"Q Chen","year":"2021","unstructured":"Chen Q, Wang Y, Yang T et al (2021) You only look one-level feature. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. https:\/\/doi.org\/10.1109\/CVPR46437.2021.01284","journal-title":"Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit"},{"key":"909_CR7","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01422","author":"P Sun","year":"2021","unstructured":"Sun P, Zhang R, Jiang Y et al (2021) Sparse R-CNN: end-to-end object detection with learnable proposals. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. https:\/\/doi.org\/10.1109\/CVPR46437.2021.01422","journal-title":"Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit"},{"key":"909_CR8","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01283","author":"CY Wang","year":"2021","unstructured":"Wang CY, Bochkovskiy A, Liao HYM (2021) Scaled-yolov4: scaling cross stage partial network. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. https:\/\/doi.org\/10.1109\/CVPR46437.2021.01283","journal-title":"Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit"},{"key":"909_CR9","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00399","author":"S Qiao","year":"2021","unstructured":"Qiao S, Zhu Y, Adam H et al (2021) VIP-DeepLab: learning visual perception with depth-aware video panoptic segmentation. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. https:\/\/doi.org\/10.1109\/CVPR46437.2021.00399","journal-title":"Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit"},{"key":"909_CR10","unstructured":"Naseer M, Ranasinghe K, Khan S et al (2021) Intriguing properties of vision transformers. no. NeurIPS, [Online]. arXiv:2105.10497"},{"key":"909_CR11","unstructured":"Radford A, Kim JW, Hallacy C et al (2021) Learning transferable visual models from natural language supervision. [Online]. arXiv:2103.00020"},{"key":"909_CR12","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1109\/2943.974352","volume":"8","author":"FA Furfari(tony)","year":"2002","unstructured":"Furfari(tony) FA (2002) The transformer. IEEE Ind Appl Mag 8:8\u201315. https:\/\/doi.org\/10.1109\/2943.974352","journal-title":"IEEE Ind Appl Mag"},{"key":"909_CR13","unstructured":"Cao H, Wang Y, Chen J et al (2021) Swin-Unet: Unet-like pure transformer for medical image segmentation, pp 1\u201314. [Online]. arXiv:2105.05537"},{"key":"909_CR14","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A et al (2020) An image is worth 16 \u00d7 16 words: transformers for image recognition at scale. [Online]. arXiv:2010.11929"},{"key":"909_CR15","unstructured":"Wang Y, Wu H, Zhang J et al (2021) PredRNN: a recurrent neural network for spatiotemporal predictive learning. 1\u201314."},{"key":"909_CR16","doi-asserted-by":"publisher","unstructured":"He K, Chen X, Xie S, et al (2022) Masked Autoencoders Are Scalable Vision Learners. 15979\u201315988. https:\/\/doi.org\/10.1109\/cvpr52688.2022.01553","DOI":"10.1109\/cvpr52688.2022.01553"},{"key":"909_CR17","doi-asserted-by":"crossref","unstructured":"Chougule S, Koznek N, Adam G et al (2019) Reliable multilane detection and classification by utilizing CNN as a regression network. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp 740\u2013752","DOI":"10.1007\/978-3-030-11021-5_46"},{"key":"909_CR18","doi-asserted-by":"crossref","unstructured":"Garnett N, Cohen R, Pe\u2019Er T et al (2019) 3D-LaneNet: end-to-end 3D multiple lane detection. In: Proceedings of the IEEE international conference on computer vision, pp 2921\u20132930","DOI":"10.1109\/ICCV.2019.00301"},{"key":"909_CR19","doi-asserted-by":"crossref","unstructured":"Su J, Chen C, Zhang K et al (2021) Structure guided lane detection. In: IJCAI international joint conference on artificial intelligence, pp 997\u20131003","DOI":"10.24963\/ijcai.2021\/138"},{"key":"909_CR20","doi-asserted-by":"crossref","unstructured":"Gansbeke WV, Brabandere BD, Neven D et al (2019) End-to-end lane detection through differentiable least-squares fitting. In: Proceedings\u20142019 international conference on computer vision workshop, ICCVW 2019, pp 905\u2013913","DOI":"10.1109\/ICCVW.2019.00119"},{"key":"909_CR21","doi-asserted-by":"crossref","unstructured":"Borkar A, Hayes M, Smith MT (2011) Polar randomized hough transform for lane detection using loose constraints of parallel lines. In: ICASSP, IEEE international conference on acoustics, speech and signal processing\u2014proceedings, pp 1037\u20131040","DOI":"10.1109\/ICASSP.2011.5946584"},{"key":"909_CR22","doi-asserted-by":"crossref","unstructured":"Yoo S, Seok L, Myeong H et al (2020) End-to-end lane marker detection via row-wise classification. In: IEEE computer society conference on computer vision and pattern recognition workshops, pp 4335\u20134343","DOI":"10.1109\/CVPRW50498.2020.00511"},{"key":"909_CR23","doi-asserted-by":"crossref","unstructured":"Aly M (2008) Real time detection of lane markers in urban streets. In: IEEE intelligent vehicles symposium, proceedings, p 7\u201312","DOI":"10.1109\/IVS.2008.4621152"},{"key":"909_CR24","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1109\/TITS.2006.869595","volume":"7","author":"JC McCall","year":"2006","unstructured":"McCall JC, Trivedi MM (2006) Video-based lane estimation and tracking for driver assistance: survey, system, and evaluation. IEEE Trans Intell Transp Syst 7:20\u201337","journal-title":"IEEE Trans Intell Transp Syst"},{"key":"909_CR25","doi-asserted-by":"crossref","unstructured":"Zhou S, Jiang Y, Xi J et al (2010) A novel lane detection based on geometrical model and Gabor filter. In: IEEE intelligent vehicles symposium, proceedings. IEEE, pp 59\u201364","DOI":"10.1109\/IVS.2010.5548087"},{"key":"909_CR26","doi-asserted-by":"publisher","first-page":"270","DOI":"10.1109\/ICIRT.2016.7588744","volume":"2016","author":"MA Selver","year":"2016","unstructured":"Selver MA, Er E, Belenlioglu B et al (2016) Camera based driver support system for rail extraction using 2-D Gabor wavelet decompositions and morphological analysis. IEEE Int Conf Intell Rail Transp ICIRT 2016:270\u2013275. https:\/\/doi.org\/10.1109\/ICIRT.2016.7588744","journal-title":"IEEE Int Conf Intell Rail Transp ICIRT"},{"key":"909_CR27","doi-asserted-by":"publisher","first-page":"254","DOI":"10.1134\/S1054661818020049","volume":"28","author":"F Zheng","year":"2018","unstructured":"Zheng F, Luo S, Song K et al (2018) Improved lane line detection algorithm based on Hough transform. Pattern Recognit Image Anal 28:254\u2013260. https:\/\/doi.org\/10.1134\/S1054661818020049","journal-title":"Pattern Recognit Image Anal"},{"key":"909_CR28","doi-asserted-by":"crossref","unstructured":"Hur J, Kang SN, Seo SW (2013) Multi-lane detection in urban driving environments using conditional random fields. In: IEEE intelligent vehicles symposium, proceedings, pp 1297\u20131302","DOI":"10.1109\/IVS.2013.6629645"},{"key":"909_CR29","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2644615","author":"V Badrinarayanan","year":"2017","unstructured":"Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder\u2013decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell. https:\/\/doi.org\/10.1109\/TPAMI.2016.2644615","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"909_CR30","doi-asserted-by":"publisher","first-page":"16591","DOI":"10.1109\/ACCESS.2021.3053408","volume":"9","author":"W Weng","year":"2021","unstructured":"Weng W, Zhu X (2021) INet: convolutional networks for biomedical image segmentation. IEEE Access 9:16591\u201316603. https:\/\/doi.org\/10.1109\/ACCESS.2021.3053408","journal-title":"IEEE Access"},{"key":"909_CR31","doi-asserted-by":"crossref","unstructured":"Wang Z, Ren W, Qiu Q (2018) LaneNet: real-time lane detection networks for autonomous driving. [Online]. arXiv:1807.01726","DOI":"10.1109\/ICoIAS.2018.8494031"},{"key":"909_CR32","doi-asserted-by":"crossref","unstructured":"Liu R, Yuan Z, Liu T et al (2021) End-to-end lane shape prediction with transformers. In: Proceedings\u20142021 IEEE winter conference on applications of computer vision, WACV 2021, pp 3693\u20133701","DOI":"10.1109\/WACV48630.2021.00374"},{"key":"909_CR33","doi-asserted-by":"crossref","unstructured":"Tabelini L, Berriel R, Paix\u00e3o TM et al (2021) Keep your eyes on the lane: real-time attention-guided lane detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 294\u2013302","DOI":"10.1109\/CVPR46437.2021.00036"},{"key":"909_CR34","doi-asserted-by":"crossref","unstructured":"Hou Y, Ma Z, Liu C et al (2019) Learning lightweight lane detection CNNS by self attention distillation. In: Proceedings of the IEEE international conference on computer vision, pp 1013\u20131021","DOI":"10.1109\/ICCV.2019.00110"},{"key":"909_CR35","doi-asserted-by":"crossref","unstructured":"Qin Z, Wang H, Li X (2020) Ultra fast structure-aware deep lane detection. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp 276\u2013291","DOI":"10.1007\/978-3-030-58586-0_17"},{"key":"909_CR36","doi-asserted-by":"crossref","unstructured":"Lo SY, Hang HM, Chan SW et al (2019) Multi-class lane semantic segmentation using efficient convolutional networks. In: IEEE 21st international workshop on multimedia signal processing, MMSP 2019","DOI":"10.1109\/MMSP.2019.8901686"},{"key":"909_CR37","doi-asserted-by":"publisher","first-page":"690","DOI":"10.1109\/TNNLS.2016.2522428","volume":"28","author":"J Li","year":"2017","unstructured":"Li J, Mei X, Prokhorov D et al (2017) Deep neural network for structural prediction and lane detection in traffic scene. IEEE Trans Neural Networks Learn Syst 28:690\u2013703. https:\/\/doi.org\/10.1109\/TNNLS.2016.2522428","journal-title":"IEEE Trans Neural Networks Learn Syst"},{"key":"909_CR38","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1145\/3422622","volume":"63","author":"I Goodfellow","year":"2020","unstructured":"Goodfellow I, Pouget-Abadie J, Mirza M et al (2020) Generative adversarial networks. Commun ACM 63:139\u2013144. https:\/\/doi.org\/10.1145\/3422622","journal-title":"Commun ACM"},{"key":"909_CR39","doi-asserted-by":"crossref","unstructured":"Ghafoorian M, Nugteren C, Baka N et al (2019) EL-GAN: embedding loss driven generative adversarial networks for lane detection. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp 256\u2013272","DOI":"10.1007\/978-3-030-11009-3_15"},{"key":"909_CR40","doi-asserted-by":"crossref","unstructured":"Liu Z, Lin Y, Cao Y et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE international conference on computer vision, pp 9992\u201310002","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"909_CR41","doi-asserted-by":"crossref","unstructured":"Neven D, De Brabandere B, Georgoulis S et al (2018) Towards end-to-end lane detection: an instance segmentation approach. In: IEEE intelligent vehicles symposium, proceedings, pp 286\u2013291","DOI":"10.1109\/IVS.2018.8500547"},{"key":"909_CR42","unstructured":"Shi X, Chen Z, Wang H et al (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Advances in neural information processing systems, pp 802\u2013810"},{"key":"909_CR43","doi-asserted-by":"crossref","unstructured":"Cho K, Van Merri\u00ebnboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder\u2013decoder for statistical machine translation. In: EMNLP 2014\u20142014 conference on empirical methods in natural language processing, proceedings of the conference, pp 1724\u20131734","DOI":"10.3115\/v1\/D14-1179"},{"key":"909_CR44","unstructured":"Ruder S (2016) An overview of gradient descent optimization, pp 1\u201314"},{"key":"909_CR45","unstructured":"Paszke A, Chaurasia A, Kim S et al (2016) ENet: a deep neural network architecture for real-time semantic segmentation, pp 1\u201310"},{"key":"909_CR46","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"909_CR47","doi-asserted-by":"crossref","unstructured":"Jia Deng, Wei Dong, Socher R et al (2009) ImageNet: a large-scale hierarchical image database. IEEE, pp 248\u2013255","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"909_CR48","unstructured":"Kingma DP, Ba JL (2015) Adam: a method for stochastic optimization. In: 3rd International conference on learning representations, ICLR 2015\u2014conference track proceedings, pp 1\u201315"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-022-00909-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-022-00909-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-022-00909-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,22]],"date-time":"2023-09-22T17:10:15Z","timestamp":1695402615000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-022-00909-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,24]]},"references-count":48,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2023,10]]}},"alternative-id":["909"],"URL":"https:\/\/doi.org\/10.1007\/s40747-022-00909-0","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,24]]},"assertion":[{"value":"27 April 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 October 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 November 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The author(s) declared no potential conflicts of interest with respect to the research, authorship, and\/or publication of this article.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}