{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,23]],"date-time":"2026-02-23T23:16:22Z","timestamp":1771888582620,"version":"3.50.1"},"reference-count":48,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2023,4,17]],"date-time":"2023-04-17T00:00:00Z","timestamp":1681689600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,4,17]],"date-time":"2023-04-17T00:00:00Z","timestamp":1681689600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100013101","name":"Scientific Research Plan Projects of Shaanxi Education Department","doi-asserted-by":"publisher","award":["No.21JK0684"],"award-info":[{"award-number":["No.21JK0684"]}],"id":[{"id":"10.13039\/501100013101","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2023,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>It has been difficult to achieve a suitable balance between effectiveness and efficiency in lightweight semantic segmentation networks in recent years. The goal of this work is to present an efficient and reliable semantic segmentation method called EBUNet, which is aimed at achieving a favorable trade-off between inference speed and prediction accuracy. Initially, we develop an Efficient Bottleneck Unit (EBU) that employs depth-wise convolution and depth-wise dilated convolution to obtain adequate features with moderate computation costs. Then, we developed a novel Image Partition Attention Module (IPAM), which divides feature maps into subregions and generates attention weights based on them. As a third step, we developed a novel lightweight attention decoder with which to retrieve spatial information effectively. Extensive experiments show that our EBUNet achieves 73.4% mIou and 152 FPS on the Cityscapes dataset and 72.2% mIoU and 147 FPS on the Camvid dataset with only 1.57\u00a0M parameters. The results of the experiment confirm that the proposed model is capable of making a decent trade-off in terms of accuracy, inference, and model size. The source code of our EBUNet is available at (<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/Skybird1101\/EBUNet\">https:\/\/github.com\/Skybird1101\/EBUNet<\/jats:ext-link>).<\/jats:p>","DOI":"10.1007\/s40747-023-01054-y","type":"journal-article","created":{"date-parts":[[2023,4,17]],"date-time":"2023-04-17T02:01:55Z","timestamp":1681696915000},"page":"5975-5990","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["EBUNet: a fast and accurate semantic segmentation network with lightweight efficient bottleneck unit"],"prefix":"10.1007","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2142-784X","authenticated-orcid":false,"given":"Siyuan","family":"Shen","sequence":"first","affiliation":[]},{"given":"Zhengjun","family":"Zhai","sequence":"additional","affiliation":[]},{"given":"Guanfeng","family":"Yu","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9555-8888","authenticated-orcid":false,"given":"Youyu","family":"Yan","sequence":"additional","affiliation":[]},{"given":"Wei","family":"Dai","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,4,17]]},"reference":[{"key":"1054_CR1","doi-asserted-by":"crossref","unstructured":"Lianos K-N, Schonberger JL, Pollefeys M, Sattler T (2018) Vso: visual semantic odometry. In: Proceedings of the European conference on computer vision (ECCV), pp 234\u2013250","DOI":"10.1007\/978-3-030-01225-0_15"},{"key":"1054_CR2","doi-asserted-by":"crossref","unstructured":"Garcia-Garcia A, Orts-Escolano S, Oprea S, Villena-Martinez V, Garcia-Rodriguez J (2017) A review on deep learning techniques applied to semantic segmentation. arXiv:1704.06857","DOI":"10.1016\/j.asoc.2018.05.018"},{"key":"1054_CR3","doi-asserted-by":"crossref","unstructured":"Ess A, M\u00fcller T, Grabner H, Van\u00a0Gool L (2009) Segmentation-based urban traffic scene understanding. In: BMVC, vol 1. Citeseer, p\u00a02","DOI":"10.5244\/C.23.84"},{"key":"1054_CR4","doi-asserted-by":"crossref","unstructured":"Tao H, Cheng L, Qiu J, Stojanovic V (2022) Few shot cross equipment fault diagnosis method based on parameter optimization and feature mertic. Meas Sci Technol 33:115005","DOI":"10.1088\/1361-6501\/ac8368"},{"key":"1054_CR5","doi-asserted-by":"crossref","unstructured":"Djordjevic V, Stojanovic V, Tao H, Song X, He S, Gao W (2022) Data-driven control of hydraulic servo actuator based on adaptive dynamic programming. Discrete Contin Dyn Syst Ser S 15","DOI":"10.3934\/dcdss.2021145"},{"key":"1054_CR6","doi-asserted-by":"crossref","unstructured":"Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881\u20132890","DOI":"10.1109\/CVPR.2017.660"},{"key":"1054_CR7","doi-asserted-by":"crossref","unstructured":"Lin G, Milan A, Shen C, Reid I (2017) Refinenet: multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1925\u20131934","DOI":"10.1109\/CVPR.2017.549"},{"key":"1054_CR8","doi-asserted-by":"publisher","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","volume":"40","author":"L-C Chen","year":"2017","unstructured":"Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40:834\u2013848","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1054_CR9","doi-asserted-by":"crossref","unstructured":"Xia M, Zhong Z, Chen D (2022) Structured pruning learns compact and accurate models. arXiv:2204.00408","DOI":"10.18653\/v1\/2022.acl-long.107"},{"key":"1054_CR10","doi-asserted-by":"crossref","unstructured":"Zhao B, Cui Q, Song R, Qiu Y, Liang J (2022) Decoupled knowledge distillation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 11953\u201311962","DOI":"10.1109\/CVPR52688.2022.01165"},{"key":"1054_CR11","doi-asserted-by":"crossref","unstructured":"Hou Y, Zhu X, Ma Y, Loy CC, Li Y (2022) Point-to-voxel knowledge distillation for lidar semantic segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 8479\u20138488","DOI":"10.1109\/CVPR52688.2022.00829"},{"key":"1054_CR12","unstructured":"Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861"},{"key":"1054_CR13","unstructured":"Li G, Yun I, Kim J, Kim J (2019) Dabnet: depth-wise asymmetric bottleneck for real-time semantic segmentation. arXiv:1907.11357"},{"key":"1054_CR14","doi-asserted-by":"publisher","unstructured":"Shi Min, Shen Jialin, Yi Qingming, Weng Jian, Huang Zunkai, Luo Aiwen, Zhou Yicong (2022) LMFFNet: a well-balanced lightweight network for fast and accurate semantic segmentation. IEEE Trans Neural Netw Learn Syst 1\u201315. https:\/\/doi.org\/10.1109\/TNNLS.2022.3176493","DOI":"10.1109\/TNNLS.2022.3176493"},{"key":"1054_CR15","first-page":"1","volume":"60","author":"C Peng","year":"2021","unstructured":"Peng C, Zhang K, Ma Y, Ma J (2021) Cross fusion net: a fast semantic segmentation network for small-scale semantic information capturing in aerial scenes. IEEE Trans Geosci Remote Sens 60:1\u201313","journal-title":"IEEE Trans Geosci Remote Sens"},{"key":"1054_CR16","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"1054_CR17","doi-asserted-by":"crossref","unstructured":"Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848\u20136856","DOI":"10.1109\/CVPR.2018.00716"},{"key":"1054_CR18","doi-asserted-by":"crossref","unstructured":"Ma N, Zhang X, Zheng H-T, Sun J (2018) Shufflenet v2: practical guidelines for efficient cnn architecture design. In: Proceedings of the European conference on computer vision (ECCV), pp 116\u2013131","DOI":"10.1007\/978-3-030-01264-9_8"},{"key":"1054_CR19","doi-asserted-by":"crossref","unstructured":"Wang Y, Zhou Q, Liu J, Xiong J, Gao G, Wu X, Latecki LJ (2019) Lednet: a lightweight encoder-decoder network for real-time semantic segmentation. In: 2019 IEEE international conference on image processing (ICIP), IEEE, pp 1860\u20131864","DOI":"10.1109\/ICIP.2019.8803154"},{"key":"1054_CR20","doi-asserted-by":"crossref","unstructured":"Gao G, Xu G, Li J, Yu Y, Lu H, Yang J (2022) Fbsnet: a fast bilateral symmetrical network for real-time semantic segmentation. IEEE Trans Multim","DOI":"10.1109\/TMM.2022.3157995"},{"key":"1054_CR21","doi-asserted-by":"crossref","unstructured":"Gao G, Xu G, Yu Y, Xie J, Yang J, Yue D (2021) Mscfnet: a lightweight network with multi-scale context fusion for real-time semantic segmentation. IEEE Trans Intell Transp Syst","DOI":"10.1109\/TITS.2021.3098355"},{"key":"1054_CR22","unstructured":"Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147"},{"key":"1054_CR23","doi-asserted-by":"crossref","unstructured":"Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Bisenet: bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 325\u2013341","DOI":"10.1007\/978-3-030-01261-8_20"},{"key":"1054_CR24","doi-asserted-by":"crossref","unstructured":"Fan M, Lai S, Huang J, Wei X, Chai Z, Luo J, Wei X (2021) Rethinking bisenet for real-time semantic segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 9716\u20139725","DOI":"10.1109\/CVPR46437.2021.00959"},{"key":"1054_CR25","doi-asserted-by":"publisher","first-page":"3051","DOI":"10.1007\/s11263-021-01515-2","volume":"129","author":"C Yu","year":"2021","unstructured":"Yu C, Gao C, Wang J, Yu G, Shen C, Sang N (2021) Bisenet v2: bilateral network with guided aggregation for real-time semantic segmentation. Int J Comput Vis 129:3051\u20133068","journal-title":"Int J Comput Vis"},{"key":"1054_CR26","doi-asserted-by":"publisher","first-page":"580","DOI":"10.1007\/s10489-021-02446-8","volume":"52","author":"X Hu","year":"2022","unstructured":"Hu X, Jing L, Sehar U (2022) Joint pyramid attention network for real-time semantic segmentation of urban scenes. Appl Intell 52:580\u2013594","journal-title":"Appl Intell"},{"key":"1054_CR27","doi-asserted-by":"publisher","first-page":"3319","DOI":"10.1007\/s10489-021-02603-z","volume":"52","author":"Y Wu","year":"2022","unstructured":"Wu Y, Jiang J, Huang Z, Tian Y (2022) Fpanet: feature pyramid aggregation network for real-time semantic segmentation. Appl Intell 52:3319\u20133336","journal-title":"Appl Intell"},{"key":"1054_CR28","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1016\/j.neucom.2021.12.003","volume":"474","author":"J Liu","year":"2022","unstructured":"Liu J, Xu X, Shi Y, Deng C, Shi M (2022) Relaxnet: residual efficient learning and attention expected fusion network for real-time semantic segmentation. Neurocomputing 474:115\u2013127","journal-title":"Neurocomputing"},{"key":"1054_CR29","doi-asserted-by":"publisher","first-page":"1454","DOI":"10.1016\/j.jfranklin.2022.11.004","volume":"360","author":"H Tao","year":"2023","unstructured":"Tao H, Qiu J, Chen Y, Stojanovic V, Cheng L (2023) Unsupervised cross-domain rolling bearing fault diagnosis based on time-frequency information fusion. J Frankl Inst 360:1454\u20131477","journal-title":"J Frankl Inst"},{"key":"1054_CR30","unstructured":"Poudel RP, Bonde U, Liwicki S, Zach C (2018) Contextnet: exploring context and detail for semantic segmentation in real-time. arXiv:1805.04554"},{"key":"1054_CR31","unstructured":"Poudel RP, Liwicki S, Cipolla R (2019) Fast-scnn: fast semantic segmentation network. arXiv:1902.04502"},{"key":"1054_CR32","doi-asserted-by":"publisher","first-page":"84","DOI":"10.1016\/j.isprsjprs.2021.09.005","volume":"181","author":"R Li","year":"2021","unstructured":"Li R, Zheng S, Zhang C, Duan C, Wang L, Atkinson PM (2021) Abcnet: attentive bilateral contextual network for efficient semantic segmentation of fine-resolution remotely sensed imagery. ISPRS J Photogramm Remote Sens 181:84\u201398","journal-title":"ISPRS J Photogramm Remote Sens"},{"key":"1054_CR33","doi-asserted-by":"crossref","unstructured":"Li H, Xiong P, Fan H, Sun J (2019) Dfanet: deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 9522\u20139531","DOI":"10.1109\/CVPR.2019.00975"},{"key":"1054_CR34","doi-asserted-by":"crossref","unstructured":"Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213\u20133223","DOI":"10.1109\/CVPR.2016.350"},{"key":"1054_CR35","doi-asserted-by":"crossref","unstructured":"Brostow GJ, Shotton J, Fauqueur J, Cipolla R (2008) Segmentation and recognition using structure from motion point clouds. In: European conference on computer vision. Springer, pp 44\u201357","DOI":"10.1007\/978-3-540-88682-2_5"},{"key":"1054_CR36","doi-asserted-by":"crossref","unstructured":"Bottou L (2010) Large-scale machine learning with stochastic gradient descent. Springer, pp 177\u2013186","DOI":"10.1007\/978-3-7908-2604-3_16"},{"key":"1054_CR37","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1109\/TITS.2017.2750080","volume":"19","author":"E Romera","year":"2017","unstructured":"Romera E, Alvarez JM, Bergasa LM, Arroyo R (2017) Erfnet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transport Syst 19:263\u2013272","journal-title":"IEEE Trans Intell Transport Syst"},{"key":"1054_CR38","doi-asserted-by":"crossref","unstructured":"Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 552\u2013568","DOI":"10.1007\/978-3-030-01249-6_34"},{"key":"1054_CR39","doi-asserted-by":"publisher","first-page":"1169","DOI":"10.1109\/TIP.2020.3042065","volume":"30","author":"T Wu","year":"2020","unstructured":"Wu T, Tang S, Zhang R, Cao J, Zhang Y (2020) Cgnet: a light-weight context guided network for semantic segmentation. IEEE Trans Image Process 30:1169\u20131179","journal-title":"IEEE Trans Image Process"},{"key":"1054_CR40","doi-asserted-by":"crossref","unstructured":"Yang C, Gao F (2019) Eda-net: dense aggregation of deep and shallow information achieves quantitative photoacoustic blood oxygenation imaging deep in human breast. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 246\u2013254","DOI":"10.1007\/978-3-030-32239-7_28"},{"key":"1054_CR41","doi-asserted-by":"crossref","unstructured":"Zhao H, Qi X, Shen X, Shi J, Jia J (2018) Icnet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European conference on computer vision (ECCV), pp 405\u2013420","DOI":"10.1007\/978-3-030-01219-9_25"},{"key":"1054_CR42","doi-asserted-by":"publisher","first-page":"1041","DOI":"10.1109\/TITS.2019.2962094","volume":"22","author":"H-Y Han","year":"2020","unstructured":"Han H-Y, Chen Y-C, Hsiao P-Y, Fu L-C (2020) Using channel-wise attention for deep cnn based real-time semantic segmentation with class-aware edge information. IEEE Trans Intell Transp Syst 22:1041\u20131051","journal-title":"IEEE Trans Intell Transp Syst"},{"key":"1054_CR43","doi-asserted-by":"publisher","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","volume":"39","author":"V Badrinarayanan","year":"2017","unstructured":"Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39:2481\u20132495","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1054_CR44","unstructured":"Treml M, Arjona-Medina J, Unterthiner T, Durgesh R, Friedmann F, Schuberth P, Mayr A, Heusel M, Hofmarcher M, Widrich M et\u00a0al (2016) Speeding up semantic segmentation for autonomous driving"},{"key":"1054_CR45","doi-asserted-by":"crossref","unstructured":"Lo S-Y, Hang H-M, Chan S-W, Lin J-J (2019) Efficient dense modules of asymmetric convolution for real-time semantic segmentation. In: Proceedings of the ACM multimedia Asia, pp 1\u20136","DOI":"10.1145\/3338533.3366558"},{"key":"1054_CR46","doi-asserted-by":"crossref","unstructured":"Lyu H, Fu H, Hu X, Liu L (2019) Esnet: edge-based segmentation network for real-time semantic segmentation in traffic scenes. In: 2019 IEEE international conference on image processing (ICIP), IEEE, pp 1855\u20131859","DOI":"10.1109\/ICIP.2019.8803132"},{"key":"1054_CR47","doi-asserted-by":"publisher","first-page":"106682","DOI":"10.1016\/j.asoc.2020.106682","volume":"96","author":"Quan Zhou","year":"2020","unstructured":"Zhou Quan, Wang Yu, Fan Yawen, Wu Xiaofu, Zhang Suofei, Kang Bin, Latecki Longin Jan (2020) AGLNet: towards real-time semantic segmentation of self-driving images via attention-guided lightweight network. Appl Soft Comput 96:106682. https:\/\/doi.org\/10.1016\/j.asoc.2020.106682","journal-title":"Appl Soft Comput"},{"key":"1054_CR48","doi-asserted-by":"crossref","unstructured":"Pohlen T, Hermans A, Mathias M, Leibe B (2017) Full-resolution residual networks for semantic segmentation in street scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4151\u20134160","DOI":"10.1109\/CVPR.2017.353"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01054-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-023-01054-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01054-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,22]],"date-time":"2023-09-22T17:28:16Z","timestamp":1695403696000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-023-01054-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,17]]},"references-count":48,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2023,10]]}},"alternative-id":["1054"],"URL":"https:\/\/doi.org\/10.1007\/s40747-023-01054-y","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,4,17]]},"assertion":[{"value":"21 December 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 March 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 April 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}