{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T01:59:44Z","timestamp":1772503184691,"version":"3.50.1"},"reference-count":50,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2024,4,17]],"date-time":"2024-04-17T00:00:00Z","timestamp":1713312000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,4,17]],"date-time":"2024-04-17T00:00:00Z","timestamp":1713312000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61602157"],"award-info":[{"award-number":["61602157"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Henan Science and Technology Planning Program","award":["202102210167"],"award-info":[{"award-number":["202102210167"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,8]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Currently, many real-time semantic segmentation networks aim for heightened accuracy, inevitably leading to increased computational complexity and reduced inference speed. Therefore, striking a balance between accuracy and speed has emerged as a crucial concern in this domain. To address these challenges, this study proposes a dual-branch fusion network with multiscale atrous pyramid pooling aggregate contextual features for real-time semantic segmentation (MAFNet). The first key component, the semantics guide spatial-details module (SGSDM) not only facilitates precise boundary extraction and fine-grained classification, but also provides semantic-based feature representation, thereby enhancing support for spatial analysis and decision boundaries. The second component, the multiscale atrous pyramid pooling module (MSAPPM), is designed by combining dilation convolution with feature pyramid pooling operations at various dilation rates. This design not only expands the receptive field, but also aggregates rich contextual information more effectively. To further improve the fusion of feature information generated by the dual-branch, a bilateral fusion module (BFM) is introduced. This module employs cross-fusion by calculating weights generated by the dual-branch to balance the weight relationship between the dual branches, thereby achieving effective feature information fusion. To validate the effectiveness of the proposed network, experiments are conducted on a single A100 GPU. MAFNet achieves a mean intersection over union (mIoU) of 77.4% at 70.9 FPS on the Cityscapes test dataset and 77.6% mIoU at 192.5 FPS on the CamVid test dataset. The experimental results conclusively demonstrated that MAFNet effectively strikes a balance between accuracy and speed.<\/jats:p>","DOI":"10.1007\/s40747-024-01428-w","type":"journal-article","created":{"date-parts":[[2024,4,17]],"date-time":"2024-04-17T04:02:20Z","timestamp":1713326540000},"page":"5107-5126","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["MAFNet: dual-branch fusion network with multiscale atrous pyramid pooling aggregate contextual features for real-time semantic segmentation"],"prefix":"10.1007","volume":"10","author":[{"given":"Shan","family":"Zhao","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0009-0007-1819-5212","authenticated-orcid":false,"given":"Yunlei","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Xuan","family":"Wu","sequence":"additional","affiliation":[]},{"given":"Fukai","family":"Zhang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,4,17]]},"reference":[{"key":"1428_CR1","doi-asserted-by":"crossref","unstructured":"Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213\u20133223","DOI":"10.1109\/CVPR.2016.350"},{"key":"1428_CR2","doi-asserted-by":"crossref","unstructured":"Zhou Z, Rahman\u00a0Siddiquee MM, Tajbakhsh N, Liang J (2018) Unet++: a nested u-net architecture for medical image segmentation. In: Deep learning in medical image analysis and multimodal learning for clinical decision support: 4th international workshop, DLMIA 2018, and 8th international workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, September 20, 2018, Proceedings 4, Springer, pp 3\u201311","DOI":"10.1007\/978-3-030-00889-5_1"},{"key":"1428_CR3","doi-asserted-by":"crossref","unstructured":"Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431\u20133440","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"1428_CR4","doi-asserted-by":"crossref","unstructured":"Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention\u2013MICCAI 2015: 18th international conference, Munich, Germany, October 5\u20139, 2015, Proceedings, Part III 18, Springer, pp 234\u2013241","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"1428_CR5","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"1428_CR6","doi-asserted-by":"crossref","unstructured":"Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801\u2013818","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"1428_CR7","doi-asserted-by":"publisher","DOI":"10.1016\/j.jprocont.2023.103112","volume":"132","author":"H Tao","year":"2023","unstructured":"Tao H, Zheng J, Wei J, Paszke W, Rogers E, Stojanovic V (2023) Repetitive process based indirect-type iterative learning control for batch processes with model uncertainty and input delay. J Process Control 132:103112","journal-title":"J Process Control"},{"key":"1428_CR8","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2023.126498","volume":"550","author":"X Song","year":"2023","unstructured":"Song X, Wu N, Song S, Zhang Y, Stojanovic V (2023) Bipartite synchronization for cooperative-competitive neural networks with reaction-diffusion terms via dual event-triggered mechanism. Neurocomputing 550:126498","journal-title":"Neurocomputing"},{"issue":"6","key":"1428_CR9","doi-asserted-by":"publisher","first-page":"7451","DOI":"10.1007\/s40747-023-01135-y","volume":"9","author":"Z Peng","year":"2023","unstructured":"Peng Z, Song X, Song S, Stojanovic V (2023) Hysteresis quantified control for switched reaction-diffusion systems and its application. Complex Intell Syst 9(6):7451\u20137460","journal-title":"Complex Intell Syst"},{"key":"1428_CR10","unstructured":"Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147"},{"key":"1428_CR11","doi-asserted-by":"crossref","unstructured":"Zhao H, Qi X, Shen X, Shi J, Jia J (2018) Icnet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European conference on computer vision (ECCV), pp 405\u2013420","DOI":"10.1007\/978-3-030-01219-9_25"},{"key":"1428_CR12","doi-asserted-by":"crossref","unstructured":"Fan M, Lai S, Huang J, Wei X, Chai Z, Luo J, Wei X (2021) Rethinking bisenet for real-time semantic segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 9716\u20139725","DOI":"10.1109\/CVPR46437.2021.00959"},{"key":"1428_CR13","unstructured":"Elhassan MA, Yang C, Huang C, Legesse\u00a0Munea T, Hong X (2022) $$\\rm s^2$$-fpn: scale-ware strip attention guided feature pyramid network for real-time semantic segmentation"},{"key":"1428_CR14","doi-asserted-by":"crossref","unstructured":"Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 552\u2013568","DOI":"10.1007\/978-3-030-01249-6_34"},{"key":"1428_CR15","unstructured":"Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning, pp 6105\u20136114"},{"key":"1428_CR16","unstructured":"Poudel RP, Liwicki S, Cipolla R (2019) Fast-scnn: fast semantic segmentation network. arXiv preprint arXiv:1902.04502"},{"key":"1428_CR17","unstructured":"Hong Y, Pan H, Sun W, Jia Y (2021) Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes. arXiv preprint arXiv:2101.06085"},{"issue":"2","key":"1428_CR18","doi-asserted-by":"publisher","first-page":"652","DOI":"10.1109\/TPAMI.2019.2938758","volume":"43","author":"S-H Gao","year":"2019","unstructured":"Gao S-H, Cheng M-M, Zhao K, Zhang X-Y, Yang M-H, Torr P (2019) Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652\u2013662","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"6","key":"1428_CR19","doi-asserted-by":"publisher","first-page":"3258","DOI":"10.1109\/TITS.2020.2980426","volume":"22","author":"G Dong","year":"2020","unstructured":"Dong G, Yan Y, Shen C, Wang H (2020) Real-time high-performance semantic image segmentation of urban street scenes. IEEE Trans Intell Transport Syst 22(6):3258\u20133274","journal-title":"IEEE Trans Intell Transport Syst"},{"key":"1428_CR20","doi-asserted-by":"crossref","unstructured":"Liu S, Huang D et\u00a0al (2018) Receptive field block net for accurate and fast object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 385\u2013400","DOI":"10.1007\/978-3-030-01252-6_24"},{"key":"1428_CR21","unstructured":"Peng J, Liu Y, Tang S, Hao Y, Chu L, Chen G, Wu Z, Chen Z, Yu Z, Du Y et al (2022) Pp-liteseg: a superior real-time semantic segmentation model. arXiv preprint arXiv:2204.02681"},{"key":"1428_CR22","doi-asserted-by":"crossref","unstructured":"Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Bisenet: bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 325\u2013341","DOI":"10.1007\/978-3-030-01261-8_20"},{"key":"1428_CR23","doi-asserted-by":"publisher","first-page":"3051","DOI":"10.1007\/s11263-021-01515-2","volume":"129","author":"C Yu","year":"2021","unstructured":"Yu C, Gao C, Wang J, Yu G, Shen C, Sang N (2021) Bisenet v2: bilateral network with guided aggregation for real-time semantic segmentation. Int J Comput Vis 129:3051\u20133068","journal-title":"Int J Comput Vis"},{"key":"1428_CR24","first-page":"7423","volume":"35","author":"J Wang","year":"2022","unstructured":"Wang J, Gou C, Wu Q, Feng H, Han J, Ding E, Wang J (2022) Rtformer: efficient design for real-time semantic segmentation with transformer. Adv Neural Inf Process Syst 35:7423\u20137436","journal-title":"Adv Neural Inf Process Syst"},{"key":"1428_CR25","unstructured":"Sun K, Zhao Y, Jiang B, Cheng T, Xiao B, Liu D, Mu Y, Wang X, Liu W, Wang J (2019) High-resolution representations for labeling pixels and regions. arXiv preprint arXiv:1904.04514"},{"issue":"2","key":"1428_CR26","doi-asserted-by":"publisher","first-page":"181","DOI":"10.4103\/crst.crst_332_22","volume":"6","author":"R Thukral","year":"2023","unstructured":"Thukral R, Aggarwal AK, Arora AS, Dora T, Sancheti S (2023) Artificial intelligence-based prediction of oral mucositis in patients with head-and-neck cancer: a prospective observational study utilizing a thermographic approach. Cancer Res Stat Treat 6(2):181\u2013190","journal-title":"Cancer Res Stat Treat"},{"key":"1428_CR27","first-page":"199","volume":"10","author":"D Maini","year":"2018","unstructured":"Maini D, Aggarwal AK (2018) Camera position estimation using 2d image dataset. Int J Innov Eng Technol 10:199\u2013203","journal-title":"Int J Innov Eng Technol"},{"key":"1428_CR28","doi-asserted-by":"crossref","unstructured":"Brostow GJ, Shotton J, Fauqueur J, Cipolla R (2008) Segmentation and recognition using structure from motion point clouds. In: Computer vision\u2013ECCV 2008: 10th European conference on computer vision, Marseille, France, October 12-18, 2008, Proceedings, Part I 10. Springer, pp 44\u201357","DOI":"10.1007\/978-3-540-88682-2_5"},{"key":"1428_CR29","unstructured":"Roland G (2021) Rethink dilated convolution for real-time semantic segmentation. arXiv:2111.09957"},{"key":"1428_CR30","unstructured":"Goyal P, Doll\u00e1r P, Girshick R, Noordhuis P, Wesolowski L, Kyrola A, Tulloch A, Jia Y, He K (2017) Accurate, large minibatch sgd: training imagenet in 1 hour. arXiv preprint arXiv:1706.02677"},{"key":"1428_CR31","first-page":"40","volume":"7","author":"AK Aggarwal","year":"2022","unstructured":"Aggarwal AK, Jaidka P (2022) Segmentation of crop images for crop yield prediction. Int J Biol Biomed 7:40\u201344","journal-title":"Int J Biol Biomed"},{"key":"1428_CR32","doi-asserted-by":"crossref","unstructured":"Cubuk ED, Zoph B, Shlens J, Le QV (2020) Randaugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition workshops, pp 702\u2013703","DOI":"10.1109\/CVPRW50498.2020.00359"},{"key":"1428_CR33","doi-asserted-by":"publisher","DOI":"10.1016\/j.foohum.2023.11.017","volume":"2","author":"DS Brar","year":"2024","unstructured":"Brar DS, Aggarwal AK, Nanda V, Kaur S, Saxena S, Gautam S (2024) Detection of sugar syrup adulteration in unifloral honey using deep learning framework: an effective quality analysis technique. Food Hum 2:100190","journal-title":"Food Hum"},{"key":"1428_CR34","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","volume":"115","author":"O Russakovsky","year":"2015","unstructured":"Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115:211\u2013252","journal-title":"Int J Comput Vis"},{"key":"1428_CR35","unstructured":"Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32"},{"key":"1428_CR36","doi-asserted-by":"crossref","unstructured":"Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 761\u2013769","DOI":"10.1109\/CVPR.2016.89"},{"key":"1428_CR37","unstructured":"Chen W, Gong X, Liu X, Zhang Q, Li Y, Wang Z (2019) Fasterseg: searching for faster real-time semantic segmentation. arXiv preprint arXiv:1912.10917"},{"key":"1428_CR38","doi-asserted-by":"crossref","unstructured":"Brar DS, Aggarwal AK, Nanda V, Saxena S, Gautam S (2024) Ai and cv based 2d-cnn algorithm: botanical authentication of Indian honey. Sustain Food Technol","DOI":"10.1039\/D3FB00170A"},{"key":"1428_CR39","doi-asserted-by":"crossref","unstructured":"Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618\u2013626","DOI":"10.1109\/ICCV.2017.74"},{"key":"1428_CR40","doi-asserted-by":"crossref","unstructured":"Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881\u20132890","DOI":"10.1109\/CVPR.2017.660"},{"key":"1428_CR41","doi-asserted-by":"crossref","unstructured":"Xu J, Xiong Z, Bhattacharyya SP (2023) Pidnet: a real-time semantic segmentation network inspired by pid controllers. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 19529\u201319539","DOI":"10.1109\/CVPR52729.2023.01871"},{"issue":"4","key":"1428_CR42","doi-asserted-by":"publisher","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","volume":"40","author":"L-C Chen","year":"2017","unstructured":"Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834\u2013848","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"1","key":"1428_CR43","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1109\/TITS.2017.2750080","volume":"19","author":"E Romera","year":"2017","unstructured":"Romera E, Alvarez JM, Bergasa LM, Arroyo R (2017) Erfnet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transport Syst 19(1):263\u2013272","journal-title":"IEEE Trans Intell Transport Syst"},{"key":"1428_CR44","doi-asserted-by":"crossref","unstructured":"Li H, Xiong P, Fan H, Sun J (2019) Dfanet: deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 9522\u20139531","DOI":"10.1109\/CVPR.2019.00975"},{"key":"1428_CR45","doi-asserted-by":"crossref","unstructured":"Hu P, Caba F, Wang O, Lin Z, Sclaroff S, Perazzi F (2020) Temporally distributed networks for fast video semantic segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 8818\u20138827","DOI":"10.1109\/CVPR42600.2020.00884"},{"key":"1428_CR46","doi-asserted-by":"crossref","unstructured":"Orsic M, Kreso I, Bevandic P, Segvic S (2019) In defense of pre-trained imagenet architectures for real-time semantic segmentation of road-driving images. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 12607\u201312616","DOI":"10.1109\/CVPR.2019.01289"},{"key":"1428_CR47","doi-asserted-by":"crossref","unstructured":"Nirkin Y, Wolf L, Hassner T (2021) Hyperseg: Patch-wise hypernetwork for real-time semantic segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 4061\u20134070","DOI":"10.1109\/CVPR46437.2021.00405"},{"key":"1428_CR48","doi-asserted-by":"crossref","unstructured":"Li X, You A, Zhu Z, Zhao H, Yang M, Yang K, Tan S, Tong Y (2020) Semantic flow for fast and accurate scene parsing. In: Computer Vision\u2013ECCV 2020: 16th European conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part I 16. Springer, pp 775\u2013793","DOI":"10.1007\/978-3-030-58452-8_45"},{"key":"1428_CR49","doi-asserted-by":"crossref","unstructured":"Chandra S, Couprie C, Kokkinos I (2018) Deep spatio-temporal random fields for efficient video segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8915\u20138924","DOI":"10.1109\/CVPR.2018.00929"},{"key":"1428_CR50","unstructured":"Si H, Zhang Z, Lv F, Yu G, Lu F (2019) Real-time semantic segmentation via multiply spatial fusion network. arXiv preprint arXiv:1911.07217"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01428-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-024-01428-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01428-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T17:20:38Z","timestamp":1721236838000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-024-01428-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,17]]},"references-count":50,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,8]]}},"alternative-id":["1428"],"URL":"https:\/\/doi.org\/10.1007\/s40747-024-01428-w","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,17]]},"assertion":[{"value":"2 January 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 March 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 April 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"All authors agreed to participate in this paper.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"Not applicable","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval"}}]}}