{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,6]],"date-time":"2026-04-06T10:16:57Z","timestamp":1775470617644,"version":"3.50.1"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2024,5,20]],"date-time":"2024-05-20T00:00:00Z","timestamp":1716163200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,5,20]],"date-time":"2024-05-20T00:00:00Z","timestamp":1716163200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,8]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>To address the challenges posed by diverse pattern-background elements, intricate details, and complex textures in the semantic segmentation of ethnic clothing patterns, this research introduces a novel semantic segmentation network model called MST-Unet (Mixed Swin Transformer U-net). The proposed model combines a U-shaped network structure with multiple attention mechanisms. The upper layers of the model employ classical convolutional operations, focusing on local relationships in the initial layers containing high-resolution details. In deeper layers, Swin Transformer modules are utilized, capable of efficient feature extraction with smaller spatial dimensions, maintaining performance while reducing computational burden. An attention gate mechanism is integrated into the decoder, contributing to enhanced performance in ethnic clothing pattern segmentation tasks by allowing the model to better capture crucial image features and achieve precise segmentation results. In visual comparisons of segmentation results, our proposed model demonstrates superior performance. The segmentation results exhibit more complete preservation of edge contours and fewer misclassifications in irrelevant regions within the images. In qualitative and quantitative experiments conducted on the ethnic clothing pattern dataset, our model achieves the highest Dice score for segmentation results in all four subclasses of ethnic clothing patterns. The average Dice score of our model reaches an impressive 89.80%, surpassing other algorithms in the same category. When compared to Deeplab_V3+, ResUnet, SwinUnet, and Unet networks, our model outperforms them by 7.72%, 5.09%, 5.05%, and 0.67%.<\/jats:p>","DOI":"10.1007\/s40747-024-01457-5","type":"journal-article","created":{"date-parts":[[2024,5,20]],"date-time":"2024-05-20T07:01:46Z","timestamp":1716188506000},"page":"5759-5770","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Segmentation of ethnic clothing patterns with fusion of multiple attention mechanisms"],"prefix":"10.1007","volume":"10","author":[{"given":"Tao","family":"Ning","sequence":"first","affiliation":[]},{"given":"Yuan","family":"Gao","sequence":"additional","affiliation":[]},{"given":"Yumeng","family":"Han","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,5,20]]},"reference":[{"key":"1457_CR1","doi-asserted-by":"crossref","unstructured":"Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention\u2013MICCAI 2015: 18th international conference, Munich, October 5\u20139, 2015, Proceedings, Part III 18. Springer International Publishing, pp 234\u2013241","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"1457_CR2","doi-asserted-by":"publisher","DOI":"10.3389\/fbioe.2020.605132","volume":"8","author":"Q Jin","year":"2020","unstructured":"Jin Q, Meng Z, Sun C et al (2020) RA-Unet: a hybrid deep attention-aware network to extract liver and tumor in CT scans. Front Bioeng Biotechnol 8:605132","journal-title":"Front Bioeng Biotechnol"},{"issue":"2","key":"1457_CR3","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1038\/s41592-020-01008-z","volume":"18","author":"F Isensee","year":"2021","unstructured":"Isensee F, Jaeger PF, Kohl SAA et al (2021) nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 18(2):203\u2013211","journal-title":"Nat Methods"},{"key":"1457_CR4","doi-asserted-by":"crossref","unstructured":"\u00c7i\u00e7ek \u00d6, Abdulkadir A, Lienkamp SS et al (2016) 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Medical Image computing and computer-assisted intervention\u2013MICCAI 2016: 19th international conference, Athens, October 17\u201321, 2016, Proceedings, Part II 19. Springer International Publishing, pp 424\u2013432","DOI":"10.1007\/978-3-319-46723-8_49"},{"key":"1457_CR5","doi-asserted-by":"crossref","unstructured":"Xiao X, Lian S, Luo Z et al (2018) Weighted res-Unet for high-quality retina vessel segmentation. In: 2018 9th international conference on information technology in medicine and education (ITME). IEEE, pp 327\u2013331","DOI":"10.1109\/ITME.2018.00080"},{"key":"1457_CR6","doi-asserted-by":"crossref","unstructured":"Zhou Z, Rahman Siddiquee M M, Tajbakhsh N et al (2018) Unet++: a nested u-net architecture for medical image segmentation. In: Deep learning in medical image analysis and multimodal learning for clinical decision support: 4th international workshop, DLMIA 2018, and 8th international workshop, ML-CDS 2018, held in conjunction with MICCAI 2018, Granada, September 20, 2018, Proceedings 4. Springer International Publishing, pp 3\u201311","DOI":"10.1007\/978-3-030-00889-5_1"},{"key":"1457_CR7","doi-asserted-by":"crossref","unstructured":"Huang H, Lin L, Tong R et al (2020) Unet 3+: a full-scale connected Unet for medical image segmentation. In: ICASSP 2020\u20132020 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1055\u20131059","DOI":"10.1109\/ICASSP40776.2020.9053405"},{"key":"1457_CR8","unstructured":"Chen J, Lu Y, Yu Q et al (2021) TransUnet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306"},{"key":"1457_CR9","doi-asserted-by":"crossref","unstructured":"Chen LC, Zhu Y, Papandreou G et al (2018) Encoder\u2013decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801\u2013818","DOI":"10.1007\/978-3-030-01234-2_49"},{"issue":"10","key":"1457_CR10","doi-asserted-by":"publisher","first-page":"2281","DOI":"10.1109\/TMI.2019.2903562","volume":"38","author":"Z Gu","year":"2019","unstructured":"Gu Z, Cheng J, Fu H et al (2019) Ce-net: context encoder network for 2d medical image segmentation. IEEE Trans Med Imaging 38(10):2281\u20132292","journal-title":"IEEE Trans Med Imaging"},{"key":"1457_CR11","doi-asserted-by":"publisher","first-page":"197","DOI":"10.1016\/j.media.2019.01.012","volume":"53","author":"J Schlemper","year":"2019","unstructured":"Schlemper J, Oktay O, Schaap M et al (2019) Attention gated networks: learning to leverage salient regions in medical images. Med Image Anal 53:197\u2013207","journal-title":"Med Image Anal"},{"key":"1457_CR12","doi-asserted-by":"crossref","unstructured":"Wang X, Girshick R, Gupta A et al (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794\u20137803","DOI":"10.1109\/CVPR.2018.00813"},{"key":"1457_CR13","doi-asserted-by":"crossref","unstructured":"Zhao H, Shi J, Qi X et al (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881\u20132890","DOI":"10.1109\/CVPR.2017.660"},{"key":"1457_CR14","unstructured":"Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. Advances in neural information processing systems, vol 30"},{"key":"1457_CR15","doi-asserted-by":"crossref","unstructured":"Carion N, Massa F, Synnaeve G et al (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer International Publishing, Cham, pp 213\u2013229","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"1457_CR16","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A et al (2020) An image is worth 16\u00d716 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929"},{"key":"1457_CR17","unstructured":"Touvron H, Cord M, Douze M et al (2021) Training data-efficient image transformers & distillation through attention. In: International conference on machine learning. PMLR, pp 10347\u201310357"},{"key":"1457_CR18","doi-asserted-by":"crossref","unstructured":"Liu Z, Lin Y, Cao Y et al (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 10012\u201310022","DOI":"10.1109\/ICCV48922.2021.00986"},{"issue":"2","key":"1457_CR19","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1109\/TMI.2002.808355","volume":"22","author":"A Tsai","year":"2003","unstructured":"Tsai A, Yezzi A, Wells W et al (2003) A shape-based approach to the segmentation of medical imagery using level sets. IEEE Trans Med Imaging 22(2):137\u2013154","journal-title":"IEEE Trans Med Imaging"},{"issue":"6","key":"1457_CR20","doi-asserted-by":"publisher","first-page":"878","DOI":"10.1109\/42.650883","volume":"16","author":"K Held","year":"1997","unstructured":"Held K, Kops ER, Krause BJ et al (1997) Markov random field segmentation of brain MR images. IEEE Trans Med Imaging 16(6):878\u2013886","journal-title":"IEEE Trans Med Imaging"},{"issue":"12","key":"1457_CR21","doi-asserted-by":"publisher","first-page":"2663","DOI":"10.1109\/TMI.2018.2845918","volume":"37","author":"X Li","year":"2018","unstructured":"Li X, Chen H, Qi X et al (2018) H-DenseUnet: hybrid densely connected Unet for liver and tumor segmentation from CT volumes. IEEE Trans Med Imaging 37(12):2663\u20132674","journal-title":"IEEE Trans Med Imaging"},{"issue":"4","key":"1457_CR22","doi-asserted-by":"publisher","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","volume":"40","author":"LC Chen","year":"2017","unstructured":"Chen LC, Papandreou G, Kokkinos I et al (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834\u2013848","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1457_CR23","doi-asserted-by":"crossref","unstructured":"Milletari F, Navab N, Ahmadi SA (2016) V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 fourth international conference on 3D vision (3DV). IEEE, pp 565\u2013571","DOI":"10.1109\/3DV.2016.79"},{"key":"1457_CR24","doi-asserted-by":"crossref","unstructured":"Xie S, Girshick R, Doll\u00e1r P et al (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1492\u20131500","DOI":"10.1109\/CVPR.2017.634"},{"issue":"07","key":"1457_CR25","doi-asserted-by":"publisher","first-page":"2135","DOI":"10.11834\/jig.220154","volume":"28","author":"SY Liu","year":"2023","unstructured":"Liu SY, Chi JN, Wu CD et al (2023) Recurrent slice networks-based 3D point cloud-relevant integrated segmentation of semantic and instances. J Image Graph 28(07):2135\u20132150","journal-title":"J Image Graph"},{"issue":"5","key":"1457_CR26","doi-asserted-by":"publisher","first-page":"7099","DOI":"10.1109\/TII.2022.3209672","volume":"19","author":"J Gou","year":"2022","unstructured":"Gou J, Sun L, Yu B et al (2022) Multilevel attention-based sample correlations for knowledge distillation. IEEE Trans Ind Inf 19(5):7099\u20137109","journal-title":"IEEE Trans Ind Inf"},{"key":"1457_CR27","unstructured":"Devlin J, Chang MW, Lee K et al (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805"},{"key":"1457_CR28","doi-asserted-by":"crossref","unstructured":"Wang W, Xie E, Li X et al (2021) Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 568\u2013578","DOI":"10.1109\/ICCV48922.2021.00061"},{"key":"1457_CR29","first-page":"15908","volume":"34","author":"K Han","year":"2021","unstructured":"Han K, Xiao A, Wu E et al (2021) Transformer in transformer. Adv Neural Inf Process Syst 34:15908\u201315919","journal-title":"Adv Neural Inf Process Syst"},{"key":"1457_CR30","doi-asserted-by":"crossref","unstructured":"Valanarasu JMJ, Oza P, Hacihaliloglu I et al (2021) Medical transformer: Gated axial-attention for medical image segmentation. In: Medical image computing and computer assisted intervention\u2014MICCAI 2021: 24th international conference, Strasbourg, September 27\u2013October 1, 2021, Proceedings, Part I 24. Springer International Publishing, pp 36\u201346","DOI":"10.1007\/978-3-030-87193-2_4"},{"key":"1457_CR31","doi-asserted-by":"crossref","unstructured":"Zhang Y, Liu H, Hu Q (2021) Transfuse: fusing transformers and cnns for medical image segmentation. In: Medical image computing and computer assisted intervention\u2014MICCAI 2021: 24th international conference, Strasbourg, September 27\u2013October 1, 2021, Proceedings, Part I 24. Springer International Publishing, pp 14\u201324","DOI":"10.1007\/978-3-030-87193-2_2"},{"key":"1457_CR32","doi-asserted-by":"crossref","unstructured":"Wang W, Chen C, Ding M et al (2021) Transbts: Multimodal brain tumor segmentation using transformer. In: Medical image computing and computer assisted intervention\u2014MICCAI 2021: 24th international conference, Strasbourg, September 27\u2013October 1, 2021, Proceedings, Part I 24. Springer International Publishing, pp 109\u2013119","DOI":"10.1007\/978-3-030-87193-2_11"},{"key":"1457_CR33","doi-asserted-by":"crossref","unstructured":"Xie Y, Zhang J, Shen C et al (2021) Cotr: efficiently bridging cnn and transformer for 3d medical image segmentation. In: Medical image computing and computer assisted intervention\u2014MICCAI 2021: 24th international conference, Strasbourg, September 27\u2013October 1, 2021, Proceedings, Part III 24. Springer International Publishing, pp 171\u2013180","DOI":"10.1007\/978-3-030-87199-4_16"},{"key":"1457_CR34","doi-asserted-by":"crossref","unstructured":"Hatamizadeh A, Tang Y, Nath V et al (2022) Unetr: Transformers for 3d medical image segmentation. In: Proceedings of the IEEE\/CVF winter conference on applications of computer vision, pp 574\u2013584","DOI":"10.1109\/WACV51458.2022.00181"},{"key":"1457_CR35","doi-asserted-by":"crossref","unstructured":"Wang H, Xie S, Lin L et al (2022) Mixed transformer u-net for medical image segmentation. In: ICASSP 2022\u20132022 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2390\u20132394","DOI":"10.1109\/ICASSP43922.2022.9746172"},{"key":"1457_CR36","doi-asserted-by":"crossref","unstructured":"Cao H, Wang Y, Chen J et al (2022) Swin-Unet: Unet-like pure transformer for medical image segmentation. In: European conference on computer vision. Springer Nature Switzerland, Cham, pp 205\u2013218","DOI":"10.1007\/978-3-031-25066-8_9"},{"key":"1457_CR37","unstructured":"Oktay O, Schlemper J, Folgoc LL et al (2018) Attention u-net: learning where to look for the pancreas. IMIDL conference"},{"key":"1457_CR38","doi-asserted-by":"crossref","unstructured":"Xiao T, Liu Y, Zhou B et al (2018) Unified perceptual parsing for scene understanding. In: Proceedings of the European conference on computer vision (ECCV), pp 418\u2013434","DOI":"10.1007\/978-3-030-01228-1_26"},{"key":"1457_CR39","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2023.102797","volume":"86","author":"F Bougourzi","year":"2023","unstructured":"Bougourzi F, Distante C, Dornaika F et al (2023) PDAtt-Unet: pyramid dual-decoder attention Unet for COVID-19 infection segmentation from CT-scans. Med Image Anal 86:102797","journal-title":"Med Image Anal"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01457-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-024-01457-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01457-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T17:29:47Z","timestamp":1721237387000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-024-01457-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,20]]},"references-count":39,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,8]]}},"alternative-id":["1457"],"URL":"https:\/\/doi.org\/10.1007\/s40747-024-01457-5","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,20]]},"assertion":[{"value":"27 December 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 April 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 May 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}