{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T06:55:26Z","timestamp":1775717726466,"version":"3.50.1"},"reference-count":48,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,2,14]],"date-time":"2023-02-14T00:00:00Z","timestamp":1676332800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,2,14]],"date-time":"2023-02-14T00:00:00Z","timestamp":1676332800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>\n                      Semantic segmentation of brain tumors plays a critical role in clinical treatment, especially for three-dimensional (3D) magnetic resonance imaging, which is often used in clinical practice. Automatic segmentation of the 3D structure of brain tumors can quickly help physicians understand the properties of tumors, such as the shape and size, thus improving the efficiency of preoperative planning and the odds of successful surgery. In past decades, 3D convolutional neural networks (CNNs) have dominated automatic segmentation methods for 3D medical images, and these network structures have achieved good results. However, to reduce the number of neural network parameters, practitioners ensure that the size of convolutional kernels in 3D convolutional operations generally does not exceed\n                      <jats:inline-formula>\n                        <jats:alternatives>\n                          <jats:tex-math>$$7 \\times 7 \\times 7$$<\/jats:tex-math>\n                          <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                            <mml:mrow>\n                              <mml:mn>7<\/mml:mn>\n                              <mml:mo>\u00d7<\/mml:mo>\n                              <mml:mn>7<\/mml:mn>\n                              <mml:mo>\u00d7<\/mml:mo>\n                              <mml:mn>7<\/mml:mn>\n                            <\/mml:mrow>\n                          <\/mml:math>\n                        <\/jats:alternatives>\n                      <\/jats:inline-formula>\n                      , which also leads to CNNs showing limitations in learning long-distance dependent information. Vision Transformer (ViT) is very good at learning long-distance dependent information in images, but it suffers from the problems of many parameters. What\u2019s worse, the ViT cannot learn local dependency information in the previous layers under the condition of insufficient data. However, in the image segmentation task, being able to learn this local dependency information in the previous layers makes a big impact on the performance of the model.\n                    <\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>This paper proposes the Swin Unet3D model, which represents voxel segmentation on medical images as a sequence-to-sequence prediction. The feature extraction sub-module in the model is designed as a parallel structure of Convolution and ViT so that all layers of the model are able to adequately learn both global and local dependency information in the image.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>On the validation dataset of Brats2021, our proposed model achieves dice coefficients of 0.840, 0.874, and 0.911 on the ET channel, TC channel, and WT channel, respectively. On the validation dataset of Brats2018, our model achieves dice coefficients of 0.716, 0.761, and 0.874 on the corresponding channels, respectively.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusion<\/jats:title>\n                    <jats:p>\n                      We propose a new segmentation model that combines the advantages of Vision Transformer and Convolution and achieves a better balance between the number of model parameters and segmentation accuracy. The code can be found at\n                      <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/1152545264\/SwinUnet3D\">https:\/\/github.com\/1152545264\/SwinUnet3D<\/jats:ext-link>\n                      .\n                    <\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12911-023-02129-z","type":"journal-article","created":{"date-parts":[[2023,2,14]],"date-time":"2023-02-14T05:04:16Z","timestamp":1676351056000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":92,"title":["Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution"],"prefix":"10.1186","volume":"23","author":[{"given":"Yimin","family":"Cai","sequence":"first","affiliation":[]},{"given":"Yuqing","family":"Long","sequence":"additional","affiliation":[]},{"given":"Zhenggong","family":"Han","sequence":"additional","affiliation":[]},{"given":"Mingkun","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Yuchen","family":"Zheng","sequence":"additional","affiliation":[]},{"given":"Wei","family":"Yang","sequence":"additional","affiliation":[]},{"given":"Liming","family":"Chen","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,2,14]]},"reference":[{"key":"2129_CR1","unstructured":"Board PATE. Adult central nervous system tumors treatment (PDQ\u00ae): Health Professional Version. Website. 2022. https:\/\/www.cancer.gov\/types\/brain\/hp\/adult-brain-treatment-pdq."},{"issue":"1","key":"2129_CR2","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1007\/s10462-020-09854-1","volume":"54","author":"SA Taghanaki","year":"2021","unstructured":"Taghanaki SA, Abhishek K, Cohen JP, Cohen-Adad J, Hamarneh G. Deep semantic segmentation of natural and medical images: a review. Artif Intell Rev. 2021;54(1):137\u201378.","journal-title":"Artif Intell Rev"},{"issue":"12","key":"2129_CR3","first-page":"234","volume":"3","author":"K Bhargavi","year":"2014","unstructured":"Bhargavi K, Jyothi S. A survey on threshold based segmentation technique in image processing. Int J Innov Res Dev. 2014;3(12):234\u20139.","journal-title":"Int J Innov Res Dev"},{"key":"2129_CR4","unstructured":"Kaganami HG, Beiji Z. In: 2009 Fifth international conference on intelligent information hiding and multimedia signal processing (IEEE). 2009; p. 1217\u201321."},{"issue":"11","key":"2129_CR5","doi-asserted-by":"publisher","first-page":"1549","DOI":"10.1109\/83.469936","volume":"4","author":"M Unser","year":"1995","unstructured":"Unser M. Texture classification and segmentation using wavelet frames. IEEE Trans Image Process. 1995;4(11):1549\u201360.","journal-title":"IEEE Trans Image Process"},{"issue":"5","key":"2129_CR6","doi-asserted-by":"publisher","first-page":"478","DOI":"10.1109\/34.134046","volume":"13","author":"B Manjunath","year":"1991","unstructured":"Manjunath B, Chellappa R. Unsupervised texture segmentation using Markov random field models. IEEE Trans Pattern Anal Mach Intell. 1991;13(5):478\u201382.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"3","key":"2129_CR7","first-page":"66","volume":"36","author":"M Paulinas","year":"2007","unstructured":"Paulinas M, U\u0161inskas A. A survey of genetic algorithms applications for image enhancement and segmentation. Inf Technol Control. 2007;36(3):66.","journal-title":"Inf Technol Control"},{"key":"2129_CR8","doi-asserted-by":"crossref","unstructured":"Ronneberger O, Fischer P, Brox T. In: International conference on medical image computing and computer-assisted intervention. Springer; 2015;pp. 234\u201341.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"2129_CR9","doi-asserted-by":"crossref","unstructured":"\u00c7i\u00e7ek \u00d6, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O. In: International conference on medical image computing and computer-assisted intervention. Springer; 2016; pp. 424\u201332.","DOI":"10.1007\/978-3-319-46723-8_49"},{"issue":"10","key":"2129_CR10","doi-asserted-by":"publisher","DOI":"10.23915\/distill.00003","volume":"1","author":"A Odena","year":"2016","unstructured":"Odena A, Dumoulin V, Olah C. Deconvolution and checkerboard artifacts. Distill. 2016;1(10): e3.","journal-title":"Distill"},{"key":"2129_CR11","unstructured":"Dumoulin V, Visin F. A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285. 2016."},{"key":"2129_CR12","unstructured":"Long J, Shelhamer E, Darrell T. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2015. p. 3431\u201340."},{"issue":"2","key":"2129_CR13","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1038\/s41592-020-01008-z","volume":"18","author":"F Isensee","year":"2021","unstructured":"Isensee F, Jaeger PF, Kohl SA, Petersen J, Maier-Hein KH. nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods. 2021;18(2):203\u201311.","journal-title":"Nat Methods"},{"key":"2129_CR14","unstructured":"Milletari F, Navab N, Ahmadi SA. In: 2016 Fourth international conference on 3D vision (3DV). IEEE; 2016. p. 565\u201371."},{"key":"2129_CR15","doi-asserted-by":"crossref","unstructured":"Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J. In: Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer; 2018. p. 3\u201311.","DOI":"10.1007\/978-3-030-00889-5_1"},{"key":"2129_CR16","unstructured":"Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et\u00a0al. Attention u-net: learning where to look for the pancreas. arXiv preprint arXiv:1804.03999. 2018."},{"key":"2129_CR17","unstructured":"Huang H, Lin L, Tong R, Hu H, Zhang Q, Iwamoto Y, Han X, Chen YW, Wu J. In: ICASSP 2020\u20142020 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE; 2020. p. 1055\u20139."},{"key":"2129_CR18","doi-asserted-by":"crossref","unstructured":"Yu L, Cheng J, Dou Q, Yang X, Chen H, Qin J, Heng P. Automatic 3d cardiovascular MR segmentation with densely-connected volumetric convnets. CoRR. 2017. arXiv:http:\/\/arxiv.org\/abs\/1708.00573.","DOI":"10.1007\/978-3-319-66185-8_33"},{"key":"2129_CR19","doi-asserted-by":"crossref","unstructured":"Huang C, Han H, Yao Q, Zhu S, Zhou SK. In: MICCAI; 2019.","DOI":"10.1155\/2019\/1693746"},{"key":"2129_CR20","unstructured":"Nikolaos A. Deep learning in medical image analysis: a comparative analysis of multi-modal brain-mri segmentation with 3d deep neural networks. Master\u2019s thesis, University of Patras; 2019. https:\/\/github.com\/black0017\/MedicalZooPytorch."},{"key":"2129_CR21","unstructured":"Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y. Transunet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306. 2021."},{"key":"2129_CR22","first-page":"12116","volume":"34","author":"M Raghu","year":"2021","unstructured":"Raghu M, Unterthiner T, Kornblith S, Zhang C, Dosovitskiy A. Do vision transformers see like convolutional neural networks? Adv Neural Inf Process Syst. 2021;34:12116\u201328.","journal-title":"Adv Neural Inf Process Syst"},{"key":"2129_CR23","unstructured":"Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122. 2015."},{"key":"2129_CR24","doi-asserted-by":"crossref","unstructured":"Woo S, Park J, Lee JY, Kweon IS. In: Proceedings of the European conference on computer vision (ECCV); 2018. p. 3\u201319.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"2129_CR25","unstructured":"Zhao H, Shi J, Qi X, Wang X, Jia J. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. p. 2881\u201390."},{"key":"2129_CR26","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser \u0141, Polosukhin I. In: Advances in neural information processing systems; 2017. p. 5998\u20136008."},{"key":"2129_CR27","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et\u00a0al. An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929. 2020."},{"key":"2129_CR28","doi-asserted-by":"crossref","unstructured":"Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B. Swin transformer: hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030. 2021.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"2129_CR29","unstructured":"Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M. Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537. 2021."},{"key":"2129_CR30","doi-asserted-by":"crossref","unstructured":"Wang W, Chen C, Ding M, Yu H, Zha S, Li J. In: International conference on medical image computing and computer-assisted intervention. Springer; 2021. p. 109\u201319.","DOI":"10.1007\/978-3-030-87193-2_11"},{"key":"2129_CR31","unstructured":"Hatamizadeh A, Tang Y, Nath V, Yang D, Myronenko A, Landman B, Roth HR, Xu D. In: Proceedings of the IEEE\/CVF Winter conference on applications of computer vision; 2022. p. 574\u201384."},{"issue":"6","key":"2129_CR32","doi-asserted-by":"publisher","first-page":"66","DOI":"10.3390\/brainsci12060797","volume":"12","author":"Y Jiang","year":"2022","unstructured":"Jiang Y, Zhang Y, Lin X, Dong J, Cheng T, Liang J. Swinbts: a method for 3d multimodal brain tumor segmentation using Swin Transformer. Brain Sci. 2022;12(6):66. https:\/\/doi.org\/10.3390\/brainsci12060797.","journal-title":"Brain Sci"},{"key":"2129_CR33","unstructured":"Baid U, Ghodasara S, Mohan S, Bilello M, Calabrese E, Colak E, Farahani K, Kalpathy-Cramer J, Kitamura FC, Pati S, et\u00a0al. The rsna\u2013asnr\u2013miccai brats 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv preprint arXiv:2107.02314. 2021."},{"issue":"1","key":"2129_CR34","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/sdata.2017.117","volume":"4","author":"S Bakas","year":"2017","unstructured":"Bakas S, Akbari H, Sotiras A, Bilello M, Rozycki M, Kirby JS, Freymann JB, Farahani K, Davatzikos C. Advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features. Sci Data. 2017;4(1):1\u201313.","journal-title":"Sci Data"},{"issue":"10","key":"2129_CR35","doi-asserted-by":"publisher","first-page":"1993","DOI":"10.1109\/TMI.2014.2377694","volume":"34","author":"BH Menze","year":"2014","unstructured":"Menze BH, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, Burren Y, Porz N, Slotboom J, Wiest R, et al. The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans Med Imaging. 2014;34(10):1993\u20132024.","journal-title":"IEEE Trans Med Imaging"},{"key":"2129_CR36","doi-asserted-by":"crossref","unstructured":"Liu Z, Ning J, Cao Y, Wei Y, Zhang Z, Lin S, Hu H. Video Swin Transformer. arXiv preprint arXiv:2106.13230. 2021.","DOI":"10.1109\/CVPR52688.2022.00320"},{"key":"2129_CR37","unstructured":"Ba JL, Kiros JR, Hinton GE. Layer normalization. arXiv preprint arXiv:1607.06450. 2016."},{"key":"2129_CR38","unstructured":"He K, Zhang X, Ren S, Sun J. In: Proceedings of the IEEE international conference on computer vision. 2015; p. 1026\u201334."},{"key":"2129_CR39","unstructured":"Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H. Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861. 2017."},{"key":"2129_CR40","unstructured":"Guo MH, Lu CZ, Liu ZN, Cheng MM, Hu SM. Visual attention network. arXiv preprint arXiv:2202.09741. 2022."},{"key":"2129_CR41","unstructured":"Rogozhnikov A. In: International conference on learning representations; 2022. https:\/\/openreview.net\/forum?id=oapKSVM2bcj."},{"key":"2129_CR42","unstructured":"He K, Zhang X, Ren S, Sun J. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 770\u20138."},{"key":"2129_CR43","first-page":"66","volume":"32","author":"A Paszke","year":"2019","unstructured":"Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al. Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst. 2019;32:66.","journal-title":"Adv Neural Inf Process Syst"},{"key":"2129_CR44","doi-asserted-by":"publisher","unstructured":"Falcon W, team TPL. Pytorch lightning; 2019. https:\/\/doi.org\/10.5281\/zenodo.3828935. https:\/\/www.pytorchlightning.ai","DOI":"10.5281\/zenodo.3828935"},{"key":"2129_CR45","doi-asserted-by":"publisher","unstructured":"Consortium M. Monai: medical open network for AI; 2020. https:\/\/doi.org\/10.5281\/zenodo.4323058. https:\/\/github.com\/Project-MONAI\/MONAI","DOI":"10.5281\/zenodo.4323058"},{"key":"2129_CR46","doi-asserted-by":"crossref","unstructured":"Yushkevich PA, Piven J, Cody\u00a0Hazlett H, Gimpel\u00a0Smith R, Ho S, Gee JC, Gerig G. User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage. 2006;31(3), 1116\u201328.","DOI":"10.1016\/j.neuroimage.2006.01.015"},{"key":"2129_CR47","unstructured":"Loshchilov I, Hutter F. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101. 2017."},{"key":"2129_CR48","doi-asserted-by":"publisher","unstructured":"Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J, van der Walt SJ, Brett M, Wilson J, Millman KJ, Mayorov N, Nelson ARJ, Jones E, Kern R, Larson E, Carey CJ, Polat \u0130, Feng Y, Moore EW, VanderPlas J, Laxalde D, Perktold J, Cimrman R, Henriksen I, Quintero EA, Harris CR, Archibald AM, Ribeiro AH, Pedregosa F, van Mulbregt P SciPy 1.0 Contributors, SciPy 1.0: fundamental algorithms for scientific computing in python. Nat Methods. 2020;17:261\u201372. https:\/\/doi.org\/10.1038\/s41592-019-0686-2.","DOI":"10.1038\/s41592-019-0686-2"}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-023-02129-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12911-023-02129-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-023-02129-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,14]],"date-time":"2023-02-14T05:08:47Z","timestamp":1676351327000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-023-02129-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,14]]},"references-count":48,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["2129"],"URL":"https:\/\/doi.org\/10.1186\/s12911-023-02129-z","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-2102490\/v1","asserted-by":"object"}]},"ISSN":["1472-6947"],"issn-type":[{"value":"1472-6947","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,14]]},"assertion":[{"value":"26 September 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 February 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 February 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"33"}}