{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T14:00:11Z","timestamp":1774965611664,"version":"3.50.1"},"reference-count":45,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2023,3,24]],"date-time":"2023-03-24T00:00:00Z","timestamp":1679616000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["U19B2030"],"award-info":[{"award-number":["U19B2030"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61976167"],"award-info":[{"award-number":["61976167"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["82230065"],"award-info":[{"award-number":["82230065"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["2021GY-082"],"award-info":[{"award-number":["2021GY-082"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["20JG031"],"award-info":[{"award-number":["20JG031"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Key Research and Development Program in the Shaanxi Province of China","award":["U19B2030"],"award-info":[{"award-number":["U19B2030"]}]},{"name":"Key Research and Development Program in the Shaanxi Province of China","award":["61976167"],"award-info":[{"award-number":["61976167"]}]},{"name":"Key Research and Development Program in the Shaanxi Province of China","award":["82230065"],"award-info":[{"award-number":["82230065"]}]},{"name":"Key Research and Development Program in the Shaanxi Province of China","award":["2021GY-082"],"award-info":[{"award-number":["2021GY-082"]}]},{"name":"Key Research and Development Program in the Shaanxi Province of China","award":["20JG031"],"award-info":[{"award-number":["20JG031"]}]},{"name":"Scientific Research Program of the Education Department of Shaanxi Provincial Government","award":["U19B2030"],"award-info":[{"award-number":["U19B2030"]}]},{"name":"Scientific Research Program of the Education Department of Shaanxi Provincial Government","award":["61976167"],"award-info":[{"award-number":["61976167"]}]},{"name":"Scientific Research Program of the Education Department of Shaanxi Provincial Government","award":["82230065"],"award-info":[{"award-number":["82230065"]}]},{"name":"Scientific Research Program of the Education Department of Shaanxi Provincial Government","award":["2021GY-082"],"award-info":[{"award-number":["2021GY-082"]}]},{"name":"Scientific Research Program of the Education Department of Shaanxi Provincial Government","award":["20JG031"],"award-info":[{"award-number":["20JG031"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The resolution of feature maps is a critical factor for accurate medical image segmentation. Most of the existing Transformer-based networks for medical image segmentation adopt a U-Net-like architecture, which contains an encoder that converts the high-resolution input image into low-resolution feature maps using a sequence of Transformer blocks and a decoder that gradually generates high-resolution representations from low-resolution feature maps. However, the procedure of recovering high-resolution representations from low-resolution representations may harm the spatial precision of the generated segmentation masks. Unlike previous studies, in this study, we utilized the high-resolution network (HRNet) design style by replacing the convolutional layers with Transformer blocks, continuously exchanging feature map information with different resolutions generated by the Transformer blocks. The proposed Transformer-based network is named the high-resolution Swin Transformer network (HRSTNet). Extensive experiments demonstrated that the HRSTNet can achieve performance comparable with that of the state-of-the-art Transformer-based U-Net-like architecture on the 2021 Brain Tumor Segmentation dataset, the Medical Segmentation Decathlon\u2019s liver dataset, and the BTCV multi-organ segmentation dataset.<\/jats:p>","DOI":"10.3390\/s23073420","type":"journal-article","created":{"date-parts":[[2023,3,24]],"date-time":"2023-03-24T06:34:07Z","timestamp":1679639647000},"page":"3420","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":70,"title":["High-Resolution Swin Transformer for Automatic Medical Image Segmentation"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0486-3084","authenticated-orcid":false,"given":"Chen","family":"Wei","sequence":"first","affiliation":[{"name":"College of Economics and Management, Xi\u2019an University of Posts & Telecommunications, Xi\u2019an 710061, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3319-7885","authenticated-orcid":false,"given":"Shenghan","family":"Ren","sequence":"additional","affiliation":[{"name":"School of Life Science and Technology, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Kaitai","family":"Guo","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Haihong","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Xidian University, Xi\u2019an 710071, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1428-5804","authenticated-orcid":false,"given":"Jimin","family":"Liang","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Xidian University, Xi\u2019an 710071, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,24]]},"reference":[{"key":"ref_1","unstructured":"Radford, A., and Narasimhan, K. (2018). Improving Language Understanding by Generative Pre-Training."},{"key":"ref_2","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I. (2017, January 4\u20139). Attention is All you Need. Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA."},{"key":"ref_3","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2021, January 3\u20137). An Image is Worth 16 \u00d7 16 Words: Transformers for Image Recognition at Scale. Proceedings of the 9th International Conference on Learning Representations, ICLR 2021, Virtual Event."},{"key":"ref_4","first-page":"10347","article-title":"Training data-efficient image transformers & distillation through attention","volume":"Volume 139","author":"Meila","year":"2021","journal-title":"Proceedings of the 38th International Conference on Machine Learning, ICML 2021"},{"key":"ref_5","unstructured":"Beal, J., Kim, E., Tzeng, E., Park, D.H., Zhai, A., and Kislyuk, D. (2020). Toward Transformer-Based Object Detection. arXiv."},{"key":"ref_6","unstructured":"Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., and Luo, P. (2021, January 6\u201314). SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, Virtual."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Caron, M., Touvron, H., Misra, I., J\u00e9gou, H., Mairal, J., Bojanowski, P., and Joulin, A. (2021, January 10\u201317). Emerging Properties in Self-Supervised Vision Transformers. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00951"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Hatamizadeh, A., Yang, D., Roth, H.R., and Xu, D. (2022, January 3\u20138). UNETR: Transformers for 3D Medical Image Segmentation. Proceedings of the 2022 IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA.","DOI":"10.1109\/WACV51458.2022.00181"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Peiris, H., Hayat, M., Chen, Z., Egan, G.F., and Harandi, M. (2021). A Volumetric Transformer for Accurate 3D Tumor Segmentation. arXiv.","DOI":"10.1007\/978-3-031-16443-9_16"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Hatamizadeh, A., Nath, V., Tang, Y., Yang, D., Roth, H., and Xu, D. (2022). Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images. arXiv.","DOI":"10.1007\/978-3-031-08999-2_22"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"\u00c7i\u00e7ek, \u00d6., Abdulkadir, A., Lienkamp, S.S., Brox, T., and Ronneberger, O. (2016, January 17\u201321). 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2016\u201419th International Conference, Athens, Greece.","DOI":"10.1007\/978-3-319-46723-8_49"},{"key":"ref_12","unstructured":"Zhou, H.Y., Guo, J., Zhang, Y., Yu, L., Wang, L., and Yu, Y. (2021). nnFormer: Interleaved Transformer for Volumetric Segmentation. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"3349","DOI":"10.1109\/TPAMI.2020.2983686","article-title":"Deep High-Resolution Representation Learning for Visual Recognition","volume":"43","author":"Wang","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_15","unstructured":"Baid, U., Ghodasara, S., Bilello, M., Mohan, S., Calabrese, E., Colak, E., Farahani, K., Kalpathy-Cramer, J., Kitamura, F.C., and Pati, S. (2021). The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1993","DOI":"10.1109\/TMI.2014.2377694","article-title":"The multimodal brain tumor image segmentation benchmark (BRATS)","volume":"34","author":"Menze","year":"2014","journal-title":"IEEE Trans. Med Imaging"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"170117","DOI":"10.1038\/sdata.2017.117","article-title":"Segmentation labels and radiomic features for the pre-operative scans of the TCGA-GBM collection. The Cancer Imaging Archive","volume":"4","author":"Bakas","year":"2017","journal-title":"Nat. Sci. Data"},{"key":"ref_18","unstructured":"Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J., Freymann, J., Farahani, K., and Davatzikos, C. (2017). Segmentation labels and radiomic features for the pre-operative scans of the TCGA-LGG collection. Cancer Imaging Arch., 286."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"170117","DOI":"10.1038\/sdata.2017.117","article-title":"Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features","volume":"4","author":"Bakas","year":"2017","journal-title":"Sci. Data"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"4128","DOI":"10.1038\/s41467-022-30695-9","article-title":"The Medical Segmentation Decathlon","volume":"13","author":"Antonelli","year":"2022","journal-title":"Nat. Commun."},{"key":"ref_21","unstructured":"Simpson, A.L., Antonelli, M., Bakas, S., Bilello, M., Farahani, K., Van Ginneken, B., Kopp-Schneider, A., Landman, B.A., Litjens, G., and Menze, B. (2019). A large annotated medical image dataset for the development and evaluation of segmentation algorithms. arXiv."},{"key":"ref_22","unstructured":"(2022, September 15). Multi-Organ Abdominal CT Reference Standard Segmentations. Available online: https:\/\/zenodo.org\/record\/1169361#.ZBv-IvZBxPY."},{"key":"ref_23","unstructured":"Landman, B., Xu, Z., Igelsias, J., Styner, M., Langerak, T., and Klein, A. (2015, January 9). Miccai Multi-Atlas Labeling beyond the Cranial Vault\u2014Workshop and Challenge. Proceedings of the MICCAI Multi-Atlas Labeling Beyond Cranial Vault\u2014Workshop Challenge, Munich, Germany."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1822","DOI":"10.1109\/TMI.2018.2806309","article-title":"Automatic Multi-Organ Segmentation on Abdominal CT With Dense V-Networks","volume":"37","author":"Gibson","year":"2018","journal-title":"IEEE Trans. Med Imaging"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Roth, H.R., Lu, L., Farag, A., Shin, H.C., Liu, J., Turkbey, E.B., and Summers, R.M. (2015, January 5\u20139). DeepOrgan: Multi-level Deep Convolutional Networks for Automated Pancreas Segmentation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2015, Munich, Germany.","DOI":"10.1007\/978-3-319-24553-9_68"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1045","DOI":"10.1007\/s10278-013-9622-7","article-title":"The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository","volume":"26","author":"Clark","year":"2013","journal-title":"J. Digit. Imaging"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1563","DOI":"10.1109\/TBME.2016.2574816","article-title":"Evaluation of Six Registration Methods for the Human Abdomen on Clinically Acquired CT","volume":"63","author":"Xu","year":"2016","journal-title":"IEEE Trans. Biomed. Eng."},{"key":"ref_28","unstructured":"Data From Pancreas-CT (2016). The Cancer Imaging Archive. IEEE Trans. Image Process."},{"key":"ref_29","unstructured":"Yuan, Y., Fu, R., Huang, L., Lin, W., Zhang, C., Chen, X., and Wang, J. (2021, January 6\u201314). HRFormer: High-Resolution Vision Transformer for Dense Predict. Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, Virtual."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Gu, J., Kwon, H., Wang, D., Ye, W., Li, M., Chen, Y., Lai, L., Chandra, V., and Pan, D.Z. (2021). Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation. arXiv.","DOI":"10.1109\/CVPR52688.2022.01178"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1016\/j.media.2004.06.007","article-title":"A brain tumor segmentation framework based on outlier detection","volume":"8","author":"Prastawa","year":"2004","journal-title":"Med Image Anal."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"2314","DOI":"10.1016\/j.patcog.2011.01.007","article-title":"Segmentation of retinal blood vessels using the radial projection and semi-supervised approach","volume":"44","author":"You","year":"2011","journal-title":"Pattern Recognit."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-Net: Convolutional Networks for Biomedical Image Segmentation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2015\u201418th International Conference, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_34","unstructured":"Futrega, M., Milesi, A., Marcinkiewicz, M., and Ribalta, P. (2019, January 17). Optimized U-Net for Brain Tumor Segmentation. Proceedings of the Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Shenzhen, China."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Crimi, A., and Bakas, S. (2019, January 17). Extending nn-UNet for Brain Tumor Segmentation. Proceedings of the Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Shenzhen, China.","DOI":"10.1007\/978-3-030-46643-5"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Xie, Y., Zhang, J., Shen, C., and Xia, Y. (October, January 27). CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation. Proceedings of the Medical Image Computing and Computer Assisted Intervention\u2014MICCAI 2021, Virtual.","DOI":"10.1007\/978-3-030-87199-4_16"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Wang, W., Chen, C., Ding, M., Yu, H., Zha, S., and Li, J. (October, January 27). TransBTS: Multimodal Brain Tumor Segmentation Using Transformer. Proceedings of the Medical Image Computing and Computer Assisted Intervention\u2014MICCAI 2021, Virtual.","DOI":"10.1007\/978-3-030-87193-2_11"},{"key":"ref_38","first-page":"267","article-title":"U-Net Transformer: Self and Cross Attention for Medical Image Segmentation","volume":"Volume 12966","author":"Petit","year":"2021","journal-title":"Proceedings of the Machine Learning in Medical Imaging\u201412th International Workshop, MLMI 2021, Held in Conjunction with MICCAI 2021"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1038\/s41592-020-01008-z","article-title":"nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation","volume":"18","author":"Isensee","year":"2021","journal-title":"Nat. Methods"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Liu, Z., Ning, J., Cao, Y., Wei, Y., Zhang, Z., Lin, S., and Hu, H. (2021). Video Swin Transformer. arXiv.","DOI":"10.1109\/CVPR52688.2022.00320"},{"key":"ref_41","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., and Antiga, L. (2019, January 8\u201314). PyTorch: An Imperative Style, High-Performance Deep Learning Library. Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1323","DOI":"10.1016\/j.mri.2012.05.001","article-title":"3D Slicer as an image computing platform for the Quantitative Imaging Network","volume":"30","author":"Fedorov","year":"2012","journal-title":"Magn. Reson. Imaging"},{"key":"ref_43","unstructured":"(2022, July 24). 3D Slicer. Available online: https:\/\/www.slicer.org."},{"key":"ref_44","unstructured":"Loshchilov, I., and Hutter, F. (2017, January 24\u201326). SGDR: Stochastic Gradient Descent with Warm Restarts. Proceedings of the 5th International Conference on Learning Representations, ICLR 2017, Toulon, France."},{"key":"ref_45","unstructured":"Contributors, M. (2022, April 06). MMCV: OpenMMLab Computer Vision Foundation. Available online: https:\/\/github.com\/open-mmlab\/mmcv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/7\/3420\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:02:13Z","timestamp":1760122933000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/7\/3420"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,24]]},"references-count":45,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2023,4]]}},"alternative-id":["s23073420"],"URL":"https:\/\/doi.org\/10.3390\/s23073420","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,24]]}}}