{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T15:31:02Z","timestamp":1762183862508,"version":"build-2065373602"},"reference-count":57,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T00:00:00Z","timestamp":1761868800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation for Theoretical Physics special fund \u201ccooperation program\u201d","award":["11547039"],"award-info":[{"award-number":["11547039"]}]},{"name":"Shaanxi Provincial Natural Science Research Funding Project","award":["2024SF-YBXM-587"],"award-info":[{"award-number":["2024SF-YBXM-587"]}]},{"name":"Shaanxi Institute of Scientific Research Plan projects","award":["SLGKYQD2-05"],"award-info":[{"award-number":["SLGKYQD2-05"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Medical image segmentation has been a central research focus in deep learning, but methods based on convolutions have limitations in modeling the long-range validity of images. To overcome this issue, hybrid CNN-Transformer architectures have been explored, with SwinUNet being a classic approach. However, SwinUNet still faces challenges such as insufficient modeling of relative position information, limited feature fusion capabilities in skip connections, and the loss of translational invariance caused by Patch Merging. To overcome these limitations, the architecture RE-XswinUnet is presented as a novel solution for medical image segmentation. In our design, relative position biases are replaced with rotary position embedding to enhance the model\u2019s ability to extract detailed information. During the decoding stage, XskipNet is designed to improve cross-scale feature fusion and learning capabilities. Additionally, an SCAR Block downsampling module is incorporated to preserve translational invariance more effectively. The experimental results demonstrate that RE-XswinUnet achieves improvements of 2.65% and 0.95% in Dice coefficients on the Synapse multi-organ and ACDC datasets, respectively, validating its superiority in medical image segmentation tasks.<\/jats:p>","DOI":"10.3390\/bdcc9110274","type":"journal-article","created":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T13:55:22Z","timestamp":1762178122000},"page":"274","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["RE-XswinUnet: Rotary Positional Encoding and Inter-Slice Contextual Connections for Multi-Organ Segmentation"],"prefix":"10.3390","volume":"9","author":[{"given":"Hang","family":"Yang","sequence":"first","affiliation":[{"name":"School of Physics and Telecommunication Engineering, Shaanxi University of Technology, Hanzhong 723001, China"}]},{"given":"Chuanghua","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Physics and Telecommunication Engineering, Shaanxi University of Technology, Hanzhong 723001, China"}]},{"given":"Dan","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Physics and Telecommunication Engineering, Shaanxi University of Technology, Hanzhong 723001, China"}]},{"given":"Xiaojing","family":"Hang","sequence":"additional","affiliation":[{"name":"School of Communications and Information Engineering, Xi\u2019an University of Posts and Telecommunications, Xi\u2019an 710061, China"}]},{"given":"Wu","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Physics and Telecommunication Engineering, Shaanxi University of Technology, Hanzhong 723001, China"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Lv, C., Li, B., Wang, X., Cai, P., Yang, B., Sun, G., and Yan, J. (2025). ECM-TransUNet: Edge-enhanced multi-scale attention and convolutional Mamba for medical image segmentation. Biomed. Signal Process. Control, 107.","DOI":"10.1016\/j.bspc.2025.107845"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"128740","DOI":"10.1016\/j.neucom.2024.128740","article-title":"A comprehensive review of deep learning for medical image segmentation","volume":"613","author":"Xia","year":"2025","journal-title":"Neurocomputing"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Etehadtavakol, M., Etehadtavakol, M., and Ng, E.Y.K. (2024). Enhanced thyroid nodule segmentation through U-Net and VGG16 fusion with feature engineering: A comprehensive study. Comput. Methods Programs Biomed., 251.","DOI":"10.1016\/j.cmpb.2024.108209"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"e17005","DOI":"10.7717\/peerj.17005","article-title":"Enhancing medical image segmentation with a multi-transformer U-Net","volume":"12","author":"Dan","year":"2024","journal-title":"PeerJ"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Gao, Y., Jiang, Y., Peng, Y., Yuan, F., Zhang, X., and Wang, J. (2025). Medical Image Segmentation: A Comprehensive Review of Deep Learning-Based Methods. Tomography, 11.","DOI":"10.3390\/tomography11050052"},{"key":"ref_6","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). ImageNet classification with deep convolutional neural networks. Proceedings of the 26th International Conference on Neural Information Processing Systems\u2014Volume 1, Lake Tahoe, NV, USA."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1529","DOI":"10.1007\/s10278-024-00981-7","article-title":"From CNN to Transformer: A Review of Medical Image Segmentation Models","volume":"37","author":"Yao","year":"2024","journal-title":"J. Imaging Inform. Med."},{"key":"ref_8","unstructured":"Jiang, J., Zhang, J., Liu, W., Gao, M., Hu, X., Xue, Z., Liu, Y., and Yan, S. (2025). Rwkv-unet: Improving unet with long-range cooperation for effective medical image segmentation. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-Net: Convolutional Networks for Biomedical Image Segmentation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2015, Cham, Switzerland.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.2352\/J.ImagingSci.Technol.2020.64.2.020508","article-title":"Medical image segmentation based on U-net: A review","volume":"64","author":"Du","year":"2020","journal-title":"J. Imaging Sci. Technol."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"10076","DOI":"10.1109\/TPAMI.2024.3435571","article-title":"Medical image segmentation review: The success of u-net","volume":"46","author":"Azad","year":"2024","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Huang, L., Miron, A., Hone, K., and Li, Y. (2024, January 26\u201328). Segmenting Medical Images: From UNet to Res-UNet and nnUNet. Proceedings of the 2024 IEEE 37th International Symposium on Computer-Based Medical Systems (CBMS), Guadalajara, Mexico.","DOI":"10.1109\/CBMS61543.2024.00086"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Milletari, F., Navab, N., and Ahmadi, S.-A. (2016, January 25\u201328). V-net: Fully convolutional neural networks for volumetric medical image segmentation. Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA.","DOI":"10.1109\/3DV.2016.79"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"\u00c7i\u00e7ek, \u00d6., Abdulkadir, A., Lienkamp, S.S., Brox, T., and Ronneberger, O. (2016, January 17\u201321). 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2013MICCAI 2016, Cham, Switzerland.","DOI":"10.1007\/978-3-319-46723-8_49"},{"key":"ref_15","unstructured":"Oktay, O., Schlemper, J., Le Folgoc, L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., and Kainz, B. (2018). Attention U-Net: Learning Where to Look for the Pancreas. arXiv."},{"key":"ref_16","unstructured":"Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., and Zhou, Y. (2021). Transunet: Transformers make strong encoders for medical image segmentation. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., and Liang, J. (2018, January 20). Unet++: A nested u-net architecture for medical image segmentation. Proceedings of the International Workshop on Deep Learning in Medical Image Analysis, Granada, Spain.","DOI":"10.1007\/978-3-030-00889-5_1"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"105264","DOI":"10.1016\/j.dsp.2025.105264","article-title":"AO-TransUNet: A multi-attention optimization network for COVID-19 and medical image segmentation","volume":"164","author":"Qi","year":"2025","journal-title":"Digit. Signal Process."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1038\/s41592-020-01008-z","article-title":"nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation","volume":"18","author":"Isensee","year":"2021","journal-title":"Nat. Methods"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1016\/j.isprsjprs.2020.01.013","article-title":"ResUNet-a: A deep learning framework for semantic segmentation of remotely sensed data","volume":"162","author":"Diakogiannis","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1275","DOI":"10.21037\/qims-19-1090","article-title":"Dense-UNet: A novel multiphoton in vivo cellular image segmentation model based on a convolutional neural network","volume":"10","author":"Cai","year":"2020","journal-title":"Quant Imaging Med. Surg."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"110099","DOI":"10.1016\/j.compeleceng.2025.110099","article-title":"Advancements in medical image segmentation: A review of transformer models","volume":"123","author":"Kumar","year":"2025","journal-title":"Comput. Electr. Eng."},{"key":"ref_23","unstructured":"Ashish, V., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I. (2017). Attention is all you need. arXiv."},{"key":"ref_24","unstructured":"Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., and J\u00e9gou, H. (2021, January 18\u201324). Training data-efficient image transformers & distillation through attention. Proceedings of the International Conference on Machine Learning, Virtual Event."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Krishna, M.S., Machado, P., Otuka, R.I., Yahaya, S.W., Neves dos Santos, F., and Ihianle, I.K. (2025). Plant Leaf Disease Detection Using Deep Learning: A Multi-Dataset Approach. J, 8.","DOI":"10.3390\/j8010004"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Pu, Q., Xi, Z., Yin, S., Zhao, Z., and Zhao, L. (2024). Advantages of transformer and its application for medical image segmentation: A survey. Biomed. Eng. Online, 23.","DOI":"10.1186\/s12938-024-01212-4"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Liu, Q., Kaul, C., Wang, J., Anagnostopoulos, C., Murray-Smith, R., and Deligianni, F. (2023, January 4\u201310). Optimizing vision transformers for medical image segmentation. Proceedings of the ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes, Greece.","DOI":"10.1109\/ICASSP49357.2023.10096379"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1109","DOI":"10.1109\/TCSVT.2022.3212434","article-title":"Hybrid CNN-transformer features for visual place recognition","volume":"33","author":"Wang","year":"2022","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_29","first-page":"1","article-title":"Hybrid CNN-Transformer Network With a Weighted MSE Loss for Global Sea Surface Wind Speed Retrieval From GNSS-R Data","volume":"63","author":"Qiao","year":"2025","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"122055","DOI":"10.1016\/j.renene.2024.122055","article-title":"Wind and solar power forecasting based on hybrid CNN-ABiLSTM, CNN-transformer-MLP models","volume":"239","author":"Bashir","year":"2025","journal-title":"Renew. Energy"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Tang, H., Chen, Y., Wang, T., Zhou, Y., Zhao, L., Gao, Q., Du, M., Tan, T., Zhang, X., and Tong, T. (2024). HTC-Net: A hybrid CNN-transformer framework for medical image segmentation. Biomed. Signal Process. Control., 88.","DOI":"10.1016\/j.bspc.2023.105605"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., and Torr, P.H. (2021, January 19\u201325). Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event.","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"ref_33","unstructured":"Xie, Y., Zhang, J., Shen, C., and Xia, Y. (October, January 27). Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, France."},{"key":"ref_34","unstructured":"Wang, W., Chen, C., Ding, M., Yu, H., Zha, S., and Li, J. (October, January 27). Transbts: Multimodal brain tumor segmentation using transformer. Proceedings of the International conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, France."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Hatamizadeh, A., Tang, Y., Nath, V., Yang, D., Myronenko, A., Landman, B., Roth, H.R., and Xu, D. (2022, January 3\u20138). Unetr: Transformers for 3d medical image segmentation. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","DOI":"10.1109\/WACV51458.2022.00181"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., and Wang, M. (2022, January 23\u201327). Swin-unet: Unet-like pure transformer for medical image segmentation. Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-25066-8_9"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"109248","DOI":"10.1016\/j.compeleceng.2024.109248","article-title":"Grey Wolf optimized SwinUNet based transformer framework for liver segmentation from CT images","volume":"117","author":"Kumar","year":"2024","journal-title":"Comput. Electr. Eng."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"127063","DOI":"10.1016\/j.neucom.2023.127063","article-title":"Roformer: Enhanced transformer with rotary position embedding","volume":"568","author":"Su","year":"2024","journal-title":"Neurocomputing"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Shaw, P., Uszkoreit, J., and Vaswani, A. (2018). Self-attention with relative position representations. arXiv.","DOI":"10.18653\/v1\/N18-2074"},{"key":"ref_41","first-page":"1","article-title":"2DSegFormer: 2-D Transformer Model for Semantic Segmentation on Aerial Images","volume":"60","author":"Li","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Wu, K., Peng, H., Chen, M., Fu, J., and Chao, H. (2021, January 10\u201317). Rethinking and improving relative position encoding for vision transformer. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00988"},{"key":"ref_43","unstructured":"Liutkus, A., C\u0131fka, O., Wu, S.-L., Simsekli, U., Yang, Y.-H., and Richard, G. (2021, January 18\u201324). Relative positional encoding for transformers with linear complexity. Proceedings of the International Conference on Machine Learning, Virtual."},{"key":"ref_44","unstructured":"Heo, B., Park, S., Han, D., and Yun, S. (October, January 29). Rotary position embedding for vision transformer. Proceedings of the European Conference on Computer Vision, Milan, Italy."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"100004","DOI":"10.1016\/j.metrad.2023.100004","article-title":"Vision transformers in multi-modal brain tumor MRI segmentation: A review","volume":"1","author":"Wang","year":"2023","journal-title":"Meta-Radiology"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.-Y., and Kweon, I.S. (2018, January 8\u201314). Cbam: Convolutional block attention module. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"114245","DOI":"10.1016\/j.knosys.2025.114245","article-title":"HFA-UNet: Hybrid and full attention UNet for thyroid nodule segmentation","volume":"328","author":"Li","year":"2025","journal-title":"Knowl.-Based Syst."},{"key":"ref_48","first-page":"474","article-title":"ACMS-TransNet: Polyp Segmentation Network Based on Adaptive Convolution and Multi-Scale Global Context","volume":"52","author":"Sun","year":"2025","journal-title":"IAENG Int. J. Comput. Sci."},{"key":"ref_49","first-page":"2131","article-title":"Merging Context Clustering with Visual State Space Models for Medical Image Segmentation","volume":"44","author":"Zhu","year":"2025","journal-title":"Qeios"},{"key":"ref_50","first-page":"1","article-title":"SwinSUNet: Pure transformer network for remote sensing image change detection","volume":"60","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_52","unstructured":"(2024, September 17). Segmentation Outside the Cranial Vault Challenge. Available online: https:\/\/repo-prod.prod.sagebase.org\/repo\/v1\/doi\/locate?id=syn3193805&type=ENTITY."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"2514","DOI":"10.1109\/TMI.2018.2837502","article-title":"Deep Learning Techniques for Automatic MRI Cardiac Multi-Structures Segmentation and Diagnosis: Is the Problem Solved?","volume":"37","author":"Bernard","year":"2018","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Fu, S., Lu, Y., Wang, Y., Zhou, Y., Shen, W., Fishman, E., and Yuille, A. (2020, January 4\u20138). Domain adaptive relational reasoning for 3d multi-organ segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Lima, Peru.","DOI":"10.1007\/978-3-030-59710-8_64"},{"key":"ref_55","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Jha, A., Kumar, A., Pande, S., Banerjee, B., and Chaudhuri, S. (2020, January 25\u201328). Mt-unet: A novel u-net based multi-task architecture for visual scene understanding. Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Virtual.","DOI":"10.1109\/ICIP40778.2020.9190695"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Yu, J., Qin, J., Xiang, J., He, X., Zhang, W., and Zhao, W. (2023, January 5\u20138). Trans-UNeter: A new Decoder of TransUNet for Medical Image Segmentation. Proceedings of the 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Istanbul, Turkey.","DOI":"10.1109\/BIBM58861.2023.10385407"}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/11\/274\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T14:43:39Z","timestamp":1762181019000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/11\/274"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,31]]},"references-count":57,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2025,11]]}},"alternative-id":["bdcc9110274"],"URL":"https:\/\/doi.org\/10.3390\/bdcc9110274","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,31]]}}}