{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T10:13:30Z","timestamp":1770459210605,"version":"3.49.0"},"reference-count":37,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2022,5,18]],"date-time":"2022-05-18T00:00:00Z","timestamp":1652832000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Guangdong Key Laboratory of Advanced IntelliSense Technology","award":["2019B121203006"],"award-info":[{"award-number":["2019B121203006"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The latest medical image segmentation methods uses UNet and transformer structures with great success. Multiscale feature fusion is one of the important factors affecting the accuracy of medical image segmentation. Existing transformer-based UNet methods do not comprehensively explore multiscale feature fusion, and there is still much room for improvement. In this paper, we propose a novel multiresolution aggregation transformer UNet (MRA-TUNet) based on multiscale input and coordinate attention for medical image segmentation. It realizes multiresolution aggregation from the following two aspects: (1) On the input side, a multiresolution aggregation module is used to fuse the input image information of different resolutions, which enhances the input features of the network. (2) On the output side, an output feature selection module is used to fuse the output information of different scales to better extract coarse-grained information and fine-grained information. We try to introduce a coordinate attention structure for the first time to further improve the segmentation performance. We compare with state-of-the-art medical image segmentation methods on the automated cardiac diagnosis challenge and the 2018 atrial segmentation challenge. Our method achieved average dice score of 0.911 for right ventricle (RV), 0.890 for myocardium (Myo), 0.961 for left ventricle (LV), and 0.923 for left atrium (LA). The experimental results on two datasets show that our method outperforms eight state-of-the-art medical image segmentation methods in dice score, precision, and recall.<\/jats:p>","DOI":"10.3390\/s22103820","type":"journal-article","created":{"date-parts":[[2022,5,18]],"date-time":"2022-05-18T23:14:26Z","timestamp":1652915666000},"page":"3820","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["Multiresolution Aggregation Transformer UNet Based on Multiscale Input and Coordinate Attention for Medical Image Segmentation"],"prefix":"10.3390","volume":"22","author":[{"given":"Shaolong","family":"Chen","sequence":"first","affiliation":[{"name":"School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen 518000, China"}]},{"given":"Changzhen","family":"Qiu","sequence":"additional","affiliation":[{"name":"School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen 518000, China"}]},{"given":"Weiping","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen 518000, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0638-5434","authenticated-orcid":false,"given":"Zhiyong","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen 518000, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,5,18]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1016\/j.joca.2020.12.019","article-title":"Osteoarthritis year in review 2020: Imaging","volume":"29","author":"Eckstein","year":"2021","journal-title":"Osteoarthr. Cartil."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1038\/nrrheum.2010.197","article-title":"The bone-cartilage unit in osteoarthritis","volume":"7","author":"Lories","year":"2010","journal-title":"Nat. Rev. Rheumatol."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1148\/radiol.2021204587","article-title":"The QIBA profile for MRI-based compositional imaging of knee cartilage","volume":"301","author":"Chalian","year":"2021","journal-title":"Radiology"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"7653","DOI":"10.1007\/s00330-021-07853-6","article-title":"Automated cartilage segmentation and quantification using 3D ultrashort echo time (UTE) cones MR imaging with deep convolutional neural networks","volume":"31","author":"Xue","year":"2021","journal-title":"Eur. Radiol."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"453","DOI":"10.1002\/jmri.21437","article-title":"Quantitative assessment of bone marrow edema-like lesion and overlying cartilage in knees with osteoarthritis and anterior cruciate ligament tear using MR imaging and spectroscopic imaging at 3 tesla","volume":"28","author":"Li","year":"2008","journal-title":"J. Magn. Reson. Imaging"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"543","DOI":"10.1016\/j.media.2009.05.004","article-title":"Statistical shape models for 3d medical image segmentation: A review","volume":"13","author":"Heimann","year":"2009","journal-title":"Med. Image Anal."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1422","DOI":"10.1002\/jmri.22188","article-title":"Segmentation of the quadratus lumborum muscle using statistical shape modeling","volume":"33","author":"Engstrom","year":"2011","journal-title":"J. Magn. Reson. Imaging"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1663","DOI":"10.1109\/TMI.2015.2443912","article-title":"Statistical interspace models (SIMs): Application to robust 3D spine segmentation","volume":"34","author":"Pozo","year":"2015","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"577","DOI":"10.1109\/TMI.2013.2290491","article-title":"Lung segmentation in chest radiographs using anatomical atlases with nonrigid registration","volume":"33","author":"Candemir","year":"2014","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1413","DOI":"10.1007\/s11517-011-0838-8","article-title":"A fully automated human knee 3D MRI bone segmentation using the ray casting technique","volume":"49","author":"Dodin","year":"2011","journal-title":"Med. Biol. Eng. Comput."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Hwang, J., and Hwang, S. (2021). Exploiting global structure information to improve medical image segmentation. Sensors, 21.","DOI":"10.3390\/s21093249"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Li, Q.Y., Yu, Z.B., Wang, Y.B., and Zheng, H.Y. (2020). TumorGAN: A multi-modal data augmentation framework for brain tumor segmentation. Sensors, 20.","DOI":"10.3390\/s20154203"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Ullah, F., Ansari, S.U., Hanif, M., Ayari, M.A., Chowdhury, M.E.H., Khandakar, A.A., and Khan, M.S. (2021). Brain MR image enhancement for tumor segmentation using 3D U-Net. Sensors, 21.","DOI":"10.3390\/s21227528"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Awan, M.J., Rahim, M.S.M., Salim, N., Rehman, A., and Garcia-Zapirain, B. (2022). Automated knee MR images segmentation of anterior cruciate ligament tears. Sensors, 22.","DOI":"10.3390\/s22041552"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Jalali, Y., Fateh, M., Rezvani, M., Abolghasemi, V., and Anisi, M.H. (2021). ResBCDU-Net: A deep learning framework for lung CT image segmentation. Sensors, 21.","DOI":"10.3390\/s21010268"},{"key":"ref_16","first-page":"129","article-title":"PM-Net: Pyramid multi-label network for joint optic disc and cup segmentation","volume":"11764","author":"Yin","year":"2019","journal-title":"Int. Conf. Med. Image Comput. Comput. Assist. Interv."},{"key":"ref_17","first-page":"234","article-title":"U-Net: Convolutional networks for biomedical image segmentation","volume":"9351","author":"Ronneberger","year":"2015","journal-title":"Int. Conf. Med. Image Comput. Comput. Assist. Interv."},{"key":"ref_18","unstructured":"Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., and Kainz, B. (2018). Attention U-Net: Learning where to look for the pancreas. arXiv, Available online: https:\/\/arxiv.org\/abs\/1804.03999."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"2663","DOI":"10.1109\/TMI.2018.2845918","article-title":"H-DenseUNet: Hybrid densely connected UNet for liver and tumor segmentation from CT volumes","volume":"37","author":"Li","year":"2018","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Alom, M.Z., Yakopcic, C., Taha, T.M., and Asari, V.K. (2018, January 23\u201326). Nuclei segmentation with recurrent residual convolutional neural networks based U-Net (R2U-Net). Proceedings of the IEEE National Aerospace and Electronics Conference (NAECON), Dayton, OH, USA.","DOI":"10.1109\/NAECON.2018.8556686"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1856","DOI":"10.1109\/TMI.2019.2959609","article-title":"UNet plus plus: Redesigning skip connections to exploit multiscale features in image segmentation","volume":"39","author":"Zhou","year":"2020","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_22","first-page":"797","article-title":"Attention guided network for retinal image segmentation","volume":"11764","author":"Zhang","year":"2019","journal-title":"Int. Conf. Med. Image Comput. Comput. Assist. Interv."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Huang, H.M., Lin, L.F., Tong, R.F., Hu, H.J., Zhang, Q.W., Iwamoto, Y., Han, X.H., Chen, Y.W., and Wu, J. (2020, January 4\u20138). UNet 3+: A full-scale connected UNet for medical image segmentation. Proceedings of the International Conference on Acoustics Speech and Signal Processing (ICASSP), Barcelona, Spain.","DOI":"10.1109\/ICASSP40776.2020.9053405"},{"key":"ref_24","first-page":"5998","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv. Neural Inf. Processing Syst."},{"key":"ref_25","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X.H., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16 \u00d7 16 words: Transformers for image recognition at scale. arXiv, Available online: https:\/\/arxiv.org\/abs\/2010.11929."},{"key":"ref_26","unstructured":"Zhou, D.Q., Kang, B.Y., Jin, X.J., Yang, L.J., Lian, X.C., Jiang, Z.H., Hou, Q.B., and Feng, J.S. (2021). DeepViT: Towards deeper vision transformer. arXiv, Available online: https:\/\/arxiv.org\/abs\/2103.11886."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Touvron, H., Cord, M., Sablayrolles, A., Synnaeve, G., and J\u00e9gou, H. (2021). Going deeper with image transformers. arXiv, Available online: https:\/\/arxiv.org\/abs\/2103.17239.","DOI":"10.1109\/ICCV48922.2021.00010"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Chen, C.F., Fan, Q.F., and Panda, R. (2021). CrossViT: Cross-attention multi-scale vision transformer for image classification. arXiv, Available online: https:\/\/arxiv.org\/abs\/2103.14899.","DOI":"10.1109\/ICCV48922.2021.00041"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Wu, H.P., Xiao, B., Codella, N., Liu, M.C., Dai, X.Y., Yuan, L., and Zhang, L. (2021). CvT: Introducing convolutions to vision transformers. arXiv, Available online: https:\/\/arxiv.org\/abs\/2111.03940.","DOI":"10.1109\/ICCV48922.2021.00009"},{"key":"ref_30","unstructured":"Chen, J.N., Lu, Y.Y., Yu, Q.H., Luo, X.D., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., and Zhou, Y.Y. (2021). TransUNet: Transformers make strong encoders for medical image segmentation. arXiv, Available online: https:\/\/arxiv.org\/abs\/2102.04306."},{"key":"ref_31","unstructured":"Cao, H., Wang, Y.Y., Chen, J., Jiang, D.S., Zhang, X.P., Tian, Q., and Wang, M.N. (2021). Swin-Unet: Unet-like pure transformer for medical image segmentation. arXiv, Available online: https:\/\/arxiv.org\/abs\/2105.05537."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Hatamizadeh, A., Tang, Y.C., Nath, V., Yang, D., Myronenko, A., Landman, B., Roth, H., and Xu, D.G. (2021). UNETR: Transformers for 3D medical image segmentation. arXiv, Available online: https:\/\/doi.org\/10.48550\/arXiv.2201.01266.","DOI":"10.1109\/WACV51458.2022.00181"},{"key":"ref_33","unstructured":"Wang, H.N., Cao, P., Wang, J.Q., and Zaiane, O.R. (2021). UCTransNet: Rethinking the skip connections in U-Net from a channel-wise perspective with transformer. arXiv, Available online: https:\/\/arxiv.org\/abs\/2109.04335."},{"key":"ref_34","unstructured":"Zhou, H.Y., Guo, J.S., Zhang, Y.H., Yu, L.Q., Wang, L.S., and Yu, Y.Z. (2021). nnFormer: Interleaved transformer for volumetric segmentation. arXiv, Available online: https:\/\/arxiv.org\/abs\/2109.03201."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Hou, Q.B., Zhou, D.Q., and Feng, J.S. (2021, January 20\u201325). Coordinate attention for efficient mobile network design. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01350"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"2514","DOI":"10.1109\/TMI.2018.2837502","article-title":"Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: Is the problem solved?","volume":"37","author":"Bernard","year":"2018","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"101832","DOI":"10.1016\/j.media.2020.101832","article-title":"A global benchmark of algorithms for segmenting the left atrium from late gadolinium-enhanced cardiac magnetic resonance imaging","volume":"67","author":"Xiong","year":"2021","journal-title":"Med. Image Anal."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/10\/3820\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:13:59Z","timestamp":1760138039000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/10\/3820"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,18]]},"references-count":37,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2022,5]]}},"alternative-id":["s22103820"],"URL":"https:\/\/doi.org\/10.3390\/s22103820","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,5,18]]}}}