{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T13:20:30Z","timestamp":1767878430685,"version":"3.49.0"},"reference-count":51,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2023,1,6]],"date-time":"2023-01-06T00:00:00Z","timestamp":1672963200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01HD094381"],"award-info":[{"award-number":["R01HD094381"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01HD104822"],"award-info":[{"award-number":["R01HD104822"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000861","name":"Burroughs Wellcome Fund","doi-asserted-by":"publisher","award":["NGP10119"],"award-info":[{"award-number":["NGP10119"]}],"id":[{"id":"10.13039\/100000861","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000865","name":"Bill and Melinda Gates Foundation","doi-asserted-by":"publisher","award":["INV-037302"],"award-info":[{"award-number":["INV-037302"]}],"id":[{"id":"10.13039\/100000865","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000865","name":"Bill and Melinda Gates Foundation","doi-asserted-by":"publisher","award":["INV-005417"],"award-info":[{"award-number":["INV-005417"]}],"id":[{"id":"10.13039\/100000865","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000865","name":"Bill and Melinda Gates Foundation","doi-asserted-by":"publisher","award":["INV-035476"],"award-info":[{"award-number":["INV-035476"]}],"id":[{"id":"10.13039\/100000865","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Big Data"],"abstract":"<jats:p>As one of the popular deep learning methods, deep convolutional neural networks (DCNNs) have been widely adopted in segmentation tasks and have received positive feedback. However, in segmentation tasks, DCNN-based frameworks are known for their incompetence in dealing with global relations within imaging features. Although several techniques have been proposed to enhance the global reasoning of DCNN, these models are either not able to gain satisfying performances compared with traditional fully-convolutional structures or not capable of utilizing the basic advantages of CNN-based networks (namely the ability of local reasoning). In this study, compared with current attempts to combine FCNs and global reasoning methods, we fully extracted the ability of self-attention by designing a novel attention mechanism for 3D computation and proposed a new segmentation framework (named 3DTU) for three-dimensional medical image segmentation tasks. This new framework processes images in an end-to-end manner and executes 3D computation on both the encoder side (which contains a 3D transformer) and the decoder side (which is based on a 3D DCNN). We tested our framework on two independent datasets that consist of 3D MRI and CT images. Experimental results clearly demonstrate that our method outperforms several state-of-the-art segmentation methods in various metrics.<\/jats:p>","DOI":"10.3389\/fdata.2022.1080715","type":"journal-article","created":{"date-parts":[[2023,1,6]],"date-time":"2023-01-06T19:17:35Z","timestamp":1673032655000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["3D bi-directional transformer U-Net for medical image segmentation"],"prefix":"10.3389","volume":"5","author":[{"given":"Xiyao","family":"Fu","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhexian","family":"Sun","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haoteng","family":"Tang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eric M.","family":"Zou","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Heng","family":"Huang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yong","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Liang","family":"Zhan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1965","published-online":{"date-parts":[[2023,1,6]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"Segnet: a deep convolutional encoder-decoder architecture for image segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern. Anal. Mach. Intell"},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2105.05537","article-title":"Swin-Unet: unet-like pure transformer for medical image segmentation","author":"Cao","year":"2021","journal-title":"arXiv preprint"},{"key":"B3","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2102.04306","article-title":"TransUnet: transformers make strong encoders for medical image segmentation","author":"Chen","year":"2021","journal-title":"arXiv preprint"},{"key":"B4","first-page":"801","article-title":"\u201cEncoder-decoder with atrous separable convolution for semantic image segmentation,\u201d","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV)","author":"Chen","year":"2018"},{"key":"B5","first-page":"433","article-title":"\u201cGraph-based global reasoning networks,\u201d","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Chen","year":"2019"},{"key":"B6","first-page":"7103","article-title":"\u201cCascaded pyramid network for multi-person pose estimation,\u201d","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Chen","year":"2018"},{"key":"B7","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/W14-4012","article-title":"On the properties of neural machine translation: encoder-decoder approaches","author":"Cho","year":"2014","journal-title":"arXiv preprint"},{"key":"B8","first-page":"424","article-title":"\u201c3D U-Net: learning dense volumetric segmentation from sparse annotation,\u201d","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"\u00c7i\u00e7ek","year":"2016"},{"key":"B9","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1810.04805","article-title":"Bert: pre-training of deep bidirectional transformers for language understanding","author":"Devlin","year":"2018","journal-title":"arXiv preprint"},{"key":"B10","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1016\/j.isprsjprs.2020.01.013","article-title":"Resunet-a: a deep learning framework for semantic segmentation of remotely sensed data","volume":"162","author":"Diakogiannis","year":"2020","journal-title":"ISPRS J. Photogram. Remote Sens"},{"key":"B11","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2010.11929","article-title":"An image is worth 16x16 words: transformers for image recognition at scale","author":"Dosovitskiy","year":"2020","journal-title":"arXiv preprint"},{"key":"B12","first-page":"61","article-title":"\u201cUtnet: a hybrid transformer architecture for medical image segmentation,\u201d","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Gao","year":"2021"},{"key":"B13","article-title":"\u201cGenerative adversarial nets,\u201d","volume-title":"Neural Information Processing Systems 27","author":"Goodfellow","year":"2014"},{"key":"B14","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-08999-2_22","article-title":"Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images","author":"Hatamizadeh","year":"","journal-title":"arXiv preprint"},{"key":"B15","first-page":"574","article-title":"\u201cUNETR: transformers for 3d medical image segmentation,\u201d","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision","author":"Hatamizadeh","year":""},{"key":"B16","first-page":"770","article-title":"\u201cDeep residual learning for image recognition,\u201d","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"He","year":"2016"},{"key":"B17","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput"},{"key":"B18","doi-asserted-by":"crossref","first-page":"1055","DOI":"10.1109\/ICASSP40776.2020.9053405","article-title":"\u201cUnet 3+: a full-scale connected unet for medical image segmentation,\u201d","volume-title":"ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Huang","year":"2020"},{"key":"B19","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1038\/s41592-020-01008-z","article-title":"NNU-Net: a self-configuring method for deep learning-based biomedical image segmentation","volume":"18","author":"Isensee","year":"2021","journal-title":"Nat. Methods"},{"key":"B20","first-page":"142","article-title":"\u201cProgressively normalized self-attention network for video polyp segmentation,\u201d","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Ji","year":"2021"},{"key":"B21","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2108.03809","article-title":"PSGR: pixel-wise sparse graph reasoning for covid-19 pneumonia segmentation in ct images","author":"Jia","year":"2021","journal-title":"arXiv preprint"},{"key":"B22","doi-asserted-by":"crossref","first-page":"1197","DOI":"10.1002\/mp.14676","article-title":"Toward data-efficient learning: A benchmark for COVID-19 CT lung and infection segmentation","volume":"48","author":"Jun","year":"2021","journal-title":"Med. Phys."},{"key":"B23","article-title":"\u201cImagenet classification with deep convolutional neural networks,\u201d","author":"Krizhevsky","year":"2012","journal-title":"Advances in Neural Information Processing Systems 25"},{"key":"B24","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2206.01136","article-title":"Transforming medicalimaging with transformers? a comparative review of key properties, current progresses, and future perspectives","author":"Li","year":"2022","journal-title":"arXiv preprint"},{"key":"B25","first-page":"8950","article-title":"\u201cSpatial pyramid based graph reasoning for semantic segmentation,\u201d","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Li","year":"2020"},{"key":"B26","article-title":"\u201cBeyond grids: learning graph representations for visual recognition,\u201d","author":"Li","year":"2018","journal-title":"Advances in Neural Information Processing Systems 31"},{"key":"B27","doi-asserted-by":"publisher","DOI":"10.1109\/TIM.2022.3178991","article-title":"Ds-transunet: dual swin transformer u-net for medical image segmentation","author":"Lin","year":"2022","journal-title":"IEEE Trans. Instrum. Meas"},{"key":"B28","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1007\/s12021-018-9370-4","article-title":"Multi-modality cascaded convolutional neural networks for alzheimer's disease diagnosis","volume":"16","author":"Liu","year":"2018","journal-title":"Neuroinformatics"},{"key":"B29","first-page":"313","article-title":"\u201cStyle transfer using generative adversarial networks for multi-site mri harmonization,\u201d","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Liu","year":"2021"},{"key":"B30","doi-asserted-by":"publisher","DOI":"10.1101\/2022.09.12.506445","article-title":"Style transfer generative adversarial networks to harmonize multi-site mri to a single reference image to avoid over-correction","author":"Liu","year":"2022","journal-title":"bioRxiv"},{"key":"B31","first-page":"3431","article-title":"\u201cFully convolutional networks for semantic segmentation,\u201d","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Long","year":"2015"},{"key":"B32","first-page":"1520","article-title":"\u201cLearning deconvolution network for semantic segmentation,\u201d","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Noh","year":"2015"},{"key":"B33","doi-asserted-by":"publisher","first-page":"8460493","DOI":"10.1155\/2020\/8460493","article-title":"Fully automated bone age assessment on large-scale hand x-ray dataset","volume":"2020","author":"Pan","year":"2020","journal-title":"Int. J. Biomed. Imaging"},{"key":"B34","first-page":"4055","article-title":"\u201cImage transformer,\u201d","volume-title":"International Conference on Machine Learning","author":"Parmar","year":"2018"},{"key":"B35","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1007\/978-3-030-87589-3_28","article-title":"\u201cU-net transformer: self and cross attention for medical image segmentation,\u201d","volume-title":"International Workshop on Machine Learning in Medical Imaging","author":"Petit","year":"2021"},{"key":"B36","doi-asserted-by":"publisher","first-page":"4846","DOI":"10.1609\/aaai.v35i6.16617","article-title":"Miniseg: an extremely minimum network for efficient COVID-19 segmentation","volume":"35","author":"Qiu","year":"2021","journal-title":"Proc. AAAI Conf. Artif. Intell"},{"key":"B37","first-page":"234","article-title":"\u201cU-Net: convolutional networks for biomedical image segmentation,\u201d","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Ronneberger","year":"2015"},{"key":"B38","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1409.1556","article-title":"Very deep convolutional networks for large-scale image recognition","author":"Simonyan","year":"2014","journal-title":"arXiv preprint"},{"key":"B39","doi-asserted-by":"publisher","DOI":"10.1002\/uog.24959","article-title":"Dual-contrast mri reveals intraplacental oxygenation patterns, detects placental abnormalities and fetal brain oxygenation","author":"Sun","year":"2022","journal-title":"Ultrasound Obstetr. Gynecol"},{"key":"B40","first-page":"12597","article-title":"\u201cAdaptive weighting multi-field-of-view cnn for semantic segmentation in pathology,\u201d","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Tokunaga","year":"2019"},{"key":"B41","first-page":"36","article-title":"\u201cMedical transformer: gated axial-attention for medical image segmentation,\u201d","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Valanarasu","year":"2021"},{"key":"B42","article-title":"\u201cAttention is all you need,\u201d","volume-title":"Advances in Neural Information Processing Systems 30","author":"Vaswani","year":"2017"},{"key":"B43","first-page":"109","article-title":"\u201cTransbts: multimodal brain tumor segmentation using transformer,\u201d","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Wang","year":"2021"},{"key":"B44","first-page":"171","article-title":"\u201cCOTR: efficiently bridging cnn and transformer for 3D medical image segmentation,\u201d","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Xie","year":"2021"},{"key":"B45","first-page":"3881","article-title":"\u201cImproved variational autoencoders for text modeling using dilated convolutions,\u201d","volume-title":"International Conference on Machine Learning","author":"Yang","year":"2017"},{"key":"B46","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1511.07122","article-title":"Multi-scale context aggregation by dilated convolutions","author":"Yu","year":"2015","journal-title":"arXiv preprint"},{"key":"B47","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2103.03604","article-title":"Spectr: spectral transformer for hyperspectral pathology image segmentation","author":"Yun","year":"2021","journal-title":"arXiv preprint"},{"key":"B48","doi-asserted-by":"publisher","first-page":"2814","DOI":"10.1109\/TMI.2022.3170701","article-title":"Diffusion kernel attention network for brain disorder classification","volume":"41","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Med. Imaging"},{"key":"B49","first-page":"1","article-title":"\u201cDilated convolution neural network with leakyrelu for environmental sound classification,\u201d","volume-title":"2017 22nd International Conference on Digital Signal Processing (DSP)","author":"Zhang","year":"2017"},{"key":"B50","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2109.03201","article-title":"nnFormer: Interleaved transformer for volumetric segmentation","author":"Zhou","year":"2021","journal-title":"arXiv preprint"},{"key":"B51","doi-asserted-by":"publisher","first-page":"1856","DOI":"10.1109\/TMI.2019.2959609","article-title":"Unet++: redesigning skip connections to exploit multiscale features in image segmentation","volume":"39","author":"Zhou","year":"2019","journal-title":"IEEE Trans. Med. Imaging"}],"container-title":["Frontiers in Big Data"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdata.2022.1080715\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,6]],"date-time":"2023-01-06T19:18:00Z","timestamp":1673032680000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdata.2022.1080715\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1,6]]},"references-count":51,"alternative-id":["10.3389\/fdata.2022.1080715"],"URL":"https:\/\/doi.org\/10.3389\/fdata.2022.1080715","relation":{},"ISSN":["2624-909X"],"issn-type":[{"value":"2624-909X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1,6]]},"article-number":"1080715"}}