{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T10:54:27Z","timestamp":1777892067015,"version":"3.51.4"},"reference-count":43,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2024,4,10]],"date-time":"2024-04-10T00:00:00Z","timestamp":1712707200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Shenzhen International Science and Technology Information Center and the Shenzhen Bay Laboratory","award":["KCXFZ20211020163813019"],"award-info":[{"award-number":["KCXFZ20211020163813019"]}]},{"name":"Shenzhen International Science and Technology Information Center and the Shenzhen Bay Laboratory","award":["JCYJ20230807094803007"],"award-info":[{"award-number":["JCYJ20230807094803007"]}]},{"name":"Shenzhen Sustainable Development Project","award":["KCXFZ20211020163813019"],"award-info":[{"award-number":["KCXFZ20211020163813019"]}]},{"name":"Shenzhen Sustainable Development Project","award":["JCYJ20230807094803007"],"award-info":[{"award-number":["JCYJ20230807094803007"]}]},{"name":"Shenzhen Basic Research Project (Natural Science Fund)","award":["KCXFZ20211020163813019"],"award-info":[{"award-number":["KCXFZ20211020163813019"]}]},{"name":"Shenzhen Basic Research Project (Natural Science Fund)","award":["JCYJ20230807094803007"],"award-info":[{"award-number":["JCYJ20230807094803007"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The accurate segmentation and quantification of retinal fluid in Optical Coherence Tomography (OCT) images are crucial for the diagnosis and treatment of ophthalmic diseases such as age-related macular degeneration. However, the accurate segmentation of retinal fluid is challenging due to significant variations in the size, position, and shape of fluid, as well as their complex, curved boundaries. To address these challenges, we propose a novel multi-scale feature fusion attention network (FNeXter), based on ConvNeXt and Transformer, for OCT fluid segmentation. In FNeXter, we introduce a novel global multi-scale hybrid encoder module that integrates ConvNeXt, Transformer, and region-aware spatial attention. This module can capture long-range dependencies and non-local similarities while also focusing on local features. Moreover, this module possesses the spatial region-aware capabilities, enabling it to adaptively focus on the lesions regions. Additionally, we propose a novel self-adaptive multi-scale feature fusion attention module to enhance the skip connections between the encoder and the decoder. The inclusion of this module elevates the model\u2019s capacity to learn global features and multi-scale contextual information effectively. Finally, we conduct comprehensive experiments to evaluate the performance of the proposed FNeXter. Experimental results demonstrate that our proposed approach outperforms other state-of-the-art methods in the task of fluid segmentation.<\/jats:p>","DOI":"10.3390\/s24082425","type":"journal-article","created":{"date-parts":[[2024,4,10]],"date-time":"2024-04-10T10:55:28Z","timestamp":1712746528000},"page":"2425","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["FNeXter: A Multi-Scale Feature Fusion Network Based on ConvNeXt and Transformer for Retinal OCT Fluid Segmentation"],"prefix":"10.3390","volume":"24","author":[{"given":"Zhiyuan","family":"Niu","sequence":"first","affiliation":[{"name":"Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhuo","family":"Deng","sequence":"additional","affiliation":[{"name":"Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weihao","family":"Gao","sequence":"additional","affiliation":[{"name":"Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shurui","family":"Bai","sequence":"additional","affiliation":[{"name":"Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6220-3984","authenticated-orcid":false,"given":"Zheng","family":"Gong","sequence":"additional","affiliation":[{"name":"Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chucheng","family":"Chen","sequence":"additional","affiliation":[{"name":"Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-5653-5530","authenticated-orcid":false,"given":"Fuju","family":"Rong","sequence":"additional","affiliation":[{"name":"Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fang","family":"Li","sequence":"additional","affiliation":[{"name":"Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6610-5871","authenticated-orcid":false,"given":"Lan","family":"Ma","sequence":"additional","affiliation":[{"name":"Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,4,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.survophthal.2008.10.001","article-title":"Diabetic macular edema: Pathogenesis and treatment","volume":"54","author":"Bhagat","year":"2009","journal-title":"Surv. Ophthalmol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1178","DOI":"10.1126\/science.1957169","article-title":"Optical coherence tomography","volume":"254","author":"Huang","year":"1991","journal-title":"Science"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"4661","DOI":"10.1364\/BOE.6.004661","article-title":"Advanced image processing for optical coherence tomographic angiography of macular diseases","volume":"6","author":"Zhang","year":"2015","journal-title":"Biomed. Opt. Express"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1109\/TBME.2017.2695461","article-title":"Automatic subretinal fluid segmentation of retinal SD-OCT images with neurosensory retinal detachment guided by enface fundus imaging","volume":"65","author":"Wu","year":"2017","journal-title":"IEEE Trans. Biomed. Eng."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1874","DOI":"10.1364\/BOE.8.001874","article-title":"Joint retinal layer and fluid segmentation in OCT scans of eyes with severe macular edema using unsupervised representation and auto-context","volume":"8","author":"Montuoro","year":"2017","journal-title":"Biomed. Opt. Express"},{"key":"ref_6","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015). Medical Image Computing and Computer-Assisted Intervention, Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2013MICCAI 2015: 18th International Conference, Munich, Germany, 5\u20139 October 2015, Springer. Proceedings, Part III 18."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"Segnet: A deep convolutional encoder-decoder architecture for image segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1856","DOI":"10.1109\/TMI.2019.2959609","article-title":"Unet++: Redesigning skip connections to exploit multiscale features in image segmentation","volume":"39","author":"Zhou","year":"2019","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_11","unstructured":"Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., and Kainz, B. (2018). Attention u-net: Learning where to look for the pancreas. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"749","DOI":"10.1109\/LGRS.2018.2802944","article-title":"Road extraction by deep residual u-net","volume":"15","author":"Zhang","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1038\/s41592-020-01008-z","article-title":"nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation","volume":"18","author":"Isensee","year":"2021","journal-title":"Nat. Methods"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1016\/j.media.2019.02.011","article-title":"Deep-learning based multiclass retinal fluid segmentation and detection in optical coherence tomography images using a fully convolutional neural network","volume":"54","author":"Lu","year":"2019","journal-title":"Med. Image Anal."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Zhu, W., Zhang, L., Shi, F., Xiang, D., Wang, L., Guo, J., Yang, X., Chen, H., and Chen, X. (2017). Automated framework for intraretinal cystoid macular edema segmentation in three-dimensional optical coherence tomography images with macular hole. J. Biomed. Opt., 22.","DOI":"10.1117\/1.JBO.22.7.076014"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1109\/JBHI.2018.2793534","article-title":"Segmentation of retinal cysts from optical coherence tomography volumes via selective enhancement","volume":"23","author":"Gopinath","year":"2018","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"112","DOI":"10.1016\/j.compbiomed.2018.12.015","article-title":"Deep structure tensor graph search framework for automated extraction and characterization of retinal layers and fluid pathology in retinal SD-OCT scans","volume":"105","author":"Hassan","year":"2019","journal-title":"Comput. Biol. Med."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1016\/j.media.2019.05.002","article-title":"Automated segmentation of macular edema in OCT using deep neural networks","volume":"55","author":"Hu","year":"2019","journal-title":"Med. Image Anal."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"3008","DOI":"10.1109\/TMI.2020.2983721","article-title":"CPFNet: Context pyramid fusion network for medical image segmentation","volume":"39","author":"Feng","year":"2020","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1016\/j.neucom.2020.07.143","article-title":"Automatic fluid segmentation in retinal optical coherence tomography images using attention based deep learning","volume":"452","author":"Liu","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1547","DOI":"10.1109\/TMI.2022.3142048","article-title":"Multi-scale pathological fluid segmentation in OCT with a novel curvature loss in convolutional neural network","volume":"41","author":"Xing","year":"2022","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"4645","DOI":"10.1109\/JBHI.2022.3187103","article-title":"Rformer: Transformer-based generative adversarial network for real fundus image restoration on a new clinical benchmark","volume":"26","author":"Deng","year":"2022","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"ref_23","first-page":"12077","article-title":"SegFormer: Simple and efficient design for semantic segmentation with transformers","volume":"34","author":"Xie","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_24","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_26","unstructured":"Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., and Zhou, Y. (2021). Transunet: Transformers make strong encoders for medical image segmentation. arXiv."},{"key":"ref_27","unstructured":"Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., and Wang, M. (2022). Computer Vision, Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23\u201327 October 2022, Springer."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1484","DOI":"10.1109\/TMI.2022.3230943","article-title":"Missformer: An effective transformer for 2d medical image segmentation","volume":"42","author":"Huang","year":"2023","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_29","unstructured":"Wang, H., Cao, P., Wang, J., and Zaiane, O.R. (March, January 22). Uctransnet: Rethinking the skip connections in u-net from a channel-wise perspective with transformer. Proceedings of the AAAI Conference on Artificial Intelligence, Virtual Event."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., and Xie, S. (2022, January 18\u201324). A convnet for the 2020s. Proceedings of the IEEE\/CVF conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01167"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Woo, S., Debnath, S., Hu, R., Chen, X., Liu, Z., Kweon, I.S., and Xie, S. (2023, January 17\u201324). Convnext v2: Co-designing and scaling convnets with masked autoencoders. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01548"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1016\/j.neunet.2023.11.033","article-title":"Hierarchical attention network with progressive feature fusion for facial expression recognition","volume":"170","author":"Tao","year":"2024","journal-title":"Neural Netw."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1858","DOI":"10.1109\/TMI.2019.2901398","article-title":"RETOUCH: The retinal OCT fluid detection and segmentation benchmark and challenge","volume":"38","author":"Venhuizen","year":"2019","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_34","unstructured":"Loshchilov, I., and Hutter, F. (May, January 30). Decoupled Weight Decay Regularization. Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Brodersen, K.H., Ong, C.S., Stephan, K.E., and Buhmann, J.M. (2010, January 23\u201326). The balanced accuracy and its posterior distribution. Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey.","DOI":"10.1109\/ICPR.2010.764"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"394","DOI":"10.1109\/TMI.2021.3112716","article-title":"MsTGANet: Automatic drusen segmentation from retinal OCT images","volume":"41","author":"Wang","year":"2021","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"2763","DOI":"10.1109\/TMI.2023.3264513","article-title":"H2Former: An Efficient Hierarchical Hybrid Transformer for Medical Image Segmentation","volume":"42","author":"He","year":"2023","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"e11581","DOI":"10.7717\/peerj.11581","article-title":"iSUMOK-PseAAC: Prediction of lysine sumoylation sites using statistical moments and Chou\u2019s PseAAC","volume":"9","author":"Khan","year":"2021","journal-title":"PeerJ"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Liu, T., Huang, J., Luo, D., Ren, L., Ning, L., Huang, J., Lin, H., and Zhang, Y. (2024). Cm-siRPred: Predicting chemically modified siRNA efficiency based on multi-view learning strategy. Int. J. Biol. Macromol., 264.","DOI":"10.1016\/j.ijbiomac.2024.130638"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1172","DOI":"10.1364\/BOE.6.001172","article-title":"Kernel regression based segmentation of optical coherence tomography images with diabetic macular edema","volume":"6","author":"Chiu","year":"2015","journal-title":"Biomed. Opt. Express"},{"key":"ref_42","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018). Computer Vision, Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8\u201314 September 2018, Springer."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/8\/2425\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:25:52Z","timestamp":1760106352000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/8\/2425"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,10]]},"references-count":43,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2024,4]]}},"alternative-id":["s24082425"],"URL":"https:\/\/doi.org\/10.3390\/s24082425","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,10]]}}}