{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T13:20:23Z","timestamp":1770816023468,"version":"3.50.1"},"reference-count":43,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T00:00:00Z","timestamp":1770681600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Shanghai Science and Technology Innovation Action Medical Innovation Research Project","award":["23Y11910100"],"award-info":[{"award-number":["23Y11910100"]}]},{"name":"Fudan University Medical Engineering Integration Project","award":["IDH2310166"],"award-info":[{"award-number":["IDH2310166"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Standard SAM-based approaches in medical imaging typically rely on explicit geometric prompts, such as bounding boxes or points. However, these rigid spatial constraints are often insufficient for capturing the complex, deformable boundaries of medical structures, where localization noise easily propagates into segmentation errors. To overcome this, we propose the Localization Distillation-Enhanced Feature Prompting SAM (LDFSAM), a novel framework that shifts from discrete coordinate inputs to a latent feature prompting paradigm. We employ a lightweight prompt generator, refined via Localization Distillation (LD), to inject multi-scale features into the SAM decoder as complementary Dense Feature Prompts (DFPs) and Sparse Feature Prompts (SFPs). This effectively guides segmentation without explicit box constraints. Extensive experiments on four public benchmarks (3D CBCT Tooth, ISIC 2018, MMOTU, and Kvasir-SEG) demonstrate that LDFSAM outperforms both prior SAM-based baselines and conventional networks, achieving Dice scores exceeding 0.91. Further validation on an in-house cohort demonstrates its robust generalization capabilities. Overall, our method outperforms both prior SAM-based baselines and conventional networks, with particularly strong gains in low-data regimes, providing a reliable solution for automated medical image segmentation.<\/jats:p>","DOI":"10.3390\/jimaging12020074","type":"journal-article","created":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T09:16:08Z","timestamp":1770801368000},"page":"74","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["LDFSAM: Localization Distillation-Enhanced Feature Prompting SAM for Medical Image Segmentation"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-6244-1644","authenticated-orcid":false,"given":"Xuanbo","family":"Zhao","sequence":"first","affiliation":[{"name":"College of Intelligent Robotics and Advanced Manufacturing, College of Future Information Technology, College of Biomedical Engineering, Fudan University, Shanghai 200433, China"}]},{"given":"Cheng","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Intelligent Robotics and Advanced Manufacturing, College of Future Information Technology, College of Biomedical Engineering, Fudan University, Shanghai 200433, China"},{"name":"School of Computer and Information, Anhui Normal University, Wuhu 241002, China"}]},{"given":"Huaxing","family":"Xu","sequence":"additional","affiliation":[{"name":"Department of Endodontics, Shanghai Stomatological Hospital, Fudan University, Shanghai 200001, China"}]},{"given":"Hong","family":"Zhou","sequence":"additional","affiliation":[{"name":"College of Intelligent Robotics and Advanced Manufacturing, College of Future Information Technology, College of Biomedical Engineering, Fudan University, Shanghai 200433, China"}]},{"given":"Zekuan","family":"Yu","sequence":"additional","affiliation":[{"name":"College of Intelligent Robotics and Advanced Manufacturing, College of Future Information Technology, College of Biomedical Engineering, Fudan University, Shanghai 200433, China"}]},{"given":"Tao","family":"Chen","sequence":"additional","affiliation":[{"name":"College of Intelligent Robotics and Advanced Manufacturing, College of Future Information Technology, College of Biomedical Engineering, Fudan University, Shanghai 200433, China"}]},{"given":"Xiaoling","family":"Wei","sequence":"additional","affiliation":[{"name":"Department of Endodontics, Shanghai Stomatological Hospital, Fudan University, Shanghai 200001, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1798-9922","authenticated-orcid":false,"given":"Rongjun","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Intelligent Robotics and Advanced Manufacturing, College of Future Information Technology, College of Biomedical Engineering, Fudan University, Shanghai 200433, China"}]}],"member":"1968","published-online":{"date-parts":[[2026,2,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1016\/j.media.2017.07.005","article-title":"A survey on deep learning in medical image analysis","volume":"42","author":"Litjens","year":"2017","journal-title":"Med. Image Anal."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Kebaili, A., Lapuyade-Lahorgue, J., and Ruan, S. (2023). Deep learning approaches for data augmentation in medical imaging: A review. J. Imaging, 9.","DOI":"10.3390\/jimaging9040081"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1038\/s41592-020-01008-z","article-title":"nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation","volume":"18","author":"Isensee","year":"2021","journal-title":"Nat. Methods"},{"key":"ref_5","unstructured":"Dosovitskiy, A. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., and Lo, W.-Y. (2023). Segment anything. Proceedings of the IEEE\/CVF International Conference on Computer Vision, IEEE.","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"654","DOI":"10.1038\/s41467-024-44824-z","article-title":"Segment anything in medical images","volume":"15","author":"Ma","year":"2024","journal-title":"Nat. Commun."},{"key":"ref_8","unstructured":"Cheng, J., Ye, J., Deng, Z., Chen, J., Li, T., Wang, H., Su, Y., Huang, Z., Chen, J., and Jiang, L. (2023). SAM-Med2D. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Rahman, M.M., Munir, M., Jha, D., Bagci, U., and Marculescu, R. (2024). Pp-sam: Perturbed prompts for robust adaption of segment anything model for polyp segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, IEEE.","DOI":"10.1109\/CVPRW63382.2024.00504"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"\u00c7i\u00e7ek, \u00d6., Abdulkadir, A., Lienkamp, S.S., Brox, T., and Ronneberger, O. (2016). 3D U-Net: Learning dense volumetric segmentation from sparse annotation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer.","DOI":"10.1007\/978-3-319-46723-8_49"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Milletari, F., Navab, N., and Ahmadi, S.-A. (2016). V-net: Fully convolutional neural networks for volumetric medical image segmentation. Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), IEEE.","DOI":"10.1109\/3DV.2016.79"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, IEEE.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Bolya, D., Zhou, C., Xiao, F., and Lee, Y.J. (2019). Yolact: Real-time instance segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, IEEE.","DOI":"10.1109\/ICCV.2019.00925"},{"key":"ref_15","first-page":"17721","article-title":"Solov2: Dynamic and fast instance segmentation","volume":"33","author":"Wang","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_16","unstructured":"Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., and Zhou, Y. (2021). Transunet: Transformers make strong encoders for medical image segmentation. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"107803","DOI":"10.1016\/j.compbiomed.2023.107803","article-title":"CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation","volume":"168","author":"Wang","year":"2024","journal-title":"Comput. Biol. Med."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Nikulins, A., Edelmers, E., Sudars, K., and Polaka, I. (2025). Adapting Classification Neural Network Architectures for Medical Image Segmentation Using Explainable AI. J. Imaging, 11.","DOI":"10.3390\/jimaging11020055"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"103061","DOI":"10.1016\/j.media.2023.103061","article-title":"Segment anything model for medical images?","volume":"92","author":"Huang","year":"2024","journal-title":"Med. Image Anal."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"102473","DOI":"10.1016\/j.compmedimag.2024.102473","article-title":"A review of the segment anything model (sam) for medical image analysis: Accomplishments and perspectives","volume":"119","author":"Ali","year":"2024","journal-title":"Comput. Med. Imaging Graph."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zhang, K., and Liu, D. (2023). Customized segment anything model for medical image segmentation. arXiv.","DOI":"10.2139\/ssrn.4495221"},{"key":"ref_22","unstructured":"Xu, Q., Li, J., He, X., Liu, Z., Chen, Z., Duan, W., Li, C., He, M.M., Tesema, F.B., and Cheah, W.P. (2024). Esp-medsam: Efficient self-prompting sam for universal domain-generalized medical image segmentation. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"437","DOI":"10.3390\/jimaging11120437","article-title":"Domain-Adaptive Segment Anything Model for Cross-Domain Water Body Segmentation in Satellite Imagery","volume":"11","author":"Yang","year":"2025","journal-title":"J. Imaging"},{"key":"ref_24","unstructured":"Jocher, G., Chaurasia, A., and Qiu, J. (2023). Ultralytics YOLOv8, Ultralytics."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"G\u00fcl, S., Cetinel, G., Aydin, B.M., Akg\u00fcn, D., and \u00d6zta\u015f Kara, R. (2025). YOLOSAMIC: A Hybrid Approach to Skin Cancer Segmentation with the Segment Anything Model and YOLOv8. Diagnostics, 15.","DOI":"10.3390\/diagnostics15040479"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Pandey, S., Chen, K.-F., and Dam, E.B. (2023). Comprehensive multimodal segmentation in medical imaging: Combining yolov8 with sam and hq-sam models. Proceedings of the IEEE\/CVF International Conference on Computer Vision, IEEE.","DOI":"10.1109\/ICCVW60793.2023.00273"},{"key":"ref_27","unstructured":"Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the knowledge in a neural network. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zheng, Z., Ye, R., Wang, P., Ren, D., Zuo, W., Hou, Q., and Cheng, M.-M. (2022). Localization distillation for dense object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, IEEE.","DOI":"10.1109\/CVPR52688.2022.00919"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2096","DOI":"10.1038\/s41467-022-29637-2","article-title":"A fully automatic AI system for tooth and alveolar bone segmentation from cone-beam CT images","volume":"13","author":"Cui","year":"2022","journal-title":"Nat. Commun."},{"key":"ref_30","unstructured":"Codella, N., Rotemberg, V., Tschandl, P., Celebi, M.E., Dusza, S., Gutman, D., Helba, B., Kalloo, A., Liopyris, K., and Marchetti, M. (2019). Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic). arXiv."},{"key":"ref_31","unstructured":"Zhao, Q., Lyu, S., Bai, W., Cai, L., Liu, B., Cheng, G., Wu, M., Sang, X., Yang, M., and Chen, L. (2022). MMOTU: A multi-modality ovarian tumor ultrasound image dataset for unsupervised cross-domain semantic segmentation. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Jha, D., Smedsrud, P.H., Riegler, M.A., Halvorsen, P., De Lange, T., Johansen, D., and Johansen, H.D. (2019). Kvasir-seg: A segmented polyp dataset. Proceedings of the International Conference on Multimedia Modeling, Springer.","DOI":"10.1007\/978-3-030-37734-2_37"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1822","DOI":"10.1109\/TMI.2018.2806309","article-title":"Automatic multi-organ segmentation on abdominal CT with dense V-networks","volume":"37","author":"Gibson","year":"2018","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_34","unstructured":"Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., and Kainz, B. (2018). Attention u-net: Learning where to look for the pancreas. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Hatamizadeh, A., Tang, Y., Nath, V., Yang, D., Myronenko, A., Landman, B., Roth, H.R., and Xu, D. (2022). Unetr: Transformers for 3d medical image segmentation. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, IEEE.","DOI":"10.1109\/WACV51458.2022.00181"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Hatamizadeh, A., Nath, V., Tang, Y., Yang, D., Roth, H.R., and Xu, D. (2021). Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. Proceedings of the International MICCAI Brainlesion Workshop, Springer.","DOI":"10.1007\/978-3-031-08999-2_22"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"4036","DOI":"10.1109\/TIP.2023.3293771","article-title":"nnformer: Volumetric medical image segmentation via a 3d transformer","volume":"32","author":"Zhou","year":"2023","journal-title":"IEEE Trans. Image Process."},{"key":"ref_38","unstructured":"Lee, H.H., Bao, S., Huo, Y., and Landman, B.A. (2022). 3d ux-net: A large kernel volumetric convnet modernizing hierarchical transformer for medical image segmentation. arXiv."},{"key":"ref_39","first-page":"12077","article-title":"SegFormer: Simple and efficient design for semantic segmentation with transformers","volume":"34","author":"Xie","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1109\/TMI.2020.3035253","article-title":"CA-Net: Comprehensive attention convolutional neural networks for explainable medical image segmentation","volume":"40","author":"Gu","year":"2020","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"2281","DOI":"10.1109\/TMI.2019.2903562","article-title":"Ce-net: Context encoder network for 2d medical image segmentation","volume":"38","author":"Gu","year":"2019","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"3008","DOI":"10.1109\/TMI.2020.2983721","article-title":"CPFNet: Context pyramid fusion network for medical image segmentation","volume":"39","author":"Feng","year":"2020","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"106881","DOI":"10.1016\/j.asoc.2020.106881","article-title":"Cascade knowledge diffusion network for skin lesion diagnosis and segmentation","volume":"99","author":"Jin","year":"2021","journal-title":"Appl. Soft Comput."}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/12\/2\/74\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T10:07:13Z","timestamp":1770804433000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/12\/2\/74"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,10]]},"references-count":43,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2026,2]]}},"alternative-id":["jimaging12020074"],"URL":"https:\/\/doi.org\/10.3390\/jimaging12020074","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,10]]}}}