{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,13]],"date-time":"2026-06-13T04:31:42Z","timestamp":1781325102485,"version":"3.54.1"},"reference-count":49,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2025,4,25]],"date-time":"2025-04-25T00:00:00Z","timestamp":1745539200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput. Healthcare"],"published-print":{"date-parts":[[2025,4,30]]},"abstract":"<jats:p>The segmentation of glioma is crucial for early diagnosis, according to a World Health Organization (WHO) 2021 report. For glioma diagnosis, 3D multi-modal brain MRI\/CT imaging has become an essential tool, offering detailed information. Nowadays, deep learning frameworks have been applied to various medical imaging problems, including brain glioma segmentation. Recently, foundation models like Segment Anything Model (SAM) have emerged as pivotal tools in computer vision tasks. These models are trained using large (real-world) datasets, offering a generalized understanding of visual data and semantic key features. Therefore, the effective utilization of foundation models in medical imaging is a significant area of current research. However, the differences in data distribution between multi-modal medical images and real-world images present challenges in directly applying foundation models to medical imaging. Additionally, utilizing multi-modal images to extract crucial information and its fusion poses further challenges. To address these issues, we propose a framework using foundation model and novel strategies for multi-modal fusion. Our fusion adapters effectively integrate the information from different modalities to enhance glioma segmentation in multi-modal MRI scans. Our method outperforms current state-of-the-art methods for accurate segmentation of the glioma using private and publicly available brain MRI datasets, proving the effectiveness of our approach across different datasets and imaging modalities.<\/jats:p>","DOI":"10.1145\/3712297","type":"journal-article","created":{"date-parts":[[2025,3,20]],"date-time":"2025-03-20T17:41:36Z","timestamp":1742492496000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Multi-modal Medical SAM: An Adaptation Method of Segment Anything Model (SAM) for Glioma Segmentation Using Multi-modal MR Images"],"prefix":"10.1145","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5971-808X","authenticated-orcid":false,"given":"Xiaoyu","family":"Shi","sequence":"first","affiliation":[{"name":"Graduate School of Information Science and Engineering, Ritsumeikan University, Ibaraki, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0768-2193","authenticated-orcid":false,"given":"Rahul Kumar","family":"Jain","sequence":"additional","affiliation":[{"name":"Graduate School of Information Science and Engineering, Ritsumeikan University, Ibaraki, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8924-5279","authenticated-orcid":false,"given":"Yinhao","family":"Li","sequence":"additional","affiliation":[{"name":"Graduate School of Information Science and Engineering, Ritsumeikan University, Ibaraki, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2252-6786","authenticated-orcid":false,"given":"Shurong","family":"Chai","sequence":"additional","affiliation":[{"name":"Graduate School of Information Science and Engineering, Ritsumeikan University, Ibaraki, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6996-329X","authenticated-orcid":false,"given":"Jingliang","family":"Cheng","sequence":"additional","affiliation":[{"name":"The Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7744-1117","authenticated-orcid":false,"given":"Jie","family":"Bai","sequence":"additional","affiliation":[{"name":"The Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8813-8527","authenticated-orcid":false,"given":"Guohua","family":"Zhao","sequence":"additional","affiliation":[{"name":"The Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4098-588X","authenticated-orcid":false,"given":"Lanfen","family":"Lin","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou, China and Zhejiang Key Laboratory of Multi-omics Precision Diagnosis and Treatment of Liver Diseases, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5952-0188","authenticated-orcid":false,"given":"Yen-Wei","family":"Chen","sequence":"additional","affiliation":[{"name":"Graduate School of Information Science and Engineering, Ritsumeikan University, Ibaraki, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2025,4,25]]},"reference":[{"issue":"3","key":"e_1_3_1_2_2","first-page":"304","article-title":"Handwriting Arabic character recognition LeNet using neural network","volume":"6","author":"Al-Jawfi Rashad","year":"2009","unstructured":"Rashad Al-Jawfi. 2009. Handwriting Arabic character recognition LeNet using neural network. The International Arab Journal of Information Technology 6, 3 (2009), 304\u2013309.","journal-title":"The International Arab Journal of Information Technology"},{"key":"e_1_3_1_3_2","unstructured":"Rishi Bommasani Drew A. Hudson Ehsan Adeli Russ Altman Simran Arora Sydney von Arx Michael S. Bernstein Jeannette Bohg Antoine Bosselut Emma Brunskill et al. 2022. On the opportunities and risks of foundation models. arXiv:2108.07258. Retrieved from https:\/\/arxiv.org\/abs\/2108.07258"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3616961.3616992"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-25066-8_9"},{"key":"e_1_3_1_6_2","unstructured":"Shurong Chai Rahul Kumar Jain Shiyu Teng Jiaqing Liu Yinhao Li Tomoko Tateyama and Yen-Wei Chen. 2023. Ladder Fine-tuning approach for SAM integrating complementary network. arXiv:2306.12737. Retrieved from https:\/\/arxiv.org\/abs\/2306.12737"},{"key":"e_1_3_1_7_2","unstructured":"Jieneng Chen Yongyi Lu Qihang Yu Xiangde Luo Ehsan Adeli Yan Wang Le Lu Alan L. Yuille and Yuyin Zhou. 2021. TransUNet: Transformers make strong encoders for medical image segmentation. arXiv:2102.04306. Retrieved from https:\/\/arxiv.org\/abs\/2102.04306"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1155\/2018\/2512037"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2007.912817"},{"key":"e_1_3_1_10_2","unstructured":"Steven De Vleeschouwer. 2017. Glioblastoma [Internet]. Retrieved February 12 2024 from https:\/\/pubmed.ncbi.nlm.nih.gov\/29251853\/"},{"key":"e_1_3_1_11_2","unstructured":"Jacob Devlin Ming-Wei Chang Kenton Lee and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805. Retrieved from https:\/\/arxiv.org\/abs\/1810.04805"},{"key":"e_1_3_1_12_2","unstructured":"Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai Thomas Unterthiner Mostafa Dehghani Matthias Minderer Georg Heigold Sylvain Gelly et al. 2021. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929. Retrieved from https:\/\/arxiv.org\/abs\/2010.11929"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/JBHI.2019.2912935"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_15_2","first-page":"2790","volume-title":"Proceedings of the 36th International Conference on Machine Learning","author":"Houlsby Neil","year":"2019","unstructured":"Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin De Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. 2019. Parameter-efficient transfer learning for NLP. In Proceedings of the 36th International Conference on Machine Learning. PMLR, 2790\u20132799. Retrieved February 12, 2024 from https:\/\/proceedings.mlr.press\/v97\/houlsby19a.html"},{"key":"e_1_3_1_16_2","unstructured":"Edward J. Hu Yelong Shen Phillip Wallis Zeyuan Allen-Zhu Yuanzhi Li Shean Wang Lu Wang and Weizhu Chen. 2021. LoRA: Low-rank adaptation of large language models. arXiv:2106.09685. Retrieved from https:\/\/arxiv.org\/abs\/2106.09685"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-020-01008-z"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-46640-4_25"},{"key":"e_1_3_1_19_2","doi-asserted-by":"crossref","unstructured":"Alexander Kirillov Eric Mintun Nikhila Ravi Hanzi Mao Chloe Rolland Laura Gustafson Tete Xiao Spencer Whitehead Alexander C. Berg Wan-Yen Lo et al. 2023. Segment anything. arXiv:2304.02643. Retrieved from https:\/\/arxiv.org\/abs\/2304.02643","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3065386"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10404-1_95"},{"key":"e_1_3_1_22_2","doi-asserted-by":"crossref","unstructured":"Brian Lester Rami Al-Rfou and Noah Constant. 2021. The power of scale for parameter-efficient prompt tuning. arXiv:2104.08691. Retrieved from https:\/\/arxiv.org\/abs\/2104.08691","DOI":"10.18653\/v1\/2021.emnlp-main.243"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-32245-8_7"},{"key":"e_1_3_1_24_2","first-page":"9543","article-title":"Glioma segmentation with a unified algorithm in multimodal MRI images","volume":"6","author":"Li Qingneng","year":"2018","unstructured":"Qingneng Li, Zhifan Gao, Qiuyu Wang, Jun Xia, Heye Zhang, Huailing Zhang, Huafeng Liu, and Shuo Li. 2018. Glioma segmentation with a unified algorithm in multimodal MRI images. IEEE Access 6 (2018), 9543\u20139553.","journal-title":"IEEE Access"},{"key":"e_1_3_1_25_2","unstructured":"Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. arXiv:2101.00190. Retrieved from https:\/\/arxiv.org\/abs\/2101.00190"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00401-016-1545-1"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/3DV.2016.79"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.3174\/ajnr.A7462"},{"issue":"5","key":"e_1_3_1_29_2","doi-asserted-by":"crossref","first-page":"v1","DOI":"10.1093\/neuonc\/now207","article-title":"CBTRUS statistical report: Primary brain and other central nervous system tumors diagnosed in the United States in 2009\u20132013","volume":"18","author":"Ostrom Quinn T.","year":"2016","unstructured":"Quinn T. Ostrom, Haley Gittleman, Jordan Xu, Courtney Kromer, Yingli Wolinsky, Carol Kruchko, and Jill S. Barnholtz-Sloan. 2016. CBTRUS statistical report: Primary brain and other central nervous system tumors diagnosed in the United States in 2009\u20132013. Neuro-oncology 18, Suppl 5 (2016), v1\u2013v75.","journal-title":"Neuro-oncology"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/EMBC.2015.7319032"},{"key":"e_1_3_1_31_2","first-page":"8748","volume-title":"International Conference on Machine Learning","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, and Jack Clark. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748\u20138763. Retrieved March 8, 2024 from http:\/\/proceedings.mlr.press\/v139\/radford21a"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00247-021-05042-7"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","unstructured":"Xiaoyu Shi Yinhao Li Jingliang Cheng Jie Bai Guohua Zhao and Yen-Wei Chen. 2024. Medical SAM: A Glioma Segmentation Fine-Tuning Method for SAM Using Brain MR Images. In 2024 IEEE International Conference on Consumer Electronics (ICCE) Las Vegas NV USA 1\u20134 DOI: 10.1109\/ICCE59016.2024.10444252","DOI":"10.1109\/ICCE59016.2024.10444252"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1029\/2021JB022027"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1259\/bjr\/65711810"},{"key":"e_1_3_1_37_2","doi-asserted-by":"crossref","unstructured":"Elena Voita David Talbot Fedor Moiseev Rico Sennrich and Ivan Titov. 2019. Analyzing multi-head self-attention: Specialized heads do the heavy lifting the rest can be pruned. arXiv:1905.09418. Retrieved from https:\/\/arxiv.org\/abs\/1905.09418","DOI":"10.18653\/v1\/P19-1580"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0009-9260(03)00268-X"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1038\/nrdp.2015.17"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-85988-8_9"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1155\/2020\/6789306"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/ITME.2018.00080"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1002\/mp.14922"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10278-017-0037-8"},{"key":"e_1_3_1_45_2","doi-asserted-by":"crossref","unstructured":"Kaidong Zhang and Dong Liu. 2023. Customized segment anything model for medical image segmentation. arXiv:2304.13785. Retrieved from https:\/\/arxiv.org\/abs\/2304.13785","DOI":"10.2139\/ssrn.4495221"},{"key":"e_1_3_1_46_2","unstructured":"Frontiers. The Design of SimpleITK. Retrieved February 12 2024 from https:\/\/www.frontiersin.org\/articles\/10.3389\/fninf.2013.00045\/full"},{"key":"e_1_3_1_47_2","doi-asserted-by":"crossref","unstructured":"Tianrun Chen Lanyun Zhu Chaotao Ding Runlong Cao Yan Wang Shangzhan Zhang Zejian Li Lingyun Sun Ying Zang and Papa Mao. 2023. SAM-adapter: Adapting segment anything in underperformed scenes. In Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV) Workshops 3367\u20133375.","DOI":"10.1109\/ICCVW60793.2023.00361"},{"key":"e_1_3_1_48_2","doi-asserted-by":"crossref","unstructured":"Tianrun Chen Lanyun Zhu Chaotao Ding Runlong Cao Yan Wang Zejian Li Lingyun Sun Papa Mao and Ying Zang. 2023. SAM fails to segment anything? \u2013 SAM-adapter: Adapting SAM in underperformed scenes: Camouflage shadow medical image segmentation and more. arXiv:2304.09148. Retrieved from https:\/\/arxiv.org\/abs\/2304.09148","DOI":"10.1109\/ICCVW60793.2023.00361"},{"key":"e_1_3_1_49_2","doi-asserted-by":"crossref","unstructured":"Lee R. Dice. 1945. Measures of the amount of ecologic association between species. Ecology 26 3 (1945) 297\u2013302.","DOI":"10.2307\/1932409"},{"key":"e_1_3_1_50_2","doi-asserted-by":"crossref","unstructured":"Abdel Aziz Taha and Allan Hanbury. 2015. Metrics for evaluating 3D medical image segmentation: Analysis selection and tool. BMC Medical Imaging 15 (2015) 1\u201328.","DOI":"10.1186\/s12880-015-0068-x"}],"container-title":["ACM Transactions on Computing for Healthcare"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3712297","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3712297","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:10:29Z","timestamp":1750295429000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3712297"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,25]]},"references-count":49,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,4,30]]}},"alternative-id":["10.1145\/3712297"],"URL":"https:\/\/doi.org\/10.1145\/3712297","relation":{},"ISSN":["2637-8051"],"issn-type":[{"value":"2637-8051","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,25]]},"assertion":[{"value":"2024-03-09","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-12-14","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-25","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}