{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T18:24:29Z","timestamp":1773771869299,"version":"3.50.1"},"reference-count":65,"publisher":"Association for Computing Machinery (ACM)","issue":"2","funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"crossref","award":["1R01EB031910-01"],"award-info":[{"award-number":["1R01EB031910-01"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput. Healthcare"],"published-print":{"date-parts":[[2026,4,30]]},"abstract":"<jats:p>\n                    Infections in\n                    <jats:bold>Diabetic Foot Ulcers (DFUs)<\/jats:bold>\n                    can cause severe complications, including tissue death and limb amputation, highlighting the need for accurate, timely diagnosis. Previous machine learning methods have focused on identifying infections by analyzing wound images alone, without utilizing additional metadata such as medical notes. In this study, we aim to improve infection classification by introducing\n                    <jats:bold>Synthetic Caption Augmented Retrieval for Wound Infection Detection (SCARWID)<\/jats:bold>\n                    , a novel multimodal deep learning framework that leverages synthetic textual descriptions to augment DFU images. SCARWID consists of two components: (1) Wound-BLIP, a\n                    <jats:bold>Vision-Language Model (VLM)<\/jats:bold>\n                    fine-tuned on GPT-4o-generated descriptions to synthesize consistent captions from images; and (2) an Image\u2013Text Fusion module that uses cross-attention to extract cross-modal embeddings from an image and its corresponding Wound-BLIP caption. Infection status is determined by retrieving the top-\n                    <jats:inline-formula content-type=\"math\/tex\">\n                      <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\( k \\)<\/jats:tex-math>\n                    <\/jats:inline-formula>\n                    similar items from a labeled support set. To enhance the diversity of training data, we utilized a latent diffusion model to generate additional wound images. As a result, SCARWID outperformed state-of-the-art models, achieving average accuracy, sensitivity, and specificity of 0.814, 0.845, and 0.783, respectively, for wound infection classification.\n                  <\/jats:p>","DOI":"10.1145\/3793535","type":"journal-article","created":{"date-parts":[[2026,1,27]],"date-time":"2026-01-27T13:36:24Z","timestamp":1769520984000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Explainable, Multimodal Wound Infection Classification from Images Augmented with Generated Captions"],"prefix":"10.1145","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7247-7551","authenticated-orcid":false,"given":"Palawat","family":"Busaranuvong","sequence":"first","affiliation":[{"name":"Department of Data Science, Worcester Polytechnic Institute, Worcester, Massachusetts, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3361-4952","authenticated-orcid":false,"given":"Emmanuel","family":"Agu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Worcester Polytechnic Institute, Worcester, Massachusetts, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-9065-3084","authenticated-orcid":false,"given":"Reza","family":"Saadati Fard","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Worcester Polytechnic Institute, Worcester, Massachusetts, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2650-9636","authenticated-orcid":false,"given":"Deepak","family":"Kumar","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Worcester Polytechnic Institute, Worcester, Massachusetts, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-2143-6318","authenticated-orcid":false,"given":"Shefalika","family":"Gautam","sequence":"additional","affiliation":[{"name":"Department of Data Science, Worcester Polytechnic Institute, Worcester, Massachusetts, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7226-1830","authenticated-orcid":false,"given":"Bengisu","family":"Tulu","sequence":"additional","affiliation":[{"name":"Business School, Worcester Polytechnic Institute, Worcester, Massachusetts, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1756-0464","authenticated-orcid":false,"given":"Diane","family":"Strong","sequence":"additional","affiliation":[{"name":"Business School, Worcester Polytechnic Institute, Worcester, Massachusetts, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-7774-6568","authenticated-orcid":false,"given":"Lorraine","family":"Loretz","sequence":"additional","affiliation":[{"name":"UMass Memorial Medical Center, Worcester, Massachusetts, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2026,3,17]]},"reference":[{"key":"e_1_3_4_2_2","doi-asserted-by":"crossref","unstructured":"Samira Abnar and Willem Zuidema. 2020. Quantifying attention flow in transformers. arXiv:2005.00928. Retrieved from https:\/\/arxiv.org\/abs\/2005.00928","DOI":"10.18653\/v1\/2020.acl-main.385"},{"key":"e_1_3_4_3_2","unstructured":"Josh Achiam Steven Adler Sandhini Agarwal Lama Ahmad Ilge Akkaya Florencia Leoni Aleman Diogo Almeida Janko Altenschmidt Sam Altman Shyamal Anadkat et al. 2023. GPT-4 technical report. arXiv:2303.08774. Retrieved from https:\/\/arxiv.org\/abs\/2303.08774"},{"key":"e_1_3_4_4_2","first-page":"99","volume-title":"Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Akrout Mohamed","year":"2023","unstructured":"Mohamed Akrout, B\u00e1lint Gyepesi, P\u00e9ter Holl\u00f3, Adrienn Po\u00f3r, Bl\u00e1ga Kincs\u0151, Stephen Solis, Katrina Cirone, Jeremy Kawahara, Dekker Slade, Latif Abid, et al. 2023. Diffusion-based data augmentation for skin disease classification: Impact across original medical datasets to fully synthetic images. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 99\u2013109."},{"key":"e_1_3_4_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2021.105055"},{"key":"e_1_3_4_6_2","doi-asserted-by":"crossref","first-page":"23716","DOI":"10.52202\/068431-1723","article-title":"Flamingo: A visual language model for few-shot learning","volume":"35","author":"Alayrac Jean-Baptiste","year":"2022","unstructured":"Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katherine Millican, Malcolm Reynolds, et al. 2022. Flamingo: A visual language model for few-shot learning. In Advances in Neural Information Processing System, Vol. 35, 23716\u201323736.","journal-title":"Advances in Neural Information Processing System"},{"key":"e_1_3_4_7_2","unstructured":"Shuai Bai Keqin Chen Xuejing Liu Jialin Wang Wenbin Ge Sibo Song Kai Dang Peng Wang Shijie Wang Jun Tang et al. 2025. Qwen2. 5-vl technical report. arXiv:2502.13923. Retrieved from https:\/\/arxiv.org\/abs\/2502.13923"},{"key":"e_1_3_4_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01442"},{"key":"e_1_3_4_9_2","first-page":"1","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Boecking Benedikt","year":"2022","unstructured":"Benedikt Boecking, Naoto Usuyama, Shruthi Bannur, Daniel C. Castro, Anton Schwaighofer, Stephanie Hyland, Maria Wetscherek, Tristan Naumann, Aditya Nori, Javier Alvarez-Valle, et al. 2022. Making the most of text semantics to improve biomedical vision\u2013language processing. In Proceedings of the European Conference on Computer Vision. Springer, 1\u201321."},{"key":"e_1_3_4_10_2","unstructured":"Thomas Buckley James A. Diao Adam Rodman and Arjun K. Manrai. 2023. Accuracy of a vision-language model on challenging medical cases. arXiv:2311.05591. Retrieved from https:\/\/arxiv.org\/abs\/2311.05591"},{"key":"e_1_3_4_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/OJEMB.2024.3453060"},{"issue":"2","key":"e_1_3_4_12_2","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1097\/01.ASW.0000426717.59326.5f","article-title":"Telemedicine in wound care: A review","volume":"26","author":"Chanussot-Deprez Caroline","year":"2013","unstructured":"Caroline Chanussot-Deprez and Jos\u00e9 Contreras-Ruiz. 2013. Telemedicine in wound care: A review. Advances in Skin & Wound Care 26, 2 (2013), 78\u201382.","journal-title":"Advances in Skin & Wound Care"},{"key":"e_1_3_4_13_2","unstructured":"Pengcheng Chen Ziyan Huang Zhongying Deng Tianbin Li Yanzhou Su Haoyu Wang Jin Ye Yu Qiao and Junjun He. 2023. Enhancing medical task performance in GPT-4V: A comprehensive study on prompt engineering strategies. arXiv:2312.04344. Retrieved from https:\/\/arxiv.org\/abs\/2312.04344"},{"key":"e_1_3_4_14_2","unstructured":"Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai Thomas Unterthiner Mostafa Dehghani Matthias Minderer Georg Heigold Sylvain Gelly et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929. Retrieved from https:\/\/arxiv.org\/abs\/2010.11929"},{"issue":"6","key":"e_1_3_4_15_2","doi-asserted-by":"crossref","first-page":"S1","DOI":"10.12968\/jowc.2016.25.Sup6.S1","article-title":"Management of patients with venous leg ulcers: Challenges and current best practice","volume":"25","author":"Franks Peter J.","year":"2016","unstructured":"Peter J. Franks, Judith Barker, Mark Collier, Georgina Gethin, Emily Haesler, Arkadiusz Jawien, Severin Laeuchli, Giovanni Mosti, Sebastian Probst, and Carolina Weller. 2016. Management of patients with venous leg ulcers: Challenges and current best practice. Journal of Wound Care 25, Sup6 (2016), S1\u2013S67.","journal-title":"Journal of Wound Care"},{"key":"e_1_3_4_16_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-94907-5_2"},{"key":"e_1_3_4_17_2","unstructured":"Google. 2025. MedGemma: A Gemma 3 Variant Optimized for Medical Text and Image Comprehension. Retrieved from https:\/\/deepmind.google\/models\/gemma\/medgemma\/"},{"issue":"1","key":"e_1_3_4_18_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/wrr.12245","article-title":"Chronic wound repair and healing in older adults: Current status and future research","volume":"23","author":"Gould Lisa","year":"2015","unstructured":"Lisa Gould, Peter Abadir, Harold Brem, Marissa Carter, Teresa Conner-Kerr, Jeff Davidson, Luisa DiPietro, Vincent Falanga, Caroline Fife, Sue Gardner, et al. 2015. Chronic wound repair and healing in older adults: Current status and future research. Wound Repair and Regeneration 23, 1 (2015), 1\u201313.","journal-title":"Wound Repair and Regeneration"},{"key":"e_1_3_4_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2020.103616"},{"key":"e_1_3_4_20_2","first-page":"6626","article-title":"Gans trained by a two time-scale update rule converge to a local NASH equilibrium","volume":"30","author":"Heusel Martin","year":"2017","unstructured":"Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local NASH equilibrium. In Advances in Neural Information Processing Systems, Vol. 30, 6626\u20136637.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_4_21_2","first-page":"6840","article-title":"Denoising diffusion probabilistic models","volume":"33","author":"Ho Jonathan","year":"2020","unstructured":"Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems, Vol. 33, 6840\u20136851.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_4_22_2","unstructured":"Jonathan Ho and Tim Salimans. 2022. Classifier-free diffusion guidance. arXiv:2207.12598. Retrieved from https:\/\/arxiv.org\/abs\/2207.12598"},{"issue":"1","key":"e_1_3_4_23_2","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1186\/s13643-016-0400-8","article-title":"The humanistic and economic burden of chronic wounds: A protocol for a systematic review","volume":"6","author":"J\u00e4rbrink K.","unstructured":"K. J\u00e4rbrink, G. Ni, H. S\u00f6nnergren, A. Schmidtchen, C. Pang, R. Bajpai, and J. Car. [n.d.]. The humanistic and economic burden of chronic wounds: A protocol for a systematic review. Systematic Reviews 6, 1 (2017), 15.","journal-title":"Systematic Reviews"},{"key":"e_1_3_4_24_2","doi-asserted-by":"crossref","unstructured":"Qiao Jin Fangyuan Chen Yiliang Zhou Ziyang Xu Justin M. Cheung Robert Chen Ronald M. Summers Justin F. Rousseau Peiyun Ni Marc J. Landsman et al. 2024. Hidden flaws behind expert-level accuracy of GPT-4 vision in medicine. arXiv:2401.08396. Retrieved from https:\/\/arxiv.org\/abs\/2401.08396","DOI":"10.1038\/s41746-024-01185-7"},{"key":"e_1_3_4_25_2","unstructured":"Kaplan. [n.d.]. USMLE Passing Scores. Retrieved April 29 2024 from https:\/\/www.kaptest.com\/study\/usmle\/passing-scores\/"},{"key":"e_1_3_4_26_2","first-page":"9459","article-title":"Retrieval-augmented generation for knowledge-intensive NLP tasks","volume":"33","author":"Lewis Patrick","year":"2020","unstructured":"Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K\u00fcttler, Mike Lewis, Wen-Tau Yih, Tim Rockt\u00e4schel, et al. 2020. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Advances in Neural Information Processing Systems, Vol. 33, 9459\u20139474.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_4_27_2","first-page":"19730","volume-title":"Proceedings of the ICML","author":"Li Junnan","year":"2023","unstructured":"Junnan Li, Dongxu Li, Silvio Savarese, and Steven Hoi. 2023. BLIP-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. In Proceedings of the ICML. PMLR, 19730\u201319742."},{"key":"e_1_3_4_28_2","first-page":"12888","volume-title":"Proceedings of the ICML","author":"Li Junnan","year":"2022","unstructured":"Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. 2022. BLIP: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In Proceedings of the ICML. PMLR, 12888\u201312900."},{"key":"e_1_3_4_29_2","first-page":"12934","article-title":"EfficientFormer: Vision transformers at MobileNet speed","volume":"35","author":"Li Yanyu","year":"2022","unstructured":"Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, and Jian Ren. 2022. EfficientFormer: Vision transformers at MobileNet speed. In Advances in Neural Information Processing System, Vol. 35, 12934\u201312949.","journal-title":"Advances in Neural Information Processing System"},{"issue":"11","key":"e_1_3_4_30_2","doi-asserted-by":"crossref","first-page":"3026","DOI":"10.1093\/jac\/dkw287","article-title":"Antimicrobial stewardship in wound care: A position paper from the British society for antimicrobial chemotherapy and European wound management association","volume":"71","author":"Lipsky Benjamin A.","year":"2016","unstructured":"Benjamin A. Lipsky, Matthew Dryden, Finn Gottrup, Dilip Nathwani, Ronald Andrew Seaton, and Jan Stryja. 2016. Antimicrobial stewardship in wound care: A position paper from the British society for antimicrobial chemotherapy and European wound management association. The Journal of Antimicrobial Chemotherapy 71, 11 (2016), 3026\u20133035.","journal-title":"The Journal of Antimicrobial Chemotherapy"},{"key":"e_1_3_4_31_2","first-page":"34892","article-title":"Visual instruction tuning","volume":"36","author":"Liu Haotian","year":"2024","unstructured":"Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. 2024. Visual instruction tuning. In Advances in Neural Information Processing Systems, Vol. 36 (2024), 34892\u201334916.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_4_32_2","unstructured":"Hanchao Liu Wenyuan Xue Yifei Chen Dapeng Chen Xiutian Zhao Ke Wang Liping Hou Rongjun Li and Wei Peng. 2024. A survey on hallucination in large vision-language models. arXiv:2402.00253. Retrieved from https:\/\/arxiv.org\/abs\/2402.00253"},{"key":"e_1_3_4_33_2","first-page":"12009","volume-title":"Proceedings of the IEEE\/CVF CVPR","author":"Liu Ze","year":"2022","unstructured":"Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, et al. 2022. Swin transformer V2: Scaling up capacity and resolution. In Proceedings of the IEEE\/CVF CVPR, 12009\u201312019."},{"issue":"3","key":"e_1_3_4_34_2","first-page":"97","article-title":"Designing the future in wound care: The role of the nurse practitioner","volume":"10","author":"MacLellan Lorna","year":"2002","unstructured":"Lorna MacLellan, G. Gardner, and Anne Gardner. 2002. Designing the future in wound care: The role of the nurse practitioner. Primary Intention: The Australian Journal of Wound Management 10, 3 (2002), 97\u201399.","journal-title":"Primary Intention: The Australian Journal of Wound Management"},{"key":"e_1_3_4_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jvs.2013.08.003"},{"key":"e_1_3_4_36_2","unstructured":"Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv:1411.1784. Retrieved from https:\/\/arxiv.org\/abs\/1411.1784"},{"key":"e_1_3_4_37_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.smhl.2020.100139"},{"key":"e_1_3_4_38_2","unstructured":"Harsha Nori Yin Tat Lee Sheng Zhang Dean Carignan Richard Edgar Nicolo Fusi Nicholas King Jonathan Larson Yuanzhi Li Weishung Liu et al. 2023. Can generalist foundation models outcompete special-purpose tuning? Case study in medicine. arXiv:2311.16452. Retrieved from https:\/\/arxiv.org\/abs\/2311.16452"},{"issue":"1","key":"e_1_3_4_39_2","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1016\/j.jval.2017.07.007","article-title":"An economic evaluation of the impact, cost, and medicare policy implications of chronic nonhealing wounds","volume":"21","author":"Nussbaum Samuel R.","year":"2018","unstructured":"Samuel R. Nussbaum, Marissa J. Carter, Caroline E. Fife, Joan DaVanzo, Randall Haught, Marcia Nusgart, and Donna Cartwright. 2018. An economic evaluation of the impact, cost, and medicare policy implications of chronic nonhealing wounds. Value in Health 21, 1 (2018), 27\u201332.","journal-title":"Value in Health"},{"issue":"1","key":"e_1_3_4_40_2","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1111\/wrr.12683","article-title":"The humanistic and economic burden of chronic wounds: A systematic review","volume":"27","author":"Olsson Maja","year":"2019","unstructured":"Maja Olsson, Krister J\u00e4rbrink, Ushashree Divakar, Ram Bajpai, Zee Upton, Artur Schmidtchen, and Josip Car. 2019. The humanistic and economic burden of chronic wounds: A systematic review. Wound Repair and Regeneration 27, 1 (2019), 114\u2013125.","journal-title":"Wound Repair and Regeneration"},{"key":"e_1_3_4_41_2","unstructured":"Jiazhen Pan Che Liu Junde Wu Fenglin Liu Jiayuan Zhu Hongwei Bran Li Chen Chen Cheng Ouyang and Daniel Rueckert. 2025. MedVLM-R1: Incentivizing medical reasoning capability of vision-language models (VLMs) via reinforcement learning. arXiv:2502.19634. Retrieved from https:\/\/arxiv.org\/abs\/2502.19634"},{"key":"e_1_3_4_42_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-94907-5_5"},{"key":"e_1_3_4_43_2","first-page":"8748","volume-title":"Proceedings of the ICML","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the ICML. PMLR, 8748\u20138763."},{"key":"e_1_3_4_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"issue":"6","key":"e_1_3_4_45_2","doi-asserted-by":"crossref","first-page":"630","DOI":"10.1111\/iwj.12172","article-title":"Prevalence of chronic wounds and structural quality indicators of chronic wound care in Dutch nursing homes","volume":"12","author":"Rondas Armand A. L. M.","year":"2015","unstructured":"Armand A. L. M. Rondas, Jos M. G. A. Schols, Ellen E. Stobberingh, and Ruud J. G. Halfens. 2015. Prevalence of chronic wounds and structural quality indicators of chronic wound care in Dutch nursing homes. International Wound Journal 12, 6 (2015), 630\u2013635.","journal-title":"International Wound Journal"},{"key":"e_1_3_4_46_2","unstructured":"Khaled Saab Tao Tu Wei-Hung Weng Ryutaro Tanno David Stutz Ellery Wulczyn Fan Zhang Tim Strother Chunjong Park Elahe Vedadi et al. 2024. Capabilities of Gemini models in medicine. arXiv:2404.18416. Retrieved from https:\/\/arxiv.org\/abs\/2404.18416"},{"key":"e_1_3_4_47_2","first-page":"815","volume-title":"Proceedings of the IEEE CVPR","author":"Schroff Florian","year":"2015","unstructured":"Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE CVPR, 815\u2013823."},{"key":"e_1_3_4_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.74"},{"issue":"6","key":"e_1_3_4_49_2","doi-asserted-by":"crossref","first-page":"763","DOI":"10.1111\/j.1524-475X.2009.00543.x","article-title":"Human skin wounds: A major and snowballing threat to public health and the economy","volume":"17","author":"Sen Chandan K.","year":"2009","unstructured":"Chandan K. Sen, Gayle M. Gordillo, Sashwati Roy, Robert Kirsner, Lynn Lambert, Thomas K. Hunt, Finn Gottrup, Geoffrey C. Gurtner, and Michael T. Longaker. 2009. Human skin wounds: A major and snowballing threat to public health and the economy. Wound Repair and Regeneration 17, 6 (2009), 763\u2013771.","journal-title":"Wound Repair and Regeneration"},{"key":"e_1_3_4_50_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-023-06291-2"},{"key":"e_1_3_4_51_2","first-page":"1","article-title":"Toward expert-level medical question answering with large language models","volume":"31","author":"Singhal Karan","year":"2025","unstructured":"Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Mohamed Amin, Le Hou, Kevin Clark, Stephen R. Pfohl, Heather Cole-Lewis, et al. 2025. Toward expert-level medical question answering with large language models. Nature Medicine 31 (2025), 1\u20138.","journal-title":"Nature Medicine"},{"key":"e_1_3_4_52_2","unstructured":"Jiaming Song Chenlin Meng and Stefano Ermon. 2020. Denoising diffusion implicit models. arXiv:2010.02502. Retrieved from https:\/\/arxiv.org\/abs\/2010.02502"},{"issue":"2","key":"e_1_3_4_53_2","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1097\/WON.0000000000000414","article-title":"When and how to perform cultures on chronic wounds","volume":"45","author":"Stallard Yvonne","year":"2018","unstructured":"Yvonne Stallard. 2018. When and how to perform cultures on chronic wounds? Journal of Wound, Ostomy, and Continence Nursing 45, 2 (2018), 179\u2013186.","journal-title":"Journal of Wound, Ostomy, and Continence Nursing"},{"key":"e_1_3_4_54_2","first-page":"6105","volume-title":"Proceedings of the ICML","author":"Tan Mingxing","year":"2019","unstructured":"Mingxing Tan and Quoc Le. 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the ICML. PMLR, 6105\u20136114."},{"key":"e_1_3_4_55_2","first-page":"10347","volume-title":"Proceedings of the ICML","author":"Touvron Hugo","year":"2021","unstructured":"Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Herv\u00e9 J\u00e9gou. 2021. Training data-efficient image transformers & distillation through attention. In Proceedings of the ICML. PMLR, 10347\u201310357."},{"key":"e_1_3_4_56_2","unstructured":"Brandon Trabucco Kyle Doherty Max Gurinas and Ruslan Salakhutdinov. 2023. Effective data augment with diffusion models. arXiv:2302.07944. Retrieved from https:\/\/arxiv.org\/abs\/2302.07944"},{"key":"e_1_3_4_57_2","first-page":"5998","article-title":"Attention is all you need","volume":"30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, Vol. 30, 5998\u20136002.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_4_58_2","first-page":"2415","volume-title":"Proceedings of the International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)","author":"Wang Changhan","year":"2015","unstructured":"Changhan Wang, Xinchen Yan, Max Smith, Kanika Kochhar, Marcie Rubin, Stephen M. Warren, James Wrobel, and Honglak Lee. 2015. A unified framework for automatic wound segmentation and analysis with deep convolutional neural networks. In Proceedings of the International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, 2415\u20132418."},{"key":"e_1_3_4_59_2","first-page":"3876","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","volume":"2022","author":"Wang Zifeng","year":"2022","unstructured":"Zifeng Wang, Zhenbang Wu, Dinesh Agarwal, and Jimeng Sun. 2022. MedCLIP: Contrastive learning from unpaired medical images and text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Vol. 2022, 3876."},{"key":"e_1_3_4_60_2","doi-asserted-by":"publisher","DOI":"10.1097\/00129334-200406000-00012"},{"key":"e_1_3_4_61_2","unstructured":"Jinge Wu Yunsoo Kim and Honghan Wu. 2024. Hallucination benchmark in medical visual question answering. arXiv:2401.05827. Retrieved from https:\/\/arxiv.org\/abs\/2401.05827"},{"key":"e_1_3_4_62_2","unstructured":"Zhiling Yan Kai Zhang Rong Zhou Lifang He Xiang Li and Lichao Sun. 2023. Multimodal ChatGPT for medical applications: An experimental study of GPT-4V. arXiv:2310.19061. Retrieved from https:\/\/arxiv.org\/abs\/2310.19061"},{"key":"e_1_3_4_63_2","doi-asserted-by":"crossref","unstructured":"Zhichao Yang Zonghai Yao Mahbuba Tasmin Parth Vashisht Won Seok Jang Feiyun Ouyang Beining Wang Dan Berlowitz and Hong Yu. 2023. Performance of multimodal GPT-4V on USMLE with image: Potential for imaging diagnostic support with explanations. medRxiv (2023) 2023\u20132010.","DOI":"10.1101\/2023.10.26.23297629"},{"key":"e_1_3_4_64_2","first-page":"1","volume-title":"Proceedings of the 2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI)","author":"Hoon Yap Moi","year":"2021","unstructured":"Moi Hoon Yap, Bill Cassidy, Joseph M. Pappachan, Claire O\u2019Shea, David Gillespie, and Neil D. Reeves. 2021. Analysis towards classification of infection and ischaemia of diabetic foot ulcers. In Proceedings of the 2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI). IEEE, 1\u20134."},{"key":"e_1_3_4_65_2","unstructured":"Shukang Yin Chaoyou Fu Sirui Zhao Ke Li Xing Sun Tong Xu and Enhong Chen. 2023. A survey on multimodal large language models. arXiv:2306.13549. Retrieved from https:\/\/arxiv.org\/abs\/2306.13549"},{"key":"e_1_3_4_66_2","first-page":"592","volume-title":"Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Yu Xinyi","year":"2023","unstructured":"Xinyi Yu, Guanbin Li, Wei Lou, Siqi Liu, Xiang Wan, Yan Chen, and Haofeng Li. 2023. Diffusion-based data augmentation for nuclei image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 592\u2013602."}],"container-title":["ACM Transactions on Computing for Healthcare"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3793535","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T15:11:03Z","timestamp":1773760263000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3793535"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,17]]},"references-count":65,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,4,30]]}},"alternative-id":["10.1145\/3793535"],"URL":"https:\/\/doi.org\/10.1145\/3793535","relation":{},"ISSN":["2691-1957","2637-8051"],"issn-type":[{"value":"2691-1957","type":"print"},{"value":"2637-8051","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,17]]},"assertion":[{"value":"2025-02-11","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-01-12","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-03-17","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}