{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T04:15:21Z","timestamp":1765340121791,"version":"3.46.0"},"publisher-location":"New York, NY, USA","reference-count":66,"publisher":"ACM","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62402120,62372117,62472102"],"award-info":[{"award-number":["62402120,62372117,62472102"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100007219","name":"Natural Science Foundation of Shanghai","doi-asserted-by":"publisher","award":["24ZR1490400"],"award-info":[{"award-number":["24ZR1490400"]}],"id":[{"id":"10.13039\/100007219","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,10,27]]},"DOI":"10.1145\/3746027.3755867","type":"proceedings-article","created":{"date-parts":[[2025,10,25]],"date-time":"2025-10-25T07:38:54Z","timestamp":1761377934000},"page":"11160-11169","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["TabiMed: Tabularizing Medical Images for Few-Shot In-Context Diagnosis"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-3741-5541","authenticated-orcid":false,"given":"Wanying","family":"Zhou","sequence":"first","affiliation":[{"name":"College of Computer Science and Artificial Intelligence, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7179-5045","authenticated-orcid":false,"given":"Yuqi","family":"Sun","sequence":"additional","affiliation":[{"name":"College of Computer Science and Artificial Intelligence, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3329-3359","authenticated-orcid":false,"given":"Yu","family":"Ling","sequence":"additional","affiliation":[{"name":"College of Computer Science and Artificial Intelligence, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1176-6987","authenticated-orcid":false,"given":"Zhen","family":"Xing","sequence":"additional","affiliation":[{"name":"College of Computer Science and Artificial Intelligence, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5577-5773","authenticated-orcid":false,"given":"Chenxi","family":"Ma","sequence":"additional","affiliation":[{"name":"College of Computer Science and Artificial Intelligence, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7677-4772","authenticated-orcid":false,"given":"Weimin","family":"Tan","sequence":"additional","affiliation":[{"name":"College of Computer Science and Artificial Intelligence, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0256-9682","authenticated-orcid":false,"given":"Bo","family":"Yan","sequence":"additional","affiliation":[{"name":"College of Computer Science and Artificial Intelligence, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,10,27]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Exploring visual prompts for adapting large-scale models. arXiv preprint arXiv:2203.17274","author":"Bahng Hyojin","year":"2022","unstructured":"Hyojin Bahng, Ali Jahanian, Swami Sankaranarayanan, and Phillip Isola. 2022. Exploring visual prompts for adapting large-scale models. arXiv preprint arXiv:2203.17274 (2022)."},{"key":"e_1_3_2_1_2_1","first-page":"25005","article-title":"Visual prompting via image inpainting","volume":"35","author":"Bar Amir","year":"2022","unstructured":"Amir Bar, Yossi Gandelsman, Trevor Darrell, Amir Globerson, and Alexei Efros. 2022. Visual prompting via image inpainting. Advances in Neural Information Processing Systems, Vol. 35 (2022), 25005-25017.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_3_1","unstructured":"Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. Advances in neural information processing systems Vol. 33 (2020) 1877-1901."},{"key":"e_1_3_2_1_4_1","volume-title":"Language models can exploit cross-task in-context learning for data-scarce novel tasks. arXiv preprint arXiv:2405.10548","author":"Chatterjee Anwoy","year":"2024","unstructured":"Anwoy Chatterjee, Eshaan Tanwar, Subhabrata Dutta, and Tanmoy Chakraborty. 2024. Language models can exploit cross-task in-context learning for data-scarce novel tasks. arXiv preprint arXiv:2405.10548 (2024)."},{"key":"e_1_3_2_1_5_1","volume-title":"Improving in-context few-shot learning via self-supervised training. arXiv preprint arXiv:2205.01703","author":"Chen Mingda","year":"2022","unstructured":"Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, and Zornitsa Kozareva. 2022. Improving in-context few-shot learning via self-supervised training. arXiv preprint arXiv:2205.01703 (2022)."},{"key":"e_1_3_2_1_6_1","volume-title":"Fine-tune language models to approximate unbiased in-context learning. arXiv preprint arXiv:2310.03331","author":"Chu Timothy","year":"2023","unstructured":"Timothy Chu, Zhao Song, and Chiwun Yang. 2023. Fine-tune language models to approximate unbiased in-context learning. arXiv preprint arXiv:2310.03331 (2023)."},{"key":"e_1_3_2_1_7_1","volume-title":"Sft memorizes, rl generalizes: A comparative study of foundation model post-training. arXiv preprint arXiv:2501.17161","author":"Chu Tianzhe","year":"2025","unstructured":"Tianzhe Chu, Yuexiang Zhai, Jihan Yang, Shengbang Tong, Saining Xie, Dale Schuurmans, Quoc V Le, Sergey Levine, and Yi Ma. 2025. Sft memorizes, rl generalizes: A comparative study of foundation model post-training. arXiv preprint arXiv:2501.17161 (2025)."},{"key":"e_1_3_2_1_8_1","volume-title":"CausalLM is not optimal for in-context learning. arXiv preprint arXiv:2308.06912","author":"Ding Nan","year":"2023","unstructured":"Nan Ding, Tomer Levinboim, Jialin Wu, Sebastian Goodman, and Radu Soricut. 2023. CausalLM is not optimal for in-context learning. arXiv preprint arXiv:2308.06912 (2023)."},{"key":"e_1_3_2_1_9_1","unstructured":"Qiang Dong et al. 2022. A Survey on In-context Learning. arXiv preprint arXiv:2209.10775 (2022)."},{"key":"e_1_3_2_1_10_1","volume-title":"An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR","author":"Dosovitskiy Alexey","year":"2021","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR (2021)."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v36i6.20610"},{"key":"e_1_3_2_1_12_1","volume-title":"International conference on machine learning. PMLR, 1126-1135","author":"Finn Chelsea","year":"2017","unstructured":"Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning. PMLR, 1126-1135."},{"key":"e_1_3_2_1_13_1","unstructured":"Deqing Fu CHEN Tianqi Robin Jia and Vatsal Sharan. 2023. Transformers learn higher-order optimization methods for in-context learning: A study with linear models. (2023)."},{"key":"e_1_3_2_1_14_1","volume-title":"In-context learning for attention scheme: from single softmax regression to multiple softmax regression via a tensor trick. arXiv preprint arXiv:2307.02419","author":"Gao Yeqi","year":"2023","unstructured":"Yeqi Gao, Zhao Song, and Shenghao Xie. 2023. In-context learning for attention scheme: from single softmax regression to multiple softmax regression via a tensor trick. arXiv preprint arXiv:2307.02419 (2023)."},{"key":"e_1_3_2_1_15_1","volume-title":"Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043","author":"Garcia Victor","year":"2017","unstructured":"Victor Garcia and Joan Bruna. 2017. Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043 (2017)."},{"key":"e_1_3_2_1_16_1","unstructured":"Shivam Garg et al. 2022. What can transformers learn in context? A case study of simple function classes. In NeurIPS."},{"key":"e_1_3_2_1_17_1","volume-title":"Oct, 2024.","author":"Research Google","year":"2024","unstructured":"Google Research. 2024a. CT Foundation: Embedding Tool for CT Volumes in Medical Imaging. Google Research Blog. https:\/\/research.google\/blog\/taking-medical-imaging-embeddings-3d\/ Accessed 21, Oct, 2024."},{"key":"e_1_3_2_1_18_1","volume-title":"Mar, 2024.","author":"Research Google","year":"2024","unstructured":"Google Research. 2024b. Derm Foundation: Embedding Tool for Dermatology Images. Health AI Developer Foundations Blog. https:\/\/research.google\/blog\/health-specific-embedding-tools-for-dermatology-and-pathology\/ Accessed 08, Mar, 2024."},{"key":"e_1_3_2_1_19_1","volume-title":"Path Foundation: Embedding Tool for Digital Pathology Images. Health AI Developer Foundations Blog. https:\/\/research.google\/blog\/health-specific-embedding-tools-for-dermatology-and-pathology\/ Accessed","author":"Research Google","year":"2024","unstructured":"Google Research. 2024c. Path Foundation: Embedding Tool for Digital Pathology Images. Health AI Developer Foundations Blog. https:\/\/research.google\/blog\/health-specific-embedding-tools-for-dermatology-and-pathology\/ Accessed 08, Mar, 2024."},{"key":"e_1_3_2_1_20_1","volume-title":"Jul, 2025.","author":"Research Google","year":"2025","unstructured":"Google Research. 2025. CXR Foundation: Embedding Tool for Chest X?Ray Images. Health AI Developer Foundations Blog. https:\/\/developers.google.com\/health-ai-developer-foundations\/cxr-foundation Accessed 09, Jul, 2025."},{"key":"e_1_3_2_1_21_1","volume-title":"Pre-training to learn in context. arXiv preprint arXiv:2305.09137","author":"Gu Yuxian","year":"2023","unstructured":"Yuxian Gu, Li Dong, Furu Wei, and Minlie Huang. 2023. Pre-training to learn in context. arXiv preprint arXiv:2305.09137 (2023)."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ssmqr.2023.100240"},{"key":"e_1_3_2_1_23_1","volume-title":"Explaining emergent in-context learning as kernel regression. arXiv preprint arXiv:2305.12766","author":"Han Chi","year":"2023","unstructured":"Chi Han, Ziqi Wang, Han Zhao, and Heng Ji. 2023c. Explaining emergent in-context learning as kernel regression. arXiv preprint arXiv:2305.12766 (2023)."},{"key":"e_1_3_2_1_24_1","volume-title":"Understanding in-context learning via supportive pretraining data. arXiv preprint arXiv:2306.15091","author":"Han Xiaochuang","year":"2023","unstructured":"Xiaochuang Han, Daniel Simig, Todor Mihaylov, Yulia Tsvetkov, Asli Celikyilmaz, and Tianlu Wang. 2023a. Understanding in-context learning via supportive pretraining data. arXiv preprint arXiv:2306.15091 (2023)."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2022.109076"},{"key":"e_1_3_2_1_26_1","volume-title":"Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods. arXiv preprint arXiv:2506.01901","author":"Hao Yifan","year":"2025","unstructured":"Yifan Hao, Xingyuan Pan, Hanning Zhang, Chenlu Ye, Rui Pan, and Tong Zhang. 2025. Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods. arXiv preprint arXiv:2506.01901 (2025)."},{"key":"e_1_3_2_1_27_1","volume-title":"Language models are general-purpose interfaces. arXiv preprint arXiv:2206.06336","author":"Hao Yaru","year":"2022","unstructured":"Yaru Hao, Haoyu Song, Li Dong, Shaohan Huang, Zewen Chi, Wenhui Wang, Shuming Ma, and Furu Wei. 2022a. Language models are general-purpose interfaces. arXiv preprint arXiv:2206.06336 (2022)."},{"key":"e_1_3_2_1_28_1","volume-title":"Structured prompting: Scaling in-context learning to 1,000 examples. arXiv preprint arXiv:2212.06713","author":"Hao Yaru","year":"2022","unstructured":"Yaru Hao, Yutao Sun, Li Dong, Zhixiong Han, Yuxian Gu, and Furu Wei. 2022b. Structured prompting: Scaling in-context learning to 1,000 examples. arXiv preprint arXiv:2212.06713 (2022)."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Clyde Highmore et al. 2024. In-context learning in large language models: A comprehensive survey. arXiv preprint arXiv:2401.00001 (2024).","DOI":"10.20944\/preprints202407.0926.v1"},{"key":"e_1_3_2_1_30_1","volume-title":"Tabpfn: A transformer that solves small tabular classification problems in a second. arXiv preprint arXiv:2207.01848","author":"Hollmann Noah","year":"2022","unstructured":"Noah Hollmann, Samuel M\u00fcller, Katharina Eggensperger, and Frank Hutter. 2022. Tabpfn: A transformer that solves small tabular classification problems in a second. arXiv preprint arXiv:2207.01848 (2022)."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-024-08328-6"},{"key":"e_1_3_2_1_32_1","volume-title":"Exploring in-context learning capabilities of foundation models for generating knowledge graphs from text. arXiv preprint arXiv:2305.08804","author":"Khorashadizadeh Hanieh","year":"2023","unstructured":"Hanieh Khorashadizadeh, Nandana Mihindukulasooriya, Sanju Tiwari, Jinghua Groppe, and Sven Groppe. 2023. Exploring in-context learning capabilities of foundation models for generating knowledge graphs from text. arXiv preprint arXiv:2305.08804 (2023)."},{"key":"e_1_3_2_1_33_1","volume-title":"Kang Min Yoo, and Sang-goo Lee","author":"Kim Hyuhng Joon","year":"2022","unstructured":"Hyuhng Joon Kim, Hyunsoo Cho, Junyeob Kim, Taeuk Kim, Kang Min Yoo, and Sang-goo Lee. 2022. Self-generated in-context learning: Leveraging auto-regressive language models as a demonstration generator. arXiv preprint arXiv:2206.08082 (2022)."},{"key":"e_1_3_2_1_34_1","first-page":"1","article-title":"Siamese neural networks for one-shot image recognition. In ICML deep learning workshop, Vol. 2","author":"Koch Gregory","year":"2015","unstructured":"Gregory Koch, Richard Zemel, Ruslan Salakhutdinov, et al., 2015. Siamese neural networks for one-shot image recognition. In ICML deep learning workshop, Vol. 2. Lille, 1-30.","journal-title":"Lille"},{"key":"e_1_3_2_1_35_1","volume-title":"Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa.","author":"Kojima Takeshi","year":"2022","unstructured":"Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. 2022. Large language models are zero-shot reasoners. Advances in neural information processing systems, Vol. 35 (2022), 22199-22213."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2022.104227"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01091"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33018642"},{"key":"e_1_3_2_1_39_1","volume-title":"NeurIPS 2024 Workshop on Fine-Tuning in Modern Machine Learning: Principles and Scalability.","author":"Li Ziniu","year":"2024","unstructured":"Ziniu Li, Congliang Chen, Tian Xu, Zeyu Qin, Jiancong Xiao, Ruoyu Sun, and Zhi-Quan Luo. 2024. Entropic distribution matching for supervised fine-tuning of LLMs: Less overfitting and better diversity. In NeurIPS 2024 Workshop on Fine-Tuning in Modern Machine Learning: Principles and Scalability."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00585"},{"key":"e_1_3_2_1_41_1","volume-title":"Unifying image processing as visual prompting question answering. arXiv preprint arXiv:2310.10513","author":"Liu Yihao","year":"2023","unstructured":"Yihao Liu, Xiangyu Chen, Xianzheng Ma, Xintao Wang, Jiantao Zhou, Yu Qiao, and Chao Dong. 2023a. Unifying image processing as visual prompting question answering. arXiv preprint arXiv:2310.10513 (2023)."},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"crossref","unstructured":"Xiaohao Mao Yu Huang Ye Jin Lun Wang Xuanzhong Chen Honghong Liu Xinglin Yang Haopeng Xu Xiaodong Luan Ying Xiao et al. 2025. A phenotype-based AI pipeline outperforms human experts in differentially diagnosing rare diseases using EHRs. npj Digital Medicine Vol. 8 1 (2025) 68.","DOI":"10.1038\/s41746-025-01452-1"},{"key":"e_1_3_2_1_43_1","first-page":"462","article-title":"Generating training data with language models: Towards zero-shot language understanding","volume":"35","author":"Meng Yu","year":"2022","unstructured":"Yu Meng, Jiaxin Huang, Yu Zhang, and Jiawei Han. 2022. Generating training data with language models: Towards zero-shot language understanding. Advances in Neural Information Processing Systems, Vol. 35 (2022), 462-477.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41582-023-00841-y"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1186\/s13023-024-03352-1"},{"key":"e_1_3_2_1_46_1","volume-title":"International Conference on Machine Learning. PMLR, 17138-17155","author":"\u00d6zt\u00fcrk Ekrem","year":"2022","unstructured":"Ekrem \u00d6zt\u00fcrk, Fabio Ferreira, Hadi Jomaa, Lars Schmidt-Thieme, Josif Grabocka, and Frank Hutter. 2022. Zero-shot automl with pretrained models. In International Conference on Machine Learning. PMLR, 17138-17155."},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2021.108111"},{"key":"e_1_3_2_1_48_1","first-page":"4077","article-title":"Prototypical networks for few-shot learning","author":"Snell Jake","year":"2017","unstructured":"Jake Snell, Kevin Swersky, and Richard S Zemel. 2017. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems. 4077-4087.","journal-title":"Advances in Neural Information Processing Systems."},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2025.3554410"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00131"},{"key":"e_1_3_2_1_51_1","first-page":"2252","article-title":"Few-shot learning through an information retrieval lens","author":"Triantafillou Eleni","year":"2017","unstructured":"Eleni Triantafillou, Richard Zemel, and Raquel Urtasun. 2017. Few-shot learning through an information retrieval lens. In Advances in Neural Information Processing Systems. 2252-2262.","journal-title":"Advances in Neural Information Processing Systems."},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","unstructured":"Bas S. Veeling Jeroen Linmans Jakob Winkens Taco Cohen and Max Welling. 2018. PatchCamelyon (PCam): A Histopathology Patch-Level Dataset for Metastasis Classification. https:\/\/doi.org\/10.5281\/zenodo.1494286. doi:10.5281\/zenodo.1494286 Dataset derived from Camelyon16 challenge; 327 680 color patches (96\u00d796px) of lymph node histopathology scans.","DOI":"10.5281\/zenodo.1494286"},{"key":"e_1_3_2_1_53_1","unstructured":"Oriol Vinyals Charles Blundell Timothy Lillicrap Daan Wierstra et al. 2016. Matching networks for one shot learning. Advances in neural information processing systems Vol. 29 (2016)."},{"key":"e_1_3_2_1_54_1","volume-title":"International Conference on Machine Learning. PMLR, 22964-22984","author":"Wang Thomas","year":"2022","unstructured":"Thomas Wang, Adam Roberts, Daniel Hesslow, Teven Le Scao, Hyung Won Chung, Iz Beltagy, Julien Launay, and Colin Raffel. 2022. What language model architecture and pretraining objective works best for zero-shot generalization?. In International Conference on Machine Learning. PMLR, 22964-22984."},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00660"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00110"},{"key":"e_1_3_2_1_57_1","volume-title":"Brian Lester, Nan Du, Andrew M Dai, and Quoc V Le.","author":"Wei Jason","year":"2021","unstructured":"Jason Wei, Maarten Bosma, Vincent Y Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M Dai, and Quoc V Le. 2021. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652 (2021)."},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00780"},{"key":"e_1_3_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41597-022-01721-8"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00375"},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV57701.2024.00258"},{"key":"e_1_3_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41591-024-03185-2"},{"key":"e_1_3_2_1_63_1","unstructured":"Sheng Zhang Yanbo Xu Naoto Usuyama Hanwen Xu Jaspreet Bagga Robert Tinn Sam Preston Rajesh Rao Mu Wei Naveen Valluri et al. 2023a. Biomedclip: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs. arXiv preprint arXiv:2303.00915 (2023)."},{"key":"e_1_3_2_1_64_1","first-page":"17773","article-title":"What makes good examples for visual in-context learning","volume":"36","author":"Zhang Yuanhan","year":"2023","unstructured":"Yuanhan Zhang, Kaiyang Zhou, and Ziwei Liu. 2023b. What makes good examples for visual in-context learning? Advances in Neural Information Processing Systems, Vol. 36 (2023), 17773-17794.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_65_1","volume-title":"Pre-trained language models can be fully zero-shot learners. arXiv preprint arXiv:2212.06950","author":"Zhao Xuandong","year":"2022","unstructured":"Xuandong Zhao, Siqi Ouyang, Zhiguo Yu, Ming Wu, and Lei Li. 2022. Pre-trained language models can be fully zero-shot learners. arXiv preprint arXiv:2212.06950 (2022)."},{"key":"e_1_3_2_1_66_1","volume-title":"Adapting language models for zero-shot learning by meta-tuning on dataset and prompt collections. arXiv preprint arXiv:2104.04670","author":"Zhong Ruiqi","year":"2021","unstructured":"Ruiqi Zhong, Kristy Lee, Zheng Zhang, and Dan Klein. 2021. Adapting language models for zero-shot learning by meta-tuning on dataset and prompt collections. arXiv preprint arXiv:2104.04670 (2021)."}],"event":{"name":"MM '25: The 33rd ACM International Conference on Multimedia","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Dublin Ireland","acronym":"MM '25"},"container-title":["Proceedings of the 33rd ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3746027.3755867","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T04:12:10Z","timestamp":1765339930000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3746027.3755867"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,27]]},"references-count":66,"alternative-id":["10.1145\/3746027.3755867","10.1145\/3746027"],"URL":"https:\/\/doi.org\/10.1145\/3746027.3755867","relation":{},"subject":[],"published":{"date-parts":[[2025,10,27]]},"assertion":[{"value":"2025-10-27","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}