{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,9]],"date-time":"2026-05-09T17:22:48Z","timestamp":1778347368498,"version":"3.51.4"},"reference-count":52,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2025,8,4]],"date-time":"2025-08-04T00:00:00Z","timestamp":1754265600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"name":"National Institutes of Health\u2019s"},{"DOI":"10.13039\/100008460","name":"National Center for Complementary and Integrative Health","doi-asserted-by":"publisher","award":["R01AT009457"],"award-info":[{"award-number":["R01AT009457"]}],"id":[{"id":"10.13039\/100008460","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100008460","name":"National Center for Complementary and Integrative Health","doi-asserted-by":"publisher","award":["U01AT012871"],"award-info":[{"award-number":["U01AT012871"]}],"id":[{"id":"10.13039\/100008460","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000049","name":"National Institute on Aging","doi-asserted-by":"publisher","award":["R01AG078154"],"award-info":[{"award-number":["R01AG078154"]}],"id":[{"id":"10.13039\/100000049","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000054","name":"National Cancer Institute","doi-asserted-by":"publisher","award":["R01CA287413"],"award-info":[{"award-number":["R01CA287413"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000062","name":"National Institute of Diabetes and Digestive and Kidney Diseases","doi-asserted-by":"publisher","award":["R01DK115629"],"award-info":[{"award-number":["R01DK115629"]}],"id":[{"id":"10.13039\/100000062","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006545","name":"National Institute on Minority Health and Health Disparities","doi-asserted-by":"publisher","award":["1R21MD019134-01"],"award-info":[{"award-number":["1R21MD019134-01"]}],"id":[{"id":"10.13039\/100006545","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objectives<\/jats:title>\n                  <jats:p>To optimize in-context learning in biomedical natural language processing by improving example selection.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>We introduce a novel multi-mode retrieval-augmented generation (MMRAG) framework, which integrates 4 retrieval strategies: (1) Random Mode, selecting examples arbitrarily; (2) Top Mode, retrieving the most relevant examples based on similarity; (3) Diversity Mode, ensuring variation in selected examples; and (4) Class Mode, selecting category-representative examples. This study evaluates MMRAG on 3 core biomedical NLP tasks: Named Entity Recognition (NER), Relation Extraction (RE), and Text Classification (TC). The datasets used include BC2GM for gene and protein mention recognition (NER), DDI for drug-drug interaction extraction (RE), GIT for general biomedical information extraction (RE), and HealthAdvice for health-related text classification (TC). The framework is tested with 2 large language models (Llama-2-7B and Llama-3-8B) and 3 retrievers (Contriever, MedCPT, and BGE-Large) to assess performance across different retrieval strategies.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>The results from the Random Mode indicate that providing more examples in the prompt improves the model\u2019s generation performance. Meanwhile, Top Mode and Diversity Mode significantly outperform Random Mode on the RE (DDI) task, achieving an F1 score of 0.9669\u2014a 26.4% improvement. Among the 3 retrievers tested, Contriever outperformed the other 2 in a greater number of experiments. Additionally, Llama 2 and Llama 3 demonstrated varying capabilities across different tasks, with Llama 3 showing a clear advantage in handling NER tasks.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>MMRAG effectively enhances biomedical in-context learning by refining example selection, mitigating data scarcity issues, and demonstrating superior adaptability for NLP-driven healthcare applications.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocaf128","type":"journal-article","created":{"date-parts":[[2025,7,17]],"date-time":"2025-07-17T10:05:16Z","timestamp":1752746716000},"page":"1505-1516","source":"Crossref","is-referenced-by-count":7,"title":["MMRAG: multi-mode retrieval-augmented generation with large language models for biomedical in-context learning"],"prefix":"10.1093","volume":"32","author":[{"given":"Zaifu","family":"Zhan","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Minnesota , Minneapolis, MN 55455,","place":["United States"]}]},{"given":"Jun","family":"Wang","sequence":"additional","affiliation":[{"name":"Division of Computational Health Sciences, Department of Surgery, University of Minnesota , Minneapolis, MN 55455,","place":["United States"]}]},{"given":"Shuang","family":"Zhou","sequence":"additional","affiliation":[{"name":"Division of Computational Health Sciences, Department of Surgery, University of Minnesota , Minneapolis, MN 55455,","place":["United States"]}]},{"given":"Jiawen","family":"Deng","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, University of Minnesota , Minneapolis, MN 55455,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8258-3585","authenticated-orcid":false,"given":"Rui","family":"Zhang","sequence":"additional","affiliation":[{"name":"Division of Computational Health Sciences, Department of Surgery, University of Minnesota , Minneapolis, MN 55455,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2025,8,4]]},"reference":[{"key":"2025092208423063800_ocaf128-B1","author":"Radford","year":"2018"},{"key":"2025092208423063800_ocaf128-B2","author":"Radford","year":"2019"},{"key":"2025092208423063800_ocaf128-B3","author":"Touvron","year":"2023"},{"key":"2025092208423063800_ocaf128-B4","doi-asserted-by":"publisher","first-page":"9","DOI":"10.1038\/s44387-025-00011-z","article-title":"Large language models for disease diagnosis: a scoping review","volume":"1","author":"Zhou","year":"2025","journal-title":"NPJ Artif Intell"},{"key":"2025092208423063800_ocaf128-B5","doi-asserted-by":"publisher","first-page":"1589","DOI":"10.1109\/JBHI.20172767063","article-title":"Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis","volume":"22","author":"Shickel","year":"2018","journal-title":"IEEE J Biomed Health Inform"},{"key":"2025092208423063800_ocaf128-B6","first-page":"5373","author":"Zhan","year":"2025"},{"key":"2025092208423063800_ocaf128-B7","author":"Li","year":"2024"},{"key":"2025092208423063800_ocaf128-B8","doi-asserted-by":"publisher","first-page":"2054","DOI":"10.1093\/jamia\/ocae079","article-title":"Large language models leverage external knowledge to extend clinical insight beyond language boundaries","volume":"31","author":"Wu","year":"2024","journal-title":"J Am Med Inform Assoc"},{"key":"2025092208423063800_ocaf128-B9","doi-asserted-by":"publisher","first-page":"127079","DOI":"10.1016\/j.neucom.2023.127079","article-title":"A survey of the recent trends in deep learning for literature based discovery in the biomedical domain","volume":"568","author":"Cesario","year":"2024","journal-title":"Neurocomputing"},{"key":"2025092208423063800_ocaf128-B10","doi-asserted-by":"crossref","first-page":"2300163","DOI":"10.1002\/gch2.202300163","article-title":"Biomedical big data technologies, applications, and challenges for precision medicine: a review","volume":"8","author":"Yang","year":"2024","journal-title":"Glob Chall"},{"key":"2025092208423063800_ocaf128-B11","doi-asserted-by":"publisher","first-page":"3615","DOI":"10.18653\/v1\/D19-1371","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Beltagy","year":"2019"},{"key":"2025092208423063800_ocaf128-B12","doi-asserted-by":"publisher","first-page":"1754","DOI":"10.18653\/v1\/2022.emnlp-main.115","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Mitchell","year":"2022"},{"key":"2025092208423063800_ocaf128-B13","first-page":"10767","author":"Hong","year":"2022"},{"key":"2025092208423063800_ocaf128-B14","first-page":"493","article-title":"Classifying supplement use status in clinical notes","volume":"2017","author":"Fan","year":"2017","journal-title":"AMIA Jt Summits Transl Sci Proc"},{"key":"2025092208423063800_ocaf128-B15","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1186\/s12911-018-0626-6","article-title":"Using natural language processing methods to classify use status of dietary supplements in clinical notes","volume":"18","author":"Fan","year":"2018","journal-title":"BMC Med Inform Decision Making"},{"key":"2025092208423063800_ocaf128-B16","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1038\/s41746-025-01429-0","article-title":"Privacy preserving strategies for electronic health records in the era of large language models","volume":"8","author":"Jonnagaddala","year":"2025","journal-title":"NPJ Digit Med"},{"key":"2025092208423063800_ocaf128-B17","doi-asserted-by":"crossref","first-page":"e17984","DOI":"10.2196\/17984","article-title":"Clinical text data in machine learning: systematic review","volume":"8","author":"Spasic","year":"2020","journal-title":"JMIR Med Inform"},{"key":"2025092208423063800_ocaf128-B18","first-page":"540","author":"Chapman","year":"2011"},{"key":"2025092208423063800_ocaf128-B19","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1186\/s13000-024-01464-7","article-title":"Challenges and barriers of using large language models (LLM) such as ChatGPT for diagnostic medicine with a focus on digital pathology\u2014a recent scoping review","volume":"19","author":"Ullah","year":"2024","journal-title":"Diagn Pathol"},{"key":"2025092208423063800_ocaf128-B20","first-page":"4850","author":"Chen","year":"2024"},{"key":"2025092208423063800_ocaf128-B21","doi-asserted-by":"crossref","first-page":"108734","DOI":"10.1016\/j.compbiomed.2024.108734","article-title":"Current strategies to address data scarcity in artificial intelligence-based drug discovery: a comprehensive review","volume":"179","author":"Gangwal","year":"2024","journal-title":"Comput Biol Med"},{"key":"2025092208423063800_ocaf128-B22","doi-asserted-by":"crossref","first-page":"1701","DOI":"10.1007\/s11831-023-10028-9","article-title":"Advances in deep learning models for resolving medical image segmentation data scarcity problem: a topical review","volume":"31","author":"Upadhyay","year":"2024","journal-title":"Arch Computat Methods Eng"},{"key":"2025092208423063800_ocaf128-B23","author":"Jin","year":"2024"},{"key":"2025092208423063800_ocaf128-B24","doi-asserted-by":"publisher","first-page":"1107","DOI":"10.18653\/v1\/2024.emnlp-main.64","volume-title":"Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing","author":"Dong","year":"2024"},{"key":"2025092208423063800_ocaf128-B25","doi-asserted-by":"publisher","first-page":"9134","DOI":"10.18653\/v1\/2022.emnlp-main.622","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Zhang","year":"2022"},{"key":"2025092208423063800_ocaf128-B26","first-page":"1","article-title":"Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing","volume":"55","author":"Liu","year":"2023","journal-title":"ACM Comput Surv"},{"key":"2025092208423063800_ocaf128-B27","doi-asserted-by":"publisher","first-page":"1423","DOI":"10.18653\/v1\/2023.acl-long.79","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Wu","year":"2023"},{"key":"2025092208423063800_ocaf128-B28","doi-asserted-by":"publisher","first-page":"100","DOI":"10.18653\/v1\/2022.deelio-1.10","volume-title":"Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures","author":"Liu","year":"2022"},{"key":"2025092208423063800_ocaf128-B29","author":"Xiong","year":"2024"},{"key":"2025092208423063800_ocaf128-B30","author":"Mavromatis","year":"2023"},{"key":"2025092208423063800_ocaf128-B31","first-page":"3293","author":"Kumari","year":"2024"},{"key":"2025092208423063800_ocaf128-B32","author":"Li"},{"key":"2025092208423063800_ocaf128-B33","first-page":"6491","author":"Fan","year":"2024"},{"key":"2025092208423063800_ocaf128-B34","first-page":"9459","volume-title":"Advances in Neural Information Processing Systems","author":"Lewis","year":"2020"},{"key":"2025092208423063800_ocaf128-B35","first-page":"102","author":"Yu","year":"2024"},{"key":"2025092208423063800_ocaf128-B36","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1093\/jamia\/ocaf002","article-title":"RAMIE: retrieval-augmented multi-task information extraction with large language models on dietary supplements","volume":"32","author":"Zhan","year":"2025","journal-title":"J Am Med Inform Assoc"},{"key":"2025092208423063800_ocaf128-B37","author":"Touvron","year":"2023"},{"key":"2025092208423063800_ocaf128-B38","author":"Grattafiori","year":"2024"},{"key":"2025092208423063800_ocaf128-B39","author":"Izacard","year":"2022"},{"key":"2025092208423063800_ocaf128-B40","doi-asserted-by":"crossref","first-page":"btad651","DOI":"10.1093\/bioinformatics\/btad651","article-title":"MedCPT: contrastive pre-trained transformers with large-scale PubMed search logs for zero-shot biomedical information retrieval","volume":"39","author":"Jin","year":"2023","journal-title":"Bioinformatics"},{"key":"2025092208423063800_ocaf128-B41","author":"Xiao","year":"2023"},{"key":"2025092208423063800_ocaf128-B42","doi-asserted-by":"crossref","first-page":"S2","DOI":"10.1186\/gb-2008-9-s2-s2","article-title":"Overview of BioCreative II gene mention recognition","volume":"9 Suppl 2","author":"Smith","year":"2008","journal-title":"Genome Biol"},{"key":"2025092208423063800_ocaf128-B43","first-page":"341","volume-title":"Second Joint Conference on Lexical and Computational Semantics (SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)","author":"Segura-Bedmar","year":"2013"},{"key":"2025092208423063800_ocaf128-B44","author":"Li","year":"2024"},{"key":"2025092208423063800_ocaf128-B45","first-page":"4664","author":"Yu","year":"2019"},{"key":"2025092208423063800_ocaf128-B46","author":"Hu","year":"2022"},{"key":"2025092208423063800_ocaf128-B47","author":"Zhan"},{"key":"2025092208423063800_ocaf128-B48","author":"Zhan"},{"key":"2025092208423063800_ocaf128-B49","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1038\/s41746-024-01126-4","article-title":"An in-depth evaluation of federated learning on biomedical natural language processing for information extraction","volume":"7","author":"Peng","year":"2024","journal-title":"NPJ Digit Med"},{"key":"2025092208423063800_ocaf128-B50","doi-asserted-by":"crossref","first-page":"104739","DOI":"10.1016\/j.jbi.2024.104739","article-title":"based learning for few-shot biomedical named entity recognition under machine reading comprehension","volume":"159","author":"Su","year":"2024","journal-title":"J Biomed Inform"},{"key":"2025092208423063800_ocaf128-B51","doi-asserted-by":"crossref","first-page":"102153","DOI":"10.1016\/j.artmed.2021.102153","article-title":"TP-DDI: transformer-based pipeline for the extraction of drug-drug interactions","volume":"119","author":"Zaikis","year":"2021","journal-title":"Artif Intell Med"},{"key":"2025092208423063800_ocaf128-B52","doi-asserted-by":"crossref","first-page":"13743","DOI":"10.1007\/s10462-023-10484-6","article-title":"Multi-task learning for few-shot biomedical relation extraction","volume":"56","author":"Moscato","year":"2023","journal-title":"Artif Intell Rev"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/32\/10\/1505\/63948576\/ocaf128.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/32\/10\/1505\/63948576\/ocaf128.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,22]],"date-time":"2025-09-22T12:42:39Z","timestamp":1758544959000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/32\/10\/1505\/8222125"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,4]]},"references-count":52,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2025,8,4]]},"published-print":{"date-parts":[[2025,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocaf128","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,10]]},"published":{"date-parts":[[2025,8,4]]}}}