{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T15:26:15Z","timestamp":1773329175635,"version":"3.50.1"},"reference-count":50,"publisher":"Springer Science and Business Media LLC","issue":"9","license":[{"start":{"date-parts":[[2025,10,20]],"date-time":"2025-10-20T00:00:00Z","timestamp":1760918400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,10,20]],"date-time":"2025-10-20T00:00:00Z","timestamp":1760918400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J. King Saud Univ. Comput. Inf. Sci."],"published-print":{"date-parts":[[2025,11]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Retrieving relevant medical cases or documents is a critical information retrieval (IR) task in clinical decision support, particularly in cardiology, yet traditional search methods struggle with complex semantic queries in healthcare. Recent advances in large language models offer powerful language understanding, but LLMs alone cannot reliably retrieve factual cases due to knowledge cutoffs and hallucinations. We focus on a case retrieval task \u2013 given a textual description of a patient case, find similar prior cases or pertinent literature \u2013 formulated as a general IR problem rather than a purely medical study. Conventional lexical methods often miss semantic similarities, while static dense retrievers falter on out-of-domain medical vocabularies. LLMs can comprehend queries and context, but without external knowledge they may produce inaccurate or non-transparent results. We propose a novel LLM-augmented multi-agent retrieval framework that marries an LLM with dedicated retrieval agents in an iterative cooperation mechanism. Our method uses a LLM as a \u201cplanner\u201d agent to reformulate queries and integrate medical (e.g., cardiology) context, and a retrieval agent (with a knowledge index) to fetch candidate cases; the agents interact iteratively, refining search and reranking results via a retrieval-augmented generation (RAG) loop. This multi-agent design contributes three innovations: (1) an iterative query refinement strategy guided by LLM reasoning chains; (2) a cooperative retrieval architecture where an LLM agent and a search agent exchange information to improve relevance; (3) an LLM-based relevance estimator that grounds the LLM with retrieved evidence to mitigate hallucinations. Experiments on three open medical text datasets show our method outperforms baseline models by 5.3\u20136.1 percentage points in Recall@10 and NDCG, with statistically significant gains. We also observe improved generalization to novel conditions and robustness to query noise compared to baselines. The proposed framework, while validated on medical text, is broadly applicable to other knowledge-intensive retrieval tasks (legal case search, technical support archives), providing a foundation for intelligent IR systems that leverage both learning-based understanding and explicit retrieval for transparency and up-to-date knowledge.<\/jats:p>","DOI":"10.1007\/s44443-025-00311-z","type":"journal-article","created":{"date-parts":[[2025,10,20]],"date-time":"2025-10-20T12:21:09Z","timestamp":1760962869000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["LLM-augmented multi-agent cooperative framework for medical case retrieval in cardiology"],"prefix":"10.1007","volume":"37","author":[{"given":"Lang","family":"Deng","sequence":"first","affiliation":[]},{"given":"Huanhuan","family":"Hu","sequence":"additional","affiliation":[]},{"given":"Kongjie","family":"Lu","sequence":"additional","affiliation":[]},{"given":"Ping","family":"He","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,10,20]]},"reference":[{"key":"311_CR1","unstructured":"Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, Aleman FL, Almeida D, Altenschmidt J, Altman S, Anadkat S et al (2023) Gpt-4 technical report. arXiv:2303.08774"},{"key":"311_CR2","doi-asserted-by":"crossref","unstructured":"Alsentzer E, Murphy JR, Boag W, Weng W-H, Jin D, Naumann T, McDermott M (2019) Publicly available clinical bert embeddings. arXiv:1904.03323","DOI":"10.18653\/v1\/W19-1909"},{"key":"311_CR3","unstructured":"Brown T, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. NeurIPS"},{"key":"311_CR4","unstructured":"Bubeck S, Chandrasekaran V, Eldan R, Gehrke J, Horvitz E, Kamar E, Lee P, Lee YT, Li Y, Lundberg S et al (2023) Sparks of artificial general intelligence: early experiments with gpt-4. arXiv:2303.12712"},{"key":"311_CR5","doi-asserted-by":"crossref","unstructured":"Chen D, Fisch A, Weston J, Bordes A (2017) Reading wikipedia to answer open-domain questions. arXiv:1704.00051","DOI":"10.18653\/v1\/P17-1171"},{"key":"311_CR6","unstructured":"Dai Z, Zhao VY, Ma J, Luan Y, Ni J, Lu J, Bakalov A, Guu K, Hall KB, Chang M-W (2022) Promptagator: few-shot dense retrieval from 8 examples. arXiv:2209.11755"},{"key":"311_CR7","doi-asserted-by":"crossref","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pp 4171\u20134186","DOI":"10.18653\/v1\/N19-1423"},{"key":"311_CR8","doi-asserted-by":"crossref","unstructured":"Formal T, Lassance C, Piwowarski B, Clinchant S (2021) Splade v2: sparse lexical and expansion model for information retrieval. arXiv:2109.10086","DOI":"10.1145\/3404835.3463098"},{"key":"311_CR9","doi-asserted-by":"crossref","unstructured":"Gao L, Ma X, Lin J, Callan J (2023) Precise zero-shot dense retrieval without relevance labels. In: Proceedings of the 61st annual meeting of the association for computational linguistics (volume 1: long papers), pp 1762\u20131777","DOI":"10.18653\/v1\/2023.acl-long.99"},{"key":"311_CR10","unstructured":"Gao Y, Xiong Y, Gao X, Jia K, Pan J, Bi Y, Dai Y, Sun J, Wang H, Wang H (2023) Retrieval-augmented generation for large language models: a survey. arXiv:2312.10997, 2"},{"key":"311_CR11","first-page":"1","volume":"3","author":"Y Gu","year":"2021","unstructured":"Gu Y, Tinn R, Cheng H, Lucas M, Usuyama N, Liu X, Naumann T, Gao J, Poon H (2021) Domain-specific language model pretraining for biomedical natural language processing. ACM Trans Comput Healthcare (HEALTH) 3:1\u201323","journal-title":"ACM Trans Comput Healthcare (HEALTH)"},{"key":"311_CR12","doi-asserted-by":"crossref","unstructured":"Guo J, Fan Y, Ai Q, Croft WB (2016) A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM international on conference on information and knowledge management, pp 55\u201364","DOI":"10.1145\/2983323.2983769"},{"key":"311_CR13","doi-asserted-by":"publisher","first-page":"76581","DOI":"10.1109\/ACCESS.2023.3295776","volume":"11","author":"KA Hambarde","year":"2023","unstructured":"Hambarde KA, Proenca H (2023) Information retrieval: recent advances and beyond. IEEE Access 11:76581\u201376604","journal-title":"IEEE Access"},{"key":"311_CR14","doi-asserted-by":"crossref","unstructured":"He B, He X, Chen M, Xue X, Zhu Y, Ling Z (2025) Rise: reasoning enhancement via iterative self-exploration in multi-hop question answering. arXiv:2505.21940","DOI":"10.18653\/v1\/2025.findings-acl.772"},{"key":"311_CR15","doi-asserted-by":"crossref","unstructured":"Jeong M, Sohn J, Sung M, Kang J (2024) Improving medical reasoning through retrieval and self-reflection with retrieval-augmented large language models. Bioinformatics 40:i119\u2013i129","DOI":"10.1093\/bioinformatics\/btae238"},{"key":"311_CR16","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3571730","volume":"55","author":"Z Ji","year":"2023","unstructured":"Ji Z, Lee N, Frieske R, Yu T, Su D, Xu Y, Ishii E, Bang YJ, Madotto A, Fung P (2023) Survey of hallucination in natural language generation. ACM Comput Surv 55:1\u201338","journal-title":"ACM Comput Surv"},{"key":"311_CR17","doi-asserted-by":"crossref","unstructured":"Johnson AE, Pollard TJ, Shen L, Lehman L-wH, Feng M, Ghassemi M, Moody B, Szolovits P, Anthony\u00a0Celi L, Mark RG (2016) Mimic-iii, a freely accessible critical care database. Sci Data 3:1\u20139","DOI":"10.1038\/sdata.2016.35"},{"key":"311_CR18","doi-asserted-by":"crossref","unstructured":"Karpukhin V, Oguz B, Min S, Lewis PS, Wu L, Edunov S, Chen D, Yih W-t (2020) Dense passage retrieval for open-domain question answering. In: EMNLP (1), pp 6769\u20136781","DOI":"10.18653\/v1\/2020.emnlp-main.550"},{"key":"311_CR19","doi-asserted-by":"crossref","unstructured":"Khattab O, Zaharia M (2020) Colbert: efficient and effective passage search via contextualized late interaction over bert. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 39\u201348","DOI":"10.1145\/3397271.3401075"},{"key":"311_CR20","unstructured":"Kirk R, Dina D, Voorhees E, William RH (2016) Overview of the trec 2016 clinical decision support track. In: Proceedings of the 15th text retrieval conference"},{"key":"311_CR21","unstructured":"Lazaridou A, Gribovskaya E, Stokowiec W, Grigorev N (2022) Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv:2203.05115"},{"key":"311_CR22","unstructured":"Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning. PMLR, pp 1188\u20131196"},{"key":"311_CR23","doi-asserted-by":"publisher","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","volume":"36","author":"J Lee","year":"2020","unstructured":"Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J (2020) Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36:1234\u20131240","journal-title":"Bioinformatics"},{"key":"311_CR24","unstructured":"Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, K\u00fcttler H, Lewis M, Yih W-t, Rockt\u00e4schel T et al (2020) Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv Neural Inf Process Syst 33:9459\u20139474"},{"key":"311_CR25","unstructured":"Mialon G, Dess\u00ec R, Lomeli M, Nalmpantis C, Pasunuru R, Raileanu R, Rozi\u00e8re B, Schick T, Dwivedi-Yu J, Celikyilmaz A et al (2023) Augmented language models: a survey. arXiv:2302.07842"},{"key":"311_CR26","doi-asserted-by":"publisher","first-page":"26094","DOI":"10.1038\/srep26094","volume":"6","author":"R Miotto","year":"2016","unstructured":"Miotto R, Li L, Kidd BA, Dudley JT (2016) Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci Rep 6:26094","journal-title":"Sci Rep"},{"key":"311_CR27","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1561\/1500000061","volume":"13","author":"B Mitra","year":"2018","unstructured":"Mitra B, Craswell N et al (2018) An introduction to neural information retrieval. Found Trends\u00ae Inf Retriev 13:1\u2013126","journal-title":"Found Trends\u00ae Inf Retriev"},{"key":"311_CR28","unstructured":"Nachimovsky H, Tennenholtz M, Kurland O (2025) A multi-agent perspective on modern information retrieval. arXiv:2502.14796"},{"key":"311_CR29","unstructured":"Nogueira R, Cho K (2019) Passage re-ranking with bert. arXiv:1901.04085"},{"key":"311_CR30","unstructured":"Nogueira R, Yang W, Lin J, Cho K (2019) Document expansion by query prediction. arXiv:1904.08375"},{"key":"311_CR31","doi-asserted-by":"crossref","unstructured":"Pal A, Umapathi LK, Sankarasubbu M (2023) Med-halt: medical domain hallucination test for large language models. arXiv:2307.15343","DOI":"10.18653\/v1\/2023.conll-1.21"},{"key":"311_CR32","doi-asserted-by":"crossref","unstructured":"Qu Y, Ding Y, Liu J, Liu K, Ren R, Zhao WX, Dong D, Wu H, Wang H (2020) Rocketqa: an optimized training approach to dense passage retrieval for open-domain question answering. arXiv:2010.08191","DOI":"10.18653\/v1\/2021.naacl-main.466"},{"key":"311_CR33","doi-asserted-by":"crossref","unstructured":"Reimers N, Gurevych I (2019) Sentence-bert: sentence embeddings using siamese bert-networks. arXiv:1908.10084","DOI":"10.18653\/v1\/D19-1410"},{"key":"311_CR34","doi-asserted-by":"publisher","first-page":"1431","DOI":"10.1093\/jamia\/ocaa091","volume":"27","author":"K Roberts","year":"2020","unstructured":"Roberts K, Alam T, Bedrick S, Demner-Fushman D, Lo K, Soboroff I, Voorhees E, Wang LL, Hersh WR (2020) Trec-covid: rationale and structure of an information retrieval shared task for covid-19. J Am Med Inform Assoc 27:1431\u20131436","journal-title":"J Am Med Inform Assoc"},{"key":"311_CR35","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1561\/1500000019","volume":"3","author":"S Robertson","year":"2009","unstructured":"Robertson S, Zaragoza H et al (2009) The probabilistic relevance framework: Bm25 and beyond. Found Trends\u00ae Inf Retriev 3:333\u2013389","journal-title":"Found Trends\u00ae Inf Retriev"},{"key":"311_CR36","doi-asserted-by":"crossref","unstructured":"Santhanam K, Khattab O, Saad-Falcon J, Potts C, Zaharia M (2021) Colbertv2: effective and efficient retrieval via lightweight late interaction. arXiv:2112.01488","DOI":"10.18653\/v1\/2022.naacl-main.272"},{"key":"311_CR37","first-page":"68539","volume":"36","author":"T Schick","year":"2023","unstructured":"Schick T, Dwivedi-Yu J, Dess\u00ec R, Raileanu R, Lomeli M, Hambro E, Zettlemoyer L, Cancedda N, Scialom T (2023) Toolformer: language models can teach themselves to use tools. Adv Neural Inf Process Syst 36:68539\u201368551","journal-title":"Adv Neural Inf Process Syst"},{"key":"311_CR38","first-page":"38154","volume":"36","author":"Y Shen","year":"2023","unstructured":"Shen Y, Song K, Tan X, Li D, Lu W, Zhuang Y (2023) Hugginggpt: solving ai tasks with chatgpt and its friends in hugging face. Adv Neural Inf Process Syst 36:38154\u201338180","journal-title":"Adv Neural Inf Process Syst"},{"key":"311_CR39","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1038\/s41586-023-06291-2","volume":"620","author":"K Singhal","year":"2023","unstructured":"Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, Scales N, Tanwani A, Cole-Lewis H, Pfohl S et al (2023) Large language models encode clinical knowledge. Nature 620:172\u2013180","journal-title":"Nature"},{"key":"311_CR40","unstructured":"Thakur N, Reimers N, R\u00fcckl\u00e9 A, Srivastava A, Gurevych I (2021) BEIR: a heterogeneous benchmark for zero-shot evaluation of information retrieval models"},{"key":"311_CR41","unstructured":"Touvron H, Martin L, Stone K, Albert P, Almahairi A, Babaei Y, Bashlykov N, Batra S, Bhargava P, Bhosale S et al (2023) Llama 2: open foundation and fine-tuned chat models. arXiv:2307.09288"},{"key":"311_CR42","doi-asserted-by":"crossref","unstructured":"Tsatsaronis G, Balikas G, Malakasiotis P, Partalas I, Zschunke M, Alvers MR, Weissenborn D, Krithara A, Petridis S, Polychronopoulos D et al (2015) An overview of the bioasq large-scale biomedical semantic indexing and question answering competition. BMC Bioinform 16:138","DOI":"10.1186\/s12859-015-0564-6"},{"key":"311_CR43","doi-asserted-by":"publisher","first-page":"1115","DOI":"10.1007\/s10439-023-03327-6","volume":"52","author":"C Wang","year":"2024","unstructured":"Wang C, Ong J, Wang C, Ong H, Cheng R, Ong D (2024) Potential for gpt technology to optimize future clinical decision-making using retrieval-augmented generation. Ann Biomed Eng 52:1115\u20131118","journal-title":"Ann Biomed Eng"},{"key":"311_CR44","first-page":"24824","volume":"35","author":"J Wei","year":"2022","unstructured":"Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV, Zhou D et al (2022) Chain-of-thought prompting elicits reasoning in large language models. Adv Neural Inf Process Syst 35:24824\u201324837","journal-title":"Adv Neural Inf Process Syst"},{"key":"311_CR45","doi-asserted-by":"crossref","unstructured":"Xiong G, Jin Q, Lu Z, Zhang A (2024) Benchmarking retrieval-augmented generation for medicine. In: Findings of the association for computational linguistics ACL 2024, pp 6233\u20136251","DOI":"10.18653\/v1\/2024.findings-acl.372"},{"key":"311_CR46","unstructured":"Xiong L, Xiong C, Li Y, Tang K-F, Liu J, Bennett P, Ahmed J, Overwijk A (2020) Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv:2007.00808"},{"key":"311_CR47","unstructured":"Yao S, Zhao J, Yu D, Du N, Shafran I, Narasimhan K, Cao Y (2023) React: synergizing reasoning and acting in language models. In: International Conference on Learning Representations (ICLR)"},{"key":"311_CR48","doi-asserted-by":"publisher","first-page":"AIoa2300068","DOI":"10.1056\/AIoa2300068","volume":"1","author":"C Zakka","year":"2024","unstructured":"Zakka C, Shad R, Chaurasia A, Dalal AR, Kim JL, Moor M, Fong R, Phillips C, Alexander K, Ashley E et al (2024) Almanac\u2014retrieval-augmented language models for clinical medicine. Nejm ai 1:AIoa2300068","journal-title":"Nejm ai"},{"key":"311_CR49","unstructured":"Zhao P, Zhang H, Yu Q, Wang Z, Geng Y, Fu F, Yang L, Zhang W, Jiang J, Cui B (2024) Retrieval-augmented generation for ai-generated content: a survey. arXiv:2402.19473"},{"key":"311_CR50","doi-asserted-by":"publisher","first-page":"909","DOI":"10.1038\/s41597-023-02814-8","volume":"10","author":"Z Zhao","year":"2023","unstructured":"Zhao Z, Jin Q, Chen F, Peng T, Yu S (2023) A large-scale dataset of patient summaries for retrieval-based clinical decision support systems. Sci Data 10:909","journal-title":"Sci Data"}],"container-title":["Journal of King Saud University Computer and Information Sciences"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44443-025-00311-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44443-025-00311-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44443-025-00311-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,19]],"date-time":"2025-11-19T15:45:35Z","timestamp":1763567135000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44443-025-00311-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,20]]},"references-count":50,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2025,11]]}},"alternative-id":["311"],"URL":"https:\/\/doi.org\/10.1007\/s44443-025-00311-z","relation":{},"ISSN":["1319-1578","2213-1248"],"issn-type":[{"value":"1319-1578","type":"print"},{"value":"2213-1248","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,20]]},"assertion":[{"value":"19 August 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 September 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 October 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"267"}}