{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,22]],"date-time":"2026-05-22T23:06:21Z","timestamp":1779491181618,"version":"3.53.1"},"reference-count":45,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T00:00:00Z","timestamp":1775001600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"funder":[{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["UL1TR001873"],"award-info":[{"award-number":["UL1TR001873"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["UL1TR002384"],"award-info":[{"award-number":["UL1TR002384"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000092","name":"National Library of Medicine","doi-asserted-by":"publisher","award":["R01LM014344"],"award-info":[{"award-number":["R01LM014344"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000092","name":"National Library of Medicine","doi-asserted-by":"publisher","award":["R01LM014573"],"award-info":[{"award-number":["R01LM014573"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000092","name":"National Library of Medicine","doi-asserted-by":"publisher","award":["T15LM007079"],"award-info":[{"award-number":["T15LM007079"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Objective<\/jats:title>\n                    <jats:p>To evaluate the effectiveness of generative query expansion for biomedical literature retrieval.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Materials and Methods<\/jats:title>\n                    <jats:p>We thoroughly examined eight generative query expansion methods using three large language models across five datasets for biomedical literature retrieval. We further performed a quantitative analysis, including performance comparisons, rank transition analysis, and article-type effect analysis. We also conducted a qualitative examination of representative cases, from which we derived an error taxonomy.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>On BioASQ-Y\/N, GPT-4o-based query expansion shifts Recall@10 to 0.417-0.512 and nDCG@10 to 0.358-0.479, relative to a baseline of 0.491 and 0.456. For PubMedQA, Precision@1 ranges from 0.764 to 0.876 and nDCG@10 from 0.847 to 0.931, compared with baseline values of 0.893 and 0.935. For 2019-Trec-PM, query expansion yields Recall@100 of 0.217-0.256 and nDCG@100 of 0.272-0.312, versus a baseline of 0.227 and 0.274. Similarly, for 2018-TREC-PM, Recall@100 spans 0.169-0.227 and nDCG@100 spans 0.195-0.250, relative to baseline scores of 0.164 and 0.191. For 2017-TREC-PM, Recall@100 and nDCG@100 fall within 0.111-0.139 and 0.154-0.191 under query expansion, compared with baseline metrics of 0.102 and 0.147. Both general-purpose and domain-specific Llama-based models demonstrate similar performance to GPT-4o.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Discussion and Conclusion<\/jats:title>\n                    <jats:p>The impact of query expansion varies significantly by the expansion methods and type of evidence, but is relatively agnostic to backbone model choice. Notably, query expansion primarily affects article ranking but has a limited impact on the screening stage. Our findings underscore the unique challenges of biomedical literature retrieval and highlight the need to develop domain-specific information retrieval techniques.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/jamia\/ocag037","type":"journal-article","created":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T22:25:22Z","timestamp":1775082322000},"page":"1121-1133","source":"Crossref","is-referenced-by-count":0,"title":["A critical evaluation of generative query expansion on biomedical literature retrieval"],"prefix":"10.1093","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2681-1931","authenticated-orcid":false,"given":"Yilu","family":"Fang","sequence":"first","affiliation":[{"name":"Department of Biomedical Informatics, Columbia University , New York, NY,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-0077-3615","authenticated-orcid":false,"given":"Gongbo","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Columbia University , New York, NY,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2926-1063","authenticated-orcid":false,"given":"Fangyi","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Columbia University , New York, NY,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9309-8331","authenticated-orcid":false,"given":"Yifan","family":"Peng","sequence":"additional","affiliation":[{"name":"Department of Population Health Sciences, Weill Cornell Medicine , New York, NY,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9624-0214","authenticated-orcid":false,"given":"Chunhua","family":"Weng","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Columbia University , New York, NY,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2026,4,1]]},"reference":[{"key":"2026052218581991600_ocag037-B1","doi-asserted-by":"crossref","first-page":"104988","DOI":"10.1016\/j.ebiom.2024.104988","article-title":"PubMed and beyond: biomedical literature search in the age of artificial intelligence","volume":"100","author":"Jin","year":"2024","journal-title":"EBioMedicine."},{"key":"2026052218581991600_ocag037-B2","doi-asserted-by":"publisher","author":"Xiong","year":"2024","DOI":"10.18653\/v1\/2024.findings-acl.372"},{"key":"2026052218581991600_ocag037-B3","doi-asserted-by":"publisher","author":"Jin","year":"2023","DOI":"10.1145\/3539618.3592005"},{"key":"2026052218581991600_ocag037-B4","doi-asserted-by":"crossref","first-page":"btad651","DOI":"10.1093\/bioinformatics\/btad651","article-title":"MedCPT: Contrastive Pre-trained Transformers with large-scale PubMed search logs for zero-shot biomedical information retrieval","volume":"39","author":"Jin","year":"2023","journal-title":"Bioinforma. Oxf. Engl"},{"key":"2026052218581991600_ocag037-B5","doi-asserted-by":"publisher","author":"Xu","year":"2024","DOI":"10.18653\/v1\/2024.emnlp-main.1241"},{"key":"2026052218581991600_ocag037-B6","doi-asserted-by":"crossref","first-page":"1698","DOI":"10.1016\/j.ipm.2019.05.009","article-title":"Query expansion techniques for information retrieval: A survey","volume":"56","author":"Azad","year":"2019","journal-title":"Inf. Process. Manag"},{"key":"2026052218581991600_ocag037-B7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2071389.2071390","article-title":"A Survey of automatic query expansion in information retrieval","volume":"44","author":"Carpineto","year":"2012","journal-title":"ACM Comput Surv."},{"key":"2026052218581991600_ocag037-B8","doi-asserted-by":"crossref","first-page":"D267","DOI":"10.1093\/nar\/gkh061","article-title":"The Unified Medical Language System (UMLS): integrating biomedical terminology","volume":"32","author":"Bodenreider","year":"2004","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2026052218581991600_ocag037-B9","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1016\/s0933-3657(98)00045-1","article-title":"Representation of change in controlled medical terminologies","volume":"15","author":"Oliver","year":"1999","journal-title":"Artif Intell Med"},{"key":"2026052218581991600_ocag037-B10","first-page":"125","article-title":"Management of dynamic biomedical terminologies: current status and future challenges","volume":"10","author":"Da Silveira","year":"2015","journal-title":"Yearb Med Inform"},{"key":"2026052218581991600_ocag037-B11","doi-asserted-by":"crossref","first-page":"4592","DOI":"10.1093\/eurheartj\/ehaa650","article-title":"Nomenclature for kidney function and disease-executive summary and glossary from a Kidney Disease: Improving Global Outcomes (KDIGO) consensus conference","volume":"41","author":"Levey","year":"2020","journal-title":"Eur Heart J"},{"key":"2026052218581991600_ocag037-B12","doi-asserted-by":"crossref","first-page":"512","DOI":"10.1016\/j.jbi.2004.08.004","article-title":"Term identification in the biomedical literature","volume":"37","author":"Krauthammer","year":"2004","journal-title":"J Biomed Inform"},{"key":"2026052218581991600_ocag037-B13","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1186\/s12859-016-1092-8","article-title":"Improving biomedical information retrieval by linear combinations of different query expansion techniques","volume":"17","author":"Abdulla","year":"2016","journal-title":"BMC Bioinformat"},{"key":"2026052218581991600_ocag037-B14","doi-asserted-by":"crossref","first-page":"45448","DOI":"10.1109\/ACCESS.2018.2861869","article-title":"Semantic sequential query expansion for biomedical article search","volume":"6","author":"Fang","year":"2018","journal-title":"IEEE Access"},{"key":"2026052218581991600_ocag037-B15","doi-asserted-by":"crossref","first-page":"100247","DOI":"10.1016\/j.smhl.2021.100247","article-title":"A hybrid query expansion framework for the optimal retrieval of the biomedical literature","volume":"23","author":"Malik","year":"2022","journal-title":"Smart Health"},{"key":"2026052218581991600_ocag037-B16","doi-asserted-by":"crossref","first-page":"212","DOI":"10.1186\/1471-2105-11-212","article-title":"Concept-based query expansion for retrieving gene related publications from Medline","volume":"11","author":"Matos","year":"2010","journal-title":"BMC Bioinformatics."},{"key":"2026052218581991600_ocag037-B17","doi-asserted-by":"crossref","first-page":"132158","DOI":"10.1155\/2014\/132158","article-title":"Study of query expansion techniques and their application in the biomedical information retrieval","volume":"2014","author":"Rivas","year":"2014","journal-title":"ScientificWorldJournal."},{"key":"2026052218581991600_ocag037-B18","doi-asserted-by":"publisher","author":"Gao","year":"2023","DOI":"10.18653\/v1\/2023.acl-long.99"},{"key":"2026052218581991600_ocag037-B19","doi-asserted-by":"publisher","author":"Jia","year":"2024","DOI":"10.18653\/v1\/2024.naacl-long.138"},{"key":"2026052218581991600_ocag037-B20","doi-asserted-by":"publisher","author":"Lei","year":"2024","DOI":"10.18653\/v1\/2024.eacl-short.34"},{"key":"2026052218581991600_ocag037-B21","doi-asserted-by":"publisher","author":"Mackie","year":"2023","DOI":"10.1145\/3539618.3591992"},{"key":"2026052218581991600_ocag037-B22","doi-asserted-by":"publisher","author":"Shen","year":"2024","DOI":"10.18653\/v1\/2024.findings-acl.943"},{"key":"2026052218581991600_ocag037-B23","doi-asserted-by":"publisher","author":"Wang","year":"2023","DOI":"10.18653\/v1\/2023.emnlp-main.585"},{"key":"2026052218581991600_ocag037-B24","first-page":"1","article-title":"Large language models for information retrieval: A survey","volume":"44","author":"Zhu","year":"2026","journal-title":"ACM Trans. Inf. Syst"},{"key":"2026052218581991600_ocag037-B25","first-page":"24824","author":"Wei","year":"2022"},{"key":"2026052218581991600_ocag037-B26","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1038\/s41746-025-01840-7","article-title":"Accelerating clinical evidence synthesis with large language models","volume":"8","author":"Wang","year":"2025","journal-title":"NPJ Digit Med"},{"key":"2026052218581991600_ocag037-B27","doi-asserted-by":"publisher","author":"Jagerman","year":"2023","DOI":"10.48550\/arXiv.2305.03653"},{"key":"2026052218581991600_ocag037-B28","doi-asserted-by":"publisher","author":"OpenAI","year":"2024","DOI":"10.48550\/arXiv.2410.21276"},{"key":"2026052218581991600_ocag037-B29","doi-asserted-by":"publisher","author":"Grattafiori","DOI":"10.48550\/arXiv.2407.21783"},{"key":"2026052218581991600_ocag037-B30","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2408.06142","article-title":"Med42-v2: A Suite of Clinical LLMs","author":"Christophe","year":"2024"},{"key":"2026052218581991600_ocag037-B31","first-page":"333","article-title":"The probabilistic relevance framework: BM25 and beyond","volume":"3","author":"Robertson","year":"2009","journal-title":"FNT in Information Retrieval"},{"key":"2026052218581991600_ocag037-B32","doi-asserted-by":"crossref","first-page":"1250930","DOI":"10.3389\/frma.2023.1250930","article-title":"The road from manual to automatic semantic indexing of biomedical literature: a 10 years journey","volume":"8","author":"Krithara","year":"2023","journal-title":"Front Res Metr Anal"},{"key":"2026052218581991600_ocag037-B33","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1038\/s41597-023-02068-4","article-title":"BioASQ-QA: A manually curated corpus for Biomedical Question Answering","volume":"10","author":"Krithara","year":"2023","journal-title":"Sci Data."},{"key":"2026052218581991600_ocag037-B34","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1186\/s12859-015-0564-6","article-title":"An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition","volume":"16","author":"Tsatsaronis","year":"2015","journal-title":"BMC Bioinformatics."},{"key":"2026052218581991600_ocag037-B35","author":"Jin","year":"2019"},{"key":"2026052218581991600_ocag037-B36","author":"Roberts","year":"2019"},{"key":"2026052218581991600_ocag037-B37","author":"Roberts"},{"key":"2026052218581991600_ocag037-B38","author":"Roberts","year":"2017"},{"key":"2026052218581991600_ocag037-B39","doi-asserted-by":"publisher","author":"Lin","year":"2021","DOI":"10.1145\/3404835.3463238"},{"key":"2026052218581991600_ocag037-B40","doi-asserted-by":"publisher","author":"Bajaj","year":"2018","DOI":"10.48550\/arXiv.1611.09268"},{"key":"2026052218581991600_ocag037-B41","doi-asserted-by":"publisher","author":"Ounis","year":"2005","DOI":"10.1007\/978-3-540-31865-1_37"},{"key":"2026052218581991600_ocag037-B42","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1037\/h0061470","article-title":"The critical incident technique","volume":"51","author":"Flanagan","year":"1954","journal-title":"Psychol Bull"},{"key":"2026052218581991600_ocag037-B43","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1186\/1752-1947-7-239","article-title":"A guide to writing case reports for the Journal of Medical Case Reports and BioMed central research notes","volume":"7","author":"Rison","year":"2013","journal-title":"J Med Case Rep."},{"key":"2026052218581991600_ocag037-B44","first-page":"104","article-title":"Guidelines to writing a clinical case report","volume":"18","year":"2017","journal-title":"Heart Views Off. J. Gulf Heart Assoc"},{"key":"2026052218581991600_ocag037-B45","first-page":"1","article-title":"Basics of writing review articles","volume":"59","author":"Erol","year":"2022","journal-title":"Arch. Neuropsychiatry"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/33\/6\/1121\/67719245\/ocag037.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/33\/6\/1121\/67719245\/ocag037.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,22]],"date-time":"2026-05-22T22:58:30Z","timestamp":1779490710000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/33\/6\/1121\/8571780"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,1]]},"references-count":45,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2026,4,1]]},"published-print":{"date-parts":[[2026,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocag037","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,6]]},"published":{"date-parts":[[2026,4,1]]}}}