{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T09:13:08Z","timestamp":1770714788438,"version":"3.49.0"},"reference-count":49,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,5,6]],"date-time":"2025-05-06T00:00:00Z","timestamp":1746489600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,5,6]],"date-time":"2025-05-06T00:00:00Z","timestamp":1746489600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Clinical trials are an essential component of drug development for new cancer treatments, yet the information required to determine a patient\u2019s eligibility for enrollment is scattered in large amounts of unstructured text. Genomic biomarkers are especially important in precision medicine and targeted therapies, making them essential for matching patients to appropriate trials. Large language models (LLMs) offer a promising solution for extracting this information from clinical trial study descriptions (e.g., brief summary, eligibility criteria), aiding in identifying suitable patient matches in downstream applications. In this study, we explore various strategies for extracting genetic biomarkers from oncology trials. Therefore, our focus is on structuring unstructured clinical trial data, not processing individual patient records. Our results show that open-source language models, when applied out-of-the-box, effectively capture complex logical expressions and structure genomic biomarkers, outperforming closed-source models such as GPT-4. Furthermore, fine-tuning these open-source models with additional data significantly enhances their performance.<\/jats:p>","DOI":"10.1038\/s41746-025-01673-4","type":"journal-article","created":{"date-parts":[[2025,5,5]],"date-time":"2025-05-05T23:11:56Z","timestamp":1746486716000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Enhancing biomarker based oncology trial matching using large language models"],"prefix":"10.1038","volume":"8","author":[{"given":"Nour","family":"Alkhoury","sequence":"first","affiliation":[]},{"given":"Maqsood","family":"Shaik","sequence":"additional","affiliation":[]},{"given":"Ricardo","family":"Wurmus","sequence":"additional","affiliation":[]},{"given":"Altuna","family":"Akalin","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,5,6]]},"reference":[{"key":"1673_CR1","doi-asserted-by":"crossref","unstructured":"Padma, V. V. An overview of targeted cancer therapy. Biomedicine 5, 1\u20136 (2015).","DOI":"10.7603\/s40681-015-0019-4"},{"key":"1673_CR2","doi-asserted-by":"publisher","first-page":"229","DOI":"10.3322\/caac.21834","volume":"74","author":"F Bray","year":"2024","unstructured":"Bray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 74, 229\u2013263 (2024).","journal-title":"CA Cancer J. Clin."},{"key":"1673_CR3","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1016\/j.flm.2017.06.001","volume":"1","author":"K Xing","year":"2017","unstructured":"Xing, K. & Shen, L. Molecular targeted therapy of cancer: the progress and future prospect. Front. Lab. Med. 1, 69\u201375 (2017).","journal-title":"Front. Lab. Med."},{"key":"1673_CR4","doi-asserted-by":"publisher","first-page":"201","DOI":"10.1038\/s41392-021-00572-w","volume":"6","author":"L Zhong","year":"2021","unstructured":"Zhong, L. et al. Small molecules in targeted cancer therapy: advances, challenges, and future perspectives. Signal Transduc. Target. Ther. 6, 201 (2021).","journal-title":"Signal Transduc. Target. Ther."},{"key":"1673_CR5","doi-asserted-by":"publisher","first-page":"9663","DOI":"10.1007\/s11033-023-08809-3","volume":"50","author":"R Kaur","year":"2023","unstructured":"Kaur, R., Bhardwaj, A. & Gupta, S. Cancer treatment therapies: traditional to modern approaches to combat cancers. Mol. Biol. Rep. 50, 9663\u20139676 (2023).","journal-title":"Mol. Biol. Rep."},{"key":"1673_CR6","doi-asserted-by":"publisher","first-page":"503","DOI":"10.1038\/bjc.2017.197","volume":"117","author":"G Pirovano","year":"2017","unstructured":"Pirovano, G. et al. TOPK modulates tumour-specific radiosensitivity and correlates with recurrence after prostate radiotherapy. Br. J. Cancer 117, 503\u2013512 (2017).","journal-title":"Br. J. Cancer"},{"key":"1673_CR7","doi-asserted-by":"publisher","first-page":"351","DOI":"10.1038\/s41578-020-00269-6","volume":"6","author":"MT Manzari","year":"2021","unstructured":"Manzari, M. T. et al. Targeted drug delivery strategies for precision medicines. Nat. Rev. Mater. 6, 351\u2013370 (2021).","journal-title":"Nat. Rev. Mater."},{"key":"1673_CR8","doi-asserted-by":"publisher","first-page":"694","DOI":"10.1377\/hlthaff.2017.1624","volume":"37","author":"GS Ginsburg","year":"2018","unstructured":"Ginsburg, G. S. & Phillips, K. A. Precision medicine: from science to value. Health Aff. 37, 694\u2013701 (2018).","journal-title":"Health Aff."},{"key":"1673_CR9","doi-asserted-by":"publisher","first-page":"13618","DOI":"10.3390\/ijms241713618","volume":"24","author":"HY Choi","year":"2023","unstructured":"Choi, H. Y. & Chang, J.-E. Targeted therapy for cancers: from ongoing clinical trials to FDA-Approved drugs. Int. J. Mol. Sci. 24, 13618 (2023).","journal-title":"Int. J. Mol. Sci."},{"key":"1673_CR10","doi-asserted-by":"publisher","first-page":"783","DOI":"10.1056\/NEJM200103153441101","volume":"344","author":"DJ Slamon","year":"2001","unstructured":"Slamon, D. J. et al. Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2. N. Engl. J. Med. 344, 783\u2013792 (2001).","journal-title":"N. Engl. J. Med."},{"key":"1673_CR11","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1093\/biostatistics\/kxx069","volume":"20","author":"CH Wong","year":"2018","unstructured":"Wong, C. H., Siah, K. W. & Lo, A. W. Estimation of clinical trial success rates and related parameters. Biostatistics 20, 273\u2013286 (2018).","journal-title":"Biostatistics"},{"key":"1673_CR12","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1158\/2159-8290.CD-23-0467","volume":"14","author":"SP Suehnholz","year":"2023","unstructured":"Suehnholz, S. P. et al. Quantifying the expanding landscape of clinical actionability for patients with cancer. Cancer Discov. 14, 49\u201365 (2023).","journal-title":"Cancer Discov."},{"key":"1673_CR13","doi-asserted-by":"publisher","unstructured":"Unger, J. M., Cook, E., Tai, E. & Bleyer, A. The role of clinical trial participation in cancer research: barriers, evidence, and strategies. Am. Soc. Clin. Oncol. Educ. Book 185\u2013198 https:\/\/doi.org\/10.1200\/edbk_156686 (2016).","DOI":"10.1200\/edbk_156686"},{"key":"1673_CR14","doi-asserted-by":"publisher","first-page":"dju229","DOI":"10.1093\/jnci\/dju229","volume":"106","author":"KD Stensland","year":"2014","unstructured":"Stensland, K. D. et al. Adult cancer clinical trials that fail to complete: an Epidemic? JNCI J. Natl Cancer Inst. 106, dju229 (2014).","journal-title":"JNCI J. Natl Cancer Inst."},{"key":"1673_CR15","doi-asserted-by":"publisher","first-page":"875","DOI":"10.1001\/jamaoncol.2015.6487","volume":"2","author":"JM Unger","year":"2016","unstructured":"Unger, J. M. et al. The scientific impact of positive and negative phase 3 cancer clinical trials. JAMA Oncol. 2, 875 (2016).","journal-title":"JAMA Oncol."},{"key":"1673_CR16","doi-asserted-by":"publisher","first-page":"5557","DOI":"10.1158\/1078-0432.CCR-10-0133","volume":"16","author":"SK Cheng","year":"2010","unstructured":"Cheng, S. K., Dietrich, M. S. & Dilts, D. M. A sense of urgency: evaluating the link between clinical trial development time and the accrual performance of cancer therapy evaluation program (NCI-CTEP) sponsored studies. Clin. Cancer Res. 16, 5557\u20135563 (2010).","journal-title":"Clin. Cancer Res."},{"key":"1673_CR17","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1093\/jnci\/djy221","volume":"111","author":"JM Unger","year":"2019","unstructured":"Unger, J. M., Vaidya, R., Hershman, D. L., Minasian, L. M. & Fleury, M. E. Systematic review and meta-analysis of the magnitude of structural, clinical, and physician and patient barriers to cancer clinical trial participation. J. Natl Cancer Inst. 111, 245\u2013255 (2019).","journal-title":"J. Natl Cancer Inst."},{"key":"1673_CR18","doi-asserted-by":"publisher","first-page":"e275","DOI":"10.1200\/JOP.2013.001120","volume":"9","author":"CP Somkin","year":"2013","unstructured":"Somkin, C. P. et al. Effect of medical oncologists\u2019 attitudes on accrual to clinical trials in a community setting. J. Oncol. Pract. 9, e275\u2013e283 (2013).","journal-title":"J. Oncol. Pract."},{"key":"1673_CR19","unstructured":"Organizational barriers to physician participation in cancer clinical trials. PubMed https:\/\/pubmed.ncbi.nlm.nih.gov\/16044978\/ (2005)."},{"key":"1673_CR20","doi-asserted-by":"publisher","first-page":"2067","DOI":"10.1200\/JCO.1991.9.11.2067","volume":"9","author":"AB Benson","year":"1991","unstructured":"Benson, A. B. et al. Oncologists\u2019 reluctance to accrue patients onto clinical trials: an Illinois Cancer Center study. J. Clin. Oncol. 9, 2067\u20132075 (1991).","journal-title":"J. Clin. Oncol."},{"key":"1673_CR21","doi-asserted-by":"publisher","first-page":"1203","DOI":"10.1200\/JCO.2000.18.6.1203","volume":"18","author":"LA Siminoff","year":"2000","unstructured":"Siminoff, L. A., Zhang, A., Colabianchi, N., Sturm, C. M. S. & Shen, Q. Factors that predict the referral of breast cancer patients onto clinical trials by their surgeons and medical oncologists. J. Clin. Oncol. 18, 1203\u20131211 (2000).","journal-title":"J. Clin. Oncol."},{"key":"1673_CR22","doi-asserted-by":"publisher","first-page":"521","DOI":"10.1093\/jamiaopen\/ooz041","volume":"2","author":"G Karystianis","year":"2019","unstructured":"Karystianis, G., Florez-Vargas, O., Butler, T. & Nenadic, G. A rule-based approach to identify patient eligibility criteria for clinical trials from narrative longitudinal records. JAMIA Open 2, 521\u2013527 (2019).","journal-title":"JAMIA Open"},{"key":"1673_CR23","doi-asserted-by":"publisher","first-page":"239","DOI":"10.1016\/j.jbi.2010.09.007","volume":"44","author":"SW Tu","year":"2010","unstructured":"Tu, S. W. et al. A practical method for transforming free-text eligibility criteria into computable criteria. J. Biomed. Inform. 44, 239\u2013250 (2010).","journal-title":"J. Biomed. Inform."},{"key":"1673_CR24","unstructured":"Wong, C. et al. Scaling clinical trial matching using large language models: a case study in oncology. arXiv.org https:\/\/arxiv.org\/abs\/2308.02180 (2023)."},{"key":"1673_CR25","doi-asserted-by":"publisher","first-page":"e27767","DOI":"10.2196\/27767","volume":"9","author":"T Haddad","year":"2021","unstructured":"Haddad, T. et al. Accuracy of an artificial intelligence system for cancer clinical trial eligibility screening: Retrospective pilot study. JMIR Med. Inform. 9, e27767 (2021).","journal-title":"JMIR Med. Inform."},{"key":"1673_CR26","unstructured":"Brown, T. B. et al. Language Models are Few-Shot Learners. arXiv.org https:\/\/arxiv.org\/abs\/2005.14165 (2020)."},{"key":"1673_CR27","doi-asserted-by":"publisher","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","volume":"36","author":"J Lee","year":"2019","unstructured":"Lee, J. et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 1234\u20131240 (2019).","journal-title":"Bioinformatics"},{"key":"1673_CR28","unstructured":"Hamer, D. M. D., Schoor, P., Polak, T. B. & Kapitan, D. Improving patient pre-screening for clinical trials: assisting physicians with large language models. arXiv.org https:\/\/arxiv.org\/abs\/2304.07396 (2023)."},{"key":"1673_CR29","doi-asserted-by":"crossref","unstructured":"Jin, Q. et al. Matching patients to clinical trials with large language models. Nat. Commun. 15, 9074 (2024).","DOI":"10.1038\/s41467-024-53081-z"},{"key":"1673_CR30","doi-asserted-by":"crossref","unstructured":"Nievas, M., Basu, A., Wang, Y. & Singh, H. Distilling large language models for matching patients to clinical trials. J. American Med. Inform. Assoc. 31, 1953\u20131963 (2024).","DOI":"10.1093\/jamia\/ocae073"},{"key":"1673_CR31","doi-asserted-by":"publisher","first-page":"375","DOI":"10.1093\/jamia\/ocad218","volume":"31","author":"S Datta","year":"2023","unstructured":"Datta, S. et al. AutoCriteria: a generalizable clinical trial eligibility criteria extraction system powered by large language models. J. Am. Med. Inform. Assoc. 31, 375\u2013385 (2023).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"1673_CR32","unstructured":"Jiang, A. Q. et al. Mistral 7B. arXiv.org https:\/\/arxiv.org\/abs\/2310.06825 (2023)."},{"key":"1673_CR33","unstructured":"Minaee, S. et al. Large Language Models: a survey. arXiv.org https:\/\/arxiv.org\/abs\/2402.06196 (2024)."},{"key":"1673_CR34","unstructured":"Touvron, H. et al. LLAMA: Open and efficient foundation language models. arXiv.org https:\/\/arxiv.org\/abs\/2302.13971 (2023)."},{"key":"1673_CR35","unstructured":"Ouyang, L. et al. Training language models to follow instructions with human feedback. arXiv.org https:\/\/arxiv.org\/abs\/2203.02155 (2022)."},{"key":"1673_CR36","unstructured":"OpenAI et al. GPT-4 Technical Report. arXiv.org https:\/\/arxiv.org\/abs\/2303.08774 (2023)."},{"key":"1673_CR37","unstructured":"Ye, J. et al. A comprehensive capability analysis of GPT-3 and GPT-3.5 series models. arXiv.org https:\/\/arxiv.org\/abs\/2303.10420 (2023)."},{"key":"1673_CR38","doi-asserted-by":"crossref","unstructured":"Wu, T. et al. PromptChainer: Chaining Large Language Model Prompts through Visual Programming. arXiv.org https:\/\/arxiv.org\/abs\/2203.06566 (2022).","DOI":"10.1145\/3491101.3519729"},{"key":"1673_CR39","unstructured":"Kang, K., Wallace, E., Tomlin, C., Kumar, A. & Levine, S. Unfamiliar finetuning examples control how language models hallucinate. arXiv.org https:\/\/arxiv.org\/abs\/2403.05612 (2024)."},{"key":"1673_CR40","unstructured":"An OCR Post-Correction approach using deep learning for processing medical reports. IEEE Journals & Magazine | IEEE Xplore https:\/\/ieeexplore.ieee.org\/document\/9448197 (2022)."},{"key":"1673_CR41","unstructured":"Kaufmann, B. et al. Validation of a Zero-Shot learning natural language processing tool for data abstraction from unstructured healthcare data. arXiv.org https:\/\/arxiv.org\/abs\/2308.00107 (2023)."},{"key":"1673_CR42","doi-asserted-by":"publisher","first-page":"1208","DOI":"10.1093\/jamia\/ocac040","volume":"29","author":"S Zhou","year":"2022","unstructured":"Zhou, S., Wang, N., Wang, L., Liu, H. & Zhang, R. CancerBERT: a cancer domain-specific language model for extracting breast cancer phenotypes from electronic health records. J. Am. Med. Inform. Assoc. 29, 1208\u20131216 (2022).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"1673_CR43","unstructured":"Lewis, P. et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP tasks. arXiv.org https:\/\/arxiv.org\/abs\/2005.11401 (2020)."},{"key":"1673_CR44","unstructured":"Rafailov, R. et al. Direct preference optimization: your language model is secretly a reward model. arXiv.org https:\/\/arxiv.org\/abs\/2305.18290 (2023)."},{"key":"1673_CR45","first-page":"8","volume":"1","author":"A Radford","year":"2019","unstructured":"Radford, A. et al. Language models are unsupervised multitask learners. OpenAI Blog 1, 8\u20139 (2019).","journal-title":"OpenAI Blog"},{"key":"1673_CR46","unstructured":"Liu, P. et al. Pre-train, Prompt, and Predict: A systematic survey of prompting methods in natural language processing. arXiv.org https:\/\/arxiv.org\/abs\/2107.13586 (2021)."},{"key":"1673_CR47","doi-asserted-by":"crossref","unstructured":"Wu, T., Terry, M. & Cai, C. J. AI chains: Transparent and controllable Human-AI interaction by chaining large language model prompts. arXiv.org https:\/\/arxiv.org\/abs\/2110.01691 (2021).","DOI":"10.1145\/3491102.3517582"},{"key":"1673_CR48","unstructured":"Dettmers, T., Pagnoni, A., Holtzman, A. & Zettlemoyer, L. QLORA: efficient finetuning of quantized LLMS. arXiv.org https:\/\/arxiv.org\/abs\/2305.14314 (2023)."},{"key":"1673_CR49","unstructured":"Hu, E. J. et al. LORA: Low-Rank adaptation of Large Language Models. arXiv.org https:\/\/arxiv.org\/abs\/2106.09685 (2021)."}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-025-01673-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-025-01673-4","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-025-01673-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,5]],"date-time":"2025-05-05T23:12:15Z","timestamp":1746486735000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-025-01673-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,6]]},"references-count":49,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["1673"],"URL":"https:\/\/doi.org\/10.1038\/s41746-025-01673-4","relation":{},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5,6]]},"assertion":[{"value":"11 September 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 April 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 May 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"250"}}