{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T16:12:19Z","timestamp":1775059939071,"version":"3.50.1"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,9,2]],"date-time":"2023-09-02T00:00:00Z","timestamp":1693612800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,9,2]],"date-time":"2023-09-02T00:00:00Z","timestamp":1693612800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"U.S. Department of Defense, CDMRP Award","award":["HT9425-23-1-0023"],"award-info":[{"award-number":["HT9425-23-1-0023"]}]},{"DOI":"10.13039\/100000892","name":"Prostate Cancer Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000892","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000862","name":"Doris Duke Charitable Foundation","doi-asserted-by":"publisher","award":["2020080"],"award-info":[{"award-number":["2020080"]}],"id":[{"id":"10.13039\/100000862","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000005","name":"U.S. Department of Defense","doi-asserted-by":"publisher","award":["W81XWH-21-PCRP-DSA"],"award-info":[{"award-number":["W81XWH-21-PCRP-DSA"]}],"id":[{"id":"10.13039\/100000005","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Mark Foundation Emerging Leader Award"},{"name":"National Cancer Institute","award":["R00CA245899"],"award-info":[{"award-number":["R00CA245899"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Longitudinal data on key cancer outcomes for clinical research, such as response to treatment and disease progression, are not captured in standard cancer registry reporting. Manual extraction of such outcomes from unstructured electronic health records is a slow, resource-intensive process. Natural language processing (NLP) methods can accelerate outcome annotation, but they require substantial labeled data. Transfer learning based on language modeling, particularly using the Transformer architecture, has achieved improvements in NLP performance. However, there has been no systematic evaluation of NLP model training strategies on the extraction of cancer outcomes from unstructured text.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We evaluated the performance of nine NLP models at the two tasks of identifying cancer response and cancer progression within imaging reports at a single academic center among patients with non-small cell lung cancer. We trained the classification models under different conditions, including training sample size, classification architecture, and language model pre-training. The training involved a labeled dataset of 14,218 imaging reports for 1112 patients with lung cancer. A subset of models was based on a pre-trained language model, DFCI-ImagingBERT, created by further pre-training a BERT-based model using an unlabeled dataset of 662,579 reports from 27,483 patients with cancer from our center. A classifier based on our DFCI-ImagingBERT, trained on more than 200 patients, achieved the best results in most experiments; however, these results were marginally better than simpler \u201cbag of words\u201d or convolutional neural network models.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusion<\/jats:title>\n                <jats:p>When developing AI models to extract outcomes from imaging reports for clinical cancer research, if computational resources are plentiful but labeled training data are limited, large language models can be used for zero- or few-shot learning to achieve reasonable performance. When computational resources are more limited but labeled training data are readily available, even simple machine learning architectures can achieve good performance for such tasks.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-023-05439-1","type":"journal-article","created":{"date-parts":[[2023,9,2]],"date-time":"2023-09-02T01:02:09Z","timestamp":1693616529000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Empirical evaluation of language modeling to ascertain cancer outcomes from clinical text reports"],"prefix":"10.1186","volume":"24","author":[{"given":"Haitham A.","family":"Elmarakeby","sequence":"first","affiliation":[]},{"given":"Pavel S.","family":"Trukhanov","sequence":"additional","affiliation":[]},{"given":"Vidal M.","family":"Arroyo","sequence":"additional","affiliation":[]},{"given":"Irbaz Bin","family":"Riaz","sequence":"additional","affiliation":[]},{"given":"Deborah","family":"Schrag","sequence":"additional","affiliation":[]},{"given":"Eliezer M.","family":"Van Allen","sequence":"additional","affiliation":[]},{"given":"Kenneth L.","family":"Kehl","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,9,2]]},"reference":[{"issue":"15","key":"5439_CR1","doi-asserted-by":"publisher","first-page":"1803","DOI":"10.1200\/JCO.2013.49.4799","volume":"31","author":"LA Garraway","year":"2013","unstructured":"Garraway LA, Verweij J, Ballman KV. Precision oncology: an overview. J Clin Oncol Off J Am Soc Clin Oncol. 2013;31(15):1803\u20135.","journal-title":"J Clin Oncol Off J Am Soc Clin Oncol"},{"issue":"8","key":"5439_CR2","doi-asserted-by":"publisher","first-page":"818","DOI":"10.1158\/2159-8290.CD-17-0151","volume":"7","author":"AACR Project GENIE Consortium","year":"2017","unstructured":"AACR Project GENIE Consortium. AACR Project GENIE: powering precision medicine through an international consortium. Cancer Discov. 2017;7(8):818\u201331.","journal-title":"Cancer Discov"},{"issue":"10","key":"5439_CR3","doi-asserted-by":"publisher","first-page":"1421","DOI":"10.1001\/jamaoncol.2019.1800","volume":"5","author":"KL Kehl","year":"2019","unstructured":"Kehl KL, Elmarakeby H, Nishino M, Van Allen EM, Lepisto EM, Hassett MJ, et al. Assessment of deep natural language processing in ascertaining oncologic outcomes from radiology reports. JAMA Oncol. 2019;5(10):1421\u20139.","journal-title":"JAMA Oncol"},{"issue":"1","key":"5439_CR4","doi-asserted-by":"publisher","first-page":"7304","DOI":"10.1038\/s41467-021-27358-6","volume":"12","author":"KL Kehl","year":"2021","unstructured":"Kehl KL, Xu W, Gusev A, Bakouny Z, Choueiri TK, Riaz IB, et al. Artificial intelligence-aided clinical annotation of a large multi-cancer genomic dataset. Nat Commun. 2021;12(1):7304.","journal-title":"Nat Commun"},{"key":"5439_CR5","doi-asserted-by":"publisher","first-page":"680","DOI":"10.1200\/CCI.20.00020","volume":"4","author":"KL Kehl","year":"2020","unstructured":"Kehl KL, Xu W, Lepisto E, Elmarakeby H, Hassett MJ, Van Allen EM, et al. Natural language processing to ascertain cancer outcomes from medical oncologist notes. JCO Clin Cancer Inform. 2020;4:680\u201390.","journal-title":"JCO Clin Cancer Inform"},{"key":"5439_CR6","unstructured":"Dai AM, Le QV. Semi-supervised sequence learning. arXiv; 2015 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/1511.01432"},{"key":"5439_CR7","doi-asserted-by":"crossref","unstructured":"Howard J, Ruder S. Universal language model fine-tuning for text classification. arXiv; 2018 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/1801.06146","DOI":"10.18653\/v1\/P18-1031"},{"key":"5439_CR8","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN et al. Attention is all you need. arXiv; 2017 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/1706.03762"},{"key":"5439_CR9","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv; 2019 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/1810.04805"},{"key":"5439_CR10","unstructured":"Huang K, Altosaar J, Ranganath R. ClinicalBERT: modeling clinical notes and predicting hospital readmission. arXiv; 2020 Nov [cited 2022 May 31]. Report No. http:\/\/arxiv.org\/abs\/1904.05342"},{"key":"5439_CR11","doi-asserted-by":"crossref","unstructured":"Dai Z, Yang Z, Yang Y, Carbonell J, Le QV, Salakhutdinov R. Transformer-XL: Attentive language models beyond a fixed-length context. arXiv; 2019 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/1901.02860","DOI":"10.18653\/v1\/P19-1285"},{"key":"5439_CR12","unstructured":"Kitaev N, Kaiser \u0141, Levskaya A. Reformer: the efficient transformer. arXiv; 2020 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/2001.04451"},{"key":"5439_CR13","unstructured":"Beltagy I, Peters ME, Cohan A. Longformer: the long-document transformer. arXiv; 2020 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/2004.05150"},{"key":"5439_CR14","doi-asserted-by":"publisher","first-page":"106304","DOI":"10.1016\/j.cmpb.2021.106304","volume":"208","author":"AW Olthof","year":"2021","unstructured":"Olthof AW, Shouche P, Fennema EM, IJpma FFA, Koolstra RHC, Stirler VMA, et al. Machine learning based natural language processing of radiology reports in orthopaedic trauma. Comput Methods Programs Biomed. 2021;208:106304.","journal-title":"Comput Methods Programs Biomed"},{"issue":"4","key":"5439_CR15","doi-asserted-by":"publisher","DOI":"10.1148\/ryai.210185","volume":"4","author":"GR Chaudhari","year":"2022","unstructured":"Chaudhari GR, Liu T, Chen TL, Joseph GB, Vella M, Lee YJ, et al. Application of a domain-specific BERT for detection of speech recognition errors in radiology reports. Radiol Artif Intell. 2022;4(4): e210185.","journal-title":"Radiol Artif Intell"},{"issue":"1","key":"5439_CR16","doi-asserted-by":"publisher","first-page":"262","DOI":"10.1186\/s12911-021-01623-6","volume":"21","author":"Y Nakamura","year":"2021","unstructured":"Nakamura Y, Hanaoka S, Nomura Y, Nakao T, Miki S, Watadani T, et al. Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers. BMC Med Inform Decis Mak. 2021;21(1):262.","journal-title":"BMC Med Inform Decis Mak"},{"issue":"10","key":"5439_CR17","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1007\/s10916-021-01761-4","volume":"45","author":"AW Olthof","year":"2021","unstructured":"Olthof AW, van Ooijen PMA, Cornelissen LJ. Deep learning-based natural language processing in radiology: the impact of report complexity, disease prevalence, dataset size, and algorithm type on model performance. J Med Syst. 2021;45(10):91.","journal-title":"J Med Syst"},{"key":"5439_CR18","unstructured":"Wei J, Bosma M, Zhao VY, Guu K, Yu AW, Lester B et al. Finetuned language models are zero-shot learners. arXiv; 2022 [cited 2023 May 26]. http:\/\/arxiv.org\/abs\/2109.01652"},{"key":"5439_CR19","unstructured":"Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv; 2020 [cited 2023 May 22]. http:\/\/arxiv.org\/abs\/1910.10683"},{"key":"5439_CR20","unstructured":"Chung HW, Hou L, Longpre S, Zoph B, Tay Y, Fedus W, et al. Scaling instruction-finetuned language models. arXiv; 2022 [cited 2023 May 22]. http:\/\/arxiv.org\/abs\/2210.11416"},{"key":"5439_CR21","unstructured":"Guti\u00e9rrez BJ, McNeal N, Washington C, Chen Y, Li L, Sun H, et al. Thinking about GPT-3 in-context learning for biomedical IE? Think again. arXiv; 2022 [cited 2023 May 26]. http:\/\/arxiv.org\/abs\/2203.08410"},{"key":"5439_CR22","doi-asserted-by":"crossref","unstructured":"Kim Y. Convolutional neural networks for sentence classification. arXiv; 2014 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/1408.5882","DOI":"10.3115\/v1\/D14-1181"},{"key":"5439_CR23","doi-asserted-by":"crossref","unstructured":"Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, et al. Learning phrase representations using RNN encoder\u2013decoder for statistical machine translation. arXiv; 2014 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/1406.1078","DOI":"10.3115\/v1\/D14-1179"},{"key":"5439_CR24","unstructured":"Huang XS, Perez F, Ba J, Volkovs M. Improving transformer optimization through better initialization. In: Proceedings of the 37th international conference on machine learning. PMLR; 2020 [cited 2022 Sep 6]. p. 4475\u201383. https:\/\/proceedings.mlr.press\/v119\/huang20f.html"},{"key":"5439_CR25","doi-asserted-by":"crossref","unstructured":"Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2019;btz682.","DOI":"10.1093\/bioinformatics\/btz682"},{"key":"5439_CR26","doi-asserted-by":"crossref","unstructured":"Lehman E, Jain S, Pichotta K, Goldberg Y, Wallace BC. Does BERT pretrained on clinical notes reveal sensitive data? arXiv; 2021 Apr [cited 2022 Jun 2]. Report No. http:\/\/arxiv.org\/abs\/2104.07762","DOI":"10.18653\/v1\/2021.naacl-main.73"},{"issue":"19","key":"5439_CR27","doi-asserted-by":"publisher","DOI":"10.1172\/jci.insight.87062","volume":"1","author":"LM Sholl","year":"2016","unstructured":"Sholl LM, Do K, Shivdasani P, Cerami E, Dubuc AM, Kuo FC, et al. Institutional implementation of clinical tumor profiling on an unselected cancer population. JCI Insight. 2016;1(19): e87062.","journal-title":"JCI Insight"},{"issue":"5","key":"5439_CR28","doi-asserted-by":"publisher","first-page":"513","DOI":"10.1016\/0306-4573(88)90021-0","volume":"24","author":"G Salton","year":"1988","unstructured":"Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inf Process Manag. 1988;24(5):513\u201323.","journal-title":"Inf Process Manag"},{"key":"5439_CR29","doi-asserted-by":"crossref","unstructured":"Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, et al. HuggingFace\u2019s transformers: state-of-the-art natural language processing. arXiv; 2020 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/1910.03771","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"5439_CR30","unstructured":"Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. PyTorch: an imperative style, high-performance deep learning library. arXiv; 2019 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/1912.01703"},{"key":"5439_CR31","unstructured":"Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv; 2016 [cited 2022 Sep 6]. http:\/\/arxiv.org\/abs\/1603.04467"},{"key":"5439_CR32","unstructured":"Zhang S, Roller S, Goyal N, Artetxe M, Chen M, Chen S, et al. OPT: open pre-trained transformer language models. arXiv; 2022 [cited 2023 May 30]. http:\/\/arxiv.org\/abs\/2205.01068"},{"key":"5439_CR33","unstructured":"Sanh V, Webson A, Raffel C, Bach SH, Sutawika L, Alyafeai Z, et al. Multitask prompted training enables zero-shot task generalization. arXiv; 2022 [cited 2023 May 30]. http:\/\/arxiv.org\/abs\/2110.08207"},{"key":"5439_CR34","doi-asserted-by":"crossref","unstructured":"Lu Q, Dou D, Nguyen T. ClinicalT5: a generative language model for clinical text. In: Findings of the association for computational linguistics: EMNLP 2022. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics; 2022 [cited 2023 May 30]. p. 5436\u201343. https:\/\/aclanthology.org\/2022.findings-emnlp.398","DOI":"10.18653\/v1\/2022.findings-emnlp.398"},{"key":"5439_CR35","unstructured":"Lehman E, Hernandez E, Mahajan D, Wulff J, Smith MJ, Ziegler Z, et al. Do we still need clinical language models? arXiv; 2023 [cited 2023 May 30]. http:\/\/arxiv.org\/abs\/2302.08091"},{"key":"5439_CR36","unstructured":"Phan LN, Anibal JT, Tran H, Chanana S, Bahadroglu E, Peltekian A, et al. SciFive: a text-to-text transformer model for biomedical literature. arXiv; 2021 [cited 2023 May 30]. http:\/\/arxiv.org\/abs\/2106.03598"},{"key":"5439_CR37","unstructured":"Loshchilov I, Hutter F. Decoupled weight decay regularization. 2017 Nov 14 [cited 2022 Sep 6]; https:\/\/arxiv.org\/abs\/1711.05101v3"},{"issue":"1","key":"5439_CR38","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1186\/s12864-019-6413-7","volume":"21","author":"D Chicco","year":"2020","unstructured":"Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21(1):6.","journal-title":"BMC Genomics"},{"issue":"1","key":"5439_CR39","doi-asserted-by":"publisher","first-page":"160035","DOI":"10.1038\/sdata.2016.35","volume":"3","author":"AEW Johnson","year":"2016","unstructured":"Johnson AEW, Pollard TJ, Shen L, Lehman L, Wei H, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3(1):160035.","journal-title":"Sci Data."},{"key":"5439_CR40","unstructured":"Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, et al. RoBERTa: a robustly optimized BERT pretraining approach. arXiv; 2019 [cited 2023 Jun 5]. http:\/\/arxiv.org\/abs\/1907.11692"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05439-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-023-05439-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05439-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,18]],"date-time":"2023-11-18T00:04:16Z","timestamp":1700265856000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-023-05439-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,2]]},"references-count":40,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["5439"],"URL":"https:\/\/doi.org\/10.1186\/s12859-023-05439-1","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,2]]},"assertion":[{"value":"2 February 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 August 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 September 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The data for this analysis were derived from the EHRs of patients with lung cancer who had genomic profiling performed through the Dana-Farber Cancer Institute (DFCI) PROFILE [] precision medicine effort or as a standard of care clinical test from June 26, 2013, to July 2, 2018. PROFILE participants consented to medical records review and genomic profiling of their tumor tissue. PROFILE was approved by the Dana-Farber\/Harvard Cancer Center Institutional Review Board (protocol #11-104 and #17-000); this supplemental retrospective analysis was declared exempt from review, and informed consent was waived for the standard of care genotyping patients given the minimal risk of data analysis, also by the Dana-Farber\/Harvard Cancer Center Institutional Review Board (protocol #16-360). All methods were performed in accordance with the Declaration of Helsinki and approved by the Institutional Review Board at Dana Farber Cancer Institute.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"Dr. Kehl reports serving as a consultant\/advisor to Aetion, receiving funding from the American Association for Cancer Research related to this work, and receiving honoraria from Roche and IBM. Dr. Schrag reports compensation from JAMA for serving as an Associate Editor and from Pfizer for giving a talk at a symposium. She has received research funding from the American Association for Cancer Research related to this work and research funding from GRAIL for serving as the site-PI of a clinical trial. Unrelated to this work, Dr. Van Allen reports serving in advisory\/consulting roles to Tango Therapeutics, Genome Medical, Invitae, Enara Bio, Janssen, Manifold Bio, and Monte Rosa; receiving research support from Novartis and BMS; holding equity in Tango Therapeutics, Genome Medical, Syapse, Enara Bio, Manifold Bio, Microsoft, and Monte Rosa; and receiving travel reimbursement from Roche\/Genentech. Pavel Trukhanov reports ownership interest in NoRD Bio. The remaining authors declare no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"328"}}