{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,16]],"date-time":"2026-06-16T09:46:50Z","timestamp":1781603210466,"version":"3.54.5"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2022,4,27]],"date-time":"2022-04-27T00:00:00Z","timestamp":1651017600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Health Data & Evidence Network"},{"name":"Innovative Medicines Initiative 2 Joint Undertaking","award":["806968"],"award-info":[{"award-number":["806968"]}]},{"name":"European Union\u2019s Horizon 2020 research and innovation program and EFPIA"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,6,14]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Objective<\/jats:title>\n                    <jats:p>This systematic review aims to assess how information from unstructured text is used to develop and validate clinical prognostic prediction models. We summarize the prediction problems and methodological landscape and determine whether using text data in addition to more commonly used structured data improves the prediction performance.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Materials and Methods<\/jats:title>\n                    <jats:p>We searched Embase, MEDLINE, Web of Science, and Google Scholar to identify studies that developed prognostic prediction models using information extracted from unstructured text in a data-driven manner, published in the period from January 2005 to March 2021. Data items were extracted, analyzed, and a meta-analysis of the model performance was carried out to assess the added value of text to structured-data models.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We identified 126 studies that described 145 clinical prediction problems. Combining text and structured data improved model performance, compared with using only text or only structured data. In these studies, a wide variety of dense and sparse numeric text representations were combined with both deep learning and more traditional machine learning methods. External validation, public availability, and attention for the explainability of the developed models were limited.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusion<\/jats:title>\n                    <jats:p>The use of unstructured text in the development of prognostic prediction models has been found beneficial in addition to structured data in most studies. The text data are source of valuable information for prediction model development and should not be neglected. We suggest a future focus on explainability and external validation of the developed models, promoting robust and trustworthy prediction models in clinical practice.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/jamia\/ocac058","type":"journal-article","created":{"date-parts":[[2022,4,14]],"date-time":"2022-04-14T07:11:57Z","timestamp":1649920317000},"page":"1292-1302","source":"Crossref","is-referenced-by-count":76,"title":["Use of unstructured text in prognostic clinical prediction models: a systematic review"],"prefix":"10.1093","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5369-8260","authenticated-orcid":false,"given":"Tom M","family":"Seinen","sequence":"first","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Egill A","family":"Fridgeirsson","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Solomon","family":"Ioannou","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Daniel","family":"Jeannetot","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Luis H","family":"John","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jan A","family":"Kors","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Aniek F","family":"Markus","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Victor","family":"Pera","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5352-943X","authenticated-orcid":false,"given":"Alexandros","family":"Rekkas","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7723-417X","authenticated-orcid":false,"given":"Ross D","family":"Williams","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6769-3153","authenticated-orcid":false,"given":"Cynthia","family":"Yang","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Erik M","family":"van Mulligen","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Peter R","family":"Rijnbeek","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2022,4,27]]},"reference":[{"issue":"8","key":"2022061415521000900_ocac058-B1","doi-asserted-by":"crossref","first-page":"969","DOI":"10.1093\/jamia\/ocy032","article-title":"Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data","volume":"25","author":"Reps","year":"2018","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2022061415521000900_ocac058-B2","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1093\/jamia\/ocw042","article-title":"Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review","volume":"24","author":"Goldstein","year":"2017","journal-title":"J Am Med Inform Assoc"},{"key":"2022061415521000900_ocac058-B3","doi-asserted-by":"crossref","first-page":"106394","DOI":"10.1016\/j.cmpb.2021.106394","article-title":"A standardized analytics pipeline for reliable and rapid development and validation of prediction models using observational health data","volume":"211","author":"Khalid","year":"2021","journal-title":"Comput Methods Programs Biomed"},{"issue":"5","key":"2022061415521000900_ocac058-B4","doi-asserted-by":"crossref","first-page":"1007","DOI":"10.1093\/jamia\/ocv180","article-title":"Extracting information from the text of electronic medical records to improve case detection: a systematic review","volume":"23","author":"Ford","year":"2016","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2022061415521000900_ocac058-B5","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1055\/s-0040-1702001","article-title":"Medical information extraction in the age of deep learning","volume":"29","author":"Hahn","year":"2020","journal-title":"Yearb Med Inform"},{"issue":"3","key":"2022061415521000900_ocac058-B6","doi-asserted-by":"crossref","first-page":"e17984","DOI":"10.2196\/17984","article-title":"Clinical text data in machine learning: systematic review","volume":"8","author":"Spasic","year":"2020","journal-title":"JMIR Med Inform"},{"key":"2022061415521000900_ocac058-B7","doi-asserted-by":"crossref","first-page":"66","DOI":"10.3389\/fmed.2019.00066","article-title":"The revival of the notes field: leveraging the unstructured content in electronic health records","volume":"6","author":"Assale","year":"2019","journal-title":"Front Med (Lausanne)"},{"key":"2022061415521000900_ocac058-B8","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.jbi.2018.10.005","article-title":"Using clinical natural language processing for health outcomes research: overview and actionable suggestions for future advances","volume":"88","author":"Velupillai","year":"2018","journal-title":"J Biomed Inform"},{"issue":"2","key":"2022061415521000900_ocac058-B9","doi-asserted-by":"crossref","first-page":"e12239","DOI":"10.2196\/12239","article-title":"Natural language processing of clinical notes on chronic diseases: systematic review","volume":"7","author":"Sheikhalishahi","year":"2019","journal-title":"JMIR Med Inform"},{"issue":"4","key":"2022061415521000900_ocac058-B10","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1093\/jamia\/ocy173","article-title":"Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review","volume":"26","author":"Koleck","year":"2019","journal-title":"J Am Med Inform Assoc"},{"key":"2022061415521000900_ocac058-B11","doi-asserted-by":"crossref","first-page":"103526","DOI":"10.1016\/j.jbi.2020.103526","article-title":"Clinical concept extraction: a methodology review","volume":"109","author":"Fu","year":"2020","journal-title":"J Biomed Inform"},{"key":"2022061415521000900_ocac058-B12","doi-asserted-by":"crossref","first-page":"494","DOI":"10.1016\/j.eswa.2018.09.034","article-title":"Clinical text classification research trends: systematic literature review and open issues","volume":"116","author":"Mujtaba","year":"2019","journal-title":"Expert Syst Appl"},{"key":"2022061415521000900_ocac058-B13","doi-asserted-by":"crossref","first-page":"983","DOI":"10.1093\/jamia\/ocac002","article-title":"Trends in the conduct and reporting of clinical prediction model development and validation: a systematic review","volume":"29","author":"Yang","year":"2022","journal-title":"J Am Med Inform Assoc"},{"key":"2022061415521000900_ocac058-B14","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1093\/jamia\/ocab236","article-title":"Sepsis prediction, early detection, and identification using clinical text for machine learning: a systematic review","volume":"29","author":"Yan","year":"2022","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2022061415521000900_ocac058-B15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/2046-4053-4-1","article-title":"Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement","volume":"4","author":"Moher","year":"2015","journal-title":"Syst Rev"},{"issue":"2","key":"2022061415521000900_ocac058-B16","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1161\/CIRCULATIONAHA.114.014508","article-title":"Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) the TRIPOD statement","volume":"131","author":"Collins","year":"2015","journal-title":"Circulation"},{"issue":"10","key":"2022061415521000900_ocac058-B17","doi-asserted-by":"crossref","first-page":"e1001744","DOI":"10.1371\/journal.pmed.1001744","article-title":"Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist","volume":"11","author":"Moons","year":"2014","journal-title":"PLoS Med"},{"key":"2022061415521000900_ocac058-B18","doi-asserted-by":"crossref","first-page":"103655","DOI":"10.1016\/j.jbi.2020.103655","article-title":"The role of explainability in creating trustworthy artificial intelligence for health care: a comprehensive survey of the terminology, design choices, and evaluation strategies","volume":"113","author":"Markus","year":"2021","journal-title":"J Biomed Inform"},{"key":"2022061415521000900_ocac058-B19","first-page":"80","author":"Gilpin","year":"2018"},{"issue":"5","key":"2022061415521000900_ocac058-B20","doi-asserted-by":"crossref","first-page":"952","DOI":"10.1097\/CCM.0b013e31820a92c6","article-title":"Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II): a public-access intensive care unit database","volume":"39","author":"Saeed","year":"2011","journal-title":"Crit Care Med"},{"issue":"1","key":"2022061415521000900_ocac058-B21","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Sci Data"},{"key":"2022061415521000900_ocac058-B22","doi-asserted-by":"crossref","first-page":"S67","DOI":"10.1016\/j.jbi.2015.07.001","article-title":"Identifying risk factors for heart disease over time: overview of 2014 i2b2\/UTHealth shared task Track 2","volume":"58","author":"Stubbs","year":"2015","journal-title":"J Biomed Inform"},{"key":"2022061415521000900_ocac058-B23","author":"Aronson","year":"2001"},{"key":"2022061415521000900_ocac058-B24","doi-asserted-by":"crossref","first-page":"D267","DOI":"10.1093\/nar\/gkh061","article-title":"The unified medical language system (UMLS): integrating biomedical terminology","volume":"32 (Database issue","author":"Bodenreider","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2022061415521000900_ocac058-B25","first-page":"279","article-title":"SNOMED-CT: The advanced terminology and coding system for eHealth","volume":"121","author":"Donnelly","year":"2006","journal-title":"Stud Health Technol Inform"},{"issue":"8","key":"2022061415521000900_ocac058-B26","doi-asserted-by":"crossref","first-page":"e185097","DOI":"10.1001\/jamanetworkopen.2018.5097","article-title":"Validation of prediction models for critical care outcomes using natural language processing of electronic health record data","volume":"1","author":"Marafino","year":"2018","journal-title":"JAMA Netw Open"},{"issue":"7","key":"2022061415521000900_ocac058-B27","doi-asserted-by":"crossref","first-page":"e196709","DOI":"10.1001\/jamanetworkopen.2019.6709","article-title":"Machine learning approach to inpatient violence risk assessment using routinely collected clinical notes in electronic health records","volume":"2","author":"Menger","year":"2019","journal-title":"JAMA Netw Open"},{"key":"2022061415521000900_ocac058-B28","first-page":"491","volume-title":"Recent Advances in Intelligent Systems and Smart Applications. Studies in Systems, Decision and Control","author":"AlShuweihi","year":"2021"},{"issue":"1","key":"2022061415521000900_ocac058-B29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13326-018-0179-8","article-title":"Clinical natural language processing in languages other than English: opportunities and challenges","volume":"9","author":"N\u00e9v\u00e9ol","year":"2018","journal-title":"J Biomed Semant"},{"issue":"6","key":"2022061415521000900_ocac058-B30","doi-asserted-by":"crossref","DOI":"10.1097\/CCE.0000000000000450","article-title":"Impact of different approaches to preparing notes for analysis with natural language processing on the performance of prediction models in intensive care","volume":"3","author":"Mahendra","year":"2021","journal-title":"Crit Care Explor"},{"issue":"6","key":"2022061415521000900_ocac058-B31","doi-asserted-by":"crossref","first-page":"1236","DOI":"10.1093\/bib\/bbx044","article-title":"Deep learning for healthcare: review, opportunities and challenges","volume":"19","author":"Miotto","year":"2018","journal-title":"Brief Bioinform"},{"key":"2022061415521000900_ocac058-B32","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1016\/j.jclinepi.2015.04.005","article-title":"Prediction models need appropriate internal, internal-external, and external validation","volume":"69","author":"Steyerberg","year":"2016","journal-title":"J Clin Epidemiol"},{"key":"2022061415521000900_ocac058-B33","first-page":"574","article-title":"Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers","volume":"216","author":"Hripcsak","year":"2015","journal-title":"Stud Health Technol Inform"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/29\/7\/1292\/44062122\/ocac058.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/29\/7\/1292\/44062122\/ocac058.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,14]],"date-time":"2022-06-14T12:42:38Z","timestamp":1655210558000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/29\/7\/1292\/6574714"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,27]]},"references-count":33,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2022,4,27]]},"published-print":{"date-parts":[[2022,6,14]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocac058","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.01.17.22269400","asserted-by":"object"}]},"ISSN":["1527-974X"],"issn-type":[{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,7,1]]},"published":{"date-parts":[[2022,4,27]]}}}