{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T01:57:42Z","timestamp":1774058262888,"version":"3.50.1"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2023,8,17]],"date-time":"2023-08-17T00:00:00Z","timestamp":1692230400000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Health Data & Evidence Network"},{"name":"Innovative Medicines Initiative 2 Joint Undertaking","award":["806968"],"award-info":[{"award-number":["806968"]}]},{"name":"European Union\u2019s Horizon 2020"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,11,17]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>This work aims to explore the value of Dutch unstructured data, in combination with structured data, for the development of prognostic prediction models in a general practitioner (GP) setting.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and methods<\/jats:title>\n                  <jats:p>We trained and validated prediction models for 4 common clinical prediction problems using various sparse text representations, common prediction algorithms, and observational GP electronic health record (EHR) data. We trained and validated 84 models internally and externally on data from different EHR systems.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>On average, over all the different text representations and prediction algorithms, models only using text data performed better or similar to models using structured data alone in 2 prediction tasks. Additionally, in these 2 tasks, the combination of structured and text data outperformed models using structured or text data alone. No large performance differences were found between the different text representations and prediction algorithms.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Discussion<\/jats:title>\n                  <jats:p>Our findings indicate that the use of unstructured data alone can result in well-performing prediction models for some clinical prediction problems. Furthermore, the performance improvement achieved by combining structured and text data highlights the added value. Additionally, we demonstrate the significance of clinical natural language processing research in languages other than English and the possibility of validating text-based prediction models across various EHR systems.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>Our study highlights the potential benefits of incorporating unstructured data in clinical prediction models in a GP setting. Although the added value of unstructured data may vary depending on the specific prediction task, our findings suggest that it has the potential to enhance patient care.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocad160","type":"journal-article","created":{"date-parts":[[2023,8,17]],"date-time":"2023-08-17T02:35:53Z","timestamp":1692239753000},"page":"1973-1984","source":"Crossref","is-referenced-by-count":11,"title":["The added value of text from Dutch general practitioner notes in predictive modeling"],"prefix":"10.1093","volume":"30","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5369-8260","authenticated-orcid":false,"given":"Tom M","family":"Seinen","sequence":"first","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}]},{"given":"Jan A","family":"Kors","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}]},{"given":"Erik M","family":"van Mulligen","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}]},{"given":"Egill","family":"Fridgeirsson","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}]},{"given":"Peter R","family":"Rijnbeek","sequence":"additional","affiliation":[{"name":"Department of Medical Informatics, Erasmus University Medical Center , Rotterdam, The Netherlands"}]}],"member":"286","published-online":{"date-parts":[[2023,8,16]]},"reference":[{"issue":"1","key":"2023111709551666500_ocad160-B1","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1093\/jamia\/ocw042","article-title":"Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review","volume":"24","author":"Goldstein","year":"2017","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"2023111709551666500_ocad160-B2","doi-asserted-by":"crossref","first-page":"983","DOI":"10.1093\/jamia\/ocac002","article-title":"Trends in the conduct and reporting of clinical prediction model development and validation: a systematic review","volume":"29","author":"Yang","year":"2022","journal-title":"J Am Med Inform Assoc"},{"issue":"8","key":"2023111709551666500_ocad160-B3","doi-asserted-by":"crossref","first-page":"969","DOI":"10.1093\/jamia\/ocy032","article-title":"Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data","volume":"25","author":"Reps","year":"2018","journal-title":"J Am Med Inform Assoc"},{"key":"2023111709551666500_ocad160-B4","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1016\/j.jbi.2017.07.012","article-title":"Natural language processing systems for capturing and standardizing unstructured clinical information: a systematic review","volume":"73","author":"Kreimeyer","year":"2017","journal-title":"J Biomed Inform"},{"issue":"5","key":"2023111709551666500_ocad160-B5","doi-asserted-by":"crossref","first-page":"1007","DOI":"10.1093\/jamia\/ocv180","article-title":"Extracting information from the text of electronic medical records to improve case detection: a systematic review","volume":"23","author":"Ford","year":"2016","journal-title":"J Am Med Inform Assoc"},{"issue":"7","key":"2023111709551666500_ocad160-B6","doi-asserted-by":"crossref","first-page":"1292","DOI":"10.1093\/jamia\/ocac058","article-title":"Use of unstructured text in prognostic clinical prediction models: a systematic review","volume":"29","author":"Seinen","year":"2022","journal-title":"J Am Med Inform Assoc"},{"key":"2023111709551666500_ocad160-B7","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1016\/j.jclinepi.2015.04.005","article-title":"Prediction models need appropriate internal, internal\u2013external, and external validation","volume":"69","author":"Steyerberg","year":"2016","journal-title":"J Clin Epidemiol"},{"issue":"1","key":"2023111709551666500_ocad160-B8","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1093\/ckj\/sfaa188","article-title":"External validation of prognostic models: what, why, how, when and where?","volume":"14","author":"Ramspek","year":"2021","journal-title":"Clin Kidney J"},{"issue":"1","key":"2023111709551666500_ocad160-B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13326-018-0179-8","article-title":"Clinical natural language processing in languages other than English: opportunities and challenges","volume":"9","author":"N\u00e9v\u00e9ol","year":"2018","journal-title":"J Biomed Semant"},{"issue":"1","key":"2023111709551666500_ocad160-B10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12911-019-0775-2","article-title":"Predicting life expectancy with a long short-term memory recurrent neural network using electronic medical records","volume":"19","author":"Beeksma","year":"2019","journal-title":"BMC Med Inform Decis Mak"},{"key":"2023111709551666500_ocad160-B11","doi-asserted-by":"crossref","first-page":"103544","DOI":"10.1016\/j.jbi.2020.103544","article-title":"Clinical information extraction for preterm birth risk prediction","volume":"110","author":"Sterckx","year":"2020","journal-title":"J Biomed Inform"},{"issue":"7","key":"2023111709551666500_ocad160-B12","doi-asserted-by":"crossref","first-page":"e196709","DOI":"10.1001\/jamanetworkopen.2019.6709","article-title":"Machine learning approach to inpatient violence risk assessment using routinely collected clinical notes in electronic health records","volume":"2","author":"Menger","year":"2019","journal-title":"JAMA Netw Open"},{"issue":"1-2","key":"2023111709551666500_ocad160-B13","doi-asserted-by":"crossref","first-page":"44","DOI":"10.2991\/jaims.d.210225.001","article-title":"Machine learning for violence risk assessment using Dutch clinical notes","volume":"2","author":"Mosteiro","year":"2021","journal-title":"JoAIMS"},{"issue":"6","key":"2023111709551666500_ocad160-B14","doi-asserted-by":"crossref","first-page":"981","DOI":"10.3390\/app8060981","article-title":"Comparing deep learning and classical machine learning approaches for predicting inpatient violence incidents from clinical text","volume":"8","author":"Menger","year":"2018","journal-title":"Appl Sci"},{"key":"2023111709551666500_ocad160-B15","doi-asserted-by":"crossref","first-page":"846930","DOI":"10.3389\/fdata.2022.846930","article-title":"Topic modeling for interpretable text classification From EHRs","volume":"5","author":"Rijcken","year":"2022","journal-title":"Front Big Data"},{"key":"2023111709551666500_ocad160-B16","first-page":"193","author":"Elfrink","year":"2023"},{"issue":"4","key":"2023111709551666500_ocad160-B17","doi-asserted-by":"crossref","first-page":"afad046","DOI":"10.1093\/ageing\/afad046","article-title":"Predicting future falls in older people using natural language processing of general practitioners\u2019 clinical notes","volume":"52","author":"Dormosh","year":"2023","journal-title":"Age Ageing"},{"key":"2023111709551666500_ocad160-B18","first-page":"245","volume":"180","author":"Cornet","year":"2012","journal-title":"Stud Health Technol Inform"},{"issue":"4","key":"2023111709551666500_ocad160-B19","doi-asserted-by":"crossref","first-page":"1002","DOI":"10.1007\/s10278-020-00327-z","article-title":"Natural language processing in Dutch free text radiology reports: challenges in a small language area staging pulmonary oncology","volume":"33","author":"Nobel","year":"2020","journal-title":"J Digit Imaging"},{"key":"2023111709551666500_ocad160-B20","first-page":"4577","author":"Kim","year":"2022"},{"key":"2023111709551666500_ocad160-B21","first-page":"141","article-title":"MedRoBERTa.nl: a language model for Dutch electronic health records","volume":"11","author":"Verkijk","year":"2021"},{"key":"2023111709551666500_ocad160-B22","first-page":"1098","author":"Verkijk","year":"2022"},{"issue":"3","key":"2023111709551666500_ocad160-B23","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1111\/j.1468-0009.2005.00409.x","article-title":"Contribution of primary care to health systems and health","volume":"83","author":"Starfield","year":"2005","journal-title":"Milbank Q"},{"issue":"15","key":"2023111709551666500_ocad160-B24","doi-asserted-by":"crossref","first-page":"E463","DOI":"10.1503\/cmaj.170784","article-title":"Why strengthening primary health care is essential to achieving universal health coverage","volume":"190","author":"Van Weel","year":"2018","journal-title":"CMAJ"},{"issue":"12","key":"2023111709551666500_ocad160-B25","doi-asserted-by":"crossref","first-page":"1645","DOI":"10.1038\/bjc.2015.409","article-title":"Risk prediction tools for cancer in primary care","volume":"113","author":"Usher-Smith","year":"2015","journal-title":"Br J Cancer"},{"issue":"1","key":"2023111709551666500_ocad160-B26","doi-asserted-by":"crossref","first-page":"160035","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Sci Data"},{"issue":"4","key":"2023111709551666500_ocad160-B27","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1016\/S1532-0464(03)00012-1","article-title":"Two biomedical sublanguages: a description based on the theories of Zellig Harris","volume":"35","author":"Friedman","year":"2002","journal-title":"J Biomed Inform"},{"issue":"6","key":"2023111709551666500_ocad160-B28","doi-asserted-by":"crossref","first-page":"e314","DOI":"10.1093\/ije\/dyac026","article-title":"Data resource profile: the Integrated Primary Care Information (IPCI) database, the Netherlands","volume":"51","author":"de Ridder","year":"2022","journal-title":"Int J Epidemiol"},{"issue":"1","key":"2023111709551666500_ocad160-B29","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1136\/amiajnl-2011-000376","article-title":"Validation of a common data model for active safety surveillance research","volume":"19","author":"Overhage","year":"2012","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"2023111709551666500_ocad160-B30","doi-asserted-by":"crossref","first-page":"1747","DOI":"10.1016\/j.chest.2020.12.051","article-title":"Novel machine learning can predict acute asthma exacerbation","volume":"159","author":"Zein","year":"2021","journal-title":"Chest"},{"issue":"7","key":"2023111709551666500_ocad160-B31","doi-asserted-by":"crossref","first-page":"e16981","DOI":"10.2196\/16981","article-title":"Asthma exacerbation prediction and risk factor analysis based on a time-sensitive, attentive neural network: retrospective cohort study","volume":"22","author":"Xiang","year":"2020","journal-title":"J Med Internet Res"},{"issue":"7","key":"2023111709551666500_ocad160-B32","doi-asserted-by":"crossref","first-page":"e028375","DOI":"10.1136\/bmjopen-2018-028375","article-title":"Predicting asthma attacks in primary care: protocol for developing a machine learning-based prediction model","volume":"9","author":"Tibble","year":"2019","journal-title":"BMJ Open"},{"key":"2023111709551666500_ocad160-B33","first-page":"438","author":"Eyre","year":"2021"},{"issue":"1","key":"2023111709551666500_ocad160-B34","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1186\/s12859-014-0373-3","article-title":"ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus","volume":"15","author":"Afzal","year":"2014","journal-title":"BMC Bioinformatics"},{"issue":"1","key":"2023111709551666500_ocad160-B35","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1186\/s12859-022-05130-x","article-title":"Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods","volume":"24","author":"van Es","year":"2023","journal-title":"BMC Bioinformatics"},{"key":"2023111709551666500_ocad160-B36","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1016\/j.jclinepi.2019.02.004","article-title":"A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models","volume":"110","author":"Christodoulou","year":"2019","journal-title":"J Clin Epidemiol"},{"issue":"1","key":"2023111709551666500_ocad160-B37","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1038\/s41746-023-00772-4","article-title":"Complex modeling with detailed temporal predictors does not improve health records-based suicide risk prediction","volume":"6","author":"Shortreed","year":"2023","journal-title":"NPJ Digit Med"},{"key":"2023111709551666500_ocad160-B38","first-page":"6765","author":"Marx","year":"2020"},{"issue":"9","key":"2023111709551666500_ocad160-B39","first-page":"10306","article-title":"Predictive Multiplicity in Probabilistic Classification","volume":"37","author":"Watson-Daniels","year":"2023"},{"key":"2023111709551666500_ocad160-B40","doi-asserted-by":"crossref","first-page":"103655","DOI":"10.1016\/j.jbi.2020.103655","article-title":"The role of explainability in creating trustworthy artificial intelligence for health care: a comprehensive survey of the terminology, design choices, and evaluation strategies","volume":"113","author":"Markus","year":"2021","journal-title":"J Biomed Inform"},{"key":"2023111709551666500_ocad160-B41","doi-asserted-by":"crossref","first-page":"e2100166","DOI":"10.1200\/CCI.21.00166","article-title":"Simple linear cancer risk prediction models with novel features outperform complex approaches","volume":"6","author":"Kulm","year":"2022","journal-title":"JCO Clin Cancer Inform"},{"key":"2023111709551666500_ocad160-B42","first-page":"1135","author":"Ribeiro","year":"2016"},{"key":"2023111709551666500_ocad160-B43","article-title":"A unified approach to interpreting model predictions","volume":"30","author":"Lundberg","year":"2017","journal-title":"Adv Neur In"},{"issue":"3","key":"2023111709551666500_ocad160-B44","doi-asserted-by":"crossref","first-page":"685","DOI":"10.1007\/s12525-021-00475-2","article-title":"Machine learning and deep learning","volume":"31","author":"Janiesch","year":"2021","journal-title":"Electron Mark"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/30\/12\/1973\/53477592\/ocad160.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/30\/12\/1973\/53477592\/ocad160.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,17]],"date-time":"2023-11-17T13:28:46Z","timestamp":1700227726000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/30\/12\/1973\/7243430"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,16]]},"references-count":44,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2023,8,16]]},"published-print":{"date-parts":[[2023,11,17]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocad160","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,12,1]]},"published":{"date-parts":[[2023,8,16]]}}}