{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T04:20:08Z","timestamp":1772166008431,"version":"3.50.1"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,3,8]],"date-time":"2021-03-08T00:00:00Z","timestamp":1615161600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2021,3,8]],"date-time":"2021-03-08T00:00:00Z","timestamp":1615161600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100008810","name":"NSW Ministry of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100008810","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Sydney Health Partners"},{"DOI":"10.13039\/501100000925","name":"National Health and Medical Research Council","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000925","id-type":"DOI","asserted-by":"publisher"}]},{"name":"NSW Agency for Clinical Innovation"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>There have been few studies describing how production EMR systems can be systematically queried to identify clinically-defined populations and limited studies utilising free-text in this process. The aim of this study is to provide a generalisable methodology for constructing clinically-defined EMR-derived patient cohorts using structured and unstructured data in EMRs.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>Patients with possible acute coronary syndrome (ACS) were used as an exemplar. Cardiologists defined clinical criteria for patients presenting with possible ACS. These were mapped to data tables within the production EMR system creating seven inclusion criteria comprised of structured data fields (orders and investigations, procedures, scanned electrocardiogram (ECG) images, and diagnostic codes) and unstructured clinical documentation. Data were extracted from two local health districts (LHD) in Sydney, Australia. Outcome measures included examination of the relative contribution of individual inclusion criteria to the identification of eligible encounters, comparisons between inclusion criterion and evaluation of consistency of data extracts across years and LHDs.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Among 802,742 encounters in a 5\u00a0year dataset (1\/1\/13\u201330\/12\/17), the presence of an ECG image (54.8% of encounters) and symptoms and keywords in clinical documentation (41.4\u201364.0%) were used most often to identify presentations of possible ACS. Orders and investigations (27.3%) and procedures (1.4%), were less often present for identified presentations. Relevant ICD-10\/SNOMED CT codes were present for 3.7% of identified encounters. Similar trends were seen when the two LHDs were examined separately, and across years.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusions<\/jats:title>\n                    <jats:p>Clinically-defined EMR-derived cohorts combining structured and unstructured data during cohort identification is a necessary prerequisite for critical validation work required for development of real-time clinical decision support and learning health systems.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12911-021-01441-w","type":"journal-article","created":{"date-parts":[[2021,3,8]],"date-time":"2021-03-08T06:03:50Z","timestamp":1615183430000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Combining structured and unstructured data in EMRs to create clinically-defined EMR-derived cohorts"],"prefix":"10.1186","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4034-9923","authenticated-orcid":false,"given":"Charmaine S.","family":"Tam","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Janice","family":"Gullick","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Aldo","family":"Saavedra","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stephen T.","family":"Vernon","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gemma A.","family":"Figtree","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Clara K.","family":"Chow","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michelle","family":"Cretikos","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Richard W.","family":"Morris","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Maged","family":"William","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jonathan","family":"Morris","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Brieger","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,3,8]]},"reference":[{"key":"1441_CR1","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1146\/annurev-publhealth-032315-021353","volume":"37","author":"JA Casey","year":"2016","unstructured":"Casey JA, et al. Using electronic health records for population health research: a review of methods and applications. Annu Rev Public Health. 2016;37:61\u201381.","journal-title":"Annu Rev Public Health"},{"issue":"15","key":"1441_CR2","doi-asserted-by":"publisher","first-page":"1452","DOI":"10.1056\/NEJMra1615014","volume":"379","author":"MA Haendel","year":"2018","unstructured":"Haendel MA, Chute CG, Robinson PN. Classification, ontology, and precision medicine. N Engl J Med. 2018;379(15):1452\u201362.","journal-title":"N Engl J Med"},{"issue":"1","key":"1441_CR3","first-page":"8","volume":"6","author":"EB Devine","year":"2018","unstructured":"Devine EB, et al. Automating electronic clinical data capture for quality improvement and research: the CERTAIN validation project of real world evidence. EGEMS (Wash DC). 2018;6(1):8.","journal-title":"EGEMS (Wash DC)"},{"key":"1441_CR4","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1186\/1472-6963-6-77","volume":"6","author":"C De Coster","year":"2006","unstructured":"De Coster C, et al. Identifying priorities in methodological research using ICD-9-CM and ICD-10 administrative data: report from an international consortium. BMC Health Serv Res. 2006;6:77.","journal-title":"BMC Health Serv Res"},{"issue":"1","key":"1441_CR5","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1016\/j.juro.2013.04.048","volume":"190","author":"EK Johnson","year":"2013","unstructured":"Johnson EK, Nelson CP. Values and pitfalls of the use of administrative databases for outcomes assessment. J Urol. 2013;190(1):17\u20138.","journal-title":"J Urol"},{"key":"1441_CR6","doi-asserted-by":"publisher","first-page":"c4226","DOI":"10.1136\/bmj.c4226","volume":"341","author":"DG Manuel","year":"2010","unstructured":"Manuel DG, Rosella LC, Stukel TA. Importance of accurately identifying disease in studies using electronic health records. BMJ. 2010;341:c4226.","journal-title":"BMJ"},{"issue":"1","key":"1441_CR7","doi-asserted-by":"publisher","first-page":"138","DOI":"10.1093\/pubmed\/fdr054","volume":"34","author":"EM Burns","year":"2012","unstructured":"Burns EM, et al. Systematic review of discharge coding accuracy. J Public Health (Oxf). 2012;34(1):138\u201348.","journal-title":"J Public Health (Oxf)"},{"key":"1441_CR8","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1038\/s41746-020-0221-y","volume":"3","author":"RT Sutton","year":"2020","unstructured":"Sutton RT, et al. An overview of clinical decision support systems: benefits, risks, and strategies for success. NPJ Digit Med. 2020;3:17.","journal-title":"NPJ Digit Med"},{"issue":"2","key":"1441_CR9","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1136\/amiajnl-2013-001935","volume":"21","author":"C Shivade","year":"2014","unstructured":"Shivade C, et al. A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc. 2014;21(2):221\u201330.","journal-title":"J Am Med Inform Assoc"},{"issue":"4","key":"1441_CR10","doi-asserted-by":"publisher","first-page":"371","DOI":"10.1016\/j.ajic.2018.10.009","volume":"47","author":"KL Colborn","year":"2019","unstructured":"Colborn KL, et al. Identification of urinary tract infections using electronic health record data. Am J Infect Control. 2019;47(4):371\u20135.","journal-title":"Am J Infect Control"},{"key":"1441_CR11","unstructured":"Botsis TH, Chen F, Weng C. Secondary use of EHR: data quality issues and informatics opportunities. Summit on Translational Bioinformatics, 2010: p. 1\u20135."},{"key":"1441_CR12","first-page":"1564","volume":"2011","author":"H Xu","year":"2011","unstructured":"Xu H, et al. Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases. AMIA Annu Symp Proc. 2011;2011:1564\u201372.","journal-title":"AMIA Annu Symp Proc"},{"issue":"5","key":"1441_CR13","doi-asserted-by":"publisher","first-page":"943","DOI":"10.1016\/j.kint.2016.04.010","volume":"90","author":"HI McDonald","year":"2016","unstructured":"McDonald HI, et al. Methodological challenges when carrying out research on CKD and AKI using routine electronic health records. Kidney Int. 2016;90(5):943\u20139.","journal-title":"Kidney Int"},{"issue":"6","key":"1441_CR14","doi-asserted-by":"publisher","first-page":"1700204","DOI":"10.1183\/13993003.00204-2017","volume":"49","author":"MA Al Sallakh","year":"2017","unstructured":"Al Sallakh MA, et al. Defining asthma and assessing asthma outcomes using electronic health record data: a systematic scoping review. Eur Respir J. 2017;49(6):1700204.","journal-title":"Eur Respir J"},{"key":"1441_CR15","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1016\/j.npbr.2020.02.002","volume":"36","author":"WM Ingram","year":"2020","unstructured":"Ingram WM, et al. Defining major depressive disorder cohorts using the EHR: multiple phenotypes based on ICD-9 codes and medication orders. Neurol Psychiatry Brain Res. 2020;36:18\u201326.","journal-title":"Neurol Psychiatry Brain Res"},{"issue":"Suppl","key":"1441_CR16","doi-asserted-by":"publisher","first-page":"S11","DOI":"10.1097\/MLR.0b013e318258530f","volume":"50","author":"E Holve","year":"2012","unstructured":"Holve E, Segal C, Hamilton Lopez M. Opportunities and challenges for comparative effectiveness research (CER) with electronic clinical data: a perspective from the EDM forum. Med Care. 2012;50(Suppl):S11\u20138.","journal-title":"Med Care"},{"key":"1441_CR17","doi-asserted-by":"publisher","first-page":"4302425","DOI":"10.1155\/2018\/4302425","volume":"2018","author":"W Sun","year":"2018","unstructured":"Sun W, et al. Data processing and text mining technologies on electronic medical records: a review. J Healthc Eng. 2018;2018:4302425.","journal-title":"J Healthc Eng"},{"issue":"5","key":"1441_CR18","doi-asserted-by":"publisher","first-page":"801","DOI":"10.1136\/amiajnl-2013-001915","volume":"21","author":"S Abhyankar","year":"2014","unstructured":"Abhyankar S, et al. Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis. J Am Med Inform Assoc. 2014;21(5):801\u20137.","journal-title":"J Am Med Inform Assoc"},{"issue":"e1","key":"1441_CR19","doi-asserted-by":"publisher","first-page":"e162","DOI":"10.1136\/amiajnl-2011-000583","volume":"19","author":"RJ Carroll","year":"2012","unstructured":"Carroll RJ, et al. Portability of an algorithm to identify rheumatoid arthritis in electronic health records. J Am Med Inform Assoc. 2012;19(e1):e162\u20139.","journal-title":"J Am Med Inform Assoc"},{"key":"1441_CR20","doi-asserted-by":"publisher","first-page":"188","DOI":"10.1016\/j.jbi.2014.10.010","volume":"53","author":"M Kreuzthaler","year":"2015","unstructured":"Kreuzthaler M, Schulz S, Berghold A. Secondary use of electronic health records for building cohort studies through top-down information extraction. J Biomed Inform. 2015;53:188\u201395.","journal-title":"J Biomed Inform"},{"issue":"e2","key":"1441_CR21","doi-asserted-by":"publisher","first-page":"e288","DOI":"10.1136\/amiajnl-2013-001923","volume":"20","author":"JT Fernandez-Breis","year":"2013","unstructured":"Fernandez-Breis JT, et al. Leveraging electronic healthcare record standards and semantic web technologies for the identification of patient cohorts. J Am Med Inform Assoc. 2013;20(e2):e288\u201396.","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"1441_CR22","doi-asserted-by":"publisher","first-page":"797","DOI":"10.1016\/j.jacl.2019.08.002","volume":"13","author":"SS Virani","year":"2019","unstructured":"Virani SS, et al. The use of structured data elements to identify ASCVD patients with statin-associated side effects: insights from the Department of Veterans Affairs. J Clin Lipidol. 2019;13(5):797-803e1.","journal-title":"J Clin Lipidol"},{"issue":"5","key":"1441_CR23","doi-asserted-by":"publisher","first-page":"1007","DOI":"10.1093\/jamia\/ocv180","volume":"23","author":"E Ford","year":"2016","unstructured":"Ford E, et al. Extracting information from the text of electronic medical records to improve case detection: a systematic review. J Am Med Inform Assoc. 2016;23(5):1007\u201315.","journal-title":"J Am Med Inform Assoc"},{"key":"1441_CR24","unstructured":"Healthstats, NSW. http:\/\/www.healthstats.nsw.gov.au\/Indicator\/dem_pop_age\/dem_pop_lhn_snap 2020 1\/2\/20."},{"issue":"10","key":"1441_CR25","doi-asserted-by":"publisher","first-page":"e1001885","DOI":"10.1371\/journal.pmed.1001885","volume":"12","author":"EI Benchimol","year":"2015","unstructured":"Benchimol EI, et al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS Med. 2015;12(10):e1001885.","journal-title":"PLoS Med"},{"issue":"6","key":"1441_CR26","doi-asserted-by":"publisher","first-page":"1046","DOI":"10.1093\/jamia\/ocv202","volume":"23","author":"JC Kirby","year":"2016","unstructured":"Kirby JC, et al. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc. 2016;23(6):1046\u201352.","journal-title":"J Am Med Inform Assoc"},{"issue":"18","key":"1441_CR27","doi-asserted-by":"publisher","first-page":"2938","DOI":"10.1093\/bioinformatics\/btx364","volume":"33","author":"JR Conway","year":"2017","unstructured":"Conway JR, Lex A, Gehlenborg N. UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics. 2017;33(18):2938\u201340.","journal-title":"Bioinformatics"},{"issue":"1","key":"1441_CR28","doi-asserted-by":"crossref","first-page":"e80","DOI":"10.1002\/cphg.80","volume":"100","author":"SA Pendergrass","year":"2019","unstructured":"Pendergrass SA, Crawford DC. Using electronic health records to generate phenotypes for research. Curr Protoc Hum Genet. 2019;100(1):e80.","journal-title":"Curr Protoc Hum Genet"},{"issue":"10","key":"1441_CR29","doi-asserted-by":"publisher","first-page":"1054","DOI":"10.1016\/j.jclinepi.2011.01.001","volume":"64","author":"C van Walraven","year":"2011","unstructured":"van Walraven C, Bennett C, Forster AJ. Administrative database research infrequently used validated diagnostic or procedural codes. J Clin Epidemiol. 2011;64(10):1054\u20139.","journal-title":"J Clin Epidemiol"},{"issue":"1","key":"1441_CR30","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1186\/s12911-020-1092-5","volume":"20","author":"R Kashyap","year":"2020","unstructured":"Kashyap R, et al. Derivation and validation of a computable phenotype for acute decompensated heart failure in hospitalized patients. BMC Med Inform Decis Mak. 2020;20(1):85.","journal-title":"BMC Med Inform Decis Mak"},{"issue":"1","key":"1441_CR31","doi-asserted-by":"publisher","first-page":"e012012","DOI":"10.1136\/bmjopen-2016-012012","volume":"7","author":"RG Jackson","year":"2017","unstructured":"Jackson RG, et al. Natural language processing to extract symptoms of severe mental illness from clinical text: the Clinical Record Interactive Search Comprehensive Data Extraction (CRIS-CODE) project. BMJ Open. 2017;7(1):e012012.","journal-title":"BMJ Open"},{"issue":"2","key":"1441_CR32","doi-asserted-by":"publisher","first-page":"126","DOI":"10.1016\/j.jclinepi.2011.08.002","volume":"65","author":"C van Walraven","year":"2012","unstructured":"van Walraven C, Austin P. Administrative database research has unique characteristics that can risk biased results. J Clin Epidemiol. 2012;65(2):126\u201331.","journal-title":"J Clin Epidemiol"},{"issue":"2","key":"1441_CR33","doi-asserted-by":"publisher","first-page":"463","DOI":"10.1016\/j.jaci.2019.12.897","volume":"145","author":"Y Juhn","year":"2020","unstructured":"Juhn Y, Liu H. Artificial intelligence approaches using natural language processing to advance EHR-based clinical research. J Allergy Clin Immunol. 2020;145(2):463\u20139.","journal-title":"J Allergy Clin Immunol"},{"issue":"3","key":"1441_CR34","doi-asserted-by":"publisher","first-page":"457","DOI":"10.1093\/jamia\/ocz200","volume":"27","author":"S Wu","year":"2020","unstructured":"Wu S, et al. Deep learning in clinical natural language processing: a methodical review. J Am Med Inform Assoc. 2020;27(3):457\u201370.","journal-title":"J Am Med Inform Assoc"},{"key":"1441_CR35","unstructured":"Review HB. Using AI to improve electronic medical records. 2018. https:\/\/hbr.org\/2018\/12\/using-ai-to-improve-electronic-health-records."},{"issue":"2","key":"1441_CR36","doi-asserted-by":"publisher","first-page":"174","DOI":"10.1016\/j.jbi.2006.06.003","volume":"40","author":"JF Penz","year":"2007","unstructured":"Penz JF, Wilcox AB, Hurdle JF. Automated identification of adverse events related to central venous catheters. J Biomed Inform. 2007;40(2):174\u201382.","journal-title":"J Biomed Inform"},{"key":"1441_CR37","first-page":"755","volume":"2019","author":"LV Rasmussen","year":"2019","unstructured":"Rasmussen LV, et al. Considerations for improving the portability of electronic health record-based phenotype algorithms. AMIA Annu Symp Proc. 2019;2019:755\u201364.","journal-title":"AMIA Annu Symp Proc"},{"issue":"2","key":"1441_CR38","doi-asserted-by":"publisher","first-page":"226","DOI":"10.1002\/jts.22399","volume":"32","author":"KM Harrington","year":"2019","unstructured":"Harrington KM, et al. Validation of an electronic medical record-based algorithm for identifying posttraumatic stress disorder in U.S. Veterans. J Trauma Stress. 2019;32(2):226\u201337.","journal-title":"J Trauma Stress"},{"issue":"9","key":"1441_CR39","doi-asserted-by":"publisher","first-page":"817","DOI":"10.1016\/j.mayocp.2012.04.015","volume":"87","author":"B Singh","year":"2012","unstructured":"Singh B, et al. Derivation and validation of automated electronic search strategies to extract Charlson comorbidities from electronic medical records. Mayo Clin Proc. 2012;87(9):817\u201324.","journal-title":"Mayo Clin Proc"},{"key":"1441_CR40","doi-asserted-by":"crossref","unstructured":"Saavedra A, Morris RW, Tam C, Killedar M, Ratwatte S, Huynh R, Yu C, Yuan DZ, Cretikos M, Gullick J, Vernon ST, Figtree GA, Morris J, Brieger D. Validation of acute myocardial infarction (AMI) in electronic medical records: the SPEED-EXTRACT study. 2020. https:\/\/www.medrxiv.org\/content\/10.1101\/2020.12.08.20245720v1.","DOI":"10.1101\/2020.12.08.20245720"}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-021-01441-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12911-021-01441-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-021-01441-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,22]],"date-time":"2023-10-22T13:51:10Z","timestamp":1697982670000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-021-01441-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,8]]},"references-count":40,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["1441"],"URL":"https:\/\/doi.org\/10.1186\/s12911-021-01441-w","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.07.27.20163279","asserted-by":"object"}]},"ISSN":["1472-6947"],"issn-type":[{"value":"1472-6947","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,3,8]]},"assertion":[{"value":"8 October 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 February 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 March 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Ethics and governance approval for the study (called <i>SPEED-EXTRACT<\/i>) was provided by the Northern Sydney Local Health District (NSLHD) Human Research Ethics Committee. We had a waiver of consent for the patients in this study which was approved by the NSLHD Human Resesarch Ethics Committee.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"91"}}