{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,25]],"date-time":"2025-10-25T12:13:44Z","timestamp":1761394424543},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"5","content-domain":{"domain":["bmj.com"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2014,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Objectives To evaluate factors affecting performance of influenza detection, including accuracy of natural language processing (NLP), discriminative ability of Bayesian network (BN) classifiers, and feature selection.<\/jats:p>\n               <jats:p>Methods We derived a testing dataset of 124 influenza patients and 87 non-influenza (shigellosis) patients. To assess NLP finding-extraction performance, we measured the overall accuracy, recall, and precision of Topaz and MedLEE parsers for 31 influenza-related findings against a reference standard established by three physician reviewers. To elucidate the relative contribution of NLP and BN classifier to classification performance, we compared the discriminative ability of nine combinations of finding-extraction methods (expert, Topaz, and MedLEE) and classifiers (one human-parameterized BN and two machine-parameterized BNs). To assess the effects of feature selection, we conducted secondary analyses of discriminative ability using the most influential findings defined by their likelihood ratios.<\/jats:p>\n               <jats:p>Results The overall accuracy of Topaz was significantly better than MedLEE (with post-processing) (0.78 vs 0.71, p&amp;lt;0.0001). Classifiers using human-annotated findings were superior to classifiers using Topaz\/MedLEE-extracted findings (average area under the receiver operating characteristic (AUROC): 0.75 vs 0.68, p=0.0113), and machine-parameterized classifiers were superior to the human-parameterized classifier (average AUROC: 0.73 vs 0.66, p=0.0059). The classifiers using the 17 \u2018most influential\u2019 findings were more accurate than classifiers using all 31 subject-matter expert-identified findings (average AUROC: 0.76&amp;gt;0.70, p&amp;lt;0.05).<\/jats:p>\n               <jats:p>Conclusions Using a three-component evaluation method we demonstrated how one could elucidate the relative contributions of components under an integrated framework. To improve classification performance, this study encourages researchers to improve NLP accuracy, use a machine-parameterized classifier, and apply feature selection methods.<\/jats:p>","DOI":"10.1136\/amiajnl-2013-001934","type":"journal-article","created":{"date-parts":[[2014,1,10]],"date-time":"2014-01-10T04:52:10Z","timestamp":1389329530000},"page":"815-823","update-policy":"http:\/\/dx.doi.org\/10.1136\/crossmarkpolicy","source":"Crossref","is-referenced-by-count":38,"title":["Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers"],"prefix":"10.1093","volume":"21","author":[{"given":"Ye","family":"Ye","sequence":"first","affiliation":[{"name":"1Real-time Outbreak and Disease Surveillance Laboratory (RODS), Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA"},{"name":"2Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fuchiang","family":"Tsui","sequence":"additional","affiliation":[{"name":"1Real-time Outbreak and Disease Surveillance Laboratory (RODS), Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA"},{"name":"2Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Wagner","sequence":"additional","affiliation":[{"name":"1Real-time Outbreak and Disease Surveillance Laboratory (RODS), Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA"},{"name":"2Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jessi","family":"Espino","sequence":"additional","affiliation":[{"name":"1Real-time Outbreak and Disease Surveillance Laboratory (RODS), Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qi","family":"Li","sequence":"additional","affiliation":[{"name":"3Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2014,1,9]]},"reference":[{"key":"2023090112313464700_R1","unstructured":"Chu\n              D\n            \n          . Clinical feature extraction from emergency department reports for biosurveillance[master's thesis]. Pittsburgh, University of Pittsburgh, 2007."},{"key":"2023090112313464700_R2","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1136\/jamia.1994.95236146","article-title":"A general natural-language text processor for clinical radiology","volume":"1","author":"Friedman","year":"1994","journal-title":"J Am Med Inform Assoc"},{"key":"2023090112313464700_R3","doi-asserted-by":"crossref","first-page":"392","DOI":"10.1197\/jamia.M1552","article-title":"Automated encoding of clinical documents based on natural language processing","volume":"11","author":"Friedman","year":"2004","journal-title":"J Am Med Inform Assoc"},{"key":"2023090112313464700_R4","first-page":"4","article-title":"The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies","author":"McCarty","year":"2011","journal-title":"BMC Med Genomics"},{"key":"2023090112313464700_R5","first-page":"274","article-title":"Analyzing the heterogeneity and complexity of electronic health record oriented phenotyping algorithms","author":"Conway","year":"2011","journal-title":"AMIA Annu Symp Proc"},{"key":"2023090112313464700_R6","first-page":"248","article-title":"The SHARPn project on secondary use of electronic medical record data: progress, plans, and possibilities","volume":"2011","author":"Chute","year":"2011","journal-title":"AMIA Annu Symp Proc"},{"key":"2023090112313464700_R7","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1136\/amiajnl-2011-000456","article-title":"Importance of multi-modal approaches to effectively identify cataract cases from electronic health records","volume":"19","author":"Peissig","year":"2012","journal-title":"J Am Med Inform Assoc"},{"key":"2023090112313464700_R8","first-page":"532","article-title":"Modeling and executing electronic health records driven phenotyping algorithms using the NQF quality data model and JBoss\u00ae drools engine","volume":"2012","author":"Li","year":"2012","journal-title":"AMIA Annu Symp Proc"},{"key":"2023090112313464700_R9","doi-asserted-by":"crossref","first-page":"e147","DOI":"10.1136\/amiajnl-2012-000896","article-title":"Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network","volume":"20","author":"Newton","year":"2013","journal-title":"J Am Med Inform Assoc"},{"key":"2023090112313464700_R10","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1136\/jamia.1997.0040376","article-title":"Automated tuberculosis detection","volume":"4","author":"Hripcsak","year":"1997","journal-title":"J Am Med Inform Assoc"},{"key":"2023090112313464700_R11","first-page":"632","article-title":"Diagnosing community-acquired pneumonia with a Bayesian network","author":"Aronsky","year":"1998","journal-title":"Proc AMIA Symp"},{"key":"2023090112313464700_R12","doi-asserted-by":"crossref","first-page":"494","DOI":"10.1197\/jamia.M1330","article-title":"Creating a text classifier to detect chest radiograph reports consistent with features of inhalational anthrax","volume":"10","author":"Chapman","year":"2003","journal-title":"J Am Med Inform Assoc"},{"key":"2023090112313464700_R13","doi-asserted-by":"crossref","first-page":"568","DOI":"10.1136\/jamia.2010.004366","article-title":"Leveraging informatics for genetic studies: use of the electronic medical record to enable a genome-wide association study of peripheral arterial disease","volume":"17","author":"Kullo","year":"2010","journal-title":"J Am Med Inform Assoc"},{"key":"2023090112313464700_R14","doi-asserted-by":"crossref","first-page":"1120","DOI":"10.1002\/acr.20184","article-title":"Electronic medical records for discovery research in rheumatoid arthritis","volume":"62","author":"Liao","year":"2010","journal-title":"Arthritis Care Res (Hoboken)"},{"key":"2023090112313464700_R15","doi-asserted-by":"crossref","first-page":"11","DOI":"10.7326\/0003-4819-156-1-201201030-00003","article-title":"Comparison of natural language processing biosurveillance methods for identifying influenza from encounter notes","volume":"156","author":"Elkin","year":"2012","journal-title":"Ann Intern Med"},{"key":"2023090112313464700_R16","unstructured":"RODS_Laboratory. Demonstration of Influenza Monitoring System. 2013. http:\/\/www.youtube.com\/watch?v=qOlGbrTsS-A&hd=1"},{"key":"2023090112313464700_R17","article-title":"Probabilistic, decision-theoretic disease surveillance and control","volume":"3","author":"Wagner","year":"2011","journal-title":"Online J Public Health"},{"key":"2023090112313464700_R18","first-page":"3","article-title":"Probabilistic case detection for disease surveillance using data in electronic medical records","author":"Tsui","year":"2011","journal-title":"Online J Public Health"},{"key":"2023090112313464700_R19","article-title":"Building an automated Bayesian case detection system","volume-title":"9th Annual Conference of the International Society for Disease Surveillance","author":"Tsui","year":"2010"},{"key":"2023090112313464700_R20","doi-asserted-by":"crossref","first-page":"839","DOI":"10.1016\/j.jbi.2009.05.002","article-title":"ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports","volume":"42","author":"Harkema","year":"2009","journal-title":"J Biomed Inform"},{"key":"2023090112313464700_R21","volume-title":"Probabilistic similarity networks","author":"Heckerman","year":"1991"},{"key":"2023090112313464700_R22","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1007\/BF00994110","article-title":"A Bayesian method for the induction of probabilistic networks from data","volume":"9","author":"Cooper","year":"1992","journal-title":"Mach Learn"},{"key":"2023090112313464700_R23","first-page":"127","article-title":"An efficient Bayesian method for predicting clinical outcomes from genome-wide data","author":"Cooper","year":"2010","journal-title":"AMIA Annu Symp Proc"},{"key":"2023090112313464700_R24","first-page":"902","article-title":"SMILE: structural modeling, inference, and learning engine and GeNIe: a development environment for graphical decision-theoretic models (Intelligent Systems Demonstration)","volume-title":"Proceedings of the Sixteenth National Conference on Artificial Intelligence Menlo Park","author":"Druzdzel","year":"1999"},{"key":"2023090112313464700_R25","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1023\/A:1007413511361","article-title":"On the optimality of the simple Bayesian classifier under zero-one loss","volume":"29","author":"Domingos","year":"1997","journal-title":"Mach Learn"},{"key":"2023090112313464700_R26","first-page":"41","article-title":"An empirical study of the naive Bayes classifier","volume-title":"IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence","author":"Rish","year":"2001"},{"key":"2023090112313464700_R27","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1309\/E6K33GBPE5C27FYU","article-title":"Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research","volume":"121","author":"Gupta","year":"2004","journal-title":"Am J Clin Pathol"},{"key":"2023090112313464700_R28","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-68282-2","volume-title":"Bayesian networks and decision graphs","author":"Jensen","year":"2007","edition":"2nd edn."},{"key":"2023090112313464700_R29","doi-asserted-by":"crossref","DOI":"10.1201\/9780429246593","volume-title":"An Introduction to the bootstrap","author":"Efron","year":"1994"},{"key":"2023090112313464700_R30","doi-asserted-by":"crossref","first-page":"837","DOI":"10.2307\/2531595","article-title":"Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach","volume":"44","author":"DeLong","year":"1988","journal-title":"Biometrics"},{"key":"2023090112313464700_R31","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1186\/1471-2105-12-77","article-title":"pROC: an open-source package for R and S+ to analyze and compare ROC curves","volume":"12","author":"Robin","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023090112313464700_R32","doi-asserted-by":"crossref","first-page":"675","DOI":"10.1080\/01621459.1937.10503522","article-title":"The use of ranks to avoid the assumption of normality implicit in the analysis of variance","volume":"32","author":"Friedman","year":"1937","journal-title":"J Am Statist Assoc"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/21\/5\/815\/51327327\/jamia_21_5_815.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/21\/5\/815\/51327327\/jamia_21_5_815.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,1]],"date-time":"2023-09-01T15:16:23Z","timestamp":1693581383000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/21\/5\/815\/758360"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,1,9]]},"references-count":32,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2014,1,9]]},"published-print":{"date-parts":[[2014,9,1]]}},"URL":"https:\/\/doi.org\/10.1136\/amiajnl-2013-001934","relation":{},"ISSN":["1527-974X","1067-5027"],"issn-type":[{"value":"1527-974X","type":"electronic"},{"value":"1067-5027","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,9]]},"published":{"date-parts":[[2014,1,9]]}}}