{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,4]],"date-time":"2026-03-04T15:54:16Z","timestamp":1772639656005,"version":"3.50.1"},"reference-count":54,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2022,8,3]],"date-time":"2022-08-03T00:00:00Z","timestamp":1659484800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"NIH\/NIDA","award":["R01DA051464"],"award-info":[{"award-number":["R01DA051464"]}]},{"name":"NIH\/NIGM","award":["R01 HL157262"],"award-info":[{"award-number":["R01 HL157262"]}]},{"name":"NIH\/NLM","award":["R01LM012973"],"award-info":[{"award-number":["R01LM012973"]}]},{"name":"NIH\/NLM","award":["R01LM012918"],"award-info":[{"award-number":["R01LM012918"]}]},{"name":"NIH NLM","award":["R01LM010090"],"award-info":[{"award-number":["R01LM010090"]}]},{"name":"NIH\/NLM","award":["R13LM013127"],"award-info":[{"award-number":["R13LM013127"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,9,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>To provide a scoping review of papers on clinical natural language processing (NLP) shared tasks that use publicly available electronic health record data from a cohort of patients.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>We searched 6 databases, including biomedical research and computer science literature databases. A round of title\/abstract screening and full-text screening were conducted by 2 reviewers. Our method followed the PRISMA-ScR guidelines.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>A total of 35 papers with 48 clinical NLP tasks met inclusion criteria between 2007 and 2021. We categorized the tasks by the type of NLP problems, including named entity recognition, summarization, and other NLP tasks. Some tasks were introduced as potential clinical decision support applications, such as substance abuse detection, and phenotyping. We summarized the tasks by publication venue and dataset type.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Discussion<\/jats:title>\n                  <jats:p>The breadth of clinical NLP tasks continues to grow as the field of NLP evolves with advancements in language systems. However, gaps exist with divergent interests between the general domain NLP community and the clinical informatics community for task motivation and design, and in generalizability of the data sources. We also identified issues in data preparation.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>The existing clinical NLP tasks cover a wide range of topics and the field is expected to grow and attract more attention from both general domain NLP and clinical informatics community. We encourage future work to incorporate multidisciplinary collaboration, reporting transparency, and standardization in data preparation. We provide a listing of all the shared task papers and datasets from this review in a GitLab repository.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocac127","type":"journal-article","created":{"date-parts":[[2022,8,4]],"date-time":"2022-08-04T05:13:33Z","timestamp":1659590013000},"page":"1797-1806","source":"Crossref","is-referenced-by-count":21,"title":["A scoping review of publicly available language tasks in clinical natural language processing"],"prefix":"10.1093","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9341-7360","authenticated-orcid":false,"given":"Yanjun","family":"Gao","sequence":"first","affiliation":[{"name":"ICU Data Science Lab, Department of Medicine, School of Medicine and Public Health, University of Wisconsin , Madison, Wisconsin, USA"}]},{"given":"Dmitriy","family":"Dligach","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Loyola University Chicago , Chicago, Illinois, USA"}]},{"given":"Leslie","family":"Christensen","sequence":"additional","affiliation":[{"name":"School of Medicine and Public Health, University of Wisconsin , Madison, Wisconsin, USA"}]},{"given":"Samuel","family":"Tesch","sequence":"additional","affiliation":[{"name":"School of Medicine and Public Health, University of Wisconsin , Madison, Wisconsin, USA"}]},{"given":"Ryan","family":"Laffin","sequence":"additional","affiliation":[{"name":"School of Medicine and Public Health, University of Wisconsin , Madison, Wisconsin, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0828-1102","authenticated-orcid":false,"given":"Dongfang","family":"Xu","sequence":"additional","affiliation":[{"name":"Computational Health Informatics Program, Boston Children's Hospital, Harvard University , Boston, Massachusetts, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4513-403X","authenticated-orcid":false,"given":"Timothy","family":"Miller","sequence":"additional","affiliation":[{"name":"Computational Health Informatics Program, Boston Children's Hospital, Harvard University , Boston, Massachusetts, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8011-9850","authenticated-orcid":false,"given":"Ozlem","family":"Uzuner","sequence":"additional","affiliation":[{"name":"Department of Information Sciences and Technology, George Mason University , Fairfax, Virginia, USA"}]},{"given":"Matthew M","family":"Churpek","sequence":"additional","affiliation":[{"name":"ICU Data Science Lab, Department of Medicine, School of Medicine and Public Health, University of Wisconsin , Madison, Wisconsin, USA"}]},{"given":"Majid","family":"Afshar","sequence":"additional","affiliation":[{"name":"ICU Data Science Lab, Department of Medicine, School of Medicine and Public Health, University of Wisconsin , Madison, Wisconsin, USA"}]}],"member":"286","published-online":{"date-parts":[[2022,8,3]]},"reference":[{"issue":"5","key":"2022091407552821500_ocac127-B1","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1136\/amiajnl-2011-000465","article-title":"Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions","volume":"18","author":"Chapman","year":"2011","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2022091407552821500_ocac127-B2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Sci Data"},{"key":"2022091407552821500_ocac127-B3","first-page":"171","author":"Yetisgen","year":"2017"},{"key":"2022091407552821500_ocac127-B4","first-page":"3417","author":"Klassen","year":"2016"},{"issue":"1","key":"2022091407552821500_ocac127-B5","doi-asserted-by":"crossref","first-page":"e24008","DOI":"10.2196\/24008","article-title":"Family history extraction from synthetic clinical narratives using natural language processing: overview and evaluation of a challenge data set and solutions for the 2019 National NLP Clinical Challenges (n2c2)\/Open Health Natural Language Processing (OHNLP) competition","volume":"9","author":"Shen","year":"2021","journal-title":"JMIR Med Inform"},{"key":"2022091407552821500_ocac127-B6","first-page":"370","author":"Abacha","year":"2019"},{"key":"2022091407552821500_ocac127-B7","first-page":"1586","author":"Romanov","year":"2018"},{"issue":"7","key":"2022091407552821500_ocac127-B8","doi-asserted-by":"crossref","first-page":"467","DOI":"10.7326\/M18-0850","article-title":"PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation","volume":"169","author":"Tricco","year":"2018","journal-title":"Ann Intern Med"},{"issue":"1","key":"2022091407552821500_ocac127-B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13643-016-0384-4","article-title":"Rayyan\u2014a web and mobile app for systematic reviews","volume":"5","author":"Ouzzani","year":"2016","journal-title":"Syst Rev"},{"issue":"5","key":"2022091407552821500_ocac127-B10","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1197\/jamia.M2444","article-title":"Evaluating the state-of-the-art in automatic de-identification","volume":"14","author":"Uzuner","year":"2007","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2022091407552821500_ocac127-B11","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1197\/jamia.M2408","article-title":"Identifying patient smoking status from medical discharge records","volume":"15","author":"Uzuner","year":"2008","journal-title":"J Am Med Inform Assoc"},{"issue":"4","key":"2022091407552821500_ocac127-B12","doi-asserted-by":"crossref","first-page":"561","DOI":"10.1197\/jamia.M3115","article-title":"Recognizing obesity and comorbidities in sparse data","volume":"16","author":"Uzuner","year":"2009","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"2022091407552821500_ocac127-B13","doi-asserted-by":"crossref","first-page":"514","DOI":"10.1136\/jamia.2010.003947","article-title":"Extracting medication information from clinical text","volume":"17","author":"Uzuner","year":"2010","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"2022091407552821500_ocac127-B14","doi-asserted-by":"crossref","first-page":"552","DOI":"10.1136\/amiajnl-2011-000203","article-title":"2010 i2b2\/VA challenge on concepts, assertions, and relations in clinical text","volume":"18","author":"Uzuner","year":"2011","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"2022091407552821500_ocac127-B15","doi-asserted-by":"crossref","first-page":"786","DOI":"10.1136\/amiajnl-2011-000784","article-title":"Evaluating the state of the art in coreference resolution for electronic medical records","volume":"19","author":"Uzuner","year":"2012","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"2022091407552821500_ocac127-B16","doi-asserted-by":"crossref","first-page":"806","DOI":"10.1136\/amiajnl-2013-001628","article-title":"Evaluating temporal relations in clinical text: 2012 i2b2 Challenge","volume":"20","author":"Sun","year":"2013","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2022091407552821500_ocac127-B17","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1136\/amiajnl-2013-002544","article-title":"Evaluating the state of the art in disorder recognition and normalization of the clinical narrative","volume":"22","author":"Pradhan","year":"2015","journal-title":"J Am Med Inform Assoc"},{"issue":"10","key":"2022091407552821500_ocac127-B18","first-page":"1529","article-title":"The 2019 national natural language processing (NLP) clinical challenges (n2c2)\/Open health NLP (OHNLP) shared task on clinical concept normalization for clinical records","volume":"27","author":"Henry","year":"2020","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2022091407552821500_ocac127-B19","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1093\/jamia\/ocz166","article-title":"2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records","volume":"27","author":"Henry","year":"2020","journal-title":"J Am Med Inform Assoc"},{"issue":"11","key":"2022091407552821500_ocac127-B20","doi-asserted-by":"crossref","first-page":"1163","DOI":"10.1093\/jamia\/ocz163","article-title":"Cohort selection for clinical trials: n2c2 2018 shared task Track 1","volume":"26","author":"Stubbs","year":"2019","journal-title":"J Am Med Inform Assoc"},{"key":"2022091407552821500_ocac127-B21","doi-asserted-by":"crossref","first-page":"S67","DOI":"10.1016\/j.jbi.2015.07.001","article-title":"Identifying risk factors for heart disease over time: overview of 2014 i2b2\/UTHealth shared task Track 2","volume":"58","author":"Stubbs","year":"2015","journal-title":"J Biomed Inform"},{"key":"2022091407552821500_ocac127-B22","doi-asserted-by":"crossref","first-page":"S20","DOI":"10.1016\/j.jbi.2015.07.020","article-title":"Annotating longitudinal clinical narratives for de-identification: the 2014 i2b2\/UTHealth corpus","volume":"58","author":"Stubbs","year":"2015","journal-title":"J Biomed Inform"},{"key":"2022091407552821500_ocac127-B23","doi-asserted-by":"crossref","first-page":"S62","DOI":"10.1016\/j.jbi.2017.04.017","article-title":"Symptom severity prediction from neuropsychiatric clinical records: overview of 2016 CEGS N-GRID shared tasks Track 2","volume":"75","author":"Filannino","year":"2017","journal-title":"J Biomed Inform"},{"key":"2022091407552821500_ocac127-B24","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1016\/j.jbi.2017.06.011","article-title":"De-identification of psychiatric intake records: overview of 2016 CEGS N-GRID shared tasks Track 1","volume":"75","author":"Stubbs","year":"2017","journal-title":"J Biomed Inform"},{"key":"2022091407552821500_ocac127-B25","doi-asserted-by":"crossref","first-page":"103631","DOI":"10.1016\/j.jbi.2020.103631","article-title":"Annotating social determinants of health using active learning, and characterizing determinants using neural event extraction","volume":"113","author":"Lybarger","year":"2021","journal-title":"J Biomed Inform"},{"issue":"11","key":"2022091407552821500_ocac127-B26","doi-asserted-by":"crossref","first-page":"e23375","DOI":"10.2196\/23375","article-title":"The 2019 n2c2\/OHNLP track on clinical semantic textual similarity: overview","volume":"8","author":"Wang","year":"2020","journal-title":"JMIR Med Inform"},{"key":"2022091407552821500_ocac127-B27","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1186\/s13326-016-0084-y","article-title":"Normalizing acronyms and abbreviations to aid patient understanding of clinical texts: ShARe\/CLEF eHealth Challenge 2013, Task 2","volume":"7","author":"Mowery","year":"2016","journal-title":"J Biomed Semantics"},{"issue":"1","key":"2022091407552821500_ocac127-B28","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1007\/s40264-018-0762-z","article-title":"Overview of the first natural language processing challenge for extracting medication, indication, and adverse drug events from electronic health record notes (MADE 1.0)","volume":"42","author":"Jagannatha","year":"2019","journal-title":"Drug Saf"},{"key":"2022091407552821500_ocac127-B29","first-page":"1252","author":"Uzuner","year":"2008"},{"key":"2022091407552821500_ocac127-B30","first-page":"188","article-title":"NegBio: a high-performance tool for negation and uncertainty detection in radiology reports","author":"Peng","year":"2018","journal-title":"AMIA Jt Summits Transl Sci Proc"},{"key":"2022091407552821500_ocac127-B31","first-page":"418","article-title":"Annotating temporal relations to determine the onset of psychosis symptoms","volume":"264","author":"Viani","year":"2019","journal-title":"Stud Health Technol Inform"},{"key":"2022091407552821500_ocac127-B32","first-page":"1365","author":"Mullenbach","year":"2021"},{"key":"2022091407552821500_ocac127-B33","author":"Yue","year":"2020"},{"key":"2022091407552821500_ocac127-B34","first-page":"1362","author":"Moseley","year":"2020"},{"key":"2022091407552821500_ocac127-B35","first-page":"2357","author":"Pampari","year":"2018"},{"key":"2022091407552821500_ocac127-B36","first-page":"172","volume-title":"International Conference of the Cross-Language Evaluation Forum for European Languages","author":"Kelly","year":"2014"},{"key":"2022091407552821500_ocac127-B37","first-page":"212","volume-title":"International Conference of the Cross-Language Evaluation Forum for European Languages","author":"Suominen","year":"2013"},{"key":"2022091407552821500_ocac127-B38"},{"key":"2022091407552821500_ocac127-B39","author":"Wang","year":"2020"},{"key":"2022091407552821500_ocac127-B40","author":"Pradhan","year":"2014"},{"key":"2022091407552821500_ocac127-B41","author":"Bethard","year":"2017"},{"key":"2022091407552821500_ocac127-B42","first-page":"74","author":"Abacha","year":"2021"},{"key":"2022091407552821500_ocac127-B43","first-page":"35","author":"van Aken","year":"2021"},{"issue":"3","key":"2022091407552821500_ocac127-B44","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1007\/s10579-015-9330-7","article-title":"Annotating patient clinical records with syntactic chunks and named entities: the Harvey Corpus","volume":"50","author":"Savkov","year":"2016","journal-title":"Lang Resour Eval"},{"key":"2022091407552821500_ocac127-B45","first-page":"74","author":"Lin","year":"2004"},{"key":"2022091407552821500_ocac127-B46","first-page":"5679","author":"M\u2019Rabet","year":"2020"},{"key":"2022091407552821500_ocac127-B47","author":"Zhang","year":"2019"},{"key":"2022091407552821500_ocac127-B48","first-page":"1500","author":"Smit","year":"2020"},{"key":"2022091407552821500_ocac127-B49","first-page":"5998","author":"Vaswani","year":"2017"},{"key":"2022091407552821500_ocac127-B50","first-page":"4171","author":"Devlin","year":"2019"},{"key":"2022091407552821500_ocac127-B51","first-page":"3615","author":"Beltagy","year":"2019"},{"key":"2022091407552821500_ocac127-B52","author":"Radford","year":"2018"},{"key":"2022091407552821500_ocac127-B53","first-page":"5418","author":"Roberts","year":"2020"},{"key":"2022091407552821500_ocac127-B54","first-page":"11328","author":"Zhang","year":"2020"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/29\/10\/1797\/45798275\/ocac127.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/29\/10\/1797\/45798275\/ocac127.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,9,14]],"date-time":"2022-09-14T07:57:20Z","timestamp":1663142240000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/29\/10\/1797\/6654732"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,3]]},"references-count":54,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2022,8,3]]},"published-print":{"date-parts":[[2022,9,12]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocac127","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,10,1]]},"published":{"date-parts":[[2022,8,3]]}}}