{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T19:29:34Z","timestamp":1772134174314,"version":"3.50.1"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2017,11,24]],"date-time":"2017-11-24T00:00:00Z","timestamp":1511481600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000057","name":"National Institute of General Medical Sciences","doi-asserted-by":"publisher","award":["GM102282 and GM103859"],"award-info":[{"award-number":["GM102282 and GM103859"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000054","name":"National Cancer Institute","doi-asserted-by":"publisher","award":["CA194215"],"award-info":[{"award-number":["CA194215"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100004917","name":"Cancer Prevention and Research Institute of Texas","doi-asserted-by":"publisher","award":["R1307"],"award-info":[{"award-number":["R1307"]}],"id":[{"id":"10.13039\/100004917","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community.<\/jats:p>","DOI":"10.1093\/jamia\/ocx132","type":"journal-article","created":{"date-parts":[[2017,10,20]],"date-time":"2017-10-20T03:10:05Z","timestamp":1508469005000},"page":"331-336","source":"Crossref","is-referenced-by-count":263,"title":["CLAMP \u2013 a toolkit for efficiently building customized clinical natural language processing pipelines"],"prefix":"10.1093","volume":"25","author":[{"given":"Ergin","family":"Soysal","sequence":"first","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]},{"given":"Jingqi","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]},{"given":"Min","family":"Jiang","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]},{"given":"Yonghui","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]},{"given":"Serguei","family":"Pakhomov","sequence":"additional","affiliation":[{"name":"Department of Pharmaceutical Care and Health System, University of Minnesota Twin Cities, Minneapolis, MN, USA"}]},{"given":"Hongfang","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA"}]},{"given":"Hua","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA"}]}],"member":"286","published-online":{"date-parts":[[2017,11,24]]},"reference":[{"issue":"5","key":"2020110612390386400_ocx132-B1","doi-asserted-by":"crossref","first-page":"760","DOI":"10.1016\/j.jbi.2009.08.007","article-title":"What can natural language processing do for clinical decision support?","volume":"42","author":"Demner-Fushman","year":"2009","journal-title":"J Biomed Inform."},{"issue":"5","key":"2020110612390386400_ocx132-B2","doi-asserted-by":"crossref","first-page":"507","DOI":"10.1136\/jamia.2009.001560","article-title":"Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications","volume":"17","author":"Savova","year":"2010","journal-title":"J Am Med Inform Assoc."},{"issue":"3","key":"2020110612390386400_ocx132-B3","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1136\/jamia.2009.002733","article-title":"An overview of MetaMap: historical perspective and recent advances","volume":"17","author":"Aronson","year":"2010","journal-title":"J Am Med Inform Assoc."},{"issue":"4","key":"2020110612390386400_ocx132-B4","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1093\/jamia\/ocw177","article-title":"MetaMap Lite: an evaluation of a new Java implementation of MetaMap","volume":"24","author":"Demner-Fushman","year":"2017","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612390386400_ocx132-B5","first-page":"595","article-title":"Towards a comprehensive medical language processing system: methods and issues","author":"Friedman","year":"1997","journal-title":"Proc AMIA Annu Fall Symp."},{"issue":"1","key":"2020110612390386400_ocx132-B6","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1197\/jamia.M2437","article-title":"Mayo Clinic NLP system for patient smoking status identification","volume":"15","author":"Savova","year":"2008","journal-title":"J Am Med Inform Assoc."},{"issue":"Pt 1","key":"2020110612390386400_ocx132-B7","first-page":"487","article-title":"Identifying respiratory findings in emergency department reports for biosurveillance using MetaMap","volume":"11","author":"Chapman","year":"2004","journal-title":"Medinfo."},{"key":"2020110612390386400_ocx132-B8","first-page":"829","article-title":"Identification of findings suspicious for breast cancer based on natural language processing of mammogram reports","author":"Jain","year":"1997","journal-title":"Proc AMIA Annu Fall Symp."},{"issue":"1","key":"2020110612390386400_ocx132-B9","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1197\/jamia.M3378","article-title":"MedEx: a medication information extraction system for clinical narratives","volume":"17","author":"Xu","year":"2010","journal-title":"J Am Med Inform Assoc."},{"issue":"5","key":"2020110612390386400_ocx132-B10","doi-asserted-by":"crossref","first-page":"828","DOI":"10.1136\/amiajnl-2013-001635","article-title":"A hybrid system for temporal information extraction from clinical text","volume":"20","author":"Tang","year":"2013","journal-title":"J Am Med Inform Assoc."},{"issue":"5","key":"2020110612390386400_ocx132-B11","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1197\/jamia.M2444","article-title":"Evaluating the state-of-the-art in automatic de-identification","volume":"14","author":"Uzuner","year":"2007","journal-title":"J Am Med Inform Assoc."},{"issue":"3","key":"2020110612390386400_ocx132-B12","doi-asserted-by":"crossref","first-page":"596","DOI":"10.1093\/jamia\/ocw156","article-title":"De-identification of patient notes with recurrent neural networks","volume":"24","author":"Dernoncourt","year":"2017","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612390386400_ocx132-B13","doi-asserted-by":"crossref","first-page":"S20","DOI":"10.1016\/j.jbi.2015.07.020","article-title":"Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2\/UTHealth corpus","volume":"58","author":"Stubbs","year":"2015","journal-title":"J Biomed Inform."},{"key":"2020110612390386400_ocx132-B14","doi-asserted-by":"crossref","first-page":"S189","DOI":"10.1016\/j.jbi.2015.07.008","article-title":"Ease of adoption of clinical natural language processing software: an evaluation of five systems","volume":"58","author":"Zheng","year":"2015","journal-title":"J Biomed Inform."},{"issue":"5","key":"2020110612390386400_ocx132-B15","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1136\/amiajnl-2011-000465","article-title":"Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions","volume":"18","author":"Chapman","year":"2011","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612390386400_ocx132-B16","first-page":"577","article-title":"A study of transportability of an existing smoking status detection module across institutions","author":"Liu","year":"2012","journal-title":"AMIA Annu Symp Proc."},{"key":"2020110612390386400_ocx132-B17","volume-title":"Unstructured Information Management Architecture (UIMA) Version 1.0","author":"Ferrucci","year":"2008"},{"issue":"5","key":"2020110612390386400_ocx132-B18","doi-asserted-by":"crossref","first-page":"514","DOI":"10.1136\/jamia.2010.003947","article-title":"Extracting medication information from clinical text","volume":"17","author":"Uzuner","year":"2010","journal-title":"J Am Med Inform Assoc."},{"issue":"5","key":"2020110612390386400_ocx132-B19","doi-asserted-by":"crossref","first-page":"552","DOI":"10.1136\/amiajnl-2011-000203","article-title":"2010 i2b2\/VA challenge on concepts, assertions, and relations in clinical text","volume":"18","author":"Uzuner","year":"2011","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612390386400_ocx132-B20","article-title":"Clinical Acronym\/Abbreviation Normalization using a Hybrid Approach","author":"Wu","year":"2013","journal-title":"Proc CLEF Evaluation Labs and Workshop."},{"key":"2020110612390386400_ocx132-B21","first-page":"802","article-title":"UTH_CCB: a report for SemEval 2014\u2013task 7 analysis of clinical text","author":"Tang","year":"2014","journal-title":"SemEval"},{"key":"2020110612390386400_ocx132-B22","volume-title":"The OpenNLP Project","author":"Baldridge","year":"2015"},{"issue":"6","key":"2020110612390386400_ocx132-B23","doi-asserted-by":"crossref","first-page":"1168","DOI":"10.1136\/amiajnl-2013-001810","article-title":"Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences","volume":"20","author":"Fan","year":"2013","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612390386400_ocx132-B24","volume-title":"Task 2: ShARe\/CLEF eHealth Evaluation Lab","author":"Murtola","year":"2013"},{"issue":"e1","key":"2020110612390386400_ocx132-B25","doi-asserted-by":"crossref","first-page":"e79","DOI":"10.1093\/jamia\/ocw109","article-title":"A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD)","volume":"24","author":"Wu","year":"2017","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612390386400_ocx132-B26","volume-title":"CRFsuite: a Fast Implementation of Conditional Random Fields (CRFs)","author":"Okazaki","year":"2007"},{"issue":"5","key":"2020110612390386400_ocx132-B27","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1006\/jbin.2001.1029","article-title":"A simple algorithm for identifying negated findings and diseases in discharge summaries","volume":"34","author":"Chapman","year":"2001","journal-title":"J Biomed Inform."},{"issue":"99","key":"2020110612390386400_ocx132-B28","first-page":"54","article-title":"SemEval-2014 Task 7: analysis of clinical text","volume":"199","author":"Pradhan","year":"2014","journal-title":"SemEval 2014."},{"issue":"01","key":"2020110612390386400_ocx132-B29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1017\/S1351324914000114","article-title":"UIMA Ruta: rapid development of rule-based information extraction applications","volume":"22","author":"Kluegl","year":"2016","journal-title":"Nat Language Eng."},{"key":"2020110612390386400_ocx132-B30","volume-title":"Transcribed Medical Transcription Sample Reports and Examples \u2013 MTSamples","year":"2015"},{"key":"2020110612390386400_ocx132-B31","article-title":"BRAT: a web-based tool for NLP-assisted text annotation","volume-title":"Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics","author":"Stenetorp"},{"issue":"5","key":"2020110612390386400_ocx132-B32","doi-asserted-by":"crossref","first-page":"e64933","DOI":"10.1371\/journal.pone.0064933","article-title":"Extracting physician group intelligence from electronic health records to support evidence based medicine","volume":"8","author":"Weber","year":"2013","journal-title":"PLoS One."},{"issue":"1","key":"2020110612390386400_ocx132-B33","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1197\/jamia.M2408","article-title":"Identifying patient smoking status from medical discharge records","volume":"15","author":"Uzuner","year":"2008","journal-title":"J Am Med Inform Assoc."},{"issue":"2011","key":"2020110612390386400_ocx132-B34","first-page":"382","article-title":"Part-of-speech tagging for clinical text: wall or bridge between institutions?","author":"Fan","year":"2011","journal-title":"AMIA Annu Symp Proc."},{"issue":"2016","key":"2020110612390386400_ocx132-B35","first-page":"88","article-title":"A quantitative and qualitative evaluation of sentence boundary detection for the clinical domain","author":"Griffis","year":"2016","journal-title":"AMIA Jt Summits Transl Sci Proc."},{"issue":"2015","key":"2020110612390386400_ocx132-B36","first-page":"873012","article-title":"Recognition and evaluation of clinical section headings in clinical documents using token-based formulation with conditional random fields","author":"Dai","year":"2015","journal-title":"Biomed Res Int."},{"issue":"2","key":"2020110612390386400_ocx132-B37","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1023\/A:1014348124664","article-title":"GATE, a general architecture for text engineering","volume":"36","author":"Cunningham","year":"2002","journal-title":"Comput Hum."},{"key":"2020110612390386400_ocx132-B38","article-title":"CliNER: A lightweight tool for clinical named entity recognition","author":"Boag","year":"2015","journal-title":"AMIA Jt Summits Clin Res Inform (poster)."},{"key":"2020110612390386400_ocx132-B39","article-title":"NeuroNER: an easy-to-use program for named-entity recognition based on neural networks","author":"Dernoncourt","year":"2017","journal-title":"arXiv preprint. arXiv:170505487."},{"key":"2020110612390386400_ocx132-B40","first-page":"1356","article-title":"Rapid NLP development with Leo","volume":"2014","author":"Cornia","year":"2014","journal-title":"AMIA Annu Symp Proc."},{"issue":"1","key":"2020110612390386400_ocx132-B41","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1136\/amiajnl-2011-000376","article-title":"Validation of a common data model for active safety surveillance research","volume":"19","author":"Overhage","year":"2012","journal-title":"J Am Med Inform Assoc."}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/25\/3\/331\/34150625\/ocx132.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/25\/3\/331\/34150625\/ocx132.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,8,4]],"date-time":"2022-08-04T22:42:49Z","timestamp":1659652969000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/25\/3\/331\/4657212"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,11,24]]},"references-count":41,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2017,11,24]]},"published-print":{"date-parts":[[2018,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocx132","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,3]]},"published":{"date-parts":[[2017,11,24]]}}}