{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,24]],"date-time":"2026-01-24T01:42:42Z","timestamp":1769218962196,"version":"3.49.0"},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2019,7,11]],"date-time":"2019-07-11T00:00:00Z","timestamp":1562803200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000092","name":"National Library of Medicine","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>Active Learning (AL) attempts to reduce annotation cost (ie, time) by selecting the most informative examples for annotation. Most approaches tacitly (and unrealistically) assume that the cost for annotating each sample is identical. This study introduces a cost-aware AL method, which simultaneously models both the annotation cost and the informativeness of the samples and evaluates both via simulation and user studies.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>We designed a novel, cost-aware AL algorithm (Cost-CAUSE) for annotating clinical named entities; we first utilized lexical and syntactic features to estimate annotation cost, then we incorporated this cost measure into an existing AL algorithm. Using the 2010 i2b2\/VA data set, we then conducted a simulation study comparing Cost-CAUSE with noncost-aware AL methods, and a user study comparing Cost-CAUSE with passive learning.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Our cost model fit empirical annotation data well, and Cost-CAUSE increased the simulation area under the learning curve (ALC) scores by up to 5.6% and 4.9%, compared with random sampling and alternate AL methods. Moreover, in a user annotation task, Cost-CAUSE outperformed passive learning on the ALC score and reduced annotation time by 20.5%\u201330.2%.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Discussion<\/jats:title>\n                  <jats:p>Although AL has proven effective in simulations, our user study shows that a real-world environment is far more complex. Other factors have a noticeable effect on the AL method, such as the annotation accuracy of users, the tiredness of users, and even the physical and mental condition of users.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>Cost-CAUSE saves significant annotation cost compared to random sampling.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocz102","type":"journal-article","created":{"date-parts":[[2019,6,6]],"date-time":"2019-06-06T03:19:09Z","timestamp":1559791149000},"page":"1314-1322","source":"Crossref","is-referenced-by-count":23,"title":["Cost-aware active learning for named entity recognition in clinical text"],"prefix":"10.1093","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8665-0201","authenticated-orcid":false,"given":"Qiang","family":"Wei","sequence":"first","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA"}]},{"given":"Yukun","family":"Chen","sequence":"additional","affiliation":[{"name":"Pieces Technologies Inc, Dallas, Texas, USA"}]},{"given":"Mandana","family":"Salimi","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA"}]},{"given":"Joshua C","family":"Denny","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, USA"},{"name":"Department of Medicine, Vanderbilt University, Nashville, Tennessee, USA"}]},{"given":"Qiaozhu","family":"Mei","sequence":"additional","affiliation":[{"name":"School of Information, University of Michigan, Ann Arbor, Michigan, USA"}]},{"given":"Thomas A","family":"Lasko","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, USA"}]},{"given":"Qingxia","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, USA"},{"name":"Department of Biostatistics, Vanderbilt University, Nashville, Tennessee, USA"}]},{"given":"Stephen","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA"}]},{"given":"Amy","family":"Franklin","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA"}]},{"given":"Trevor","family":"Cohen","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, USA"}]},{"given":"Hua","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA"}]}],"member":"286","published-online":{"date-parts":[[2019,7,11]]},"reference":[{"key":"2020110613071515900_ocz102-B1","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1016\/j.jbi.2017.11.011","article-title":"Clinical information extraction applications: a literature review","volume":"77","author":"Wang","year":"2018","journal-title":"J Biomed Inform"},{"key":"2020110613071515900_ocz102-B2","author":"Liu"},{"key":"2020110613071515900_ocz102-B3","author":"Kim"},{"key":"2020110613071515900_ocz102-B4","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.jbi.2015.09.010","article-title":"A study of active learning methods for named entity recognition in clinical text","volume":"58","author":"Chen","year":"2015","journal-title":"J Biomed Inform"},{"key":"2020110613071515900_ocz102-B5","author":"Lewis"},{"key":"2020110613071515900_ocz102-B6","first-page":"287","author":"Seung","year":"1992"},{"key":"2020110613071515900_ocz102-B7","author":"Settles","year":"2009"},{"key":"2020110613071515900_ocz102-B8","doi-asserted-by":"crossref","first-page":"82.","DOI":"10.1186\/s12911-017-0466-9","article-title":"An active learning-enabled annotation system for clinical named entity recognition","volume":"17","author":"Chen","year":"2016","journal-title":"BMC Med Inform Decis Mak"},{"key":"2020110613071515900_ocz102-B9","author":"Settles"},{"key":"2020110613071515900_ocz102-B10","doi-asserted-by":"crossref","article-title":"Active learning: a step towards automating medical concept extraction","author":"Kholghi","DOI":"10.1093\/jamia\/ocv069"},{"key":"2020110613071515900_ocz102-B11","author":"Settles"},{"key":"2020110613071515900_ocz102-B12","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/j.ijmedinf.2017.08.001","article-title":"Active learning reduces annotation time for clinical concept extraction","volume":"106","author":"Kholghi","year":"2017","journal-title":"Int J Med Inform"},{"key":"2020110613071515900_ocz102-B13"},{"key":"2020110613071515900_ocz102-B14","author":"Tomanek"},{"key":"2020110613071515900_ocz102-B15","article-title":"Assessing the costs of machine-assisted corpus annotation through a user study","author":"Ringger"},{"key":"2020110613071515900_ocz102-B16","author":"Arora"},{"key":"2020110613071515900_ocz102-B17","author":"Tomanek"},{"key":"2020110613071515900_ocz102-B18","author":"Haertel"},{"issue":"5","key":"2020110613071515900_ocz102-B19","doi-asserted-by":"crossref","first-page":"552","DOI":"10.1136\/amiajnl-2011-000203","article-title":"i2b2\/VA challenge on concepts, assertions, and relations in clinical text","volume":"18","author":"Uzuner","year":"2011","journal-title":"J Am Med Inform Assoc"},{"key":"2020110613071515900_ocz102-B20","author":"Wu"},{"issue":"1","key":"2020110613071515900_ocz102-B21","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1080\/23273798.2015.1102299","article-title":"What do we mean by prediction in language comprehension?","volume":"31","author":"Kuperberg","year":"2016","journal-title":"Lang Cogn Neurosci"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/26\/11\/1314\/34151979\/ocz102.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/26\/11\/1314\/34151979\/ocz102.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,11,6]],"date-time":"2020-11-06T19:13:37Z","timestamp":1604690017000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/26\/11\/1314\/5531148"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,7,11]]},"references-count":21,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2019,7,11]]},"published-print":{"date-parts":[[2019,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocz102","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,11]]},"published":{"date-parts":[[2019,7,11]]}}}