{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,2]],"date-time":"2026-03-02T16:06:30Z","timestamp":1772467590321,"version":"3.50.1"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2024,7,31]],"date-time":"2024-07-31T00:00:00Z","timestamp":1722384000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"DOI":"10.13039\/501100003819","name":"Natural Science Foundation of Hubei Province","doi-asserted-by":"publisher","award":["2023AFB414"],"award-info":[{"award-number":["2023AFB414"]}],"id":[{"id":"10.13039\/501100003819","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100007839","name":"Zhongnan University of Economics and Law","doi-asserted-by":"publisher","award":["2722023BQ053"],"award-info":[{"award-number":["2722023BQ053"]}],"id":[{"id":"10.13039\/100007839","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science KAKENHI","doi-asserted-by":"crossref","award":["18H03336"],"award-info":[{"award-number":["18H03336"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objectives<\/jats:title>\n                  <jats:p>Active learning (AL) has rarely integrated diversity-based and uncertainty-based strategies into a dynamic sampling framework for clinical named entity recognition (NER). Machine-assisted annotation is becoming popular for creating gold-standard labels. This study investigated the effectiveness of dynamic AL strategies under simulated machine-assisted annotation scenarios for clinical NER.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>We proposed 3 new AL strategies: a diversity-based strategy (CLUSTER) based on Sentence-BERT and 2 dynamic strategies (CLC and CNBSE) capable of switching from diversity-based to uncertainty-based strategies. Using BioClinicalBERT as the foundational NER model, we conducted simulation experiments on 3 medication-related clinical NER datasets independently: i2b2 2009, n2c2 2018 (Track 2), and MADE 1.0. We compared the proposed strategies with uncertainty-based (LC and NBSE) and passive-learning (RANDOM) strategies. Performance was primarily measured by the number of edits made by the annotators to achieve a desired target effectiveness evaluated on independent test sets.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>When aiming for 98% overall target effectiveness, on average, CLUSTER required the fewest edits. When aiming for 99% overall target effectiveness, CNBSE required 20.4% fewer edits than NBSE did. CLUSTER and RANDOM could not achieve such a high target under the pool-based simulation experiment. For high-difficulty entities, CNBSE required 22.5% fewer edits than NBSE to achieve 99% target effectiveness, whereas neither CLUSTER nor RANDOM achieved 93% target effectiveness.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Discussion and Conclusion<\/jats:title>\n                  <jats:p>When the target effectiveness was set high, the proposed dynamic strategy CNBSE exhibited both strong learning capabilities and low annotation costs in machine-assisted annotation. CLUSTER required the fewest edits when the target effectiveness was set low.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocae197","type":"journal-article","created":{"date-parts":[[2024,8,1]],"date-time":"2024-08-01T01:42:17Z","timestamp":1722476537000},"page":"2632-2640","source":"Crossref","is-referenced-by-count":7,"title":["Utilizing active learning strategies in machine-assisted annotation for clinical named entity recognition: a comprehensive analysis considering annotation costs and target effectiveness"],"prefix":"10.1093","volume":"31","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5184-4313","authenticated-orcid":false,"given":"Jiaxing","family":"Liu","sequence":"first","affiliation":[{"name":"School of Statistics and Mathematics, Zhongnan University of Economics and Law , Wuhan, Hubei 430073,","place":["China"]}]},{"given":"Zoie S Y","family":"Wong","sequence":"additional","affiliation":[{"name":"Graduate School of Public Health, St Luke\u2019s International University , OMURA Susumu & Mieko Memorial St Luke\u2019s Center for Clinical Academia , Chuo-ku, Tokyo 104-0045,","place":["Japan"]},{"name":"The Kirby Institute, University of New South Wales , Sydney, NSW 2052,","place":["Australia"]},{"name":"School of Medical Sciences, The Unviersity of Sydney , Camperdown, NSW 2050,","place":["Australia"]}]}],"member":"286","published-online":{"date-parts":[[2024,7,31]]},"reference":[{"issue":"1","key":"2024102107521798400_ocae197-B1","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1093\/jamia\/ocz166","article-title":"2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records","volume":"27","author":"Henry","year":"2019","journal-title":"J Am Med Inform Assoc"},{"issue":"3","key":"2024102107521798400_ocae197-B2","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1093\/jamia\/ocz200","article-title":"Deep learning in clinical natural language processing: a methodical review","volume":"27","author":"Wu","year":"2019","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"2024102107521798400_ocae197-B3","doi-asserted-by":"crossref","first-page":"833","DOI":"10.1136\/amiajnl-2013-002255","article-title":"Assisted annotation of medical free text using RapTAT","volume":"21","author":"Gobbel","year":"2014","journal-title":"J Am Med Inform Assoc"},{"issue":"3","key":"2024102107521798400_ocae197-B4","doi-asserted-by":"crossref","first-page":"406","DOI":"10.1136\/amiajnl-2013-001837","article-title":"Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements","volume":"21","author":"Lingren","year":"2014","journal-title":"J Am Med Inform Assoc"},{"issue":"30","key":"2024102107521798400_ocae197-B5","doi-asserted-by":"crossref","first-page":"e2305016120","DOI":"10.1073\/pnas.2305016120","article-title":"ChatGPT outperforms crowd workers for text-annotation tasks","volume":"120","author":"Gilardi","year":"2023","journal-title":"Proc Natl Acad Sci"},{"key":"2024102107521798400_ocae197-B6","first-page":"ocad259","article-title":"Improving large language models for clinical named entity recognition via prompt engineering","author":"Hu","year":"2024","journal-title":"J Am Med Inform Assoc"},{"issue":"9","key":"2024102107521798400_ocae197-B7","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btad557","article-title":"An extensive benchmark study on biomedical text generation and mining with ChatGPT","volume":"39","author":"Chen","year":"2023","journal-title":"Bioinformatics"},{"key":"2024102107521798400_ocae197-B8","first-page":"72","author":"Alsentzer","year":"2019"},{"key":"2024102107521798400_ocae197-B9","first-page":"1183","author":"Gal","year":"2017"},{"key":"2024102107521798400_ocae197-B10","first-page":"9368","author":"Beluch","year":"2018"},{"issue":"5","key":"2024102107521798400_ocae197-B11","doi-asserted-by":"crossref","first-page":"809","DOI":"10.1136\/amiajnl-2011-000648","article-title":"Active learning for clinical text classification: is it better than random sampling?","volume":"19","author":"Figueroa","year":"2012","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"2024102107521798400_ocae197-B12","doi-asserted-by":"crossref","first-page":"893","DOI":"10.1136\/amiajnl-2013-002516","article-title":"Supervised machine learning and active learning in classification of radiology reports","volume":"21","author":"Nguyen","year":"2014","journal-title":"J Am Med Inform Assoc"},{"key":"2024102107521798400_ocae197-B13","first-page":"7949","author":"Ein-Dor","year":"2020"},{"key":"2024102107521798400_ocae197-B14","first-page":"363","author":"Gu\u00e9lorget","year":"2020"},{"issue":"12","key":"2024102107521798400_ocae197-B15","doi-asserted-by":"crossref","first-page":"2551","DOI":"10.1093\/jamia\/ocab158","article-title":"Active neural networks to detect mentions of changes to medication treatment in social media","volume":"28","author":"Weissenbacher","year":"2021","journal-title":"J Am Med Inform Assoc"},{"issue":"7","key":"2024102107521798400_ocae197-B16","first-page":"8897","article-title":"EASAL: entity-aware subsequence-based active learning for named entity recognition","volume":"37","author":"Liu","year":"2023","journal-title":"Proc AAAI Conf Artif Intell"},{"issue":"3","key":"2024102107521798400_ocae197-B17","doi-asserted-by":"crossref","first-page":"2433","DOI":"10.1007\/s11063-021-10737-x","article-title":"LTP: a new active learning strategy for CRF-based named entity recognition","volume":"54","author":"Liu","year":"2022","journal-title":"Neural Process Lett"},{"key":"2024102107521798400_ocae197-B18","first-page":"482","author":"Shelmanov","year":"2019"},{"key":"2024102107521798400_ocae197-B19","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/j.ijmedinf.2017.08.001","article-title":"Active learning reduces annotation time for clinical concept extraction","volume":"106","author":"Kholghi","year":"2017","journal-title":"Int J Med Inform"},{"key":"2024102107521798400_ocae197-B20","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.jbi.2015.09.010","article-title":"A study of active learning methods for named entity recognition in clinical text","volume":"58","author":"Chen","year":"2015","journal-title":"J Biomed Inform"},{"key":"2024102107521798400_ocae197-B21","first-page":"746","author":"Culotta","year":"2005"},{"key":"2024102107521798400_ocae197-B22","author":"Shen","year":", , .    ;\u00a02017:252-256"},{"issue":"2","key":"2024102107521798400_ocae197-B23","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1007\/s13748-021-00230-w","article-title":"Active learning approach using a modified least confidence sampling strategy for named entity recognition","volume":"10","author":"Agrawal","year":"2021","journal-title":"Prog Artif Intell"},{"key":"2024102107521798400_ocae197-B24","author":"Settles","year":"1079"},{"issue":"11","key":"2024102107521798400_ocae197-B25","doi-asserted-by":"crossref","first-page":"2543","DOI":"10.1002\/asi.23936","article-title":"Clinical information extraction using small data: an active learning approach based on sequence representations and word embeddings","volume":"68","author":"Kholghi","year":"2017","journal-title":"J Assoc Inf Sci Technol"},{"key":"2024102107521798400_ocae197-B26","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1186\/s12911-017-0466-9","article-title":"An active learning-enabled annotation system for clinical named entity recognition","volume":"17(Suppl 2)","author":"Chen","year":"2017","journal-title":"BMC Med Inform Decis Mak"},{"issue":"11","key":"2024102107521798400_ocae197-B27","doi-asserted-by":"crossref","first-page":"1314","DOI":"10.1093\/jamia\/ocz102","article-title":"Cost-aware active learning for named entity recognition in clinical text","volume":"26","author":"Wei","year":"2019","journal-title":"J Am Med Inform Assoc"},{"key":"2024102107521798400_ocae197-B28","author":"Shen","year":", , .    ;\u00a02004:589-596"},{"issue":"5","key":"2024102107521798400_ocae197-B29","doi-asserted-by":"crossref","first-page":"514","DOI":"10.1136\/jamia.2010.003947","article-title":"Extracting medication information from clinical text","volume":"17","author":"Uzuner","year":"2010","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2024102107521798400_ocae197-B30","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1007\/s40264-018-0762-z","article-title":"Overview of the first natural language processing challenge for extracting medication, indication, and adverse drug events from electronic health record notes (MADE 1.0)","volume":"42","author":"Jagannatha","year":"2019","journal-title":"Drug Saf"},{"key":"2024102107521798400_ocae197-B31","first-page":"4171","author":"Devlin"},{"issue":"9","key":"2024102107521798400_ocae197-B32","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3472291","article-title":"A survey of deep active learning","volume":"54","author":"Ren","year":"2021","journal-title":"ACM Comput Surv"},{"key":"2024102107521798400_ocae197-B33","first-page":"69","author":"Kim","year":"2006"},{"key":"2024102107521798400_ocae197-B34","author":"Reimers","year":"3992"},{"key":"2024102107521798400_ocae197-B35","doi-asserted-by":"crossref","first-page":"103481","DOI":"10.1016\/j.jbi.2020.103481","article-title":"Adversarial active learning for the identification of medical concepts and annotation inconsistency","volume":"108","author":"Yu","year":"2020","journal-title":"J Biomed Inform"},{"issue":"8","key":"2024102107521798400_ocae197-B36","first-page":"707","article-title":"Binary codes capable of correcting deletions, insertions, and reversals","volume":"10","author":"Levenshtein","year":"1966","journal-title":"Sov Phys Dokl"},{"issue":"2","key":"2024102107521798400_ocae197-B37","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1093\/jamia\/ocv069","article-title":"Active learning: a step towards automating medical concept extraction","volume":"23","author":"Kholghi","year":"2015","journal-title":"J Am Med Inform Assoc"},{"key":"2024102107521798400_ocae197-B38","author":"seqeval: a Python framework for sequence labeling evaluation","year":"2018"},{"key":"2024102107521798400_ocae197-B39","author":"Liu"},{"key":"2024102107521798400_ocae197-B40","first-page":"5753","author":"Yang","year":"2019"},{"key":"2024102107521798400_ocae197-B41","author":"Label Studio"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/31\/11\/2632\/59813511\/ocae197.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/31\/11\/2632\/59813511\/ocae197.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,21]],"date-time":"2024-10-21T07:52:37Z","timestamp":1729497157000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/31\/11\/2632\/7724491"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,31]]},"references-count":41,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2024,7,31]]},"published-print":{"date-parts":[[2024,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocae197","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,11]]},"published":{"date-parts":[[2024,7,31]]}}}