{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T09:43:49Z","timestamp":1770457429923,"version":"3.49.0"},"reference-count":26,"publisher":"Springer Science and Business Media LLC","issue":"S7","license":[{"start":{"date-parts":[[2020,9,1]],"date-time":"2020-09-01T00:00:00Z","timestamp":1598918400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,9,30]],"date-time":"2020-09-30T00:00:00Z","timestamp":1601424000000},"content-version":"vor","delay-in-days":29,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"published-print":{"date-parts":[[2020,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>While clinical entity recognition mostly aims at electronic health records (EHRs), there are also the demands of dealing with the other type of text data. Automatic medical diagnosis is an example of new applications using a different data source. In this work, we are interested in extracting Korean clinical entities from a new medical dataset, which is completely different from EHRs. The dataset is collected from an online QA site for medical diagnosis. Bidirectional Encoder Representations from Transformers (BERT), which is one of the best language representation models, is used to extract the entities.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>A slightly modified version of BERT labeling strategy replaces the original labeling to enhance the separation of postpositions in Korean. A new clinical entity recognition dataset that we construct, as well as a standard NER dataset, have been used for the experiments. A pre-trained multilingual BERT model is used for the initialization of the entity recognition model. BERT significantly outperforms a character-level bidirectional LSTM-CRF, a benchmark model, in terms of all metrics. The micro-averaged precision, recall, and f1 of BERT are 0.83, 0.85 and 0.84, whereas that of bi-LSTM-CRF are 0.82, 0.79 and 0.81 respectively. The recall values of BERT are especially better than that of the other model. It can be interpreted that the trained BERT model could detect out of vocabulary (OOV) words better than bi-LSTM-CRF.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>The recently developed BERT and its WordPiece tokenization are effective for the Korean clinical entity recognition. The experiments using a new dataset constructed for the purpose and a standard NER dataset show the superiority of BERT compared to a state-of-the-art method. To the best of our knowledge, this work is one of the first studies dealing with clinical entity extraction from non-EHR data.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s12911-020-01241-8","type":"journal-article","created":{"date-parts":[[2020,9,30]],"date-time":"2020-09-30T10:04:01Z","timestamp":1601460241000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":31,"title":["Korean clinical entity recognition from diagnosis text using BERT"],"prefix":"10.1186","volume":"20","author":[{"given":"Young-Min","family":"Kim","sequence":"first","affiliation":[]},{"given":"Tae-Hoon","family":"Lee","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,9,30]]},"reference":[{"key":"1241_CR1","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1016\/j.jbi.2017.11.011","volume":"77","author":"Y Wang","year":"2018","unstructured":"Wang Y, Wang L, Rastegar-Mojarad M, Moon S, Shen F, Afzal N, Liu S, Zeng Y, Mehrabi S, Sohn S, Liu H. Clinical information extraction applications: A literature review. J Biomed Inform. 2018; 77:34\u201349.","journal-title":"J Biomed Inform"},{"issue":"5","key":"1241_CR2","doi-asserted-by":"publisher","first-page":"1589","DOI":"10.1109\/JBHI.2017.2767063","volume":"22","author":"B Shickel","year":"2018","unstructured":"Shickel B, Tighe P, Bihorac A, Rashidi P. Deep EHR: A survey of recent advances on deep learning techniques for electronic health record (EHR) analysis. IEEE J Biomed Health Inform. 2018; 22(5):1589\u2013604.","journal-title":"IEEE J Biomed Health Inform"},{"issue":"5","key":"1241_CR3","doi-asserted-by":"publisher","first-page":"552","DOI":"10.1136\/amiajnl-2011-000203","volume":"18","author":"O Uzuner","year":"2011","unstructured":"Uzuner O, South B, Shen S, DuVall S. 2010 i2b2\/va challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc. 2011; 18(5):552\u20136.","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"1241_CR4","doi-asserted-by":"publisher","first-page":"601","DOI":"10.1136\/amiajnl-2011-000163","volume":"18","author":"M Jiang","year":"2011","unstructured":"Jiang M, Chen Y, Liu M, Rosenbloom S, Mani S, Denny J, Xu H. A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. J Am Med Inform Assoc. 2011; 18(5):601\u20136.","journal-title":"J Am Med Inform Assoc"},{"key":"1241_CR5","volume-title":"2017 24th Asia-Pacific Software Engineering Conference (APSEC)","author":"L Feng","year":"2017","unstructured":"Feng L, Chiam Y, Lo S. Text-mining techniques and tools for systematic literature reviews: A systematic literature review. In: 2017 24th Asia-Pacific Software Engineering Conference (APSEC). Nanjing: IEEE: 2017. p. 41\u201350."},{"key":"1241_CR6","doi-asserted-by":"publisher","first-page":"134","DOI":"10.1016\/j.jbi.2014.01.004","volume":"(49)","author":"R Zhang","year":"2014","unstructured":"Zhang R, Cairelli M, Fiszman M, Rosemblat G, Kilicoglu H, Rindflesch T, Pakhomov S, Melton G. Using semantic predications to uncover drug-drug interactions in clinical data. J Biomed Inform. 2014; (49):134\u201347.","journal-title":"J Biomed Inform"},{"key":"1241_CR7","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)","author":"Z Wei","year":"2018","unstructured":"Wei Z, Liu Q, Peng B, Tou H, Chen T, Huang X, Wong K-F, Dai X. Task-oriented dialogue system for automatic diagnosis. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Melbourne: ACL: 2018. p. 201\u20137."},{"key":"1241_CR8","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"J Devlin","year":"2019","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: ACL: 2019. p. 4171\u201386."},{"key":"1241_CR9","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts","author":"J Gao","year":"2018","unstructured":"Gao J, Galley M, Li L. Neural approaches to conversational AI. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts. Melbourne: ACL: 2018. p. 2\u20137."},{"issue":"2","key":"1241_CR10","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1145\/3166054.3166058","volume":"19","author":"H Chen","year":"2017","unstructured":"Chen H, Liu X, Yin D, Tang J. A survey on dialogue systems: Recent advances and new frontiers. SIGKDD Explor Newsl. 2017; 19(2):25\u201335.","journal-title":"SIGKDD Explor Newsl"},{"key":"1241_CR11","volume-title":"Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"X Li","year":"2017","unstructured":"Li X, Chen Y-N, Li L, Gao J, Celikyilmaz A. End-to-end task-completion neural dialogue systems. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Taipei: ACL: 2017. p. 733\u201343."},{"key":"1241_CR12","first-page":"1","volume":"abs\/1508.01991","author":"Z Huang","year":"2015","unstructured":"Huang Z, Xu W, Yu K. Bidirectional lstm-crf models for sequence tagging. CoRR. 2015; abs\/1508.01991:1\u201310.","journal-title":"CoRR"},{"key":"1241_CR13","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"X Ma","year":"2016","unstructured":"Ma X, Hovy E. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin: ACL: 2016. p. 1064\u201374."},{"key":"1241_CR14","volume-title":"Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP)","author":"R Chalapathy","year":"2016","unstructured":"Chalapathy R, Zare Borzeshi E, Piccardi M. Bidirectional LSTM-CRF for clinical concept extraction. In: Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP). Osaka: COLING: 2016. p. 7\u201312."},{"issue":"17","key":"1241_CR15","first-page":"53","volume":"67","author":"Z Liu","year":"2017","unstructured":"Liu Z, Yang M, Wang X, Chen Q, Tang B, Wang Z, Xu H. Entity recognition from clinical texts via recurrent neural network. BMC Med Informatics Decis Mak. 2017; 67(17):53\u201361.","journal-title":"BMC Med Informatics Decis Mak"},{"key":"1241_CR16","volume-title":"Advances in Neural Information Processing Systems 30","author":"A Vaswani","year":"2017","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A, Kaiser Lu, Polosukhin I. Attention is all you need. In: Advances in Neural Information Processing Systems 30. Long Beach: Curran Associates, Inc.: 2017. p. 5998\u20136008."},{"key":"1241_CR17","doi-asserted-by":"crossref","unstructured":"Liu B, Lane I. Attention-based recurrent neural network models for joint intent detection and slot filling. In: INTERSPEECH: 2016.","DOI":"10.21437\/Interspeech.2016-1352"},{"key":"1241_CR18","doi-asserted-by":"crossref","unstructured":"Tan Z, Wang M, Xie J, Chen Y, Shi X. Deep semantic role labeling with self-attention. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence: 2018. p. 4929\u201336.","DOI":"10.1609\/aaai.v32i1.11928"},{"key":"1241_CR19","volume-title":"Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers","author":"M Rei","year":"2016","unstructured":"Rei M, Crichton G, Pyysalo S. Attending to characters in neural sequence labeling models. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. Osaka: COLING: 2016. p. 309\u201318."},{"key":"1241_CR20","volume-title":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","author":"S Kwon","year":"2017","unstructured":"Kwon S, Ko Y, Seo J. A robust named-entity recognition system using syllable bigram embedding with eojeol prefix information. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. Singapore: ACM: 2017. p. 2139\u201342."},{"key":"1241_CR21","doi-asserted-by":"crossref","unstructured":"Park S, Byun J, Baek S, Cho Y, Oh A. Subword-level word vector representations for korean. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics: 2018. p. 2429\u201338.","DOI":"10.18653\/v1\/P18-1226"},{"key":"1241_CR22","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1016\/j.csl.2018.09.005","volume":"54","author":"S Na","year":"2019","unstructured":"Na S, Kim H, Min J, Kim K. Improving LSTM crfs using character-based compositions for korean named entity recognition. Comput Speech Lang. 2019; 54:106\u201321.","journal-title":"Comput Speech Lang"},{"key":"1241_CR23","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"M Peters","year":"2018","unstructured":"Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L. Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans: ACL: 2018. p. 2227\u201337."},{"key":"1241_CR24","unstructured":"Alec Radford TS, Karthik Narasimhan, Sutskever I. Improving language understanding with unsupervised learning. Technical report, OpenAI. 2018. https:\/\/openai.com\/blog\/language-unsupervised\/."},{"key":"1241_CR25","unstructured":"Wu Y, Schuster M, Chen Z, Le Q, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, Kaiser L, Gouws S, Kato Y, Kudo T, Kazawa H, Stevens K, Kurian G, Patil N, Wang W, Young C, Smith J, Riesa J, Rudnick A, Vinyals O, Corrado G, Hughes M, Dean J. Google\u2019s neural machine translation system: Bridging the gap between human and machine translation. CoRR. 2016; abs\/1609.08144."},{"issue":"3","key":"1241_CR26","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1109\/MCI.2018.2840738","volume":"13","author":"T Young","year":"2017","unstructured":"Young T, Hazarika D, Poria S, Cambria E. Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag. 2017; 13(3):55\u201375.","journal-title":"IEEE Comput Intell Mag"}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-020-01241-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12911-020-01241-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-020-01241-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,11,21]],"date-time":"2022-11-21T09:17:39Z","timestamp":1669022259000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-020-01241-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9]]},"references-count":26,"journal-issue":{"issue":"S7","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["1241"],"URL":"https:\/\/doi.org\/10.1186\/s12911-020-01241-8","relation":{},"ISSN":["1472-6947"],"issn-type":[{"value":"1472-6947","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9]]},"assertion":[{"value":"4 August 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 September 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 September 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"242"}}