{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T08:19:37Z","timestamp":1772525977485,"version":"3.50.1"},"reference-count":62,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,5,8]],"date-time":"2022-05-08T00:00:00Z","timestamp":1651968000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,5,8]],"date-time":"2022-05-08T00:00:00Z","timestamp":1651968000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100015703","name":"Philips Research North America","doi-asserted-by":"crossref","award":["NLP for Portuguese clinical texts"],"award-info":[{"award-number":["NLP for Portuguese clinical texts"]}],"id":[{"id":"10.13039\/100015703","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002322","name":"Coordena\u00e7\u00e3o de Aperfei\u00e7oamento de Pessoal de N\u00edvel Superior","doi-asserted-by":"publisher","award":["Finance Code 001"],"award-info":[{"award-number":["Finance Code 001"]}],"id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Biomed Semant"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>The high volume of research focusing on extracting patient information from electronic health records (EHRs) has led to an increase in the demand for annotated corpora, which are a precious resource for both the development and evaluation of natural language processing (NLP) algorithms. The absence of a multipurpose clinical corpus outside the scope of the English language, especially in Brazilian Portuguese, is glaring and severely impacts scientific progress in the biomedical NLP field.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Methods<\/jats:title>\n                <jats:p>In this study, a semantically annotated corpus was developed using clinical text from multiple medical specialties, document types, and institutions. In addition, we present, (1) a survey listing common aspects, differences, and lessons learned from previous research, (2) a fine-grained annotation schema that can be replicated to guide other annotation initiatives, (3) a web-based annotation tool focusing on an annotation suggestion feature, and (4) both intrinsic and extrinsic evaluation of the annotations.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>This study resulted in SemClinBr, a corpus that has 1000 clinical notes, labeled with 65,117 entities and 11,263 relations. In addition, both negation cues and medical abbreviation dictionaries were generated from the annotations. The average annotator agreement score varied from 0.71 (applying strict match) to 0.92 (considering a relaxed match) while accepting partial overlaps and hierarchically related semantic types. The extrinsic evaluation, when applying the corpus to two downstream NLP tasks, demonstrated the reliability and usefulness of annotations, with the systems achieving results that were consistent with the agreement scores.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusion<\/jats:title>\n                <jats:p>The SemClinBr corpus and other resources produced in this work can support clinical NLP studies, providing a common development and evaluation resource for the research community, boosting the utilization of EHRs in both clinical practice and biomedical research. To the best of our knowledge, SemClinBr is the first available Portuguese clinical corpus.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s13326-022-00269-1","type":"journal-article","created":{"date-parts":[[2022,5,8]],"date-time":"2022-05-08T05:02:58Z","timestamp":1651986178000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":24,"title":["SemClinBr - a multi-institutional and multi-specialty semantically annotated corpus for Portuguese clinical NLP tasks"],"prefix":"10.1186","volume":"13","author":[{"given":"Lucas Emanuel Silva e","family":"Oliveira","sequence":"first","affiliation":[]},{"given":"Ana Carolina","family":"Peters","sequence":"additional","affiliation":[]},{"given":"Adalniza Moura Pucca","family":"da Silva","sequence":"additional","affiliation":[]},{"given":"Caroline Pilatti","family":"Gebeluca","sequence":"additional","affiliation":[]},{"given":"Yohan Bonescki","family":"Gumiel","sequence":"additional","affiliation":[]},{"given":"Lilian Mie Mukai","family":"Cintho","sequence":"additional","affiliation":[]},{"given":"Deborah Ribeiro","family":"Carvalho","sequence":"additional","affiliation":[]},{"given":"Sadid","family":"Al Hasan","sequence":"additional","affiliation":[]},{"given":"Claudia Maria Cabral","family":"Moro","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,5,8]]},"reference":[{"key":"269_CR1","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3127881","volume":"50","author":"P Yadav","year":"2018","unstructured":"Yadav P, Steinbach M, Kumar V, Simon G. Mining electronic health records (EHRs): a survey. ACM Comput Surv. 2018;50:1\u201340. https:\/\/doi.org\/10.1145\/3127881.","journal-title":"ACM Comput Surv"},{"key":"269_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.3389\/fmed.2019.00066","volume":"6","author":"M Assale","year":"2019","unstructured":"Assale M, Dui LG, Cina A, Seveso A, Cabitza F. The revival of the notes field: leveraging the unstructured content in electronic health records. Front Med. 2019;6:1\u201323. https:\/\/doi.org\/10.3389\/fmed.2019.00066.","journal-title":"Front Med"},{"key":"269_CR3","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1186\/s13326-018-0179-8","volume":"9","author":"A N\u00e9v\u00e9ol","year":"2018","unstructured":"N\u00e9v\u00e9ol A, Dalianis H, Velupillai S, Savova G, Zweigenbaum P. Clinical natural language processing in languages other than English: opportunities and challenges. J Biomed Semantics. 2018;9:12. https:\/\/doi.org\/10.1186\/s13326-018-0179-8.","journal-title":"J Biomed Semantics"},{"key":"269_CR4","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1186\/s13326-017-0153-x","volume":"8","author":"J Jovanovi\u0107","year":"2017","unstructured":"Jovanovi\u0107 J, Bagheri E. Semantic annotation in biomedicine: the current landscape. J Biomed Semantics. 2017;8:44. https:\/\/doi.org\/10.1186\/s13326-017-0153-x.","journal-title":"J Biomed Semantics"},{"key":"269_CR5","unstructured":"Summary of the HIPAA privacy rule. https:\/\/www.hhs.gov\/hipaa\/for-professionals\/privacy\/laws-regulations\/index.html. Accessed 25 Apr\u00a02022."},{"key":"269_CR6","doi-asserted-by":"publisher","first-page":"950","DOI":"10.1016\/j.jbi.2008.12.013","volume":"42","author":"A Roberts","year":"2009","unstructured":"Roberts A, Gaizauskas R, Hepple M, Demetriou G, Guo Y, Roberts I, et al. Building a semantically annotated corpus of clinical texts. J Biomed Inform. 2009;42:950\u201366. https:\/\/doi.org\/10.1016\/j.jbi.2008.12.013.","journal-title":"J Biomed Inform"},{"key":"269_CR7","doi-asserted-by":"publisher","first-page":"18","DOI":"10.3115\/1667884.1667888","volume-title":"Proceedings of the ACL-IJCNLP 2009 Student Research Workshop on \u2013 ACL-IJCNLP \u201809","author":"Y Wang","year":"2009","unstructured":"Wang Y. Annotating and recognising named entities in clinical notes. In:  Proceedings of the ACL-IJCNLP 2009 Student Research Workshop on \u2013 ACL-IJCNLP \u201809. Morristown: Association for Computational Linguistics; 2009. p. 18. https:\/\/doi.org\/10.3115\/1667884.1667888."},{"key":"269_CR8","doi-asserted-by":"publisher","first-page":"552","DOI":"10.1136\/amiajnl-2011-000203","volume":"18","author":"\u00d6 Uzuner","year":"2011","unstructured":"Uzuner \u00d6, South BR, Shen S, DuVall SL. 2010 i2b2\/VA challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc. 2011;18:552\u20136. https:\/\/doi.org\/10.1136\/amiajnl-2011-000203.","journal-title":"J Am Med Inform Assoc"},{"key":"269_CR9","doi-asserted-by":"publisher","first-page":"212","DOI":"10.1007\/978-3-642-40802-1_24","volume-title":"Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)","author":"H Suominen","year":"2013","unstructured":"Suominen H, Salanter\u00e4 S, Velupillai S, Chapman WW, Savova G, Elhadad N, et al. Overview of the ShARe\/CLEF eHealth evaluation lab 2013. In:  Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 2013. p. 212\u201331. https:\/\/doi.org\/10.1007\/978-3-642-40802-1_24."},{"key":"269_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.jbi.2013.12.006","volume":"47","author":"RI Do\u011fan","year":"2014","unstructured":"Do\u011fan RI, Leaman R, Lu Z. NCBI disease corpus: a resource for disease name recognition and concept normalization. J Biomed Inform. 2014;47:1\u201310. https:\/\/doi.org\/10.1016\/j.jbi.2013.12.006.","journal-title":"J Biomed Inform"},{"key":"269_CR11","doi-asserted-by":"publisher","first-page":"54","DOI":"10.3115\/v1\/S14-2007","volume-title":"Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)","author":"S Pradhan","year":"2014","unstructured":"Pradhan S, Elhadad N, Chapman W, Manandhar S, Savova G. SemEval-2014 Task 7: Analysis of clinical text. In:  Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Stroudsburg: Association for Computational Linguistics; 2014. p. 54\u201362. https:\/\/doi.org\/10.3115\/v1\/S14-2007."},{"key":"269_CR12","doi-asserted-by":"publisher","first-page":"303","DOI":"10.18653\/v1\/S15-2051","volume-title":"Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)","author":"N Elhadad","year":"2015","unstructured":"Elhadad N, Pradhan S, Gorman S, Manandhar S, Chapman W, Savova G. SemEval-2015 Task 14: Analysis of clinical text. In:  Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). Stroudsburg: Association for Computational Linguistics; 2015. p. 303\u201310. https:\/\/doi.org\/10.18653\/v1\/S15-2051."},{"issue":"Supplement","key":"269_CR13","doi-asserted-by":"publisher","first-page":"S78","DOI":"10.1016\/j.jbi.2015.05.009","volume":"58","author":"A Stubbs","year":"2015","unstructured":"Stubbs A, Uzuner \u00d6. Annotating risk factors for heart disease in clinical narratives for diabetic patients. J Biomed Inform. 2015;58(Supplement):S78\u201391. https:\/\/doi.org\/10.1016\/j.jbi.2015.05.009.","journal-title":"J Biomed Inform"},{"key":"269_CR14","doi-asserted-by":"publisher","first-page":"318","DOI":"10.1016\/j.jbi.2015.06.016","volume":"56","author":"M Oronoz","year":"2015","unstructured":"Oronoz M, Gojenola K, P\u00e9rez A, de Ilarraza AD, Casillas A. On the creation of a clinical gold standard corpus in Spanish: mining adverse drug reactions. J Biomed Inform. 2015;56:318\u201332. https:\/\/doi.org\/10.1016\/j.jbi.2015.06.016.","journal-title":"J Biomed Inform"},{"key":"269_CR15","doi-asserted-by":"publisher","first-page":"571","DOI":"10.1007\/s10579-017-9382-y","volume":"52","author":"L Campillos","year":"2018","unstructured":"Campillos L, Del\u00e9ger L, Grouin C, Hamon T, Ligozat A-L, N\u00e9v\u00e9ol A. A French clinical corpus with comprehensive semantic annotations: development of the medical entity and relation LIMSI annOtated text corpus (Merlot). Lang Resource Eval. 2018;52:571\u2013601. https:\/\/doi.org\/10.1007\/s10579-017-9382-y.","journal-title":"Lang Resource Eval"},{"key":"269_CR16","unstructured":"Xia F, Yetisgen-Yildiz M. Clinical corpus annotation: challenges and strategies. In:  Proceedings of the Third Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM\u20192012) of the International Conference on Language Resources and Evaluation (LREC). Istanbul: European Language Resources Association (ELRA);\u00a02012. http:\/\/faculty.washington.edu\/melihay\/publications\/LREC_BioTxtM_2012.pdf."},{"key":"269_CR17","doi-asserted-by":"publisher","DOI":"10.1075\/nlp.11","volume-title":"Biomedical natural language processing","author":"K Bretonnel Cohen","year":"2014","unstructured":"Bretonnel Cohen K, Demner-Fushman D. Biomedical natural language processing. Amsterdam: John Benjamins Publishing Company; 2014. https:\/\/doi.org\/10.1075\/nlp.11."},{"key":"269_CR18","first-page":"39","volume-title":"En: Technologia Del Habla and II Iberian SL Tech Workshop VI Jornadas","author":"L Ferreira","year":"2010","unstructured":"Ferreira L, Teixeira A, JPS C. Information extraction from Portuguese hospital discharge letters. In:  En: Technologia Del Habla and II Iberian SL Tech Workshop VI Jornadas; 2010. p. 39\u201342."},{"key":"269_CR19","doi-asserted-by":"publisher","first-page":"281","DOI":"10.1055\/s-0038-1634945","volume":"32","author":"DAB Lindberg","year":"1993","unstructured":"Lindberg DAB, Humphreys BL, McCray AT. The unified medical language system. Methods Inf Med. 1993;32:281\u201391.","journal-title":"Methods Inf Med"},{"key":"269_CR20","doi-asserted-by":"publisher","first-page":"550","DOI":"10.1197\/jamia.M2444","volume":"14","author":"O Uzuner","year":"2007","unstructured":"Uzuner O, Luo Y, Szolovits P. Evaluating the state-of-the-art in automatic de-identification. J Am Med Inform Assoc. 2007;14:550\u201363. https:\/\/doi.org\/10.1197\/jamia.M2444.","journal-title":"J Am Med Inform Assoc"},{"issue":"Supplement","key":"269_CR21","doi-asserted-by":"publisher","first-page":"S11","DOI":"10.1016\/j.jbi.2015.06.007","volume":"58","author":"A Stubbs","year":"2015","unstructured":"Stubbs A, Kotfila C, Uzuner \u00d6. Automated systems for the de-identification of longitudinal clinical narratives: overview of 2014 i2b2\/UTHealth shared task track 1. J Biomed Inform. 2015;58(Supplement):S11\u20139. https:\/\/doi.org\/10.1016\/j.jbi.2015.06.007.","journal-title":"J Biomed Inform"},{"key":"269_CR22","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1197\/jamia.M2408","volume":"15","author":"O Uzuner","year":"2008","unstructured":"Uzuner O, Goldstein I, Luo Y, Kohane I. Identifying patient smoking status from medical discharge records. J Am Med Inform Assoc. 2008;15:14\u201324. https:\/\/doi.org\/10.1197\/jamia.M2408.","journal-title":"J Am Med Inform Assoc"},{"key":"269_CR23","doi-asserted-by":"publisher","first-page":"561","DOI":"10.1197\/jamia.M3115","volume":"16","author":"O Uzuner","year":"2009","unstructured":"Uzuner O. Recognizing obesity and comorbidities in sparse data. J Am Med Inform Assoc. 2009;16:561\u201370. https:\/\/doi.org\/10.1197\/jamia.M3115.","journal-title":"J Am Med Inform Assoc"},{"key":"269_CR24","doi-asserted-by":"publisher","first-page":"514","DOI":"10.1136\/jamia.2010.003947","volume":"17","author":"O Uzuner","year":"2010","unstructured":"Uzuner O, Solti I, Cadag E. Extracting medication information from clinical text. J Am Med Inform Assoc. 2010;17:514\u20138. https:\/\/doi.org\/10.1136\/jamia.2010.003947.","journal-title":"J Am Med Inform Assoc"},{"key":"269_CR25","doi-asserted-by":"publisher","first-page":"786","DOI":"10.1136\/amiajnl-2011-000784","volume":"19","author":"O Uzuner","year":"2012","unstructured":"Uzuner O, Bodnari A, Shen S, Forbush T, Pestian J, South BR. Evaluating the state of the art in coreference resolution for electronic medical records. J Am Med Inform Assoc. 2012;19:786\u201391. https:\/\/doi.org\/10.1136\/amiajnl-2011-000784.","journal-title":"J Am Med Inform Assoc"},{"key":"269_CR26","doi-asserted-by":"publisher","first-page":"806","DOI":"10.1136\/amiajnl-2013-001628","volume":"20","author":"W Sun","year":"2013","unstructured":"Sun W, Rumshisky A, Uzuner O. Evaluating temporal relations in clinical text: 2012 i2b2 challenge. J Am Med Inform Assoc. 2013;20:806\u201313. https:\/\/doi.org\/10.1136\/amiajnl-2013-001628.","journal-title":"J Am Med Inform Assoc"},{"issue":"Supplement","key":"269_CR27","doi-asserted-by":"publisher","first-page":"S67","DOI":"10.1016\/j.jbi.2015.07.001","volume":"58","author":"A Stubbs","year":"2015","unstructured":"Stubbs A, Kotfila C, Xu H, Uzuner \u00d6. Identifying risk factors for heart disease over time: overview of 2014 i2b2\/UTHealth shared task track 2. J Biomed Inform. 2015;58(Supplement):S67\u201377. https:\/\/doi.org\/10.1016\/j.jbi.2015.07.001.","journal-title":"J Biomed Inform"},{"key":"269_CR28","doi-asserted-by":"publisher","first-page":"519","DOI":"10.1136\/jamia.2010.004200","volume":"17","author":"O Uzuner","year":"2010","unstructured":"Uzuner O, Solti I, Xia F, Cadag E. Community annotation experiment for ground truth generation for the i2b2 medication challenge. J Am Med Inform Assoc. 2010;17:519\u201323. https:\/\/doi.org\/10.1136\/jamia.2010.004200.","journal-title":"J Am Med Inform Assoc"},{"issue":"Supplement","key":"269_CR29","doi-asserted-by":"publisher","first-page":"S5","DOI":"10.1016\/j.jbi.2013.07.004","volume":"46","author":"W Sun","year":"2013","unstructured":"Sun W, Rumshisky A, Uzuner O. Annotating temporal information in clinical narratives. J Biomed Inform. 2013;46(Supplement):S5\u2013S12. https:\/\/doi.org\/10.1016\/j.jbi.2013.07.004.","journal-title":"J Biomed Inform"},{"issue":"Supplement","key":"269_CR30","doi-asserted-by":"publisher","first-page":"S20","DOI":"10.1016\/j.jbi.2015.07.020","volume":"58","author":"A Stubbs","year":"2015","unstructured":"Stubbs A, Uzuner \u00d6. Annotating longitudinal clinical narratives for de-identification: the 2014 i2b2\/UTHealth corpus. J Biomed Inform. 2015;58(Supplement):S20\u20139. https:\/\/doi.org\/10.1016\/j.jbi.2015.07.020.","journal-title":"J Biomed Inform"},{"key":"269_CR31","doi-asserted-by":"publisher","first-page":"1052","DOI":"10.18653\/v1\/S16-1165","volume-title":"Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)","author":"S Bethard","year":"2016","unstructured":"Bethard S, Savova G, Chen W-T, Derczynski L, Pustejovsky J, Verhagen M. SemEval-2016 task 12: clinical TempEval. In:  Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). Stroudsburg: Association for Computational Linguistics; 2016. p. 1052\u201362. https:\/\/doi.org\/10.18653\/v1\/S16-1165."},{"key":"269_CR32","doi-asserted-by":"publisher","first-page":"565","DOI":"10.18653\/v1\/S17-2093","volume-title":"Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)","author":"S Bethard","year":"2017","unstructured":"Bethard S, Savova G, Palmer M, Pustejovsky J. SemEval-2017 Task 12: Clinical TempEval. In:  Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Stroudsburg: Association for Computational Linguistics; 2017. p. 565\u201372. https:\/\/doi.org\/10.18653\/v1\/S17-2093."},{"key":"269_CR33","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1007\/978-3-319-11382-1_17","volume-title":"Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)","author":"L Kelly","year":"2014","unstructured":"Kelly L, Goeuriot L, Suominen H, Schreck T, Leroy G, Mowery DL, et al. Overview of the ShARe\/CLEF eHealth evaluation lab 2014. In:  Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 2014. p. 172\u201391. https:\/\/doi.org\/10.1007\/978-3-319-11382-1_17."},{"key":"269_CR34","doi-asserted-by":"publisher","first-page":"2033","DOI":"10.18653\/v1\/D18-1228","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"P Patel","year":"2018","unstructured":"Patel P, Davey D, Panchal V, Pathak P. Annotation of a large clinical entity corpus. In:  Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels. https:\/\/www.aclweb.org\/anthology\/D18-1228: Association for Computational Linguistics; 2018. p. 2033\u201342."},{"key":"269_CR35","doi-asserted-by":"publisher","first-page":"143","DOI":"10.1162\/tacl_a_00172","volume":"2","author":"WF Styler","year":"2014","unstructured":"Styler WF, Bethard S, Finan S, Palmer M, Pradhan S, de Groen PC, et al. Temporal annotation in the clinical domain. Trans Assoc Comput Linguist. 2014;2:143\u201354 http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/29082229.","journal-title":"Trans Assoc Comput Linguist"},{"key":"269_CR36","doi-asserted-by":"publisher","first-page":"922","DOI":"10.1136\/amiajnl-2012-001317","volume":"20","author":"D Albright","year":"2013","unstructured":"Albright D, Lanfranchi A, Fredriksen A, Styler WF, Warner C, Hwang JD, et al. Towards comprehensive syntactic and semantic annotations of the clinical narrative. J Am Med Inform Assoc. 2013;20:922\u201330. https:\/\/doi.org\/10.1136\/amiajnl-2012-001317.","journal-title":"J Am Med Inform Assoc"},{"key":"269_CR37","doi-asserted-by":"publisher","first-page":"216","DOI":"10.3233\/978-1-60750-928-8-216","volume":"84","author":"AT McCray","year":"2001","unstructured":"McCray AT, Burgun A, Bodenreider O. Aggregating UMLS semantic types for reducing conceptual complexity. Stud Health Technol Inform. 2001;84:216\u201320. https:\/\/doi.org\/10.3233\/978-1-60750-928-8-216.","journal-title":"Stud Health Technol Inform"},{"key":"269_CR38","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1186\/s13326-017-0135-z","volume":"8","author":"L Del\u00e9ger","year":"2017","unstructured":"Del\u00e9ger L, Campillos L, Ligozat AL, N\u00e9v\u00e9ol A. Design of an extensive information representation scheme for clinical narratives. J Biomed Semantics. 2017;8:37. https:\/\/doi.org\/10.1186\/s13326-017-0135-z.","journal-title":"J Biomed Semantics"},{"key":"269_CR39","first-page":"69","volume-title":"Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP)","author":"R Roller","year":"2016","unstructured":"Roller R, Uszkoreit H, Xu F, Seiffe L, Mikhailov M, Staeck O, et al. A fine-grained corpus annotation schema of German nephrology records. In:  Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP). https:\/\/www.aclweb.org\/anthology\/W16-4210:. Osaka: The COLING; Organizing Committee; 2016. p. 69\u201377."},{"key":"269_CR40","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1016\/j.jbi.2014.01.012","volume":"49","author":"M Skeppstedt","year":"2014","unstructured":"Skeppstedt M, Kvist M, Nilsson GH, Dalianis H. Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study. J Biomed Inform. 2014;49:148\u201358. https:\/\/doi.org\/10.1016\/j.jbi.2014.01.012.","journal-title":"J Biomed Inform"},{"key":"269_CR41","unstructured":"Deleger L, Li Q, Lingren T, Kaiser M, Molnar K, Stoutenborough L, et al. Building gold standard corpora for medical natural language processing tasks. AMIA Annu Symp Proceedings AMIA Symp. 2012;2012:144\u201353.\u00a0http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?artid=PMC3540456."},{"key":"269_CR42","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1515\/cllt-2016-0046","volume":"15","author":"L Crible","year":"2017","unstructured":"Crible L, Degand L. Reliability vs. granularity in discourse annotation: what is the trade-off? Corpus Linguist Theor. 2017;15:71\u201399. https:\/\/doi.org\/10.1515\/cllt-2016-0046.","journal-title":"Corpus Linguist Theor"},{"key":"269_CR43","first-page":"13","volume":"22","author":"E Hovy","year":"2010","unstructured":"Hovy E, Lavid J. Towards a \u201cscience\u201d of corpus annotation: a new methodological challenge for corpus linguistics. Int J Transl. 2010;22:13\u201336.","journal-title":"Int J Transl"},{"key":"269_CR44","doi-asserted-by":"publisher","first-page":"555","DOI":"10.1162\/coli.07-034-R2","volume":"34","author":"R Artstein","year":"2008","unstructured":"Artstein R, Poesio M. Inter-coder agreement for computational linguistics. Comput Linguist. 2008;34:555\u201396. https:\/\/doi.org\/10.1162\/coli.07-034-R2.","journal-title":"Comput Linguist"},{"key":"269_CR45","doi-asserted-by":"publisher","first-page":"296","DOI":"10.1197\/jamia.M1733","volume":"12","author":"G Hripcsak","year":"2005","unstructured":"Hripcsak G, Rothschild AS. Agreement, the F-measure, and reliability in information retrieval. J Am Med Inform Assoc. 2005;12:296\u20138. https:\/\/doi.org\/10.1197\/jamia.M1733.","journal-title":"J Am Med Inform Assoc"},{"key":"269_CR46","doi-asserted-by":"publisher","first-page":"143","DOI":"10.1136\/amiajnl-2013-002544","volume":"22","author":"S Pradhan","year":"2015","unstructured":"Pradhan S, Elhadad N, South BR, Martinez D, Christensen L, Vogel A, et al. Evaluating the state of the art in disorder recognition and normalization of the clinical narrative. J Am Med Inform Assoc. 2015;22:143\u201354. https:\/\/doi.org\/10.1136\/amiajnl-2013-002544.","journal-title":"J Am Med Inform Assoc"},{"key":"269_CR47","doi-asserted-by":"publisher","first-page":"159","DOI":"10.2307\/2529310","volume":"33","author":"JR Landis","year":"1977","unstructured":"Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159\u201374. https:\/\/doi.org\/10.2307\/2529310.","journal-title":"Biometrics"},{"key":"269_CR48","doi-asserted-by":"publisher","first-page":"319","DOI":"10.1162\/coli.2008.34.3.319","volume":"34","author":"D Reidsma","year":"2008","unstructured":"Reidsma D, Carletta J. Reliability measurement without limits. Comput Linguist. 2008;34:319\u201326. https:\/\/doi.org\/10.1162\/coli.2008.34.3.319.","journal-title":"Comput Linguist"},{"key":"269_CR49","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1007\/978-3-319-78503-5_4","volume-title":"Clinical Text Mining","author":"H Dalianis","year":"2018","unstructured":"Dalianis H. Characteristics of patient records and clinical corpora. In:  Clinical Text Mining. Cham: Springer International Publishing; 2018. p. 21\u201334. https:\/\/doi.org\/10.1007\/978-3-319-78503-5_4."},{"key":"269_CR50","unstructured":"Andrade GHB, Oliveira LES, Moro CMC. Metodologias E Ferramentas Para Anota\u00e7\u00e3o De Narrativas Cl\u00ednicas. In: CBIS Congresso Brasileiro de Inform\u00e1tica em Sa\u00fade Goi\u00e2nia, vol. 2016\u2013XV. 2016. p. 1031\u201340."},{"key":"269_CR51","unstructured":"Oliveira LES, Hasan SA, Farri O, Moro CMC. Translation of UMLS ontologies from European Portuguese to Brazilian Portuguese. CBIS. In:  Congresso Brasileiro de Inform\u00e1tica em Sa\u00fade Goi\u00e2nia, vol. 2016-XV; 2016. p. 373\u201380."},{"key":"269_CR52","doi-asserted-by":"publisher","unstructured":"Oliveira LES, Gebeluca CP, Silva AMP, Moro CMC, Hasan SA, Farri O. A statistics and UMLS-based tool for assisted semantic annotation of Brazilian clinical documents. In:  IEEE International Conference on Bioinformatics and Biomedicine (BIBM); 2017. p. 1072\u20138. https:\/\/doi.org\/10.1109\/BIBM.2017.8217805.","DOI":"10.1109\/BIBM.2017.8217805"},{"key":"269_CR53","unstructured":"Boisen S, Crystal MR, Schwartz R, Stone R, Weischedel R. Annotating resources for information extraction. In:  Proceedings of the Second International Conference on Language Resources and Evaluation (LREC\u201900). Athens: European Language Resources Association (ELRA); 2000. p. 1211\u20134. http:\/\/www.lrec-conf.org\/proceedings\/lrec2000\/pdf\/263.pdf."},{"key":"269_CR54","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1145\/2347736.2347755","volume":"55","author":"P Domingos","year":"2012","unstructured":"Domingos P. A few useful things to know about machine learning. Commun ACM. 2012;55:78\u201387. https:\/\/doi.org\/10.1145\/2347736.2347755.","journal-title":"Commun ACM"},{"key":"269_CR55","first-page":"89","volume":"1 Maio","author":"L Ferreira","year":"2009","unstructured":"Ferreira L, Oliveira CT, Teixeira A, Cunha JPda S. Extrac\u00e7\u00e3o de informa\u00e7\u00e3o de Relat\u00f3rios m\u00e9dicos. Linguam\u00e1tica. 2009;1 Maio:89\u2013102.","journal-title":"Linguam\u00e1tica"},{"key":"269_CR56","doi-asserted-by":"publisher","first-page":"181","DOI":"10.1017\/S1351324920000352","volume":"27","author":"C Dalloux","year":"2021","unstructured":"Dalloux C, Claveau V, Grabar N, Oliveira LES, Moro CMC, Gumiel YB, et al. Supervised learning for the detection of negation and of its scope in French and Brazilian Portuguese biomedical corpora. Nat Lang Eng. 2021;27:181\u2013201. https:\/\/doi.org\/10.1017\/S1351324920000352.","journal-title":"Nat Lang Eng"},{"key":"269_CR57","doi-asserted-by":"publisher","first-page":"318","DOI":"10.5753\/sbcas.2019.6269","volume-title":"Anais do XIX Simp\u00f3sio Brasileiro de Computa\u00e7\u00e3o Aplicada \u00e0 Sa\u00fade","author":"JVA de Souza","year":"2019","unstructured":"de Souza JVA, Gumiel YB, Oliveira LES, Moro CMC. Named entity recognition for clinical Portuguese corpus with conditional random fields and semantic groups. In:  Anais do XIX Simp\u00f3sio Brasileiro de Computa\u00e7\u00e3o Aplicada \u00e0 Sa\u00fade. Niter\u00f3i: Sociedade Brasileira de Computa\u00e7\u00e3o; 2019. p. 318\u201323."},{"key":"269_CR58","doi-asserted-by":"publisher","unstructured":"Schneider ETR, de Souza JVA, Knafou J, Oliveira LES, Copara J, Gumiel YB, et al. BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition. In:  Proceedings of the 3rd Clinical Natural Language Processing Workshop. Stroudsburg: Association for Computational Linguistics; 2020. p. 65\u201372. https:\/\/doi.org\/10.18653\/v1\/2020.clinicalnlp-1.7.","DOI":"10.18653\/v1\/2020.clinicalnlp-1.7"},{"key":"269_CR59","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1186\/2041-1480-5-6","volume":"5","author":"A Henriksson","year":"2014","unstructured":"Henriksson A, Moen H, Skeppstedt M, Daudaravi\u010dius V, Duneld M. Synonym extraction and abbreviation expansion with ensembles of semantic spaces. J Biomed Semant. 2014;5:6. https:\/\/doi.org\/10.1186\/2041-1480-5-6.","journal-title":"J Biomed Semant"},{"key":"269_CR60","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3462475","volume":"54","author":"YB Gumiel","year":"2022","unstructured":"Gumiel YB, Oliveira LES, Claveau V, Grabar N, Paraiso EC, Moro C, et al. Temporal relation extraction in clinical texts. ACM Comput Surv. 2022;54:1\u201336. https:\/\/doi.org\/10.1145\/3462475.","journal-title":"ACM Comput Surv"},{"key":"269_CR61","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1186\/s13326-017-0173-6","volume":"9","author":"JD Osborne","year":"2018","unstructured":"Osborne JD, Neu MB, Danila MI, Solorio T, Bethard SJ. CUILESS2016: a clinical corpus applying compositional normalization of text mentions. J Biomed Semantics. 2018;9:2. https:\/\/doi.org\/10.1186\/s13326-017-0173-6.","journal-title":"J Biomed Semantics"},{"key":"269_CR62","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1186\/2041-1480-4-3","volume":"4","author":"KB Wagholikar","year":"2013","unstructured":"Wagholikar KB, Torii M, Jonnalagadda SR, Liu H. Pooling annotated corpora for clinical concept extraction. J Biomed Semantics. 2013;4:3. https:\/\/doi.org\/10.1186\/2041-1480-4-3.","journal-title":"J Biomed Semantics"}],"container-title":["Journal of Biomedical Semantics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13326-022-00269-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13326-022-00269-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13326-022-00269-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,8]],"date-time":"2022-05-08T05:15:57Z","timestamp":1651986957000},"score":1,"resource":{"primary":{"URL":"https:\/\/jbiomedsem.biomedcentral.com\/articles\/10.1186\/s13326-022-00269-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,8]]},"references-count":62,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["269"],"URL":"https:\/\/doi.org\/10.1186\/s13326-022-00269-1","relation":{},"ISSN":["2041-1480"],"issn-type":[{"value":"2041-1480","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,5,8]]},"assertion":[{"value":"18 March 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 April 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 May 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The EHR data used in this study were approved by the PUCPR Research Ethics Committee (certificate of presentation for ethical appreciation) number 51376015.4.0000.0020.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"13"}}