{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T22:33:39Z","timestamp":1770071619095,"version":"3.49.0"},"reference-count":38,"publisher":"Cambridge University Press (CUP)","issue":"2","license":[{"start":{"date-parts":[[2020,6,30]],"date-time":"2020-06-30T00:00:00Z","timestamp":1593475200000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["cambridge.org"],"crossmark-restriction":true},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2021,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Automatic detection of negated content is often a prerequisite in information extraction systems in various domains. In the biomedical domain especially, this task is important because negation plays an important role. In this work, two main contributions are proposed. First, we work with languages which have been poorly addressed up to now: Brazilian Portuguese and French. Thus, we developed new corpora for these two languages which have been manually annotated for marking up the negation cues and their scope. Second, we propose automatic methods based on supervised machine learning approaches for the automatic detection of negation marks and of their scopes. The methods show to be robust in both languages (Brazilian Portuguese and French) and in cross-domain (general and biomedical languages) contexts. The approach is also validated on English data from the state of the art: it yields very good results and outperforms other existing approaches. Besides, the application is accessible and usable online. We assume that, through these issues (new annotated corpora, application accessible online, and cross-domain robustness), the reproducibility of the results and the robustness of the NLP applications will be augmented.<\/jats:p>","DOI":"10.1017\/s1351324920000352","type":"journal-article","created":{"date-parts":[[2020,6,30]],"date-time":"2020-06-30T07:48:11Z","timestamp":1593503291000},"page":"181-201","update-policy":"https:\/\/doi.org\/10.1017\/policypage","source":"Crossref","is-referenced-by-count":16,"title":["Supervised learning for the detection of negation and of its scope in French and Brazilian Portuguese biomedical corpora"],"prefix":"10.1017","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2605-6777","authenticated-orcid":false,"given":"Cl\u00e9ment","family":"Dalloux","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vincent","family":"Claveau","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Natalia","family":"Grabar","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lucas Emanuel Silva","family":"Oliveira","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Claudia Maria Cabral","family":"Moro","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yohan Bonescki","family":"Gumiel","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Deborah Ribeiro","family":"Carvalho","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"56","published-online":{"date-parts":[[2020,6,30]]},"reference":[{"key":"S1351324920000352_ref13","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1047"},{"key":"S1351324920000352_ref26","unstructured":"N\u00e9v\u00e9ol, A. , Grouin, C. , Leixa, J. , Rosset, S. and Zweigenbaum, P. (2014). The Quaero French medical corpus: A ressource for medical entity recognition and normalization. In Proc BioText M, Reykjavik, Iceland: Citeseer."},{"key":"S1351324920000352_ref20","unstructured":"Lafferty, J. , McCallum, A. and Pereira, F.C. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001), pp. 282\u2013289, San Francisco, CA, USA."},{"key":"S1351324920000352_ref6","doi-asserted-by":"publisher","DOI":"10.1006\/jbin.2001.1029"},{"key":"S1351324920000352_ref7","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1179"},{"key":"S1351324920000352_ref12","doi-asserted-by":"publisher","DOI":"10.1186\/1472-6947-5-13"},{"key":"S1351324920000352_ref36","doi-asserted-by":"publisher","DOI":"10.1162\/COLI_a_00126"},{"key":"S1351324920000352_ref33","unstructured":"Rehurek, R. and Sojka, P. (2010). Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45\u201350. Citeseer."},{"key":"S1351324920000352_ref28","unstructured":"Oliveira, L.E.S. , Peters, A.C. , Da Silva, A.M.P. , Gebeluca, C.P. , Gumiel, Y.B. , Cintho, L.M.M. , Carvalho, D.R. , Hasan, S.A. and Moro, C.M.C. (2020). SemClinBr \u2013 a multi institutional and multi specialty semantically annotated corpus for Portuguese clinical NLP tasks. arXiv preprint."},{"key":"S1351324920000352_ref17","unstructured":"Hartmann, N. , Fonseca, E. , Shulby, C. , Treviso, M. , Rodrigues, J. and Aluisio, S. (2017). Portuguese word embeddings: Evaluating on word analogies and natural language tasks. Proceedings of Symposium in Information and Human Language Technology, Uberl\u00e2ndia, MG, Brazil, October 2\u20135, 2017. Sociedade Brasileira de computa\u00e7\u00e3o."},{"key":"S1351324920000352_ref35","doi-asserted-by":"publisher","DOI":"10.1136\/amiajnl-2011-000203"},{"key":"S1351324920000352_ref24","unstructured":"Morante, R. , Schrauwen, S. and Daelemans, W. (2011). Annotation of Negation Cues and their Scope. Guidelines v1.0. Computational linguistics and psycholinguistics technical report series, CTRS-003, pp. 1\u201342."},{"key":"S1351324920000352_ref11","unstructured":"Devlin, J. , Chang, M.-W. , Lee, K. and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT 2019, Minneapolis, Minnesota, June 2\u20137, 2019. Association for Computational Linguistics, pp. 4171\u20134186."},{"key":"S1351324920000352_ref34","unstructured":"Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing, pp. 44\u201349. Manchester, UK."},{"key":"S1351324920000352_ref32","unstructured":"Read, J. , Velldal, E. , \u00d8vrelid, L. and Oepen, S. (2012, June). Uio 1: Constituent-based discriminative ranking for negation resolution. In Proceedings of the First Joint Conference on Lexical and Computational Semantics, Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation. Montr\u00e9al, Canada: Association for Computational Linguistics, pp. 310\u2013318."},{"key":"S1351324920000352_ref29","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-1007"},{"key":"S1351324920000352_ref30","unstructured":"Peng, Y. , Wang, X. , Lu, L. , Bagheri, M. , Summers, R. and Lu, Z. (2018). NegBio: A high-performance tool for negation and uncertainty detection in radiology reports. AMIA Summits on Translational Science Proceedings 2017, 188\u2013196."},{"key":"S1351324920000352_ref31","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"key":"S1351324920000352_ref1","unstructured":"Abadi, M. , Agarwal, A. , Barham, P. , Brevdo, E. , Chen, Z. , Citro, C. , Corrado, G.S. , Davis, A. , Dean, J. , Devin, M. , Ghemawat, S. , Goodfellow, I. , Harp, A. , Irving, G. , Isard, M. , Jia, Y. , Jozefowicz, R. , Kaiser, L. , Kudlur, M. , Levenberg, J. , Mane, D. , Monga, R. , Moore, S. , Murray, D. , Olah, C. , Schuster, M. , Shlens, J. , Steiner, B. , Sutskever, I. , Talwar, K. , Tucker, P. , Vanhoucke, V. , Vasudevan, V. , Viegas, F. , Vinyals, O. , Warden, P. , Wattenberg, M. , Wicke, M. , Yu, Y. and Zheng, X. (2016) TensorFlow: Large-scale machine learning on heterogeneous distributed systems. pp. 1\u201319."},{"key":"S1351324920000352_ref5","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00051"},{"key":"S1351324920000352_ref37","unstructured":"Velupillai, S. , Dalianis, H. and Kvist, M. (2011). Factuality levels of diagnoses in Swedish clinical text. In Proceedings of the Medical Informatics Europe conference 2011 - The XXIIIrd International Congress of the European Federation for Medical Informatics. Oslo, Norway: Studies in Health Technology and Informatics 169, IOS Press, pp. 559\u2013563."},{"key":"S1351324920000352_ref3","doi-asserted-by":"publisher","DOI":"10.1136\/amiajnl-2012-001317"},{"key":"S1351324920000352_ref9","unstructured":"Denny, J.C. and Peterson, J.F. (2007). Identifying QT prolongation from ECG impressions using natural language processing and negation detection. In Medinfo 2007: Proceedings of the 12th World Congress on Health (Medical) Informatics; Building Sustainable Health Systems. Studies in Health Technology and Informatics, Vol. 129, IOS Press, pp. 1283\u20131288."},{"key":"S1351324920000352_ref8","doi-asserted-by":"publisher","DOI":"10.1145\/2110363.2110443"},{"key":"S1351324920000352_ref10","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9"},{"key":"S1351324920000352_ref23","unstructured":"Mikolov, T. , Sutskever, I. , Chen, K. , Corrado, G.S. and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. Lake Tahoe, Nevada: Curran Associates Inc. Red Hook, NY, USA, pp. 3111\u20133119."},{"key":"S1351324920000352_ref14","first-page":"187","article-title":"Syntactical negation detection in clinical practice guidelines","volume":"136","author":"Gindl","year":"2008","journal-title":"Studies in Health Technology and Informatics"},{"key":"S1351324920000352_ref2","unstructured":"Abdaoui, A. , Tchechmedjiev, A. , Digan, W. , Bringay, S. and Jonquet, C. (2017) French ConText: D\u00e9tecter la n\u00e9gation, la temporalit\u00e9 et le sujet dans les textes cliniques Fran\u00e7ais. 4e \u00e9dition du Symposium sur l\u2019Ing\u00e9nierie de l\u2019Information M\u00e9dicale. Toulouse, France, pp. 1\u201310."},{"key":"S1351324920000352_ref38","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-9-S11-S9"},{"key":"S1351324920000352_ref18","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"S1351324920000352_ref16","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2009.05.002"},{"key":"S1351324920000352_ref19","unstructured":"Jozefowicz, R. , Zaremba, W. and Sutskever, I. (2015, June). An empirical exploration of recurrent network architectures. In International Conference on Machine Learning, pp. 2342\u20132350."},{"key":"S1351324920000352_ref27","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/E14-2005"},{"key":"S1351324920000352_ref25","doi-asserted-by":"publisher","DOI":"10.1136\/jamia.2001.0080598"},{"key":"S1351324920000352_ref21","unstructured":"Lapponi, E. , Velldal, E. , \u00d8vrelid, L. and Read, J. (2012, June). Uio 2: Sequence-labeling negation using dependency features. In Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation. Montr\u00e9al, Canada: Association for Computational Linguistics, pp. 319\u2013327."},{"key":"S1351324920000352_ref22","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-2085"},{"key":"S1351324920000352_ref4","first-page":"993","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324920000352_ref15","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W18-5614"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324920000352","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,5,12]],"date-time":"2021-05-12T11:02:32Z","timestamp":1620817352000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324920000352\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,30]]},"references-count":38,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,3]]}},"alternative-id":["S1351324920000352"],"URL":"https:\/\/doi.org\/10.1017\/s1351324920000352","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"value":"1351-3249","type":"print"},{"value":"1469-8110","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,6,30]]},"assertion":[{"value":"\u00a9 The Author(s), 2020. Published by Cambridge University Press","name":"copyright","label":"Copyright","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http:\/\/creativecommons.org\/licenses\/by\/4.0\/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.","name":"license","label":"License","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This content has been made available to all.","name":"free","label":"Free to read"}]}}