{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,26]],"date-time":"2025-08-26T07:05:20Z","timestamp":1756191920293},"reference-count":48,"publisher":"Walter de Gruyter GmbH","issue":"4","license":[{"start":{"date-parts":[[2017,12,13]],"date-time":"2017-12-13T00:00:00Z","timestamp":1513123200000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,12,13]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Word sense disambiguation (WSD) is an important step in biomedical text mining, which is responsible for assigning an unequivocal concept to an ambiguous term, improving the accuracy of biomedical information extraction systems. In this work we followed supervised and knowledge-based disambiguation approaches, with the best results obtained by supervised means. In the supervised method we used bag-of-words as local features, and word embeddings as global features. In the knowledge-based method we combined word embeddings, concept textual definitions extracted from the UMLS database, and concept association values calculated from the MeSH co-occurrence counts from MEDLINE articles. Also, in the knowledge-based method, we tested different word embedding averaging functions to calculate the surrounding context vectors, with the goal to give more importance to closest words of the ambiguous term. The MSH WSD dataset, the most common dataset used for evaluating biomedical concept disambiguation, was used to evaluate our methods. We obtained a top accuracy of 95.6 % by supervised means, while the best knowledge-based accuracy was 87.4 %. Our results show that word embedding models improved the disambiguation accuracy, proving to be a powerful resource in the WSD task.<\/jats:p>","DOI":"10.1515\/jib-2017-0051","type":"journal-article","created":{"date-parts":[[2017,12,13]],"date-time":"2017-12-13T07:59:31Z","timestamp":1513151971000},"source":"Crossref","is-referenced-by-count":10,"title":["Supervised Learning and Knowledge-Based Approaches Applied to Biomedical Word Sense Disambiguation"],"prefix":"10.1515","volume":"14","author":[{"given":"Rui","family":"Antunes","sequence":"first","affiliation":[{"name":"DETI\/IEETA, University of Aveiro, 3810-193Aveiro, Portugal"}]},{"given":"S\u00e9rgio","family":"Matos","sequence":"additional","affiliation":[{"name":"DETI\/IEETA, University of Aveiro, 3810-193Aveiro, Portugal"}]}],"member":"374","reference":[{"key":"ref361","first-page":"943","year":"2013","journal-title":"Biomedical text disambiguation using UMLS"},{"key":"ref371","doi-asserted-by":"crossref","first-page":"300","DOI":"10.1016\/j.jbi.2014.11.015","article-title":"Knowledge based word-concept model estimation and refinement for biomedical text mining","volume":"53","year":"2015","journal-title":"J Biomed Inform"},{"key":"ref31","first-page":"746","year":"2001","journal-title":"Developing a test collection for biomedical word sense disambiguation"},{"key":"ref21","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1016\/j.jbi.2013.09.009","article-title":"Determining the difficulty of word sense disambiguation","volume":"47","year":"2014","journal-title":"J Biomed Inform"},{"key":"ref01","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1459352.1459355","article-title":"Word sense disambiguation: a survey","volume":"41","year":"2009","journal-title":"ACM Comput Surv"},{"key":"ref311","first-page":"265","article-title":"Medical Subject Headings (MeSH)","volume":"88","year":"2000","journal-title":"Bull Med Libr Assoc"},{"key":"ref391","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1016\/j.jbi.2017.08.001","article-title":"Word embeddings and recurrent neural networks based on long-short term memory nodes in supervised biomedical word sense disambiguation","volume":"73","year":"2017","journal-title":"J Biomed Inform"},{"key":"ref471","first-page":"77","year":"2016","journal-title":"Using distributed representations to disambiguate biomedical and clinical concepts"},{"key":"ref61","doi-asserted-by":"crossref","first-page":"D267","DOI":"10.1093\/nar\/gkh061","article-title":"The Unified Medical Language System (UMLS): integrating biomedical terminology","volume":"32","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"ref221","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","year":"2011","journal-title":"J Mach Learn Res"},{"key":"ref41","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1136\/amiajnl-2012-001506","article-title":"A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources","volume":"21","year":"2014","journal-title":"J Am Med Inform Assoc"},{"key":"ref71","first-page":"265","article-title":"Medical Subject Headings (MeSH)","volume":"88","year":"2000","journal-title":"Bull Med Libr Assoc"},{"key":"ref261","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1016\/j.jbi.2013.09.009","article-title":"Determining the difficulty of word sense disambiguation","volume":"47","year":"2014","journal-title":"J Biomed Inform"},{"key":"ref281","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1136\/amiajnl-2012-001506","article-title":"A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources","volume":"21","year":"2014","journal-title":"J Am Med Inform Assoc"},{"key":"ref271","first-page":"746","year":"2001","journal-title":"Developing a test collection for biomedical word sense disambiguation"},{"key":"ref331","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1186\/1471-2105-10-28","article-title":"Biomedical word sense disambiguation with ontologies and metadata: automation meets accuracy","volume":"10","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"ref431","first-page":"314","year":"2015","journal-title":"Semi-supervised word sense disambiguation using word embeddings in general and specific domains"},{"key":"ref251","doi-asserted-by":"crossref","first-page":"554","DOI":"10.1089\/cmb.2005.12.554","article-title":"Word sense disambiguation in the biomedical domain: an overview","volume":"12","year":"2005","journal-title":"J Comput Biol"},{"key":"ref461","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","year":"2011","journal-title":"J Mach Learn Res"},{"key":"ref401","first-page":"897","year":"2016","journal-title":"Embeddings for word sense disambiguation: an evaluation study"},{"key":"ref341","doi-asserted-by":"crossref","first-page":"882","DOI":"10.1136\/amiajnl-2012-001350","article-title":"Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification","volume":"20","year":"2013","journal-title":"J Am Med Inform Assoc"},{"key":"ref121","first-page":"943","year":"2013","journal-title":"Biomedical text disambiguation using UMLS"},{"key":"ref131","doi-asserted-by":"crossref","first-page":"300","DOI":"10.1016\/j.jbi.2014.11.015","article-title":"Knowledge based word-concept model estimation and refinement for biomedical text mining","volume":"53","year":"2015","journal-title":"J Biomed Inform"},{"key":"ref321","first-page":"273","year":"2017","journal-title":"Biomedical word sense disambiguation with word embeddings"},{"key":"ref211","first-page":"45","year":"2010","journal-title":"Software framework for topic modelling with large corpora"},{"key":"ref411","article-title":"Efficient estimation of word representations in vector space","year":"2013","journal-title":"arXiv e-print"},{"key":"ref381","year":"2017","journal-title":"Knowledge-based biomedical word sense disambiguation with neural concept embeddings"},{"key":"ref441","first-page":"171","year":"2015","journal-title":"Clinical abbreviation disambiguation using neural word embeddings"},{"key":"ref111","doi-asserted-by":"crossref","first-page":"1116","DOI":"10.1016\/j.jbi.2013.08.008","article-title":"Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text","volume":"46","year":"2013","journal-title":"J Biomed Inform"},{"key":"ref161","first-page":"897","year":"2016","journal-title":"Embeddings for word sense disambiguation: an evaluation study"},{"key":"ref11","doi-asserted-by":"crossref","first-page":"554","DOI":"10.1089\/cmb.2005.12.554","article-title":"Word sense disambiguation in the biomedical domain: an overview","volume":"12","year":"2005","journal-title":"J Comput Biol"},{"key":"ref151","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1016\/j.jbi.2017.08.001","article-title":"Word embeddings and recurrent neural networks based on long-short term memory nodes in supervised biomedical word sense disambiguation","volume":"73","year":"2017","journal-title":"J Biomed Inform"},{"key":"ref301","doi-asserted-by":"crossref","first-page":"D267","DOI":"10.1093\/nar\/gkh061","article-title":"The Unified Medical Language System (UMLS): integrating biomedical terminology","volume":"32","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"ref81","first-page":"273","year":"2017","journal-title":"Biomedical word sense disambiguation with word embeddings"},{"key":"ref241","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1459352.1459355","article-title":"Word sense disambiguation: a survey","volume":"41","year":"2009","journal-title":"ACM Comput Surv"},{"key":"ref51","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1186\/1471-2105-12-223","article-title":"Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation","volume":"12","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"ref91","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1186\/1471-2105-10-28","article-title":"Biomedical word sense disambiguation with ontologies and metadata: automation meets accuracy","volume":"10","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"ref101","doi-asserted-by":"crossref","first-page":"882","DOI":"10.1136\/amiajnl-2012-001350","article-title":"Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification","volume":"20","year":"2013","journal-title":"J Am Med Inform Assoc"},{"key":"ref201","first-page":"171","year":"2015","journal-title":"Clinical abbreviation disambiguation using neural word embeddings"},{"key":"ref451","first-page":"45","year":"2010","journal-title":"Software framework for topic modelling with large corpora"},{"key":"ref351","doi-asserted-by":"crossref","first-page":"1116","DOI":"10.1016\/j.jbi.2013.08.008","article-title":"Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text","volume":"46","year":"2013","journal-title":"J Biomed Inform"},{"key":"ref191","first-page":"314","year":"2015","journal-title":"Semi-supervised word sense disambiguation using word embeddings in general and specific domains"},{"key":"ref231","first-page":"77","year":"2016","journal-title":"Using distributed representations to disambiguate biomedical and clinical concepts"},{"key":"ref181","doi-asserted-by":"crossref","first-page":"i37","DOI":"10.1093\/bioinformatics\/btx228","article-title":"Deep learning with word embeddings improves biomedical named entity recognition","volume":"33","year":"2017","journal-title":"Bioinformatics"},{"key":"ref291","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1186\/1471-2105-12-223","article-title":"Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation","volume":"12","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"ref171","article-title":"Efficient estimation of word representations in vector space","year":"2013","journal-title":"arXiv e-print"},{"key":"ref421","doi-asserted-by":"crossref","first-page":"i37","DOI":"10.1093\/bioinformatics\/btx228","article-title":"Deep learning with word embeddings improves biomedical named entity recognition","volume":"33","year":"2017","journal-title":"Bioinformatics"},{"key":"ref141","year":"2017","journal-title":"Knowledge-based biomedical word sense disambiguation with neural concept embeddings"}],"container-title":["Journal of Integrative Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.degruyter.com\/view\/journals\/jib\/14\/4\/article-20170051.xml","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/jib-2017-0051\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,4,22]],"date-time":"2021-04-22T01:57:40Z","timestamp":1619056660000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/jib-2017-0051\/html"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,12,13]]},"references-count":48,"journal-issue":{"issue":"4"},"URL":"https:\/\/doi.org\/10.1515\/jib-2017-0051","relation":{},"ISSN":["1613-4516"],"issn-type":[{"value":"1613-4516","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,12,13]]}}}