{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T22:37:31Z","timestamp":1775601451468,"version":"3.50.1"},"reference-count":48,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2020,10,8]],"date-time":"2020-10-08T00:00:00Z","timestamp":1602115200000},"content-version":"vor","delay-in-days":7,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"Intramural Research Program of the National Institutes of Health, National Library of Medicine"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Objective<\/jats:title><jats:p>The study sought to explore the use of deep learning techniques to measure the semantic relatedness between Unified Medical Language System (UMLS) concepts.<\/jats:p><\/jats:sec><jats:sec><jats:title>Materials and Methods<\/jats:title><jats:p>Concept sentence embeddings were generated for UMLS concepts by applying the word embedding models BioWordVec and various flavors of BERT to concept sentences formed by concatenating UMLS terms. Graph embeddings were generated by the graph convolutional networks and 4 knowledge graph embedding models, using graphs built from UMLS hierarchical relations. Semantic relatedness was measured by the cosine between the concepts\u2019 embedding vectors. Performance was compared with 2 traditional path-based (shortest path and Leacock-Chodorow) measurements and the publicly available concept embeddings, cui2vec, generated from large biomedical corpora. The concept sentence embeddings were also evaluated on a word sense disambiguation (WSD) task. Reference standards used included the semantic relatedness and semantic similarity datasets from the University of Minnesota, concept pairs generated from the Standardized MedDRA Queries and the MeSH (Medical Subject Headings) WSD corpus.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Sentence embeddings generated by BioWordVec outperformed all other methods used individually in semantic relatedness measurements. Graph convolutional network graph embedding uniformly outperformed path-based measurements and was better than some word embeddings for the Standardized MedDRA Queries dataset. When used together, combined word and graph embedding achieved the best performance in all datasets. For WSD, the enhanced versions of BERT outperformed BioWordVec.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>Word and graph embedding techniques can be used to harness terms and relations in the UMLS to measure semantic relatedness between concepts. Concept sentence embedding outperforms path-based measurements and cui2vec, and can be further enhanced by combining with graph embedding.<\/jats:p><\/jats:sec>","DOI":"10.1093\/jamia\/ocaa136","type":"journal-article","created":{"date-parts":[[2020,7,18]],"date-time":"2020-07-18T04:25:55Z","timestamp":1595046355000},"page":"1538-1546","source":"Crossref","is-referenced-by-count":27,"title":["Use of word and graph embedding to measure semantic relatedness between Unified Medical Language System concepts"],"prefix":"10.1093","volume":"27","author":[{"given":"Yuqing","family":"Mao","sequence":"first","affiliation":[{"name":"National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA"}]},{"given":"Kin Wah","family":"Fung","sequence":"additional","affiliation":[{"name":"National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,10,8]]},"reference":[{"issue":"3","key":"2020110613121544000_ocaa136-B1","doi-asserted-by":"crossref","first-page":"288","DOI":"10.1016\/j.jbi.2006.06.004","article-title":"Measures of semantic similarity and relatedness in the biomedical domain","volume":"40","author":"Pedersen","year":"2007","journal-title":"J Biomed Inform"},{"key":"2020110613121544000_ocaa136-B2","first-page":"245","article-title":"Intelligent indexing and semantic retrieval of multimodal documents","volume-title":"Information Retrieval","author":"Srihari","year":"2000"},{"key":"2020110613121544000_ocaa136-B3","first-page":"379","author":"Stevenson","year":"2005"},{"issue":"1","key":"2020110613121544000_ocaa136-B4","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1162\/coli.2006.32.1.13","article-title":"Evaluating wordnet-based measures of lexical semantic relatedness","volume":"32","author":"Budanitsky","year":"2006","journal-title":"Comput Linguistics"},{"key":"2020110613121544000_ocaa136-B5","author":"Liu","year":"2007"},{"issue":"1","key":"2020110613121544000_ocaa136-B6","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1186\/1471-2105-13-261","article-title":"Semantic similarity in the biomedical domain: an evaluation across knowledge sources","volume":"13","author":"Garla","year":"2012","journal-title":"BMC Bioinformatics"},{"key":"2020110613121544000_ocaa136-B7","doi-asserted-by":"crossref","first-page":"258","DOI":"10.1016\/j.jbi.2015.12.007","article-title":"Computing semantic similarity between biomedical concepts using new information content approach","volume":"59","author":"Aouicha","year":"2016","journal-title":"J Biomed Inform"},{"key":"2020110613121544000_ocaa136-B8","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1016\/j.knosys.2017.05.021","article-title":"Sematch: Semantic similarity framework for knowledge graphs","volume":"130","author":"Zhu","year":"2017","journal-title":"Knowledge Based Syst"},{"key":"2020110613121544000_ocaa136-B9","first-page":"895","author":"Sch\u00fctze","year":"1992"},{"issue":"23","key":"2020110613121544000_ocaa136-B10","doi-asserted-by":"crossref","first-page":"3635","DOI":"10.1093\/bioinformatics\/btw529","article-title":"Corpus domain effects on distributional semantic modeling of medical terms","volume":"32","author":"Pakhomov","year":"2016","journal-title":"Bioinformatics"},{"key":"2020110613121544000_ocaa136-B11","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1016\/j.jbi.2018.09.008","article-title":"A comparison of word embeddings for the biomedical natural language processing","volume":"87","author":"Wang","year":"2018","journal-title":"J Biomed Inform"},{"key":"2020110613121544000_ocaa136-B12","first-page":"431","article-title":"UMLS-Interface and UMLS-Similarity: open source software for measuring paths and semantic similarity","author":"McInnes","year":"2009","journal-title":"AMIA Annu Symp Proc"},{"issue":"90001","key":"2020110613121544000_ocaa136-B13","doi-asserted-by":"crossref","first-page":"D267","DOI":"10.1093\/nar\/gkh061","article-title":"The unified medical language system (UMLS): integrating biomedical terminology","volume":"32","author":"Bodenreider","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2020110613121544000_ocaa136-B14","first-page":"3111","author":"Mikolov","year":"2013"},{"key":"2020110613121544000_ocaa136-B15","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1162\/tacl_a_00051","article-title":"Enriching word vectors with subword information","volume":"5","author":"Bojanowski","year":"2017","journal-title":"Trans Assoc Comput Linguistics"},{"issue":"1","key":"2020110613121544000_ocaa136-B16","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41597-019-0055-0","article-title":"BioWordVec, improving biomedical word embeddings with subword information and MeSH","volume":"6","author":"Zhang","year":"2019","journal-title":"Sci Data"},{"key":"2020110613121544000_ocaa136-B17","first-page":"4171","author":"Devlin","year":"2019"},{"key":"2020110613121544000_ocaa136-B18","first-page":"58","author":"Peng","year":"2019"},{"issue":"4","key":"2020110613121544000_ocaa136-B19","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2020","journal-title":"Bioinformatics"},{"key":"2020110613121544000_ocaa136-B20","author":"Monti","year":"2017"},{"key":"2020110613121544000_ocaa136-B21","author":"Chen","year":"2018"},{"key":"2020110613121544000_ocaa136-B22","author":"Song","year":"2018"},{"issue":"9","key":"2020110613121544000_ocaa136-B23","doi-asserted-by":"crossref","first-page":"1616","DOI":"10.1109\/TKDE.2018.2807452","article-title":"comprehensive survey of graph embedding: Problems, techniques, and applications","volume":"30","author":"Cai","year":"2018","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2020110613121544000_ocaa136-B24","author":"Battaglia","year":"2018"},{"key":"2020110613121544000_ocaa136-B25","author":"Defferrard","year":"2016"},{"key":"2020110613121544000_ocaa136-B26","author":"Kipf","year":"2016"},{"key":"2020110613121544000_ocaa136-B27","author":"Hamilton","year":"2017"},{"key":"2020110613121544000_ocaa136-B28","author":"Berg","year":"2017"},{"key":"2020110613121544000_ocaa136-B29","author":"Chen","year":"2018"},{"key":"2020110613121544000_ocaa136-B30","first-page":"7370","article-title":"Graph convolutional networks for text classification","volume":"33","author":"Yao","year":"2019","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"2020110613121544000_ocaa136-B31","first-page":"2787","author":"Bordes","year":"2013"},{"key":"2020110613121544000_ocaa136-B32","first-page":"1955","article-title":"Holographic embeddings of knowledge graphs","volume":"30","author":"Nickel","year":"2016","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"2020110613121544000_ocaa136-B33","author":"Yang","year":"2014"},{"key":"2020110613121544000_ocaa136-B34","first-page":"2071","article-title":"Complex embeddings for simple link prediction","volume":"48","author":"Trouillon","year":"2016","journal-title":"Proc Mach Learn Res"},{"key":"2020110613121544000_ocaa136-B35","author":"Kipf"},{"key":"2020110613121544000_ocaa136-B36","author":"Kingma","year":"2014"},{"key":"2020110613121544000_ocaa136-B37","first-page":"572","article-title":"Semantic similarity and relatedness between clinical terms: an experimental study","author":"Pakhomov","year":"2010","journal-title":"AMIA Annu Symp Proc"},{"key":"2020110613121544000_ocaa136-B38","first-page":"2012: 43","article-title":"Evaluating semantic relatedness and similarity measures with standardized MedDRA queries","volume":"2012","author":"Bill","journal-title":"AMIA Annu Symp Proc"},{"key":"2020110613121544000_ocaa136-B39","author":"Beam","year":"2020"},{"key":"2020110613121544000_ocaa136-B40","first-page":"895","article-title":"Knowledge-based method for determining the meaning of ambiguous biomedical terms using information content measures of similarity","author":"McInnes","year":"2011","journal-title":"AMIA Annu Symp Proc"},{"key":"2020110613121544000_ocaa136-B41","doi-asserted-by":"crossref","first-page":"265","DOI":"10.7551\/mitpress\/7287.003.0018","volume-title":"Fellbaum C, Miller G, eds. WordNet: An Electronic Lexical Database","author":"Leacock","year":"1998"},{"issue":"1","key":"2020110613121544000_ocaa136-B42","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1186\/1471-2105-12-223","article-title":"Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation","volume":"12","author":"Jimeno-Yepes","year":"2011","journal-title":"BMC Bioinformatics"},{"issue":"6","key":"2020110613121544000_ocaa136-B43","doi-asserted-by":"crossref","first-page":"1116","DOI":"10.1016\/j.jbi.2013.08.008","article-title":"Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text","volume":"46","author":"McInnes","year":"2013","journal-title":"J Biomed Inform"},{"key":"2020110613121544000_ocaa136-B44","author":"Huang","year":"2019"},{"key":"2020110613121544000_ocaa136-B45","author":"Salehi","year":"2019"},{"key":"2020110613121544000_ocaa136-B46","first-page":"2609","author":"Pan","year":"2018"},{"key":"2020110613121544000_ocaa136-B47","first-page":"657","article-title":"Retrofitting concept vector representations of medical concepts to improve estimates of semantic similarity and relatedness","volume":"245","author":"Yu","year":"2017","journal-title":"Stud Health Technol Inform"},{"key":"2020110613121544000_ocaa136-B48","doi-asserted-by":"crossref","first-page":"103182","DOI":"10.1016\/j.jbi.2019.103182","article-title":"Concept embedding to measure semantic relatedness for biomedical information ontologies","volume":"94","author":"Park","year":"2019","journal-title":"J Biomed Inform"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/27\/10\/1538\/34153534\/ocaa136.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/27\/10\/1538\/34153534\/ocaa136.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,10]],"date-time":"2024-08-10T08:38:16Z","timestamp":1723279096000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/27\/10\/1538\/5919213"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,1]]},"references-count":48,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2020,10,8]]},"published-print":{"date-parts":[[2020,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocaa136","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,10]]},"published":{"date-parts":[[2020,10,1]]}}}