{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,14]],"date-time":"2025-11-14T07:39:09Z","timestamp":1763105949659,"version":"3.37.3"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"12","funder":[{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000092","name":"National Library of Medicine","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,11,17]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>Use heuristic, deep learning (DL), and hybrid AI methods to predict semantic group (SG) assignments for new UMLS Metathesaurus atoms, with target accuracy \u226595%.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>We used train-test datasets from successive 2020AA\u20132022AB UMLS Metathesaurus releases. Our heuristic \u201cwaterfall\u201d approach employed a sequence of 7 different SG prediction methods. Atoms not qualifying for a method were passed on to the next method. The DL approach generated BioWordVec and SapBERT embeddings for atom names, BioWordVec embeddings for source vocabulary names, and BioWordVec embeddings for atom names of the second-to-top nodes of an atom\u2019s source hierarchy. We fed a concatenation of the 4 embeddings into a fully connected multilayer neural network with an output layer of 15 nodes (one for each SG). For both approaches, we developed methods to estimate the probability that their predicted SG for an atom would be correct. Based on these estimations, we developed 2 hybrid SG prediction methods combining the strengths of heuristic and DL methods.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>The heuristic waterfall approach accurately predicted 94.3% of SGs for 1\u200a563\u200a692 new unseen atoms. The DL accuracy on the same dataset was also 94.3%. The hybrid approaches achieved an average accuracy of 96.5%.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>Our study demonstrated that AI methods can predict SG assignments for new UMLS atoms with sufficient accuracy to be potentially useful as an intermediate step in the time-consuming task of assigning new atoms to UMLS concepts. We showed that for SG prediction, combining heuristic methods and DL methods can produce better results than either alone.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocad152","type":"journal-article","created":{"date-parts":[[2023,8,2]],"date-time":"2023-08-02T02:31:09Z","timestamp":1690943469000},"page":"1887-1894","source":"Crossref","is-referenced-by-count":5,"title":["Two complementary AI approaches for predicting UMLS semantic group assignment: heuristic reasoning and deep learning"],"prefix":"10.1093","volume":"30","author":[{"given":"Yuqing","family":"Mao","sequence":"first","affiliation":[{"name":"National Library of Medicine, National Institutes of Health , Bethesda, Maryland, USA"}]},{"given":"Randolph A","family":"Miller","sequence":"additional","affiliation":[{"name":"National Library of Medicine, National Institutes of Health , Bethesda, Maryland, USA"}]},{"given":"Olivier","family":"Bodenreider","sequence":"additional","affiliation":[{"name":"National Library of Medicine, National Institutes of Health , Bethesda, Maryland, USA"}]},{"given":"Vinh","family":"Nguyen","sequence":"additional","affiliation":[{"name":"National Library of Medicine, National Institutes of Health , Bethesda, Maryland, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0593-5377","authenticated-orcid":false,"given":"Kin Wah","family":"Fung","sequence":"additional","affiliation":[{"name":"National Library of Medicine, National Institutes of Health , Bethesda, Maryland, USA"}]}],"member":"286","published-online":{"date-parts":[[2023,8,1]]},"reference":[{"key":"2023111709552080800_ocad152-B1","doi-asserted-by":"crossref","first-page":"D267","DOI":"10.1093\/nar\/gkh061","article-title":"The unified medical language system (UMLS): integrating biomedical terminology","volume":"32 (Database issue)","author":"Bodenreider","year":"2004","journal-title":"Nucleic Acids Res"},{"issue":"10","key":"2023111709552080800_ocad152-B2","doi-asserted-by":"crossref","first-page":"1606","DOI":"10.1093\/jamia\/ocaa084","article-title":"UMLS users and uses: a current overview","volume":"27","author":"Amos","year":"2020","journal-title":"J Am Med Inform Assoc"},{"first-page":"2672","year":"2021","author":"Nguyen","key":"2023111709552080800_ocad152-B3"},{"first-page":"1037","year":"2022","author":"Nguyen","key":"2023111709552080800_ocad152-B4"},{"issue":"1","key":"2023111709552080800_ocad152-B5","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1055\/s-0038-1637976","article-title":"The unified medical language system","volume":"2","author":"Lindberg","year":"1993","journal-title":"Yearb Med Inform"},{"key":"2023111709552080800_ocad152-B6","first-page":"216","article-title":"Aggregating UMLS semantic types for reducing conceptual complexity","volume":"84 (Pt 1)","author":"McCray","year":"2001","journal-title":"Stud Health Technol Inform"},{"volume-title":"Readings in Medical Artificial Intelligence: The First Decade","year":"1984","author":"Clancey","key":"2023111709552080800_ocad152-B7"},{"issue":"1","key":"2023111709552080800_ocad152-B8","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1038\/s41597-019-0055-0","article-title":"BioWordVec, improving biomedical word embeddings with subword information and MeSH","volume":"6","author":"Zhang","year":"2019","journal-title":"Sci Data"},{"first-page":"4171","year":"2019","author":"Devlin","key":"2023111709552080800_ocad152-B9"},{"year":"2018","author":"Radford","key":"2023111709552080800_ocad152-B10"},{"year":"2022","author":"Hoffmann","key":"2023111709552080800_ocad152-B11"},{"year":"2020","author":"Liu","key":"2023111709552080800_ocad152-B12"},{"year":"2023","author":"UMLS","key":"2023111709552080800_ocad152-B13"},{"key":"2023111709552080800_ocad152-B14","first-page":"82","article-title":"Evaluating biomedical word embeddings for vocabulary alignment at scale in the UMLS Metathesaurus using Siamese networks","volume":"2022","author":"Bajaj","year":"2022","journal-title":"Proc Conf Assoc Comput Linguist Meet"},{"issue":"2","key":"2023111709552080800_ocad152-B15","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1007\/BF02295996","article-title":"Note on the sampling error of the difference between correlated proportions or percentages","volume":"12","author":"McNemar","year":"1947","journal-title":"Psychometrika"},{"key":"2023111709552080800_ocad152-B16","article-title":"Data from: two complementary AI approaches for predicting UMLS semantic group assignment: heuristic reasoning and deep learning","author":"Mao","year":"2023","journal-title":"Dryad"},{"year":"2023","key":"2023111709552080800_ocad152-B17"},{"key":"2023111709552080800_ocad152-B18","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1007\/10968987_3","volume-title":"Job Scheduling Strategies for Parallel Processing: 9th International Workshop, JSSPP 2003","author":"Yoo","year":"2003"},{"issue":"4","key":"2023111709552080800_ocad152-B19","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1197\/jamia.M2314","article-title":"Semantic classification of biomedical concepts using distributional similarity","volume":"14","author":"Fan","year":"2007","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2023111709552080800_ocad152-B20","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1186\/1471-2105-8-264","article-title":"Using contextual and lexical features to restructure and validate the classification of biomedical concepts","volume":"8","author":"Fan","year":"2007","journal-title":"BMC Bioinformatics"},{"first-page":"335","year":"2014","author":"Kudama","key":"2023111709552080800_ocad152-B21"},{"issue":"10","key":"2023111709552080800_ocad152-B22","doi-asserted-by":"crossref","first-page":"1625","DOI":"10.1093\/jamia\/ocaa108","article-title":"A review of auditing techniques for the Unified Medical Language System","volume":"27","author":"Zheng","year":"2020","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2023111709552080800_ocad152-B23","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1016\/j.artmed.2004.02.002","article-title":"Auditing concept categorizations in the UMLS","volume":"31","author":"Gu","year":"2004","journal-title":"Artif Intell Med"},{"first-page":"294","year":"2007","author":"Gu","key":"2023111709552080800_ocad152-B24"},{"issue":"6","key":"2023111709552080800_ocad152-B25","doi-asserted-by":"crossref","first-page":"1042","DOI":"10.1016\/j.jbi.2012.05.006","article-title":"A study of terminology auditors\u2019 performance for UMLS semantic type assignments","volume":"45","author":"Gu","year":"2012","journal-title":"J Biomed Inform"},{"first-page":"234","year":"2001","author":"Halper","key":"2023111709552080800_ocad152-B26"},{"issue":"5","key":"2023111709552080800_ocad152-B27","doi-asserted-by":"crossref","first-page":"746","DOI":"10.1197\/jamia.M2951","article-title":"Expanding the extent of a UMLS semantic type via group neighborhood auditing","volume":"16","author":"Chen","year":"2009","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2023111709552080800_ocad152-B28","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1016\/j.jbi.2008.06.001","article-title":"Structural group auditing of a UMLS semantic type\u2019s extent","volume":"42","author":"Chen","year":"2009","journal-title":"J Biomed Inform"},{"issue":"3","key":"2023111709552080800_ocad152-B29","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1016\/j.artmed.2011.05.003","article-title":"Resolution of redundant semantic type assignments for organic chemicals in the UMLS","volume":"52","author":"Morrey","year":"2011","journal-title":"Artif Intell Med"},{"key":"2023111709552080800_ocad152-B30","first-page":"1262","article-title":"Auditing the assignments of top-level semantic types in the UMLS semantic network to UMLS concepts","volume":"2017","author":"He","year":"2017","journal-title":"Proceedings (IEEE Int Conf Bioinformatics Biomed)"},{"issue":"1","key":"2023111709552080800_ocad152-B31","first-page":"43","article-title":"Validating UMLS semantic type assignments using SNOMED CT semantic tags","volume":"57","author":"Gu","year":"2018","journal-title":"Methods Inf Med"},{"year":"2022","author":"OpenAI","key":"2023111709552080800_ocad152-B32"},{"year":"2023","author":"OpenAI","key":"2023111709552080800_ocad152-B33"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/30\/12\/1887\/53477601\/ocad152.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/30\/12\/1887\/53477601\/ocad152.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,17]],"date-time":"2023-11-17T13:28:58Z","timestamp":1700227738000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/30\/12\/1887\/7235063"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,1]]},"references-count":33,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2023,8,1]]},"published-print":{"date-parts":[[2023,11,17]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocad152","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"type":"print","value":"1067-5027"},{"type":"electronic","value":"1527-974X"}],"subject":[],"published-other":{"date-parts":[[2023,12,1]]},"published":{"date-parts":[[2023,8,1]]}}}