{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T19:58:53Z","timestamp":1759694333969},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"8","license":[{"start":{"date-parts":[[2020,5,27]],"date-time":"2020-05-27T00:00:00Z","timestamp":1590537600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"National Institute of Health","award":["U01TR0062-1"],"award-info":[{"award-number":["U01TR0062-1"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>As coronavirus disease 2019 (COVID-19) started its rapid emergence and gradually transformed into an unprecedented pandemic, the need for having a knowledge repository for the disease became crucial. To address this issue, a new COVID-19 machine-readable dataset known as the COVID-19 Open Research Dataset (CORD-19) has been released. Based on this, our objective was to build a computable co-occurrence network embeddings to assist association detection among COVID-19\u2013related biomedical entities.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>Leveraging a Linked Data version of CORD-19 (ie, CORD-19-on-FHIR), we first utilized SPARQL to extract co-occurrences among chemicals, diseases, genes, and mutations and build a co-occurrence network. We then trained the representation of the derived co-occurrence network using node2vec with 4 edge embeddings operations (L1, L2, Average, and Hadamard). Six algorithms (decision tree, logistic regression, support vector machine, random forest, na\u00efve Bayes, and multilayer perceptron) were applied to evaluate performance on link prediction. An unsupervised learning strategy was also developed incorporating the t-SNE (t-distributed stochastic neighbor embedding) and DBSCAN (density-based spatial clustering of applications with noise) algorithms for case studies.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>The random forest classifier showed the best performance on link prediction across different network embeddings. For edge embeddings generated using the Average operation, random forest achieved the optimal average precision of 0.97 along with a F1 score of 0.90. For unsupervised learning, 63 clusters were formed with silhouette score of 0.128. Significant associations were detected for 5 coronavirus infectious diseases in their corresponding subgroups.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusions<\/jats:title>\n                  <jats:p>In this study, we constructed COVID-19\u2013centered co-occurrence network embeddings. Results indicated that the generated embeddings were able to extract significant associations for COVID-19 and coronavirus infectious diseases.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocaa117","type":"journal-article","created":{"date-parts":[[2020,5,23]],"date-time":"2020-05-23T04:22:13Z","timestamp":1590207733000},"page":"1259-1267","source":"Crossref","is-referenced-by-count":19,"title":["Constructing co-occurrence network embeddings to assist association extraction for COVID-19 and other coronavirus infectious diseases"],"prefix":"10.1093","volume":"27","author":[{"given":"David","family":"Oniani","sequence":"first","affiliation":[{"name":"Kern Center for the Science of Health Care Delivery, Mayo Clinic, Rochester, Minnesota, USA"}]},{"given":"Guoqian","family":"Jiang","sequence":"additional","affiliation":[{"name":"Division of Digital Health Sciences, Mayo Clinic, Rochester, Minnesota, USA"}]},{"given":"Hongfang","family":"Liu","sequence":"additional","affiliation":[{"name":"Division of Digital Health Sciences, Mayo Clinic, Rochester, Minnesota, USA"}]},{"given":"Feichen","family":"Shen","sequence":"additional","affiliation":[{"name":"Division of Digital Health Sciences, Mayo Clinic, Rochester, Minnesota, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,5,27]]},"reference":[{"issue":"7798","key":"2020110613103449800_ocaa117-B1","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1038\/d41586-020-00694-1","article-title":"Keep up with the latest coronavirus research","volume":"579","author":"Chen","year":"2020","journal-title":"Nature"},{"key":"2020110613103449800_ocaa117-B2","first-page":"775","author":"Mihalcea","year":"2006"},{"issue":"4","key":"2020110613103449800_ocaa117-B3","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1016\/j.datak.2011.01.002","article-title":"SyMSS: a syntax-based measure for short-text semantic similarity","volume":"70","author":"Oliva","year":"2011","journal-title":"Data Knowl Eng"},{"key":"2020110613103449800_ocaa117-B4","first-page":"1","volume-title":"Synthetic Lectures on the Semantic Web: Theory and Technology","author":"Heath","year":"2011"},{"key":"2020110613103449800_ocaa117-B5","author":"Ahamed","year":"2020"},{"key":"2020110613103449800_ocaa117-B6","article-title":"COVID-19 and company knowledge graphs: assessing golden powers and economic impact of selective lockdown via AI reasoning","author":"Bellomarini","year":"2020","journal-title":"arXiv:2004.10119"},{"key":"2020110613103449800_ocaa117-B7","author":"Tsiotas","year":"2020"},{"key":"2020110613103449800_ocaa117-B8"},{"key":"2020110613103449800_ocaa117-B9","article-title":"Visualization of diseases at risk in the COVID-19 Literature","author":"Wolinski","year":"2020","journal-title":"arXiv:2005.00848"},{"key":"2020110613103449800_ocaa117-B10","author":"Wang","year":"2020"},{"key":"2020110613103449800_ocaa117-B11","year":"2020"},{"key":"2020110613103449800_ocaa117-B12","first-page":"326","author":"Bender","year":"2013"},{"issue":"1","key":"2020110613103449800_ocaa117-B13","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1002\/bult.105","article-title":"An introduction to the resource description framework","volume":"25","author":"Miller","year":"2005","journal-title":"Bull Am Soc Inf Sci Technol"},{"key":"2020110613103449800_ocaa117-B14","article-title":"Detecting fake news for the new coronavirus by reasoning on the Covid-19 ontology","author":"Groza","year":"2020","journal-title":"arXiv:2004.12330"},{"key":"2020110613103449800_ocaa117-B15","first-page":"746","article-title":"Linguistic regularities in continuous space word representations","author":"Mikolov","year":"2013"},{"key":"2020110613103449800_ocaa117-B16","first-page":"855","article-title":"node2vec: Scalable feature learning for networks","author":"Grover","year":"2016"},{"key":"2020110613103449800_ocaa117-B17","first-page":"29","author":"Shen","year":"2018"},{"key":"2020110613103449800_ocaa117-B18","doi-asserted-by":"crossref","first-page":"103246","DOI":"10.1016\/j.jbi.2019.103246","article-title":"HPO2Vec+: leveraging heterogeneous knowledge resources to enrich node embeddings for the human phenotype ontology","volume":"96","author":"Shen","year":"2019","journal-title":"J Biomed Inform"},{"issue":"W1","key":"2020110613103449800_ocaa117-B19","doi-asserted-by":"crossref","first-page":"W518","DOI":"10.1093\/nar\/gkt441","article-title":"PubTator: a web-based text mining tool for assisting biocuration","volume":"41","author":"Wei","year":"2013","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"2020110613103449800_ocaa117-B20","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1007\/s10618-010-0210-x","article-title":"Leveraging social media networks for classification","volume":"23","author":"Tang","year":"2011","journal-title":"Data Min Knowl Disc"},{"issue":"3\u20135","key":"2020110613103449800_ocaa117-B21","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.physrep.2009.11.002","article-title":"Community detection in graphs","volume":"486","author":"Fortunato","year":"2010","journal-title":"Phys Rep"},{"key":"2020110613103449800_ocaa117-B22","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Maaten","year":"2008","journal-title":"J Mach Learn Res"},{"issue":"2","key":"2020110613103449800_ocaa117-B23","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1023\/A:1009745219419","article-title":"Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications","volume":"2","author":"Sander","year":"1998","journal-title":"Data Min Knowl Discov"},{"issue":"1","key":"2020110613103449800_ocaa117-B24","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1007\/BF00116251","article-title":"Induction of decision trees","volume":"1","author":"Quinlan","year":"1986","journal-title":"Mach Learn"},{"issue":"1\u20132","key":"2020110613103449800_ocaa117-B25","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1093\/biomet\/54.1-2.167","article-title":"Estimation of the probability of an event as a function of several independent variables","volume":"54","author":"Walker","year":"1967","journal-title":"Biometrika"},{"issue":"3","key":"2020110613103449800_ocaa117-B26","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-vector networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach Learn"},{"key":"2020110613103449800_ocaa117-B27","first-page":"278","article-title":"Random decision forests","author":"Ho","year":"1995"},{"key":"2020110613103449800_ocaa117-B28","first-page":"41","article-title":"An empirical study of the naive Bayes classifier","author":"Rish","year":"2001"},{"issue":"1","key":"2020110613103449800_ocaa117-B29","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1186\/s12872-016-0402-4","article-title":"The association of functional polymorphisms in genes encoding growth factors for endothelial cells and smooth muscle cells with the severity of coronary artery disease","volume":"16","author":"Osadnik","year":"2016","journal-title":"BMC Cardiovasc Disord"},{"key":"2020110613103449800_ocaa117-B30","year":"2020"},{"issue":"1","key":"2020110613103449800_ocaa117-B31","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1186\/s12967-015-0496-y","article-title":"Hegemonic structure of basic, clinical and patented knowledge on Ebola research: a US army reductionist initiative","volume":"13","author":"Fajardo-Ortiz","year":"2015","journal-title":"J Transl Med"},{"key":"2020110613103449800_ocaa117-B32","doi-asserted-by":"crossref","first-page":"103998","DOI":"10.1016\/j.micpath.2020.103998","article-title":"Gga-miR-30d regulates infectious bronchitis virus infection by targeting USP47 in HD11 cells","volume":"141","author":"Li","year":"2020","journal-title":"Microbial Pathog"},{"issue":"3","key":"2020110613103449800_ocaa117-B33","doi-asserted-by":"crossref","first-page":"212","DOI":"10.1016\/j.compbiolchem.2005.04.006","article-title":"Structural analysis of inhibition mechanisms of aurintricarboxylic acid on SARS-CoV polymerase and other proteins","volume":"29","author":"Yap","year":"2005","journal-title":"Comput Biol Chem"},{"issue":"541","key":"2020110613103449800_ocaa117-B34","doi-asserted-by":"crossref","first-page":"eabb5883","DOI":"10.1126\/scitranslmed.abb5883","article-title":"An orally bioavailable broad-spectrum antiviral inhibits SARS-CoV-2 in human airway epithelial cell cultures and multiple coronaviruses in mice","volume":"12","author":"Sheahan","year":"2020","journal-title":"Sci Transl Med"},{"issue":"1","key":"2020110613103449800_ocaa117-B35","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/j.antiviral.2006.03.001","article-title":"Enhancement of the infectivity of SARS-CoV in BALB\/c mice by IMP dehydrogenase inhibitors, including ribavirin","volume":"71","author":"Barnard","year":"2006","journal-title":"Antiviral Res"},{"issue":"10","key":"2020110613103449800_ocaa117-B36","doi-asserted-by":"crossref","first-page":"1401","DOI":"10.1086\/386321","article-title":"Sedation, sucralfate, and antibiotic use are potential means for protection against early-onset ventilator-associated pneumonia","volume":"38","author":"Bornstain","year":"2004","journal-title":"Clin Infect Dis"},{"key":"2020110613103449800_ocaa117-B37","first-page":"204","article-title":"Detection of Actinobacillus pleuropneumoniae in the porcine upper respiratory tract as a complement to serological tests","volume":"57","author":"Sidibe","year":"1993","journal-title":"Can J Vet Res"},{"key":"2020110613103449800_ocaa117-B38","doi-asserted-by":"crossref","DOI":"10.1111\/jth.14828","article-title":"Tissue plasminogen activator (TPA) treatment for COVID-19 associated acute respiratory distress syndrome (ARDS): a case series","author":"Wang","year":"2020","journal-title":"J Thromb Haemost"},{"key":"2020110613103449800_ocaa117-B39","doi-asserted-by":"crossref","DOI":"10.1148\/radiol.2020200770","article-title":"FDG PET\/CT of COVID-19","author":"Zou","year":"2020","journal-title":"Radiology"},{"issue":"3","key":"2020110613103449800_ocaa117-B40","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1038\/nm1001","article-title":"Pegylated interferon-\u03b1 protects type 1 pneumocytes against SARS coronavirus infection in macaques","volume":"10","author":"Haagmans","year":"2004","journal-title":"Nat Med"},{"issue":"8","key":"2020110613103449800_ocaa117-B41","doi-asserted-by":"crossref","first-page":"e0160005","DOI":"10.1371\/journal.pone.0160005","article-title":"Knowledge discovery from biomedical ontologies in cross domains","volume":"11","author":"Shen","year":"2016","journal-title":"PLoS One"},{"issue":"1","key":"2020110613103449800_ocaa117-B42","first-page":"1","article-title":"Biobroker: Knowledge discovery framework for heterogeneous biomedical ontologies and data","volume":"10","author":"Shen","year":"2018","journal-title":"J Intell Learn Syst Appl"},{"key":"2020110613103449800_ocaa117-B43","first-page":"1092","author":"Shen","year":"2015"},{"issue":"3","key":"2020110613103449800_ocaa117-B44","first-page":"66","article-title":"Predicate oriented pattern analysis for biomedical knowledge discovery","volume":"8","author":"Shen","year":"2016","journal-title":"Intell Inf Manag"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/advance-article-pdf\/doi\/10.1093\/jamia\/ocaa117\/33673057\/ocaa117.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/27\/8\/1259\/34152768\/ocaa117.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/27\/8\/1259\/34152768\/ocaa117.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,11,6]],"date-time":"2020-11-06T19:34:20Z","timestamp":1604691260000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/27\/8\/1259\/5847598"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,27]]},"references-count":44,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2020,5,27]]},"published-print":{"date-parts":[[2020,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocaa117","relation":{},"ISSN":["1527-974X"],"issn-type":[{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,8]]},"published":{"date-parts":[[2020,5,27]]}}}