{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T14:21:36Z","timestamp":1776090096701,"version":"3.50.1"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2021,12,16]],"date-time":"2021-12-16T00:00:00Z","timestamp":1639612800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"US National Science Foundation (Convergence Accelerator","award":["NSF_1937160"],"award-info":[{"award-number":["NSF_1937160"]}]},{"name":"Bakar Family Foundation and the Bakar Computational Health Sciences Institute"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,1,29]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>Early identification of chronic diseases is a pillar of precision medicine as it can lead to improved outcomes, reduction of disease burden, and lower healthcare costs. Predictions of a patient\u2019s health trajectory have been improved through the application of machine learning approaches to electronic health records (EHRs). However, these methods have traditionally relied on \u201cblack box\u201d algorithms that can process large amounts of data but are unable to incorporate domain knowledge, thus limiting their predictive and explanatory power. Here, we present a method for incorporating domain knowledge into clinical classifications by embedding individual patient data into a biomedical knowledge graph.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>A modified version of the Page rank algorithm was implemented to embed millions of deidentified EHRs into a biomedical knowledge graph (SPOKE). This resulted in high-dimensional, knowledge-guided patient health signatures (ie, SPOKEsigs) that were subsequently used as features in a random forest environment to classify patients at risk of developing a chronic disease.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Our model predicted disease status of 5752 subjects 3 years before being diagnosed with multiple sclerosis (MS) (AUC = 0.83). SPOKEsigs outperformed predictions using EHRs alone, and the biological drivers of the classifiers provided insight into the underpinnings of prodromal MS.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>Using data from EHR as input, SPOKEsigs describe patients at both the clinical and biological levels. We provide a clinical use case for detecting MS up to 5 years prior to their documented diagnosis in the clinic and illustrate the biological features that distinguish the prodromal MS state.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocab270","type":"journal-article","created":{"date-parts":[[2021,11,26]],"date-time":"2021-11-26T20:11:18Z","timestamp":1637957478000},"page":"424-434","source":"Crossref","is-referenced-by-count":45,"title":["Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis"],"prefix":"10.1093","volume":"29","author":[{"given":"Charlotte A","family":"Nelson","sequence":"first","affiliation":[{"name":"Integrated Program in Quantitative Biology, University of California San Francisco, San Francisco, California, USA"},{"name":"Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2034-8800","authenticated-orcid":false,"given":"Riley","family":"Bove","sequence":"additional","affiliation":[{"name":"Department of Neurology, UCSF Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California, USA"}]},{"given":"Atul J","family":"Butte","sequence":"additional","affiliation":[{"name":"Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA"},{"name":"Department of Pediatrics, University of California San Francisco, San Francisco, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0067-194X","authenticated-orcid":false,"given":"Sergio E","family":"Baranzini","sequence":"additional","affiliation":[{"name":"Integrated Program in Quantitative Biology, University of California San Francisco, San Francisco, California, USA"},{"name":"Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA"},{"name":"Department of Neurology, UCSF Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,12,16]]},"reference":[{"issue":"8","key":"2022012920324804000_ocab270-B1","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1186\/gm178","article-title":"Predictive, preventive, personalized and participatory medicine: back to the future","volume":"2","author":"Auffray","year":"2010","journal-title":"Genome Med"},{"issue":"2","key":"2022012920324804000_ocab270-B2","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1136\/amiajnl-2013-001847","article-title":"Exploiting the potential of large databases of electronic health records for research using rapid search algorithms and an intuitive query interface","volume":"21","author":"Tate","year":"2014","journal-title":"J Am Med Inform Assoc"},{"issue":"6","key":"2022012920324804000_ocab270-B3","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1038\/nrg3208","article-title":"Mining electronic health records: towards better research applications and clinical care","volume":"13","author":"Jensen","year":"2012","journal-title":"Nat Rev Genet"},{"issue":"5","key":"2022012920324804000_ocab270-B4","doi-asserted-by":"crossref","first-page":"1589","DOI":"10.1109\/JBHI.2017.2767063","article-title":"Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis","volume":"22","author":"Shickel","year":"2018","journal-title":"IEEE J Biomed Health Inform"},{"issue":"9","key":"2022012920324804000_ocab270-B5","doi-asserted-by":"crossref","first-page":"600","DOI":"10.7326\/0003-4819-153-9-201011020-00010","article-title":"Advancing the science for active surveillance: rationale and design for the Observational Medical Outcomes Partnership","volume":"153","author":"Stang","year":"2010","journal-title":"Ann Intern Med"},{"issue":"9","key":"2022012920324804000_ocab270-B6","doi-asserted-by":"crossref","first-page":"1205","DOI":"10.1093\/bioinformatics\/btq126","article-title":"PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations","volume":"26","author":"Denny","year":"2010","journal-title":"Bioinformatics"},{"issue":"4","key":"2022012920324804000_ocab270-B7","doi-asserted-by":"crossref","first-page":"560","DOI":"10.1016\/j.ajhg.2010.03.003","article-title":"Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record","volume":"86","author":"Ritchie","year":"2010","journal-title":"Am J Hum Genet"},{"issue":"79","key":"2022012920324804000_ocab270-B8","doi-asserted-by":"crossref","first-page":"79re1","DOI":"10.1126\/scitranslmed.3001807","article-title":"Electronic medical records for genetic research: results of the eMERGE consortium","volume":"3","author":"Kho","year":"2011","journal-title":"Sci Transl Med"},{"issue":"3","key":"2022012920324804000_ocab270-B9","doi-asserted-by":"crossref","first-page":"588","DOI":"10.1016\/j.ajhg.2019.07.018","article-title":"Harmonizing clinical sequencing and interpretation for the eMERGE III Network","volume":"105","year":"2019","journal-title":"Am J Hum Genet"},{"issue":"10","key":"2022012920324804000_ocab270-B10","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1186\/s13073-014-0082-6","article-title":"Modules, networks and systems medicine for understanding disease and aiding diagnosis","volume":"6","author":"Gustafsson","year":"2014","journal-title":"Genome Med"},{"key":"2022012920324804000_ocab270-B11","doi-asserted-by":"crossref","first-page":"1414","DOI":"10.1016\/j.csbj.2020.05.017","article-title":"Constructing knowledge graphs and their biomedical applications","volume":"18","author":"Nicholson","year":"2020","journal-title":"Comput Struct Biotechnol J"},{"issue":"1","key":"2022012920324804000_ocab270-B12","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nrg2918","article-title":"Network medicine: a network-based approach to human disease","volume":"12","author":"Barabasi","year":"2011","journal-title":"Nat Rev Genet"},{"issue":"21","key":"2022012920324804000_ocab270-B13","doi-asserted-by":"crossref","first-page":"8685","DOI":"10.1073\/pnas.0701361104","article-title":"The human disease network","volume":"104","author":"Goh","year":"2007","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2022012920324804000_ocab270-B14","doi-asserted-by":"crossref","first-page":"485","DOI":"10.12688\/f1000research.6836.1","article-title":"iCTNet2: integrating heterogeneous biological interactions to understand complex traits","volume":"4","author":"Wang","year":"2015","journal-title":"F1000Res"},{"key":"2022012920324804000_ocab270-B15","doi-asserted-by":"crossref","first-page":"4212","DOI":"10.1038\/ncomms5212","article-title":"Human symptoms-disease network","volume":"5","author":"Zhou","year":"2014","journal-title":"Nat Commun"},{"issue":"7","key":"2022012920324804000_ocab270-B16","doi-asserted-by":"crossref","first-page":"e1004259","DOI":"10.1371\/journal.pcbi.1004259","article-title":"Heterogeneous network edge prediction: a data integration approach to prioritize disease-associated genes","volume":"11","author":"Himmelstein","year":"2015","journal-title":"PLoS Comput Biol"},{"issue":"7","key":"2022012920324804000_ocab270-B17","doi-asserted-by":"crossref","first-page":"527","DOI":"10.7326\/L17-0326","article-title":"Single-payer reform","volume":"167","author":"Woolhandler","year":"2017","journal-title":"Ann Intern Med"},{"issue":"1","key":"2022012920324804000_ocab270-B18","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1370\/afm.1196","article-title":"Impact of electronic health record clinical decision support on diabetes care: a randomized trial","volume":"9","author":"O\u2019Connor","year":"2011","journal-title":"Ann Fam Med"},{"issue":"1","key":"2022012920324804000_ocab270-B19","doi-asserted-by":"crossref","first-page":"e22","DOI":"10.2196\/jmir.9268","article-title":"Prediction of incident hypertension within the next year: prospective study using statewide electronic health records and machine learning","volume":"20","author":"Ye","year":"2018","journal-title":"J Med Internet Res"},{"issue":"3","key":"2022012920324804000_ocab270-B20","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1001\/jamaneurol.2016.5056","article-title":"Assessment of early evidence of multiple sclerosis in a prospective study of asymptomatic high-risk family members","volume":"74","author":"Xia","year":"2017","journal-title":"JAMA Neurol"},{"issue":"6","key":"2022012920324804000_ocab270-B21","doi-asserted-by":"crossref","first-page":"1162","DOI":"10.1002\/ana.25247","article-title":"Prodromal symptoms of multiple sclerosis in primary care","volume":"83","author":"Disanto","year":"2018","journal-title":"Ann Neurol"},{"issue":"10","key":"2022012920324804000_ocab270-B22","doi-asserted-by":"crossref","first-page":"978","DOI":"10.1212\/WNL.0000000000003078","article-title":"The 11-year long-term follow-up study from the randomized BENEFIT CIS trial","volume":"87","author":"Kappos","year":"2016","journal-title":"Neurology"},{"issue":"1","key":"2022012920324804000_ocab270-B23","doi-asserted-by":"crossref","first-page":"3045","DOI":"10.1038\/s41467-019-11069-0","article-title":"Integrating biomedical research and electronic health records to create knowledge-based biologically meaningful machine-readable embeddings","volume":"10","author":"Nelson","year":"2019","journal-title":"Nat Commun"},{"key":"2022012920324804000_ocab270-B24","year":"2002: 517\u201326"},{"key":"2022012920324804000_ocab270-B25","author":"Mikolov","year":"2013"},{"key":"2022012920324804000_ocab270-B26","first-page":"3111","author":"Mikolov","year":"2013"},{"issue":"1","key":"2022012920324804000_ocab270-B27","doi-asserted-by":"crossref","first-page":"270","DOI":"10.1186\/s12859-018-2264-5","article-title":"Random forest versus logistic regression: a large-scale benchmark experiment","volume":"19","author":"Couronne","year":"2018","journal-title":"BMC Bioinformatics"},{"issue":"5","key":"2022012920324804000_ocab270-B28","doi-asserted-by":"crossref","first-page":"326","DOI":"10.1212\/01.wnl.0000252807.38124.a3","article-title":"How common are the \u201ccommon\u201d neurologic disorders?","volume":"68","author":"Hirtz","year":"2007","journal-title":"Neurology"},{"issue":"10","key":"2022012920324804000_ocab270-B29","doi-asserted-by":"crossref","first-page":"e1029","DOI":"10.1212\/WNL.0000000000007035","article-title":"The prevalence of MS in the United States","volume":"92","author":"Wallin","year":"2019","journal-title":"Neurology"},{"issue":"9","key":"2022012920324804000_ocab270-B30","doi-asserted-by":"crossref","first-page":"800","DOI":"10.1212\/01.wnl.0000335764.14513.1a","article-title":"Incidental MRI anomalies suggestive of multiple sclerosis the radiologically isolated syndrome","volume":"72","author":"Okuda","year":"2009","journal-title":"Neurology"},{"issue":"3","key":"2022012920324804000_ocab270-B31","doi-asserted-by":"crossref","first-page":"e90509","DOI":"10.1371\/journal.pone.0090509","article-title":"Radiologically isolated syndrome: 5-year risk for an initial clinical event","volume":"9","author":"Okuda","year":"2014","journal-title":"PLoS One"},{"issue":"1","key":"2022012920324804000_ocab270-B32","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1016\/j.cyto.2014.09.011","article-title":"T cell subsets and their signature cytokines in autoimmune and inflammatory diseases","volume":"74","author":"Raphael","year":"2015","journal-title":"Cytokine"},{"issue":"4","key":"2022012920324804000_ocab270-B33","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1093\/intimm\/dxw006","article-title":"CD4+T-cell subsets in inflammatory diseases: beyond the Th1\/Th2 paradigm","volume":"28","author":"Hirahara","year":"2016","journal-title":"Int Immunol"},{"issue":"1","key":"2022012920324804000_ocab270-B34","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1177\/1756285614564152","article-title":"Dimethyl fumarate in the treatment of relapsing\u2013remitting multiple sclerosis: an overview","volume":"8","author":"Bomprezzi","year":"2015","journal-title":"Adv Neurol Disord"},{"issue":"2","key":"2022012920324804000_ocab270-B35","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1615\/CritRevImmunol.v25.i2.10","article-title":"Th1 and Th2 lymphocytes in autoimmune disease","volume":"25","author":"Crane","year":"2005","journal-title":"Crit Rev Immunol"},{"issue":"11","key":"2022012920324804000_ocab270-B36","doi-asserted-by":"crossref","first-page":"1506","DOI":"10.1177\/1352458516681198","article-title":"Infection-related health care utilization among people with and without multiple sclerosis","volume":"23","author":"Wijnands","year":"2017","journal-title":"Mult Scler"},{"issue":"6","key":"2022012920324804000_ocab270-B37","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1016\/S1474-4422(17)30076-5","article-title":"Health-care use before a first demyelinating event suggestive of a multiple sclerosis prodrome: a matched cohort study","volume":"16","author":"Wijnands","year":"2017","journal-title":"Lancet Neurol"},{"issue":"8","key":"2022012920324804000_ocab270-B38","doi-asserted-by":"crossref","first-page":"1092","DOI":"10.1177\/1352458518783662","article-title":"Five years before multiple sclerosis onset: Phenotyping the prodrome","volume":"25","author":"Wijnands","year":"2019","journal-title":"Mult Scler"},{"issue":"5","key":"2022012920324804000_ocab270-B39","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1038\/gene.2011.3","article-title":"A knowledge-driven interaction analysis reveals potential neurodegenerative mechanism of multiple sclerosis susceptibility","volume":"12","author":"Bush","year":"2011","journal-title":"Genes Immun"},{"issue":"31","key":"2022012920324804000_ocab270-B40","doi-asserted-by":"crossref","first-page":"10807","DOI":"10.1074\/jbc.RA120.013696","article-title":"A myelin basic protein fragment induces sexually dimorphic transcriptome signatures of neuropathic pain in mice","volume":"295","author":"Chernov","year":"2020","journal-title":"J Biol Chem"},{"issue":"1","key":"2022012920324804000_ocab270-B41","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1055\/s-2007-1019124","article-title":"The immunology of multiple sclerosis","volume":"28","author":"Bar-Or","year":"2008","journal-title":"Semin Neurol"},{"issue":"11","key":"2022012920324804000_ocab270-B42","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1093\/qjmed\/95.11.753","article-title":"Asthma and multiple sclerosis: an inverse association in a case-control general practice population","volume":"95","author":"Tremlett","year":"2002","journal-title":"QJM"},{"key":"2022012920324804000_ocab270-B43","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1016\/B978-0-7020-6896-6.00016-8","volume-title":"Clinical Immunology","author":"Eagar","year":"2019","edition":"5th ed."}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/29\/3\/424\/42333234\/ocab270.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/29\/3\/424\/42333234\/ocab270.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,1,29]],"date-time":"2022-01-29T20:34:09Z","timestamp":1643488449000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/29\/3\/424\/6463510"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,16]]},"references-count":43,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2021,12,16]]},"published-print":{"date-parts":[[2022,1,29]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocab270","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,3,1]]},"published":{"date-parts":[[2021,12,16]]}}}