{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T09:08:42Z","timestamp":1768900122588,"version":"3.49.0"},"reference-count":59,"publisher":"Oxford University Press (OUP)","issue":"8","license":[{"start":{"date-parts":[[2021,5,3]],"date-time":"2021-05-03T00:00:00Z","timestamp":1620000000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Institutes of Health Common Fund"},{"DOI":"10.13039\/100006086","name":"Office of Strategic Coordination","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006086","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Office of the National Institutes of Health Director","award":["U01HG007530"],"award-info":[{"award-number":["U01HG007530"]}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,7,30]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Objective<\/jats:title><jats:p>When studying any specific rare disease, heterogeneity and scarcity of affected individuals has historically hindered investigators from discerning on what to focus to understand and diagnose a disease. New nongenomic methodologies must be developed that identify similarities in seemingly dissimilar conditions.<\/jats:p><\/jats:sec><jats:sec><jats:title>Materials and Methods<\/jats:title><jats:p>This observational study analyzes 1042 patients from the Undiagnosed Diseases Network (2015-2019), a multicenter, nationwide research study using phenotypic data annotated by specialized staff using Human Phenotype Ontology terms. We used Louvain community detection to cluster patients linked by Jaccard pairwise similarity and 2 support vector classifier to assign new cases. We further validated the clusters\u2019 most representative comorbidities using a national claims database (67 million patients).<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Patients were divided into 2 groups: those with symptom onset before 18 years of age (n\u2009=\u2009810) and at 18 years of age or older (n\u2009=\u2009232) (average symptom onset age: 10 [interquartile range, 0-14] years). For 810 pediatric patients, we identified 4 statistically significant clusters. Two clusters were characterized by growth disorders, and developmental delay enriched for hypotonia presented a higher likelihood of diagnosis. Support vector classifier showed 0.89 balanced accuracy (0.83 for Human Phenotype Ontology terms only) on test data.<\/jats:p><\/jats:sec><jats:sec><jats:title>Discussions<\/jats:title><jats:p>To set the framework for future discovery, we chose as our endpoint the successful grouping of patients by phenotypic similarity and provide a classification tool to assign new patients to those clusters.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusion<\/jats:title><jats:p>This study shows that despite the scarcity and heterogeneity of patients, we can still find commonalities that can potentially be harnessed to uncover new insights and targets for therapy.<\/jats:p><\/jats:sec>","DOI":"10.1093\/jamia\/ocab050","type":"journal-article","created":{"date-parts":[[2021,3,5]],"date-time":"2021-03-05T20:14:28Z","timestamp":1614975268000},"page":"1694-1702","source":"Crossref","is-referenced-by-count":5,"title":["Finding commonalities in rare diseases through the undiagnosed diseases network"],"prefix":"10.1093","volume":"28","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8346-4502","authenticated-orcid":false,"given":"Josephine","family":"Yates","sequence":"first","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"given":"Alba","family":"Guti\u00e9rrez-Sacrist\u00e1n","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5272-2265","authenticated-orcid":false,"given":"Vianney","family":"Jouhet","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"given":"Kimberly","family":"LeBlanc","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"given":"Cecilia","family":"Esteves","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"name":"Undiagnosed Diseases Network","sequence":"additional","affiliation":[]},{"given":"Thomas N","family":"DeSain","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"given":"Nick","family":"Benik","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"given":"Jason","family":"Stedman","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"given":"Nathan","family":"Palmer","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"given":"Guillaume","family":"Mellon","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"given":"Isaac","family":"Kohane","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0235-7543","authenticated-orcid":false,"given":"Paul","family":"Avillach","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,5,3]]},"reference":[{"issue":"2","key":"2021073020264922600_ocab050-B1","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1038\/d41573-019-00180-y","article-title":"How many rare diseases are there?","volume":"19","author":"Haendel","year":"2020","journal-title":"Nat Rev Drug Discov"},{"key":"2021073020264922600_ocab050-B2","volume-title":"Profile of Rare Diseases, pp. 2\u20133","year":"2010"},{"issue":"1","key":"2021073020264922600_ocab050-B3","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1186\/s13023-017-0622-4","article-title":"Australian children living with rare diseases: experiences of diagnosis and perceived consequences of diagnostic delays","volume":"12","author":"Zurynski","year":"2017","journal-title":"Orphanet J Rare Dis"},{"key":"2021073020264922600_ocab050-B4","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1016\/j.ymgmr.2016.07.006","article-title":"The diagnostic journey of patients with mucopolysaccharidosis I: A real-world survey of patient and physician experiences","volume":"8","author":"Bruni","year":"2016","journal-title":"Mol Genet Metab Rep"},{"issue":"1","key":"2021073020264922600_ocab050-B5","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1186\/s13023-017-0733-y","article-title":"Failure to shorten the diagnostic delay in two ultra-orphan diseases (mucopolysaccharidosis types I and III): potential causes and implications","volume":"13","author":"Kuiper","year":"2018","journal-title":"Orphanet J Rare Dis"},{"issue":"1","key":"2021073020264922600_ocab050-B6","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1242\/dmm.009258","article-title":"The NIH Undiagnosed Diseases Program: bonding scientists and clinicians","volume":"5","author":"Gahl","year":"2012","journal-title":"Dis Model Mech"},{"issue":"1","key":"2021073020264922600_ocab050-B7","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1038\/gim.0b013e318232a005","article-title":"The National Institutes of Health Undiagnosed Diseases Program: insights into rare diseases","volume":"14","author":"Gahl","year":"2012","journal-title":"Genet Med"},{"issue":"22","key":"2021073020264922600_ocab050-B8","doi-asserted-by":"crossref","first-page":"2131","DOI":"10.1056\/NEJMoa1714458","article-title":"Effect of genetic diagnosis on patients with previously undiagnosed disease","volume":"379","author":"Splinter","year":"2018","journal-title":"N Engl J Med"},{"issue":"4","key":"2021073020264922600_ocab050-B9","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1016\/j.ymgme.2016.01.007","article-title":"The NIH Undiagnosed Diseases Program and Network: applications to modern medicine","volume":"117","author":"Gahl","year":"2016","journal-title":"Mol Genet Metab"},{"issue":"2","key":"2021073020264922600_ocab050-B10","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1016\/j.ajhg.2017.01.006","article-title":"The Undiagnosed Diseases Network: accelerating discovery about health and disease","volume":"100","author":"Ramoni","year":"2017","journal-title":"Am J Hum Genet"},{"issue":"1","key":"2021073020264922600_ocab050-B11","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1186\/s13023-017-0755-5","article-title":"An overview of the impact of rare disease characteristics on research methodology","volume":"13","author":"Whicher","year":"2018","journal-title":"Orphanet J Rare Dis"},{"issue":"3","key":"2021073020264922600_ocab050-B12","doi-asserted-by":"crossref","first-page":"404","DOI":"10.2174\/0929867324666170718101946","article-title":"NGS technologies as a turning point in rare disease research, diagnosis and treatment","volume":"25","author":"Fernandez-Marmiesse","year":"2018","journal-title":"Curr Med Chem"},{"key":"2021073020264922600_ocab050-B13","doi-asserted-by":"crossref","first-page":"e15","DOI":"10.1017\/S0016672315000166","article-title":"The long tail and rare disease research: the impact of next-generation sequencing for rare Mendelian disorders","volume":"97","author":"Shen","year":"2015","journal-title":"Genet Res"},{"key":"2021073020264922600_ocab050-B14","year":"2019"},{"key":"2021073020264922600_ocab050-B15","year":"2019"},{"issue":"18","key":"2021073020264922600_ocab050-B16","doi-asserted-by":"crossref","first-page":"1870","DOI":"10.1001\/jama.2014.14601","article-title":"Molecular findings among patients referred for clinical whole-exome sequencing","volume":"312","author":"Yang","year":"2014","journal-title":"JAMA"},{"issue":"18","key":"2021073020264922600_ocab050-B17","doi-asserted-by":"crossref","first-page":"1880","DOI":"10.1001\/jama.2014.14604","article-title":"Clinical exome sequencing for genetic identification of rare Mendelian disorders","volume":"312","author":"Lee","year":"2014","journal-title":"JAMA"},{"key":"2021073020264922600_ocab050-B18","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1016\/j.atg.2016.03.003","article-title":"Limited resources of genome sequencing in developing countries: challenges and solutions","volume":"9","author":"Helmy","year":"2016","journal-title":"Appl Transl Genom"},{"key":"2021073020264922600_ocab050-B19","year":"2019"},{"key":"2021073020264922600_ocab050-B20"},{"issue":"8","key":"2021073020264922600_ocab050-B21","doi-asserted-by":"crossref","first-page":"1057","DOI":"10.1002\/humu.22347","article-title":"PhenoTips: patient phenotyping software for clinical and research use","volume":"34","author":"Girdea","year":"2013","journal-title":"Hum Mutat"},{"issue":"7","key":"2021073020264922600_ocab050-B22","doi-asserted-by":"crossref","first-page":"613","DOI":"10.1002\/ajmg.b.32579","article-title":"Phelan-McDermid syndrome data network: Integrating patient reported outcomes with clinical notes and curated genetic reports","volume":"177","author":"Kothari","year":"2018","journal-title":"Am J Med Genet B Neuropsychiatr Genet"},{"key":"2021073020264922600_ocab050-B23","year":"2020"},{"issue":"D1","key":"2021073020264922600_ocab050-B24","doi-asserted-by":"crossref","first-page":"D865","DOI":"10.1093\/nar\/gkw1039","article-title":"The Human Phenotype Ontology in 2017","volume":"45","author":"K\u00f6hler","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2021073020264922600_ocab050-B25","year":"2019"},{"key":"2021073020264922600_ocab050-B26","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1016\/j.patrec.2018.12.007","article-title":"A note on the triangle inequality for the Jaccard distance","volume":"120","author":"Kosub","year":"2019","journal-title":"Pattern Recognit Lett"},{"key":"2021073020264922600_ocab050-B27","author":"Blondel"},{"key":"2021073020264922600_ocab050-B28","doi-asserted-by":"crossref","first-page":"336","DOI":"10.1038\/srep00336","article-title":"Consensus clustering in complex networks","volume":"2","author":"Lancichinetti","year":"2012","journal-title":"Sci Rep"},{"key":"2021073020264922600_ocab050-B29","year":"2020"},{"key":"2021073020264922600_ocab050-B30","year":"2020"},{"issue":"9","key":"2021073020264922600_ocab050-B31","doi-asserted-by":"crossref","first-page":"1205","DOI":"10.1093\/bioinformatics\/btq126","article-title":"PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations","volume":"26","author":"Denny","year":"2010","journal-title":"Bioinformatics"},{"issue":"6381","key":"2021073020264922600_ocab050-B32","doi-asserted-by":"crossref","first-page":"1233","DOI":"10.1126\/science.aal4043","article-title":"Phenotype risk scores identify patients with unrecognized Mendelian disease patterns","volume":"359","author":"Bastarache","year":"2018","journal-title":"Science"},{"key":"2021073020264922600_ocab050-B33","year":"2019"},{"issue":"D1","key":"2021073020264922600_ocab050-B34","doi-asserted-by":"crossref","first-page":"D1207","DOI":"10.1093\/nar\/gkaa1043","article-title":"The Human Phenotype Ontology in 2021","volume":"49","author":"K\u00f6hler","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2021073020264922600_ocab050-B35","doi-asserted-by":"crossref","first-page":"D793","DOI":"10.1093\/nar\/gkn665","article-title":"McKusick\u2019s Online Mendelian Inheritance in Man (OMIM)","volume":"37 (Database issue","author":"Amberger","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2021073020264922600_ocab050-B36","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1016\/S1474-4422(11)70001-1","article-title":"Rare neurological diseases: a united approach is needed","volume":"10","year":"2011","journal-title":"Lancet Neurol"},{"issue":"1","key":"2021073020264922600_ocab050-B37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.4274\/Jcrpe.1149","article-title":"Syndromic disorders with short stature","volume":"6","author":"\u015e\u0131klar","year":"2014","journal-title":"J Clin Res Pediatr Endocrinol"},{"key":"2021073020264922600_ocab050-B38","year":"2019"},{"key":"2021073020264922600_ocab050-B39","year":"2019"},{"key":"2021073020264922600_ocab050-B40","year":"2019"},{"key":"2021073020264922600_ocab050-B41","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1684\/j.1950-6945.2006.tb00195.x","article-title":"Developmental delay and epilepsy","volume":"8","author":"Bednarek","year":"2006","journal-title":"Epileptic Disord"},{"key":"2021073020264922600_ocab050-B42","year":"2019"},{"key":"2021073020264922600_ocab050-B43","year":"2019"},{"key":"2021073020264922600_ocab050-B44","year":"2019"},{"key":"2021073020264922600_ocab050-B45","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1016\/j.jpsychires.2014.09.007","article-title":"Association between mental disorders and subsequent adult onset asthma","volume":"59","author":"Alonso","year":"2014","journal-title":"J Psychiatr Res"},{"key":"2021073020264922600_ocab050-B46","year":"2019"},{"issue":"8","key":"2021073020264922600_ocab050-B47","doi-asserted-by":"crossref","first-page":"5107","DOI":"10.19082\/5107","article-title":"Anxiety and depression in patients with gastroesophageal reflux disorder","volume":"9","author":"Javadi","year":"2017","journal-title":"Electron Physician"},{"issue":"5","key":"2021073020264922600_ocab050-B48","doi-asserted-by":"crossref","first-page":"614","DOI":"10.1002\/mds.26158","article-title":"Treatable causes of cerebellar ataxia","volume":"30","author":"Ramirez-Zamora","year":"2015","journal-title":"Mov Disord"},{"issue":"3","key":"2021073020264922600_ocab050-B49","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1001\/archneur.1968.00480030109013","article-title":"Familial myoclonus, cerebellar ataxia, and deafness. Specific genetically-determined disease","volume":"19","author":"May","year":"1968","journal-title":"Arch Neurol"},{"issue":"9","key":"2021073020264922600_ocab050-B50","doi-asserted-by":"crossref","first-page":"999","DOI":"10.1177\/0883073808315622","article-title":"Spinocerebellar ataxia type 2 presenting with cognitive regression in childhood","volume":"23","author":"Ramocki","year":"2008","journal-title":"J Child Neurol"},{"key":"2021073020264922600_ocab050-B51","year":"2019"},{"key":"2021073020264922600_ocab050-B52","year":"2019"},{"key":"2021073020264922600_ocab050-B53","year":"2019"},{"issue":"4","key":"2021073020264922600_ocab050-B54","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1016\/j.ajhg.2018.08.017","article-title":"Phenotype-specific enrichment of mendelian disorder genes near GWAS regions across 62 complex traits","volume":"103","author":"Freund","year":"2018","journal-title":"Am J Hum Genet"},{"issue":"7265","key":"2021073020264922600_ocab050-B55","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1038\/nature08494","article-title":"Finding the missing heritability of complex diseases","volume":"461","author":"Manolio","year":"2009","journal-title":"Nature"},{"issue":"4","key":"2021073020264922600_ocab050-B56","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1001\/jama.2015.19394","article-title":"Analyzing repeated measurements using mixed models","volume":"315","author":"Detry","year":"2016","journal-title":"JAMA"},{"issue":"4","key":"2021073020264922600_ocab050-B57","doi-asserted-by":"crossref","first-page":"324","DOI":"10.1016\/j.jclinepi.2007.07.008","article-title":"Efficient ways exist to obtain the optimal sample size in clinical trials in rare diseases","volume":"61","author":"van der Lee","year":"2008","journal-title":"J Clin Epidemiol"},{"issue":"6","key":"2021073020264922600_ocab050-B58","doi-asserted-by":"crossref","first-page":"783","DOI":"10.1001\/jama.283.6.783","article-title":"Effect of out-of-hospital pediatric endotracheal intubation on survival and neurological outcome: a controlled clinical trial","volume":"283","author":"Gausche","year":"2000","journal-title":"JAMA"},{"issue":"2\/3","key":"2021073020264922600_ocab050-B59","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1023\/A:1012801612483","article-title":"On clustering validation techniques","volume":"17","author":"Halkidi","year":"2001","journal-title":"J Intell Inf Syst"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/28\/8\/1694\/39502222\/ocab050.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/28\/8\/1694\/39502222\/ocab050.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,25]],"date-time":"2024-08-25T13:09:30Z","timestamp":1724591370000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/28\/8\/1694\/6262054"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,5,3]]},"references-count":59,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2021,5,3]]},"published-print":{"date-parts":[[2021,7,30]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocab050","relation":{},"ISSN":["1527-974X"],"issn-type":[{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,8,1]]},"published":{"date-parts":[[2021,5,3]]}}}