{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T02:46:16Z","timestamp":1773197176375,"version":"3.50.1"},"reference-count":38,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2018,1,18]],"date-time":"2018-01-18T00:00:00Z","timestamp":1516233600000},"content-version":"vor","delay-in-days":1,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000274","name":"British Heart Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000274","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100004440","name":"Wellcome Trust","doi-asserted-by":"publisher","award":["WT\/104955\/Z\/14\/Z"],"award-info":[{"award-number":["WT\/104955\/Z\/14\/Z"]}],"id":[{"id":"10.13039\/100004440","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Genome-wide association studies have identified thousands of loci associated with human disease, but identifying the causal genes at these loci is often difficult. Several methods prioritize genes most likely to be disease causing through the integration of biological data, including protein\u2013protein interaction and phenotypic data. Data availability is not the same for all genes however, potentially influencing the performance of these methods.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We demonstrate that whilst disease genes tend to be associated with greater numbers of data, this may be at least partially a result of them being better studied. With this observation we develop PhenoRank, which prioritizes disease genes whilst avoiding being biased towards genes with more available data. Bias is avoided by comparing gene scores generated for the query disease against gene scores generated using simulated sets of phenotype terms, which ensures that differences in data availability do not affect the ranking of genes. We demonstrate that whilst existing prioritization methods are biased by data availability, PhenoRank is not similarly biased. Avoiding this bias allows PhenoRank to effectively prioritize genes with fewer available data and improves its overall performance. PhenoRank outperforms three available prioritization methods in cross-validation (PhenoRank area under receiver operating characteristic curve [AUC]=0.89, DADA AUC\u2009=\u20090.87, EXOMISER AUC\u2009=\u20090.71, PRINCE AUC\u2009=\u20090.83, P\u2009&amp;lt;\u20092.2\u2009\u00d7\u200910\u221216).<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>PhenoRank is freely available for download at https:\/\/github.com\/alexjcornish\/PhenoRank.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty028","type":"journal-article","created":{"date-parts":[[2018,1,16]],"date-time":"2018-01-16T20:11:11Z","timestamp":1516133471000},"page":"2087-2095","source":"Crossref","is-referenced-by-count":34,"title":["PhenoRank: reducing study bias in gene prioritization through\n     simulation"],"prefix":"10.1093","volume":"34","author":[{"given":"Alex J","family":"Cornish","sequence":"first","affiliation":[{"name":"Department of Life Sciences, Center of Bioinformatics and Systems Biology, Imperial College London, London, UK"}]},{"given":"Alessia","family":"David","sequence":"additional","affiliation":[{"name":"Department of Life Sciences, Center of Bioinformatics and Systems Biology, Imperial College London, London, UK"}]},{"given":"Michael J E","family":"Sternberg","sequence":"additional","affiliation":[{"name":"Department of Life Sciences, Center of Bioinformatics and Systems Biology, Imperial College London, London, UK"}]}],"member":"286","published-online":{"date-parts":[[2018,1,17]]},"reference":[{"key":"2023012713391074300_bty028-B1","doi-asserted-by":"crossref","first-page":"D789","DOI":"10.1093\/nar\/gku1205","article-title":"OMIM.org: online Mendelian\n      Inheritance in Man (OMIM), an online catalog of human genes and genetic\n      disorders","volume":"43","author":"Amberger","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023012713391074300_bty028-B2","doi-asserted-by":"crossref","first-page":"71.","DOI":"10.1186\/s13075-015-0572-y","article-title":"Identification of NF-\u03baB and PLCL2 as\n      new susceptibility genes and highlights on a potential role of IRF8 through interferon\n      signature modulation in systemic sclerosis","volume":"17","author":"Arismendi","year":"2015","journal-title":"Arthritis Res.\n      Ther"},{"key":"2023012713391074300_bty028-B3","doi-asserted-by":"crossref","first-page":"632","DOI":"10.1007\/s00335-012-9427-x","article-title":"The International Mouse Phenotyping Consortium: past\n      and future perspectives on mouse phenotyping","volume":"23","author":"Brown","year":"2012","journal-title":"Mamm.\n      Genome"},{"key":"2023012713391074300_bty028-B4","doi-asserted-by":"crossref","first-page":"D840","DOI":"10.1093\/nar\/gkv1211","article-title":"Mouse genome database\n      2016","volume":"44","author":"Bult","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023012713391074300_bty028-B5","doi-asserted-by":"crossref","first-page":"D470","DOI":"10.1093\/nar\/gku1204","article-title":"The BioGRID interaction database:\n      2015 update","volume":"43","author":"Chatr-Aryamontri","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023012713391074300_bty028-B6","doi-asserted-by":"crossref","first-page":"858","DOI":"10.1002\/humu.22051","article-title":"Mousefinder: candidate disease genes\n      from mouse phenotype data","volume":"33","author":"Chen","year":"2012","journal-title":"Hum. Mutat"},{"key":"2023012713391074300_bty028-B7","doi-asserted-by":"crossref","first-page":"551","DOI":"10.1038\/nrg.2017.38","article-title":"Network propagation: a universal\n      amplifier of genetic associations","volume":"18","author":"Cowen","year":"2017","journal-title":"Nat. Rev. Genet"},{"key":"2023012713391074300_bty028-B8","doi-asserted-by":"crossref","first-page":"92.","DOI":"10.1186\/1752-0509-6-92","article-title":"HINT: high-quality protein interactomes and their\n      applications in understanding human disease","volume":"6","author":"Das","year":"2012","journal-title":"BMC Syst.\n      Biol"},{"key":"2023012713391074300_bty028-B9","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1186\/1756-0381-4-19","article-title":"DADA: degree-aware algorithms for\n      network-based disease gene prioritization","volume":"4","author":"Erten","year":"2011","journal-title":"BioData\n     Min"},{"key":"2023012713391074300_bty028-B10","doi-asserted-by":"crossref","first-page":"480","DOI":"10.1186\/s13059-014-0480-5","article-title":"FunSeq2: a framework for\n      prioritizing noncoding regulatory variants in cancer","volume":"15","author":"Fu","year":"2014","journal-title":"Genome\n      Biol"},{"key":"2023012713391074300_bty028-B11","doi-asserted-by":"crossref","first-page":"e1002444.","DOI":"10.1371\/journal.pcbi.1002444","article-title":"Guilt by association\u2019 is the exception rather than\n      the rule in gene networks","volume":"8","author":"Gillis","year":"2012","journal-title":"PLOS Comput. Biol"},{"key":"2023012713391074300_bty028-B12","doi-asserted-by":"crossref","first-page":"10888","DOI":"10.1038\/srep10888","article-title":"Analysis of the human diseasome\n      reveals phenotype modules across common, genetic, and infectious\n      diseases","volume":"5","author":"Hoehndorf","year":"2015","journal-title":"Sci. Rep"},{"key":"2023012713391074300_bty028-B13","doi-asserted-by":"crossref","first-page":"6178.","DOI":"10.1038\/ncomms7178","article-title":"Capture Hi-C identifies the\n      chromatin interactome of colorectal cancer risk loci","volume":"6","author":"J\u00e4ger","year":"2015","journal-title":"Nat.\n      Commun"},{"key":"2023012713391074300_bty028-B14","doi-asserted-by":"crossref","first-page":"2630","DOI":"10.1002\/art.30425","article-title":"Genome-wide and species-wide\n      dissection of the genetics of arthritis severity in heterogeneous stock\n      mice","volume":"63","author":"Johnsen","year":"2011","journal-title":"Arthritis Rheum"},{"key":"2023012713391074300_bty028-B15","doi-asserted-by":"crossref","first-page":"D767","DOI":"10.1093\/nar\/gkn892","article-title":"Human protein reference\n      database-2009 update","volume":"37","author":"Keshava Prasad","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023012713391074300_bty028-B16","doi-asserted-by":"crossref","first-page":"D1071","DOI":"10.1093\/nar\/gku1011","article-title":"Disease Ontology 2015 update: an\n      expanded and updated database of human diseases for linking biomedical knowledge through\n      disease data","volume":"43","author":"Kibbe","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023012713391074300_bty028-B17","doi-asserted-by":"crossref","first-page":"949","DOI":"10.1016\/j.ajhg.2008.02.013","article-title":"Walking the interactome for\n      prioritization of candidate disease genes","volume":"82","author":"K\u00f6hler","year":"2008","journal-title":"Am. J. Hum.\n      Genet"},{"key":"2023012713391074300_bty028-B18","doi-asserted-by":"crossref","first-page":"30","DOI":"10.12688\/f1000research.2-30.v1","article-title":"Construction and accessibility of a\n      cross-species phenotype ontology along with gene annotations for biomedical\n      research","volume":"2","author":"K\u00f6hler","year":"2013","journal-title":"F1000Research"},{"key":"2023012713391074300_bty028-B19","doi-asserted-by":"crossref","first-page":"D966","DOI":"10.1093\/nar\/gkt1026","article-title":"The Human Phenotype Ontology\n      project: linking molecular biology and disease through phenotype data","volume":"42","author":"K\u00f6hler","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023012713391074300_bty028-B20","doi-asserted-by":"crossref","first-page":"7.","DOI":"10.1186\/s13326-017-0119-z","article-title":"Evaluating the effect of annotation size on measures\n      of semantic similarity","volume":"8","author":"Kulmanov","year":"2017","journal-title":"J. Biomed. Seman"},{"key":"2023012713391074300_bty028-B21","doi-asserted-by":"crossref","first-page":"D862","DOI":"10.1093\/nar\/gkv1222","article-title":"ClinVar: public archive of\n      interpretations of clinically relevant variants","volume":"44","author":"Landrum","year":"2016","journal-title":"Nucleic Acids\n      Res"},{"key":"2023012713391074300_bty028-B22","doi-asserted-by":"crossref","first-page":"3555","DOI":"10.1093\/bioinformatics\/btv402","article-title":"LDlink: a web-based application for exploring\n      population-specific haplotype structure and linking correlated alleles of possible\n      functional variants","volume":"31","author":"Machiela","year":"2014","journal-title":"Bioinformatics"},{"key":"2023012713391074300_bty028-B23","doi-asserted-by":"crossref","first-page":"D7","DOI":"10.1093\/nar\/gkv1290","article-title":"Database\n      resources of the National Center for Biotechnology Information","volume":"44","author":"NCBI Resource Coordinators","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023012713391074300_bty028-B24","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1038\/nature12873","article-title":"Genetics of rheumatoid arthritis\n      contributes to biology and drug discovery","volume":"506","author":"Okada","year":"2014","journal-title":"Nature"},{"key":"2023012713391074300_bty028-B25","doi-asserted-by":"crossref","first-page":"D358","DOI":"10.1093\/nar\/gkt1115","article-title":"The MIntAct project\u2013IntAct as a\n      common curation platform for 11 molecular interaction databases","volume":"42","author":"Orchard","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023012713391074300_bty028-B26","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/1471-2105-9-S5-S4","article-title":"Metrics for GO based protein\n      semantic similarity: a systematic evaluation","volume":"9","author":"Pesquita","year":"2008","journal-title":"BMC\n      Bioinformatics"},{"key":"2023012713391074300_bty028-B27","doi-asserted-by":"crossref","first-page":"77.","DOI":"10.1186\/1471-2105-12-77","article-title":"pROC: an open-source package for R\n      and S+ to analyze and compare ROC curves","volume":"12","author":"Robin","year":"2011","journal-title":"BMC\n      Bioinformatics"},{"key":"2023012713391074300_bty028-B28","doi-asserted-by":"crossref","first-page":"1212","DOI":"10.1016\/j.cell.2014.10.050","article-title":"A proteome-scale map of the human\n      interactome network","volume":"159","author":"Rolland","year":"2014","journal-title":"Cell"},{"key":"2023012713391074300_bty028-B29","doi-asserted-by":"crossref","first-page":"1083","DOI":"10.1038\/nmeth.2656","article-title":"eXtasy: variant prioritization by\n      genomic data fusion","volume":"10","author":"Sifrim","year":"2013","journal-title":"Nat. Methods"},{"key":"2023012713391074300_bty028-B30","doi-asserted-by":"crossref","first-page":"2004","DOI":"10.1038\/nprot.2015.124","article-title":"Next-generation diagnostics and\n      disease\u2013gene discovery with the Exomiser","volume":"10","author":"Smedley","year":"2015","journal-title":"Nat. Protoc"},{"key":"2023012713391074300_bty028-B31","doi-asserted-by":"crossref","first-page":"R7","DOI":"10.1186\/gb-2004-6-1-r7","article-title":"The Mammalian Phenotype Ontology as\n      a tool for annotating, analyzing and comparing phenotypic information","volume":"6","author":"Smith","year":"2005","journal-title":"Genome Biol"},{"key":"2023012713391074300_bty028-B32","doi-asserted-by":"crossref","first-page":"7486","DOI":"10.1093\/nar\/gku469","article-title":"Activities at\n      the Universal Protein Resource (UniProt)","volume":"42","author":"The UniProt Consortium","year":"2014","journal-title":"Nucleic Acids\n      Res"},{"key":"2023012713391074300_bty028-B33","doi-asserted-by":"crossref","first-page":"1910","DOI":"10.4049\/jimmunol.1501165","article-title":"Galectin-1 couples glycobiology to\n      inflammation in osteoarthritis through the activation of an NF-\u03baB-regulated gene\n      network","volume":"196","author":"Toegel","year":"2016","journal-title":"J. Immunol"},{"key":"2023012713391074300_bty028-B34","first-page":"16","article-title":"How correlated are network\n      centrality measures?","volume":"28","author":"Valente","year":"2008","journal-title":"Connections"},{"key":"2023012713391074300_bty028-B35","doi-asserted-by":"crossref","first-page":"e1000641","DOI":"10.1371\/journal.pcbi.1000641","article-title":"Associating genes and protein\n      complexes with disease via network propagation","volume":"6","author":"Vanunu","year":"2010","journal-title":"PLOS Comput.\n      Biol"},{"key":"2023012713391074300_bty028-B36","doi-asserted-by":"crossref","first-page":"1274","DOI":"10.1016\/j.jmb.2013.01.026","article-title":"Proteins and domains vary in their tolerance of\n      non-synonymous single nucleotide polymorphisms (nsSNPs)","volume":"425","author":"Yates","year":"2013","journal-title":"J. Mol.\n      Biol"},{"key":"2023012713391074300_bty028-B37","doi-asserted-by":"crossref","first-page":"2692","DOI":"10.1016\/j.jmb.2014.04.026","article-title":"SuSPect: enhanced prediction of\n      single amino acid variant (SAV) phenotype using network features","volume":"426","author":"Yates","year":"2014","journal-title":"J. Mol. Biol"},{"key":"2023012713391074300_bty028-B38","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1146\/annurev-immunol-030409-101212","article-title":"Differentiation of effector CD4 T\n      cell populations","volume":"28","author":"Zhu","year":"2010","journal-title":"Annu. Rev. Immunol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/12\/2087\/48935799\/bioinformatics_34_12_2087.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/12\/2087\/48935799\/bioinformatics_34_12_2087.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T14:19:16Z","timestamp":1674829156000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/12\/2087\/4816110"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,1,17]]},"references-count":38,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2018,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty028","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,6,15]]},"published":{"date-parts":[[2018,1,17]]}}}