{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T20:32:59Z","timestamp":1776457979019,"version":"3.51.2"},"reference-count":47,"publisher":"Oxford University Press (OUP)","issue":"19","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: High-throughput sequencing platforms are increasingly used to screen patients with genetic disease for pathogenic mutations, but prediction of the effects of mutations remains challenging. Previously we developed SAAPdap (Single Amino Acid Polymorphism Data Analysis Pipeline) and SAAPpred (Single Amino Acid Polymorphism Predictor) that use a combination of rule-based structural measures to predict whether a missense genetic variant is pathogenic. Here we investigate whether the same methodology can be used to develop a differential phenotype predictor, which, once a mutation has been predicted as pathogenic, is able to distinguish between phenotypes\u2014in this case the two major clinical phenotypes (hypertrophic cardiomyopathy, HCM and dilated cardiomyopathy, DCM) associated with mutations in the beta-myosin heavy chain (MYH7) gene product (Myosin-7).<\/jats:p>\n               <jats:p>Results: A random forest predictor trained on rule-based structural analyses together with structural clustering data gave a Matthews\u2019 correlation coefficient (MCC) of 0.53 (accuracy, 75%). A post hoc removal of machine learning models that performed particularly badly, increased the performance (MCC\u2009=\u20090.61, Acc\u2009=\u200979%). This proof of concept suggests that methods used for pathogenicity prediction can be extended for use in differential phenotype prediction.<\/jats:p>\n               <jats:p>Availability and Implementation: Analyses were implemented in Perl and C and used the Java-based Weka machine learning environment. Please contact the authors for availability.<\/jats:p>\n               <jats:p>Contacts: andrew@bioinf.org.uk or andrew.martin@ucl.ac.uk<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btw362","type":"journal-article","created":{"date-parts":[[2016,6,19]],"date-time":"2016-06-19T00:18:20Z","timestamp":1466295500000},"page":"2947-2955","source":"Crossref","is-referenced-by-count":12,"title":["The structural effects of mutations can aid in differential phenotype prediction of beta-myosin heavy chain (Myosin-7) missense variants"],"prefix":"10.1093","volume":"32","author":[{"given":"Nouf S.","family":"Al-Numair","sequence":"first","affiliation":[{"name":"1 Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London WC1E 6BT, UK"}]},{"given":"Luis","family":"Lopes","sequence":"additional","affiliation":[{"name":"2 Institute of Cardiovascular Science, UCL, London, UK"}]},{"given":"Petros","family":"Syrris","sequence":"additional","affiliation":[{"name":"2 Institute of Cardiovascular Science, UCL, London, UK"}]},{"given":"Lorenzo","family":"Monserrat","sequence":"additional","affiliation":[{"name":"3 Complejo Hospitalario Universitario de A Coru\u00f1a, Insituto de Investigaci\u00f3n Biom\u00e9dica, Coru\u00f1a, Spain"}]},{"given":"Perry","family":"Elliott","sequence":"additional","affiliation":[{"name":"2 Institute of Cardiovascular Science, UCL, London, UK"}]},{"given":"Andrew C. R.","family":"Martin","sequence":"additional","affiliation":[{"name":"1 Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London WC1E 6BT, UK"}]}],"member":"286","published-online":{"date-parts":[[2016,6,17]]},"reference":[{"key":"2023020113443082900_btw362-B1","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1038\/nmeth0410-248","article-title":"A method and server for predicting damaging missense mutations","volume":"7","author":"Adzhubei","year":"2010","journal-title":"Nat. Methods"},{"key":"2023020113443082900_btw362-B2","first-page":"7.20.","article-title":"Predicting functional effect of human missense mutations using PolyPhen-2","volume":"76","author":"Adzhubei","year":"2013","journal-title":"Curr. Protoc. Hum. Genet"},{"key":"2023020113443082900_btw362-B3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2164-14-S3-S4","article-title":"The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations","volume":"14","author":"Al-Numair","year":"2013","journal-title":"BMC Genomics"},{"key":"2023020113443082900_btw362-B4","doi-asserted-by":"crossref","first-page":"918","DOI":"10.1038\/ejhg.2012.283","article-title":"New population-based exome data are questioning the pathogenicity of previously cardiomyopathy-associated genetic variants","volume":"21","author":"Andreasen","year":"2013","journal-title":"Eur. J. Hum. Genet"},{"key":"2023020113443082900_btw362-B5","doi-asserted-by":"crossref","first-page":"2499","DOI":"10.1093\/hmg\/11.20.2499","article-title":"Phenotypic diversity in hypertrophic cardiomyopathy","volume":"11","author":"Arad","year":"2002","journal-title":"Hum. Mol. Genet"},{"key":"2023020113443082900_btw362-B6","doi-asserted-by":"crossref","first-page":"W480","DOI":"10.1093\/nar\/gki372","article-title":"nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms","volume":"33","author":"Bao","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B7","doi-asserted-by":"crossref","first-page":"3823","DOI":"10.1093\/nar\/gkm238","article-title":"SNAP: predict effect of non-synonymous polymorphisms on function","volume":"35","author":"Bromberg","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B8","doi-asserted-by":"crossref","first-page":"2397","DOI":"10.1093\/bioinformatics\/btn435","article-title":"SNAP predicts effect of mutations on protein function","volume":"24","author":"Bromberg","year":"2008","journal-title":"Bioinformatics"},{"key":"2023020113443082900_btw362-B9","doi-asserted-by":"crossref","first-page":"1237","DOI":"10.1002\/humu.21047","article-title":"Functional annotations improve the predictive score of human disease-related mutations in proteins","volume":"30","author":"Calabrese","year":"2009","journal-title":"Hum. Mutat"},{"key":"2023020113443082900_btw362-B10","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1161\/01.res.0000435859.24609.b3","article-title":"Organization and sequence of human cardiac myosin binding protein C gene (MYBPC3) and identification of mutations predicted to produce truncated proteins in familial hypertrophic cardiomyopathy","volume":"80","author":"Carrier","year":"1997","journal-title":"Circulation Res"},{"key":"2023020113443082900_btw362-B11","doi-asserted-by":"crossref","first-page":"W311","DOI":"10.1093\/nar\/gki404","article-title":"MutDB services: interactive structural analysis of mutation data","volume":"33","author":"Dantzer","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B12","doi-asserted-by":"crossref","first-page":"D222","DOI":"10.1093\/nar\/gkt1223","article-title":"Pfam: the protein families database","volume":"42","author":"Finn","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B13","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1016\/j.ajhg.2011.03.004","article-title":"Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel","volume":"88","author":"Gonz\u00e1lez-P\u00e9rez","year":"2011","journal-title":"Am. J. Hum. Genet"},{"key":"2023020113443082900_btw362-B14","doi-asserted-by":"crossref","first-page":"1123","DOI":"10.1093\/eurheartj\/ehu301","article-title":"Atlas of the clinical genetics of human dilated cardiomyopathy","volume":"36","author":"Haas","year":"2014","journal-title":"Eur. Heart J"},{"key":"2023020113443082900_btw362-B15","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1136\/hrt.2004.040337","article-title":"New insights into the pathology of inherited cardiomyopathy","volume":"91","author":"Hughes","year":"2005","journal-title":"Heart"},{"key":"2023020113443082900_btw362-B16","doi-asserted-by":"crossref","first-page":"D306","DOI":"10.1093\/nar\/gkr948","article-title":"InterPro in 2011: new developments in the family and domain prediction database","volume":"40","author":"Hunter","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B17","doi-asserted-by":"crossref","first-page":"616","DOI":"10.1002\/humu.20898","article-title":"The SAAPdb web resource: a large-scale structural analysis of mutant proteins","volume":"30","author":"Hurst","year":"2009","journal-title":"Hum. Mutat"},{"key":"2023020113443082900_btw362-B18","doi-asserted-by":"crossref","first-page":"2814","DOI":"10.1093\/bioinformatics\/bti442","article-title":"LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources","volume":"21","author":"Karchin","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020113443082900_btw362-B19","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1038\/ng.2892","article-title":"A general framework for estimating the relative pathogenicity of human genetic variants","volume":"46","author":"Kircher","year":"2014","journal-title":"Nat. Genet"},{"key":"2023020113443082900_btw362-B20","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1016\/j.gene.2013.01.056","article-title":"Roadmap to determine the point mutations involved in cardiomyopathy disorder: a Bayesian approach","volume":"519","author":"Kumar","year":"2013","journal-title":"Gene"},{"key":"2023020113443082900_btw362-B21","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1002\/humu.10036","article-title":"G6PDdb, an integrated database of glucose-6-phosphate dehydrogenase (G6PD) mutations","volume":"19","author":"Kwok","year":"2002","journal-title":"Hum. Mutat"},{"key":"2023020113443082900_btw362-B22","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1016\/0022-2836(71)90324-X","article-title":"The interpretation of protein structures: estimation of static accessibility","volume":"55","author":"Lee","year":"1971","journal-title":"J. Mol. Biol"},{"key":"2023020113443082900_btw362-B23","doi-asserted-by":"crossref","first-page":"D302","DOI":"10.1093\/nar\/gkr931","article-title":"SMART 7: Recent updates to the protein domain annotation resource","volume":"40","author":"Letunic","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B24","doi-asserted-by":"crossref","first-page":"2744","DOI":"10.1093\/bioinformatics\/btp528","article-title":"Automated inference of molecular mechanisms of disease from amino acid substitutions","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020113443082900_btw362-B25","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1136\/jmedgenet-2012-101270","article-title":"Genetic complexity in hypertrophic cardiomyopathy revealed by high-throughput sequencing","volume":"50","author":"Lopes","year":"2013","journal-title":"J. Med. Genet"},{"key":"2023020113443082900_btw362-B26","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1002\/humu.10032","article-title":"Integrating mutation data and structural analysis of the TP53 tumor-suppressor protein","volume":"19","author":"Martin","year":"2002","journal-title":"Hum. Mutat"},{"key":"2023020113443082900_btw362-B27","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1016\/j.ydbio.2005.11.019","article-title":"Characterization of loss-of-function and gain-of-function Eph receptor tyrosine kinase signaling in C. elegans axon targeting and cell migration","volume":"290","author":"Mohamed","year":"2006","journal-title":"Dev. Biol"},{"key":"2023020113443082900_btw362-B28","doi-asserted-by":"crossref","first-page":"3812","DOI":"10.1093\/nar\/gkg509","article-title":"SIFT: predicting amino acid changes that affect protein function","volume":"31","author":"Ng","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B29","doi-asserted-by":"crossref","first-page":"602","DOI":"10.1161\/CIRCGENETICS.112.963421","article-title":"Cardiac structural and sarcomere genes associated with cardiomyopathy exhibit marked intolerance of genetic variation","volume":"5","author":"Pan","year":"2012","journal-title":"Circ. Cardiovasc. Genet"},{"key":"2023020113443082900_btw362-B30","doi-asserted-by":"crossref","first-page":"D527","DOI":"10.1093\/nar\/gki086","article-title":"SNPeffect: a database mapping molecular phenotypic effects of human non-synonymous coding SNPs","volume":"33","author":"Reumers","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B31","doi-asserted-by":"crossref","first-page":"e118","DOI":"10.1093\/nar\/gkr407","article-title":"Predicting the functional impact of protein mutations: application to cancer genomics","volume":"39","author":"Reva","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B32","doi-asserted-by":"crossref","first-page":"2227","DOI":"10.1161\/01.CIR.0000066323.15244.54","article-title":"Hypertrophic cardiomyopathy: Distribution of disease genes, spectrum of mutations, and implications for a molecular diagnosis strategy","volume":"107","author":"Richard","year":"2003","journal-title":"Circulation"},{"key":"2023020113443082900_btw362-B33","doi-asserted-by":"crossref","first-page":"575","DOI":"10.1038\/nmeth0810-575","article-title":"MutationTaster evaluates disease-causing potential of sequence alterations","volume":"7","author":"Schwarz","year":"2010","journal-title":"Nat. Methods"},{"key":"2023020113443082900_btw362-B34","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1002\/humu.22225","article-title":"Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models","volume":"34","author":"Shihab","year":"2013","journal-title":"Hum. Mutat"},{"key":"2023020113443082900_btw362-B35","doi-asserted-by":"crossref","first-page":"1236","DOI":"10.1016\/j.bpj.2014.02.011","article-title":"Hypertrophic and dilated cardiomyopathy: four decades of basic research on muscle lead to potential therapeutic approaches to these devastating genetic diseases","volume":"106","author":"Spudich","year":"2014","journal-title":"Biophys. J"},{"key":"2023020113443082900_btw362-B36","doi-asserted-by":"crossref","first-page":"2181","DOI":"10.1093\/bioinformatics\/btr365","article-title":"Kvsnp: accurately predicting the effect of genetic variants in voltage-gated potassium channels","volume":"27","author":"Stead","year":"2011","journal-title":"Bioinformatics"},{"key":"2023020113443082900_btw362-B37","volume-title":"The Human Gene Mutation Database (HGMD) and Its Exploitation in the Fields of Personalized Genomics and Molecular Evolution","author":"Stenson","year":"2002"},{"key":"2023020113443082900_btw362-B38","doi-asserted-by":"crossref","first-page":"D520","DOI":"10.1093\/nar\/gkh104","article-title":"topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association","volume":"32","author":"Stitziel","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B39","doi-asserted-by":"crossref","first-page":"7486","DOI":"10.1093\/nar\/gku469","article-title":"Activities at the Universal Protein Resource (UniProt)","volume":"42","author":"UniProt Consortium","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B40","doi-asserted-by":"crossref","first-page":"W384","DOI":"10.1093\/nar\/gkm232","article-title":"Structure SNP (StSNP): a web server for mapping and modeling nsSNPs on protein structures with linkage to metabolic pathways","volume":"35","author":"Uzun","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B41","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1159\/000252808","article-title":"Cardiomyopathy: a systematic review of disease-causing mutations in myosin heavy chain 7 and their phenotypic manifestations","volume":"115","author":"Walsh","year":"2010","journal-title":"Cardiology"},{"key":"2023020113443082900_btw362-B42","volume-title":"Data Mining: Practical Machine Learning Tools and Techniques","author":"Witten","year":"2011","edition":"3rd ed"},{"key":"2023020113443082900_btw362-B43","doi-asserted-by":"crossref","first-page":"1179","DOI":"10.1136\/heart.89.10.1179","article-title":"Mutations of the beta myosin heavy chain gene in hypertrophic cardiomyopathy: critical functional sites determine prognosis","volume":"89","author":"Woo","year":"2003","journal-title":"Heart"},{"key":"2023020113443082900_btw362-B44","doi-asserted-by":"crossref","first-page":"W215","DOI":"10.1093\/nar\/gkr363","article-title":"SDM \u2014 a server for predicting effects of mutations on protein stability and malfunction","volume":"39","author":"Worth","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023020113443082900_btw362-B45","doi-asserted-by":"crossref","first-page":"2692","DOI":"10.1016\/j.jmb.2014.04.026","article-title":"SuSPect: enhanced prediction of single amino acid variant (SAV) phenotype using network features","volume":"426","author":"Yates","year":"2014","journal-title":"J. Mol. Biol"},{"key":"2023020113443082900_btw362-B46","doi-asserted-by":"crossref","first-page":"464","DOI":"10.1002\/humu.20021","article-title":"The Swiss-Prot variant page and the ModSNP database: a resource for sequence and structure information on human protein variants","volume":"23","author":"Yip","year":"2004","journal-title":"Hum. Mutat"},{"key":"2023020113443082900_btw362-B47","doi-asserted-by":"crossref","first-page":"166166.","DOI":"10.1186\/1471-2105-7-166","article-title":"SNPs3D: candidate gene and SNP selection for association studies","volume":"7","author":"Yue","year":"2006","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/19\/2947\/49020969\/bioinformatics_32_19_2947.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/19\/2947\/49020969\/bioinformatics_32_19_2947.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T23:50:15Z","timestamp":1675295415000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/32\/19\/2947\/2196564"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,6,17]]},"references-count":47,"journal-issue":{"issue":"19","published-print":{"date-parts":[[2016,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw362","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2016,10,1]]},"published":{"date-parts":[[2016,6,17]]}}}