{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:33:35Z","timestamp":1772138015679,"version":"3.50.1"},"reference-count":121,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2022,5,31]],"date-time":"2022-05-31T00:00:00Z","timestamp":1653955200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000268","name":"Biotechnology and Biological Sciences Research Council","doi-asserted-by":"publisher","award":["BB\/S020144\/1"],"award-info":[{"award-number":["BB\/S020144\/1"]}],"id":[{"id":"10.13039\/501100000268","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000268","name":"Biotechnology and Biological Sciences Research Council","doi-asserted-by":"publisher","award":["BB\/R009597\/1"],"award-info":[{"award-number":["BB\/R009597\/1"]}],"id":[{"id":"10.13039\/501100000268","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000268","name":"Biotechnology and Biological Sciences Research Council","doi-asserted-by":"publisher","award":["BB\/R014892\/1"],"award-info":[{"award-number":["BB\/R014892\/1"]}],"id":[{"id":"10.13039\/501100000268","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000268","name":"Biotechnology and Biological Sciences Research Council","doi-asserted-by":"publisher","award":["BB\/S017135\/1"],"award-info":[{"award-number":["BB\/S017135\/1"]}],"id":[{"id":"10.13039\/501100000268","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["# DBI 1937533"],"award-info":[{"award-number":["# DBI 1937533"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Institute for Protein Design"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,7,18]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Mutations in human proteins lead to diseases. The structure of these proteins can help understand the mechanism of such diseases and develop therapeutics against them. With improved deep learning techniques, such as RoseTTAFold and AlphaFold, we can predict the structure of proteins even in the absence of structural homologs. We modeled and extracted the domains from 553 disease-associated human proteins without known protein structures or close homologs in the Protein Databank. We noticed that the model quality was higher and the Root mean square deviation (RMSD) lower between AlphaFold and RoseTTAFold models for domains that could be assigned to CATH families as compared to those which could only be assigned to Pfam families of unknown structure or could not be assigned to either. We predicted ligand-binding sites, protein\u2013protein interfaces and conserved residues in these predicted structures. We then explored whether the disease-associated missense mutations were in the proximity of these predicted functional sites, whether they destabilized the protein structure based on ddG calculations or whether they were predicted to be pathogenic. We could explain 80% of these disease-associated mutations based on proximity to functional sites, structural destabilization or pathogenicity. When compared to polymorphisms, a larger percentage of disease-associated missense mutations were buried, closer to predicted functional sites, predicted as destabilizing and pathogenic. Usage of models from the two state-of-the-art techniques provide better confidence in our predictions, and we explain 93 additional mutations based on RoseTTAFold models which could not be explained based solely on AlphaFold models.<\/jats:p>","DOI":"10.1093\/bib\/bbac187","type":"journal-article","created":{"date-parts":[[2022,4,27]],"date-time":"2022-04-27T15:30:46Z","timestamp":1651073446000},"source":"Crossref","is-referenced-by-count":27,"title":["Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs"],"prefix":"10.1093","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3324-5755","authenticated-orcid":false,"given":"Neeladri","family":"Sen","sequence":"first","affiliation":[{"name":"Institute of Structural and Molecular Biology, University College London , London, WC1E 6BT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ivan","family":"Anishchenko","sequence":"additional","affiliation":[{"name":"Department of Biochemistry, University of Washington , Seattle, WA 98195, USA"},{"name":"Institute for Protein Design, University of Washington , Seattle, WA 98195, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6568-9035","authenticated-orcid":false,"given":"Nicola","family":"Bordin","sequence":"additional","affiliation":[{"name":"Institute of Structural and Molecular Biology, University College London , London, WC1E 6BT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ian","family":"Sillitoe","sequence":"additional","affiliation":[{"name":"Institute of Structural and Molecular Biology, University College London , London, WC1E 6BT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sameer","family":"Velankar","sequence":"additional","affiliation":[{"name":"Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton, Cambridge CB10 1SD, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Baker","sequence":"additional","affiliation":[{"name":"Department of Biochemistry, University of Washington , Seattle, WA 98195, USA"},{"name":"Institute for Protein Design, University of Washington , Seattle, WA 98195, USA"},{"name":"Howard Hughes Medical Institute, University of Washington , Seattle, WA 98195, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christine","family":"Orengo","sequence":"additional","affiliation":[{"name":"Institute of Structural and Molecular Biology, University College London , London, WC1E 6BT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2022,6,1]]},"reference":[{"key":"2022071905584563200_ref1","doi-asserted-by":"crossref","first-page":"527484","DOI":"10.3389\/fgene.2020.527484","article-title":"Disease-causing mutations and rearrangements in long non-coding RNA gene loci","volume":"11","author":"Aznaourova","year":"2020","journal-title":"Front Genet"},{"key":"2022071905584563200_ref2","doi-asserted-by":"crossref","first-page":"103084","DOI":"10.1016\/j.ebiom.2020.103084","article-title":"Somatic mutation in noncoding regions: the sound of silence","volume":"61","author":"Tan","year":"2020","journal-title":"EBioMedicine"},{"key":"2022071905584563200_ref3","doi-asserted-by":"crossref","first-page":"659","DOI":"10.1097\/MOP.0000000000000283","article-title":"Mutations in the noncoding genome","volume":"27","author":"Scacheri","year":"2015","journal-title":"Curr Opin Pediatr"},{"key":"2022071905584563200_ref4","doi-asserted-by":"crossref","first-page":"500","DOI":"10.1038\/s41568-021-00371-z","article-title":"Non-coding driver mutations in human cancer","volume":"21","author":"Elliott","year":"2021","journal-title":"Nat Rev Cancer"},{"key":"2022071905584563200_ref5","doi-asserted-by":"crossref","first-page":"24572","DOI":"10.1016\/S0021-9258(19)74505-0","article-title":"Effect of mutations at active site residues on the activity of ornithine decarboxylase and its inhibition by active site-directed irreversible inhibitors","volume":"268","author":"Coleman","year":"1993","journal-title":"J Biol Chem"},{"key":"2022071905584563200_ref6","doi-asserted-by":"crossref","first-page":"e0207747","DOI":"10.1371\/journal.pone.0207747","article-title":"Mutation of a serine near the catalytic site of the choline acetyltransferase a gene almost completely abolishes motility of the zebrafish embryo","volume":"13","author":"Joshi","year":"2018","journal-title":"PLOS ONE"},{"key":"2022071905584563200_ref7","doi-asserted-by":"crossref","first-page":"300","DOI":"10.1038\/s41598-018-36391-3","article-title":"Effects of point mutations in the binding pocket of the mouse major urinary protein MUP20 on ligand affinity and specificity","volume":"9","author":"Ricatti","year":"2019","journal-title":"Sci Rep"},{"key":"2022071905584563200_ref8","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1038\/ng0797-307","article-title":"Missense mutations abolishing DNA binding of the osteoblast-specific transcription factor OSF2\/CBFA1 in cleidocranial dysplasia","volume":"16","author":"Lee","year":"1997","journal-title":"Nat Genet"},{"key":"2022071905584563200_ref9","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.pbiomolbio.2016.10.002","article-title":"Mutations at protein-protein interfaces: Small changes over big surfaces have large impacts on human health","volume":"128","author":"Jubb","year":"2017","journal-title":"Prog Biophys Mol Biol"},{"key":"2022071905584563200_ref10","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1038\/s41588-020-00774-y","article-title":"Comprehensive characterization of protein\u2013protein interactions perturbed by disease mutations","volume":"53","author":"Cheng","year":"2021","journal-title":"Nat Genet"},{"key":"2022071905584563200_ref11","doi-asserted-by":"crossref","first-page":"1719","DOI":"10.1038\/s41598-017-19135-7","article-title":"Effects of distal mutations on the structure, dynamics and catalysis of human Monoacylglycerol lipase","volume":"8","author":"Tyukhtenko","year":"2018","journal-title":"Sci Rep"},{"key":"2022071905584563200_ref12","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1126\/science.181.4096.223","article-title":"Principles that govern the folding of protein chains","volume":"181","author":"Anfinsen","year":"1973","journal-title":"Science"},{"key":"2022071905584563200_ref13","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1038\/nrn1007","article-title":"Unfolding the role of protein misfolding in neurodegenerative diseases","volume":"4","author":"Soto","year":"2003","journal-title":"Nat Rev Neurosci"},{"key":"2022071905584563200_ref14","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1111\/bpa.12695","article-title":"Recent advances in the histo-molecular pathology of human prion disease: histo-molecular pathology of human prion disease","volume":"29","author":"Baiardi","year":"2019","journal-title":"Brain Pathol"},{"key":"2022071905584563200_ref15","doi-asserted-by":"crossref","first-page":"352","DOI":"10.1093\/nar\/28.1.352","article-title":"dbSNP: a database of single nucleotide polymorphisms","volume":"28","author":"Smigielski","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref16","doi-asserted-by":"crossref","first-page":"D941","DOI":"10.1093\/nar\/gkz836","article-title":"The International Genome Sample Resource (IGSR) collection of open human genomic variation resources","volume":"48","author":"Fairley","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref17","doi-asserted-by":"crossref","first-page":"D835","DOI":"10.1093\/nar\/gkz972","article-title":"ClinVar: improvements to accessing data","volume":"48","author":"Landrum","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref18","doi-asserted-by":"crossref","first-page":"D777","DOI":"10.1093\/nar\/gkw1121","article-title":"COSMIC: somatic cancer genetics at high-resolution","volume":"45","author":"Forbes","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref19","doi-asserted-by":"crossref","first-page":"D1289","DOI":"10.1093\/nar\/gkaa1033","article-title":"OncoVar: an integrated database and analysis platform for oncogenic driver variants in cancers","volume":"49","author":"Wang","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref20","doi-asserted-by":"crossref","first-page":"806","DOI":"10.1038\/nmeth.4000","article-title":"DoCM: a database of curated mutations in cancer","volume":"13","author":"Ainscough","year":"2016","journal-title":"Nat Methods"},{"key":"2022071905584563200_ref21","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1093\/nar\/27.1.362","article-title":"KinMutBase, a database of human disease-causing protein kinase mutations","volume":"27","author":"Stenberg","year":"1999","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref22","doi-asserted-by":"crossref","first-page":"D901","DOI":"10.1093\/nar\/gkx973","article-title":"ActiveDriverDB: human disease mutations and genome variation in post-translational modification sites of proteins","volume":"46","author":"Krassowski","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref23","doi-asserted-by":"crossref","first-page":"D480","DOI":"10.1093\/nar\/gkaa1100","article-title":"UniProt: the universal protein knowledgebase in 2021","volume":"49","author":"The UniProt Consortium","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref24","doi-asserted-by":"crossref","first-page":"D344","DOI":"10.1093\/nar\/gkz853","article-title":"PDBe-KB: a community-driven resource for structural and functional annotations","volume":"48","author":"PDBe-KB consortium","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref25","doi-asserted-by":"crossref","first-page":"166915","DOI":"10.1016\/j.jmb.2021.166915","article-title":"The DBSAV database: predicting deleteriousness of single amino acid variations in the human proteome","volume":"433","author":"Pei","year":"2021","journal-title":"J Mol Biol"},{"key":"2022071905584563200_ref26","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1006\/jmbi.1993.1626","article-title":"Comparative protein modelling by satisfaction of spatial restraints","volume":"234","author":"\u0160ali","year":"1993","journal-title":"J Mol Biol"},{"key":"2022071905584563200_ref27","doi-asserted-by":"crossref","first-page":"5.6.1","DOI":"10.1002\/cpbi.3","article-title":"Comparative protein structure modeling using MODELLER","volume":"54","author":"Webb","year":"2016","journal-title":"Curr Protoc Bioinforma"},{"key":"2022071905584563200_ref28","doi-asserted-by":"crossref","first-page":"W296","DOI":"10.1093\/nar\/gky427","article-title":"SWISS-MODEL: homology modelling of protein structures and complexes","volume":"46","author":"Waterhouse","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref29","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/S0076-6879(04)83004-0","article-title":"Protein structure prediction using Rosetta","volume":"383","author":"Rohl","year":"2004","journal-title":"Methods Enzymol"},{"key":"2022071905584563200_ref30","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1038\/nprot.2010.5","article-title":"I-TASSER: a unified platform for automated protein structure and function prediction","volume":"5","author":"Roy","year":"2010","journal-title":"Nat Protoc"},{"key":"2022071905584563200_ref31","doi-asserted-by":"crossref","first-page":"16856","DOI":"10.1073\/pnas.1821309116","article-title":"Distance-based protein folding powered by deep learning","volume":"116","author":"Xu","year":"2019","journal-title":"Proc Natl Acad Sci"},{"key":"2022071905584563200_ref32","doi-asserted-by":"crossref","first-page":"3977","DOI":"10.1038\/s41467-019-11994-0","article-title":"Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints","volume":"10","author":"Greener","year":"2019","journal-title":"Nat Commun"},{"key":"2022071905584563200_ref33","doi-asserted-by":"crossref","first-page":"9122","DOI":"10.1073\/pnas.1702664114","article-title":"Origins of coevolution between residues distant in protein 3D structures","volume":"114","author":"Anishchenko","year":"2017","journal-title":"Proc Natl Acad Sci"},{"key":"2022071905584563200_ref34","doi-asserted-by":"crossref","first-page":"706","DOI":"10.1038\/s41586-019-1923-7","article-title":"Improved protein structure prediction using potentials from deep learning","volume":"577","author":"Senior","year":"2020","journal-title":"Nature"},{"key":"2022071905584563200_ref35","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2022071905584563200_ref36","doi-asserted-by":"crossref","first-page":"871","DOI":"10.1126\/science.abj8754","article-title":"Accurate prediction of protein structures and interactions using a three-track neural network","volume":"373","author":"Baek","year":"2021","journal-title":"Science"},{"key":"2022071905584563200_ref37","doi-asserted-by":"crossref","first-page":"590","DOI":"10.1038\/s41586-021-03828-1","article-title":"Highly accurate protein structure prediction for the human proteome","volume":"596","author":"Tunyasuvunakool","year":"2021","journal-title":"Nature"},{"key":"2022071905584563200_ref38","doi-asserted-by":"crossref","DOI":"10.1101\/2021.09.26.461876","volume-title":"A Structural Biology Community Assessment of AlphaFold 2 Applications","author":"Akdel","year":"2021"},{"key":"2022071905584563200_ref39","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1093\/bfgp\/ely039","article-title":"Research progress in protein posttranslational modification site prediction","volume":"18","author":"He","year":"2019","journal-title":"Brief Funct Genomics"},{"key":"2022071905584563200_ref40","doi-asserted-by":"crossref","DOI":"10.1002\/cpps.62","article-title":"Computational methods for predicting protein-protein interactions using various protein features","volume":"93","author":"Ding","year":"2018","journal-title":"Curr Protoc Protein Sci"},{"key":"2022071905584563200_ref41","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1016\/j.sbi.2021.05.012","article-title":"Computational approaches to predict protein functional families and functional sites","volume":"70","author":"Rauer","year":"2021","journal-title":"Curr Opin Struct Biol"},{"key":"2022071905584563200_ref42","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.sbi.2017.10.002","article-title":"Structure-based prediction of protein allostery","volume":"50","author":"Greener","year":"2018","journal-title":"Curr Opin Struct Biol"},{"key":"2022071905584563200_ref43","doi-asserted-by":"crossref","first-page":"W382","DOI":"10.1093\/nar\/gki387","article-title":"The FoldX web server: an online force field","volume":"33","author":"Schymkowitz","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref44","doi-asserted-by":"crossref","first-page":"5461","DOI":"10.1021\/acs.jctc.9b00538","article-title":"QresFEP: an automated protocol for free energy calculations of protein mutations in Q","volume":"15","author":"Jespers","year":"2019","journal-title":"J Chem Theory Comput"},{"key":"2022071905584563200_ref45","doi-asserted-by":"crossref","first-page":"948","DOI":"10.1016\/j.jmb.2016.12.007","article-title":"Predicting the effect of amino acid single-point mutations on protein stability-large-scale validation of MD-based relative free energy calculations","volume":"429","author":"Steinbrecher","year":"2017","journal-title":"J Mol Biol"},{"key":"2022071905584563200_ref46","doi-asserted-by":"crossref","first-page":"7364","DOI":"10.1002\/anie.201510054","article-title":"Accurate and rigorous prediction of the changes in protein free energies in a large-scale mutation scan","volume":"55","author":"Gapsys","year":"2016","journal-title":"Angew Chem Int Ed Engl"},{"key":"2022071905584563200_ref47","doi-asserted-by":"crossref","first-page":"5918","DOI":"10.1038\/s41467-020-19669-x","article-title":"Inferring the molecular and phenotypic impact of amino acid variants with MutPred2","volume":"11","author":"Pejaver","year":"2020","journal-title":"Nat Commun"},{"key":"2022071905584563200_ref48","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1038\/s41586-021-04043-8","article-title":"Disease variant prediction with deep generative models of evolutionary data","volume":"599","author":"Frazer","year":"2021","journal-title":"Nature"},{"key":"2022071905584563200_ref49","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1038\/ng.2892","article-title":"A general framework for estimating the relative pathogenicity of human genetic variants","volume":"46","author":"Kircher","year":"2014","journal-title":"Nat Genet"},{"key":"2022071905584563200_ref50","doi-asserted-by":"crossref","first-page":"9289","DOI":"10.1021\/bi049334h","article-title":"Homology modeling of the human microsomal glucose 6-phosphate transporter explains the mutations that cause the glycogen storage disease type Ib","volume":"43","author":"Almqvist","year":"2004","journal-title":"Biochemistry"},{"key":"2022071905584563200_ref51","doi-asserted-by":"crossref","first-page":"2197","DOI":"10.1016\/j.jmb.2019.04.009","article-title":"Can predicted protein 3D structures provide reliable insights into whether missense variants are disease associated?","volume":"431","author":"Ittisoponpisan","year":"2019","journal-title":"J Mol Biol"},{"key":"2022071905584563200_ref52","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref53","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1002\/pro.3746","article-title":"VarSite: disease variants and protein structure","volume":"29","author":"Laskowski","year":"2020","journal-title":"Protein Sci Publ Protein Soc"},{"key":"2022071905584563200_ref54","doi-asserted-by":"crossref","first-page":"1551","DOI":"10.1038\/nprot.2013.092","article-title":"Large-scale gene function analysis with the PANTHER classification system","volume":"8","author":"Mi","year":"2013","journal-title":"Nat Protoc"},{"key":"2022071905584563200_ref55","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1002\/pro.4218","article-title":"PANTHER: making genome-scale phylogenetics accessible to all","volume":"31","author":"Thomas","year":"2022","journal-title":"Protein Sci Publ Protein Soc"},{"key":"2022071905584563200_ref56","doi-asserted-by":"crossref","first-page":"D394","DOI":"10.1093\/nar\/gkaa1106","article-title":"PANTHER version 16: a revised family classification, tree-based classification tool, enhancer regions and extensive API","volume":"49","author":"Mi","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref57","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1016\/S0969-2126(97)00260-8","article-title":"CATH\u2014a hierarchic classification of protein domain structures","volume":"5","author":"Orengo","year":"1997","journal-title":"Structure"},{"key":"2022071905584563200_ref58","doi-asserted-by":"crossref","first-page":"D266","DOI":"10.1093\/nar\/gkaa1079","article-title":"CATH: increased structural coverage of functional space","volume":"49","author":"Sillitoe","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref59","doi-asserted-by":"crossref","first-page":"D427","DOI":"10.1093\/nar\/gky995","article-title":"The Pfam protein families database in 2019","volume":"47","author":"El-Gebali","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref60","doi-asserted-by":"crossref","first-page":"869","DOI":"10.1016\/j.str.2009.03.015","article-title":"PSI-2: Structural genomics to cover protein domain family space","volume":"17","author":"Dessailly","year":"2009","journal-title":"Structure"},{"key":"2022071905584563200_ref61","doi-asserted-by":"crossref","first-page":"3460","DOI":"10.1093\/bioinformatics\/btv398","article-title":"Functional classification of CATH superfamilies: a domain-based approach for protein function annotation","volume":"31","author":"Das","year":"2015","journal-title":"Bioinformatics"},{"key":"2022071905584563200_ref62","doi-asserted-by":"crossref","first-page":"166788","DOI":"10.1016\/j.jmb.2020.166788","article-title":"A fifth of the protein world: Rossmann-like proteins as an evolutionarily successful structural unit","volume":"433","author":"Medvedev","year":"2021","journal-title":"J Mol Biol"},{"key":"2022071905584563200_ref63","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1093\/protein\/12.7.563","article-title":"The immunoglobulin fold family: sequence analysis and 3D structure comparisons","volume":"12","author":"Halaby","year":"1999","journal-title":"Protein Eng Des Sel"},{"key":"2022071905584563200_ref64","doi-asserted-by":"crossref","DOI":"10.1101\/2022.03.10.483805","article-title":"CATHe: Detection of remote homologues for CATH superfamilies using embeddings from protein language models","volume-title":"bioRxiv","author":"Nallapareddy","year":"2022"},{"key":"2022071905584563200_ref65","article-title":"ProtTrans: towards cracking the language of life\u2019s code through self-supervised deep learning and high performance","volume":"14","author":"Elnaggar","journal-title":"IEEE Trans Pattern analysis and Machine Intelligence;"},{"key":"2022071905584563200_ref66","doi-asserted-by":"crossref","first-page":"2722","DOI":"10.1093\/bioinformatics\/btt473","article-title":"lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests","volume":"29","author":"Mariani","year":"2013","journal-title":"Bioinformatics"},{"key":"2022071905584563200_ref67","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1002\/prot.10146","article-title":"Scoring residue conservation","volume":"48","author":"Valdar","year":"2002","journal-title":"Proteins Struct Funct Genet"},{"key":"2022071905584563200_ref68","doi-asserted-by":"crossref","first-page":"W329","DOI":"10.1093\/nar\/gky384","article-title":"IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding","volume":"46","author":"M\u00e9sz\u00e1ros","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref69","doi-asserted-by":"crossref","first-page":"D1255","DOI":"10.1093\/nar\/gkab1063","article-title":"The human disease ontology 2022 update","volume":"50","author":"Schriml","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref70","doi-asserted-by":"crossref","first-page":"2301","DOI":"10.1016\/j.ajhg.2021.10.007","article-title":"Identification of discriminative gene-level and protein-level features associated with pathogenic gain-of-function and loss-of-function variants","volume":"108","author":"Sevim Bayrak","year":"2021","journal-title":"Am J Hum Genet"},{"key":"2022071905584563200_ref71","doi-asserted-by":"crossref","first-page":"1197","DOI":"10.1007\/s00439-020-02199-3","article-title":"The Human Gene Mutation Database (HGMD\u00ae): optimizing its use in a clinical diagnostic or research setting","volume":"139","author":"Stenson","year":"2020","journal-title":"Hum Genet"},{"key":"2022071905584563200_ref72","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1186\/s13059-019-1845-6","article-title":"MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect","volume":"20","author":"Esposito","year":"2019","journal-title":"Genome Biol"},{"key":"2022071905584563200_ref73","doi-asserted-by":"crossref","first-page":"575","DOI":"10.1111\/j.1365-2958.2010.07231.x","article-title":"Gain-of-function mutations cluster in distinct regions associated with the signalling pathway in the PAS domain of the aerotaxis receptor, Aer: Signalling in the Aer-PAS domain","volume":"77","author":"Campbell","year":"2010","journal-title":"Mol Microbiol"},{"key":"2022071905584563200_ref74","doi-asserted-by":"crossref","first-page":"E5486","DOI":"10.1073\/pnas.1516373112","article-title":"Comprehensive assessment of cancer missense mutation clustering in protein structures","volume":"112","author":"Kamburov","year":"2015","journal-title":"Proc Natl Acad Sci"},{"key":"2022071905584563200_ref75","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1002\/humu.22963","article-title":"mutation3D: cancer gene prediction through atomic clustering of coding variants in the structural proteome","volume":"37","author":"Meyer","year":"2016","journal-title":"Hum Mutat"},{"key":"2022071905584563200_ref76","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1186\/1471-2105-8-211","article-title":"Composition profiler: a tool for discovery and visualization of amino acid composition differences","volume":"8","author":"Vacic","year":"2007","journal-title":"BMC Bioinform"},{"key":"2022071905584563200_ref77","doi-asserted-by":"crossref","first-page":"1362","DOI":"10.1016\/j.str.2015.03.028","article-title":"Insights into disease-associated mutations in the human proteome through protein structural analysis","volume":"23","author":"Gao","year":"2015","journal-title":"Structure"},{"key":"2022071905584563200_ref78","doi-asserted-by":"crossref","first-page":"3246","DOI":"10.1021\/acs.jcim.0c00104","article-title":"GalaxySagittarius: structure- and similarity-based prediction of protein targets for druglike compounds","volume":"60","author":"Yang","year":"2020","journal-title":"J Chem Inf Model"},{"key":"2022071905584563200_ref79","doi-asserted-by":"crossref","first-page":"105495","DOI":"10.1016\/j.ejps.2020.105495","article-title":"Structure-based drug repositioning over the human TMPRSS2 protease domain: search for chemical probes able to repress SARS-CoV-2 Spike protein cleavages","volume":"153","author":"Singh","year":"2020","journal-title":"Eur J Pharm Sci"},{"key":"2022071905584563200_ref80","doi-asserted-by":"crossref","first-page":"3516","DOI":"10.1016\/j.febslet.2015.10.003","article-title":"Computational prediction of protein interfaces: a review of data driven methods","volume":"589","author":"Xue","year":"2015","journal-title":"FEBS Lett"},{"key":"2022071905584563200_ref81","doi-asserted-by":"crossref","first-page":"631297","DOI":"10.3389\/fmicb.2021.631297","article-title":"The archaeal elongation factor EF-2 induces the release of aIF6 from 50S ribosomal subunit","volume":"12","author":"Lo Gullo","year":"2021","journal-title":"Front Microbiol"},{"key":"2022071905584563200_ref82","doi-asserted-by":"crossref","first-page":"16807","DOI":"10.1038\/s41598-018-34244-7","article-title":"The 2.1 \u00c5 structure of protein F9 and its comparison to L1, two components of the conserved poxvirus entry-fusion complex","volume":"8","author":"Diesterbeck","year":"2018","journal-title":"Sci Rep"},{"key":"2022071905584563200_ref83","doi-asserted-by":"crossref","first-page":"620554","DOI":"10.3389\/fmolb.2020.620554","article-title":"Influence of disease-causing mutations on protein structural networks","volume":"7","author":"Prabantu","year":"2021","journal-title":"Front Mol Biosci"},{"key":"2022071905584563200_ref84","doi-asserted-by":"crossref","first-page":"W375","DOI":"10.1093\/nar\/gkw383","article-title":"NAPS: network analysis of protein structures","volume":"44","author":"Chakrabarty","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref85","doi-asserted-by":"crossref","first-page":"e1002452","DOI":"10.1371\/journal.pbio.1002452","article-title":"Functional sites induce long-range evolutionary constraints in enzymes","volume":"14","author":"Jack","year":"2016","journal-title":"PLoS Biol"},{"key":"2022071905584563200_ref86","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1002\/pro.3942","article-title":"DynaMut2: assessing changes in stability and flexibility upon single and multiple point missense mutations","volume":"30","author":"Rodrigues","year":"2021","journal-title":"Protein Sci. Publ. Protein Soc."},{"key":"2022071905584563200_ref87","doi-asserted-by":"crossref","first-page":"626363","DOI":"10.3389\/fmolb.2020.626363","article-title":"Solvent accessibility of residues undergoing pathogenic variations in humans: from protein structures to protein sequences","volume":"7","author":"Savojardo","year":"2021","journal-title":"Front Mol Biosci"},{"key":"2022071905584563200_ref88","doi-asserted-by":"crossref","first-page":"28201","DOI":"10.1073\/pnas.2002660117","article-title":"Comprehensive characterization of amino acid positions in protein structures reveals molecular effect of missense variants","volume":"117","author":"Iqbal","year":"2020","journal-title":"Proc Natl Acad Sci"},{"key":"2022071905584563200_ref89","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1214\/aoms\/1177730491","article-title":"On a test of whether one of two random variables is stochastically larger than the other","volume":"18","author":"Mann","year":"1947","journal-title":"Ann Math Stat"},{"key":"2022071905584563200_ref90","doi-asserted-by":"crossref","first-page":"e1009818","DOI":"10.1371\/journal.pcbi.1009818","article-title":"The structural coverage of the human proteome before and after AlphaFold","volume":"18","author":"Porta-Pardo","year":"2022","journal-title":"PLoS Comput Biol"},{"key":"2022071905584563200_ref91","doi-asserted-by":"crossref","first-page":"e0007419","DOI":"10.1371\/journal.pntd.0007419","article-title":"Predicting and designing therapeutics against the Nipah virus","volume":"13","author":"Sen","year":"2019","journal-title":"PLoS Negl Trop Dis"},{"key":"2022071905584563200_ref92","doi-asserted-by":"crossref","first-page":"1529","DOI":"10.1021\/acs.jcim.8b00762","article-title":"Discovering putative protein targets of small molecules: a study of the p53 activator nutlin","volume":"59","author":"Nguyen","year":"2019","journal-title":"J Chem Inf Model"},{"key":"2022071905584563200_ref93","doi-asserted-by":"crossref","first-page":"742","DOI":"10.1093\/bib\/bbaa362","article-title":"The impact of structural bioinformatics tools and resources on SARS-CoV-2 research and therapeutic strategies","volume":"22","author":"Waman","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022071905584563200_ref94","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1016\/j.pbiomolbio.2017.02.004","article-title":"Depth dependent amino acid substitution matrices and their use in predicting deleterious mutations","volume":"128","author":"Farheen","year":"2017","journal-title":"Prog Biophys Mol Biol"},{"key":"2022071905584563200_ref95","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1007\/978-1-0716-1406-8_3","article-title":"Methods for molecular modelling of protein complexes","volume":"2305","author":"Kanitkar","year":"2021","journal-title":"Struct Proteomics"},{"key":"2022071905584563200_ref96","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1038\/s41598-018-36401-4","article-title":"A CATH domain functional family based approach to identify putative cancer driver genes and driver mutations","volume":"9","author":"Ashford","year":"2019","journal-title":"Sci Rep"},{"key":"2022071905584563200_ref97","doi-asserted-by":"crossref","first-page":"1099","DOI":"10.1093\/bioinformatics\/btaa937","article-title":"CATH functional families predict functional sites in proteins","volume":"37","author":"Das","year":"2021","journal-title":"Bioinformatics"},{"key":"2022071905584563200_ref98","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J Mol Biol"},{"key":"2022071905584563200_ref99","doi-asserted-by":"crossref","first-page":"e121","DOI":"10.1093\/nar\/gkt263","article-title":"Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions","volume":"41","author":"Mistry","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref100","doi-asserted-by":"crossref","first-page":"1766","DOI":"10.1093\/bioinformatics\/bty863","article-title":"cath-resolve-hits: a new tool that resolves domain matches suspiciously quickly","volume":"35","author":"Lewis","year":"2019","journal-title":"Bioinformatics"},{"key":"2022071905584563200_ref101","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1093\/molbev\/mst010","article-title":"MAFFT multiple sequence alignment software version 7: Improvements in Performance and Usability","volume":"30","author":"Katoh","year":"2013","journal-title":"Mol Biol Evol"},{"key":"2022071905584563200_ref102","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1186\/s12859-019-3019-7","article-title":"HH-suite3 for fast remote homology detection and deep protein annotation","volume":"20","author":"Steinegger","year":"2019","journal-title":"BMC Bioinform"},{"key":"2022071905584563200_ref103","doi-asserted-by":"crossref","first-page":"D170","DOI":"10.1093\/nar\/gkw1081","article-title":"Uniclust databases of clustered and deeply annotated protein sequences and alignments","volume":"45","author":"Mirdita","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref104","doi-asserted-by":"crossref","first-page":"603","DOI":"10.1038\/s41592-019-0437-4","article-title":"Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold","volume":"16","author":"Steinegger","year":"2019","journal-title":"Nat Methods"},{"key":"2022071905584563200_ref105","doi-asserted-by":"crossref","first-page":"2542","DOI":"10.1038\/s41467-018-04964-5","article-title":"Clustering huge protein sequence sets in linear time","volume":"9","author":"Steinegger","year":"2018","journal-title":"Nat Commun"},{"key":"2022071905584563200_ref106","doi-asserted-by":"crossref","first-page":"prot.26194","DOI":"10.1002\/prot.26194","article-title":"Protein tertiary structure prediction and refinement using deep learning and Rosetta in CASP14","volume":"89","author":"Anishchenko","year":"2021","journal-title":"Proteins Struct Funct Bioinforma"},{"key":"2022071905584563200_ref107","doi-asserted-by":"crossref","first-page":"689","DOI":"10.1093\/bioinformatics\/btq007","article-title":"PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta","volume":"26","author":"Chaudhury","year":"2010","journal-title":"Bioinformatics"},{"key":"2022071905584563200_ref108","doi-asserted-by":"crossref","first-page":"1340","DOI":"10.1038\/s41467-021-21511-x","article-title":"Improved protein structure refinement guided by deep learning based accuracy estimation","volume":"12","author":"Hiranuma","year":"2021","journal-title":"Nat Commun"},{"key":"2022071905584563200_ref109","doi-asserted-by":"crossref","first-page":"3370","DOI":"10.1093\/nar\/gkg571","article-title":"LGA: a method for finding 3D similarities in protein structures","volume":"31","author":"Zemla","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2022071905584563200_ref110","doi-asserted-by":"crossref","first-page":"6201","DOI":"10.1021\/acs.jctc.6b00819","article-title":"Simultaneous optimization of biomolecular energy functions on features from small molecules and macromolecules","volume":"12","author":"Park","year":"2016","journal-title":"J Chem Theory Comput"},{"key":"2022071905584563200_ref111","first-page":"76","article-title":"Enhanced fold recognition using efficient short fragment clustering","volume":"1","author":"Krissinel","year":"2012","journal-title":"J Mol Biochem"},{"key":"2022071905584563200_ref112","doi-asserted-by":"crossref","first-page":"1940","DOI":"10.1002\/prot.26192","article-title":"Assessment of protein model structure accuracy estimation in CASP14: Old and new challenges","volume":"89","author":"Kwon","year":"2021","journal-title":"Proteins Struct Funct Bioinforma"},{"key":"2022071905584563200_ref113","author":"Soni"},{"key":"2022071905584563200_ref114","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1016\/j.cpc.2004.04.004","article-title":"Improved neighbor list algorithm in molecular simulations using cell decomposition and data sorting method","volume":"161","author":"Yao","year":"2004","journal-title":"Comput Phys Commun"},{"key":"2022071905584563200_ref115","article-title":"Cell list algorithms for nonequilibrium molecular dynamics","author":"Dobson","year":"2014","journal-title":"arXiv:1412.3784"},{"key":"2022071905584563200_ref116","doi-asserted-by":"crossref","first-page":"3739","DOI":"10.1093\/bioinformatics\/btaa207","article-title":"A knowledge-based scoring function to assess quaternary associations of proteins","volume":"36","author":"Dhawanjewar","year":"2020","journal-title":"Bioinformatics"},{"key":"2022071905584563200_ref117","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1186\/s13321-018-0285-8","article-title":"P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure","volume":"10","author":"Kriv\u00e1k","year":"2018","journal-title":"J Chem"},{"key":"2022071905584563200_ref118","doi-asserted-by":"crossref","first-page":"3386","DOI":"10.1093\/bioinformatics\/btm434","article-title":"meta-PPISP: a meta web server for protein-protein interaction site prediction","volume":"23","author":"Qin","year":"2007","journal-title":"Bioinformatics"},{"key":"2022071905584563200_ref119"},{"key":"2022071905584563200_ref120","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1186\/s12859-015-0611-3","article-title":"InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams","volume":"16","author":"Heberle","year":"2015","journal-title":"BMC Bioinform"},{"key":"2022071905584563200_ref121","article-title":"UCSF Chimera\u2013A visualization system for exploratory research and analysis","volume":"25","year":"2004","journal-title":"J. Comput. Chem."}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/advance-article-pdf\/doi\/10.1093\/bib\/bbac187\/45016938\/bbac187.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/advance-article-pdf\/doi\/10.1093\/bib\/bbac187\/45016938\/bbac187.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,7,19]],"date-time":"2022-07-19T02:00:56Z","timestamp":1658196056000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac187\/6596316"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,1]]},"references-count":121,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,7,18]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac187","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.11.17.468998","asserted-by":"object"}]},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,7,18]]},"published":{"date-parts":[[2022,6,1]]},"article-number":"bbac187"}}