{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,22]],"date-time":"2025-02-22T00:45:26Z","timestamp":1740185126483,"version":"3.37.3"},"reference-count":45,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2021,9,20]],"date-time":"2021-09-20T00:00:00Z","timestamp":1632096000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100001691","name":"Kakenhi","doi-asserted-by":"publisher","award":["18H02395"],"award-info":[{"award-number":["18H02395"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,1,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>An accurate estimation of the quality of protein model structures typifies as a cornerstone in protein structure prediction regimes. Despite the recent groundbreaking success in the field of protein structure prediction, there are certain prospects for the improvement in model quality estimation at multiple stages of protein structure prediction and thus, to further push the prediction accuracy. Here, a novel approach, named ProFitFun, for assessing the quality of protein models is proposed by harnessing the sequence and structural features of experimental protein structures in terms of the preferences of backbone dihedral angles and relative surface accessibility of their amino acid residues at the tripeptide level. The proposed approach leverages upon the backbone dihedral angle and surface accessibility preferences of the residues by accounting for its N-terminal and C-terminal neighbors in the protein structure. These preferences are used to evaluate protein structures through a machine learning approach and tested on an extensive dataset of diverse proteins.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>The approach was extensively validated on a large test dataset (n\u2009=\u200925\u00a0005) of protein structures, comprising 23\u00a0661 models of 82 non-homologous proteins and 1344 non-homologous experimental structures. In addition, an external dataset of 40\u00a0000 models of 200 non-homologous proteins was also used for the validation of the proposed method. Both datasets were further used for benchmarking the proposed method with four different state-of-the-art methods for protein structure quality assessment. In the benchmarking, the proposed method outperformed some state-of-the-art methods in terms of Spearman\u2019s and Pearson\u2019s correlation coefficients, average GDT-TS loss, sum of z-scores and average absolute difference of predictions over corresponding observed values. The high accuracy of the proposed approach promises a potential use of the sequence and structural features in computational protein design.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>http:\/\/github.com\/KYZ-LSB\/ProTerS-FitFun.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab666","type":"journal-article","created":{"date-parts":[[2021,9,16]],"date-time":"2021-09-16T11:17:30Z","timestamp":1631791050000},"page":"369-376","source":"Crossref","is-referenced-by-count":9,"title":["ProFitFun: a protein tertiary structure fitness function for quantifying the accuracies of model structures"],"prefix":"10.1093","volume":"38","author":[{"given":"Rahul","family":"Kaushik","sequence":"first","affiliation":[{"name":"Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN , Yokohama, Kanagawa 230-0045, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9282-8045","authenticated-orcid":false,"given":"Kam Y J","family":"Zhang","sequence":"additional","affiliation":[{"name":"Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN , Yokohama, Kanagawa 230-0045, Japan"}]}],"member":"286","published-online":{"date-parts":[[2021,9,20]]},"reference":[{"key":"2023020108442954000_btab666-B1","doi-asserted-by":"crossref","first-page":"4862","DOI":"10.1093\/bioinformatics\/btz422","article-title":"AlphaFold at CASP13","volume":"35","author":"AlQuraishi","year":"2019","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B2","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1093\/bioinformatics\/btaa714","article-title":"GraphQA: protein model quality assessment using graph convolutional networks","volume":"37","author":"Baldassarre","year":"2021","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B3","doi-asserted-by":"crossref","first-page":"939","DOI":"10.1093\/bioinformatics\/btr072","article-title":"Entropy-accelerated exact clustering of protein decoys","volume":"27","author":"Berenger","year":"2011","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B4","doi-asserted-by":"crossref","first-page":"586","DOI":"10.1093\/bioinformatics\/btw694","article-title":"QAcon: single model quality assessment using protein structural and contact information with machine learning techniques","volume":"33","author":"Cao","year":"2017","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B5","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1186\/s12859-016-1405-y","article-title":"DeepQA: improving the estimation of single protein model quality with deep belief networks","volume":"17","author":"Cao","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023020108442954000_btab666-B6","doi-asserted-by":"crossref","first-page":"23990","DOI":"10.1038\/srep23990","article-title":"Protein single-model quality assessment by feature-based probability density functions","volume":"6","author":"Cao","year":"2016","journal-title":"Sci. Rep"},{"key":"2023020108442954000_btab666-B7","doi-asserted-by":"crossref","first-page":"D475","DOI":"10.1093\/nar\/gky1134","article-title":"SCOPe: classification of large macromolecular structures in the structural classification of proteins-extended database","volume":"47","author":"Chandonia","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023020108442954000_btab666-B8","doi-asserted-by":"crossref","first-page":"11136","DOI":"10.1021\/acs.jpcb.5b02999","article-title":"From Ramachandran maps to tertiary structures of proteins","volume":"119","author":"DasGupta","year":"2015","journal-title":"J. Phys. Chem. B"},{"key":"2023020108442954000_btab666-B9","doi-asserted-by":"crossref","first-page":"378","DOI":"10.1093\/bioinformatics\/btv601","article-title":"3DRobot: automated generation of diverse and well-packed protein structure decoys","volume":"32","author":"Deng","year":"2016","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B10","doi-asserted-by":"crossref","first-page":"4046","DOI":"10.1093\/bioinformatics\/bty494","article-title":"Deep convolutional networks for quality assessment of protein folds","volume":"34","author":"Derevyanko","year":"2018","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B11","doi-asserted-by":"crossref","first-page":"W500","DOI":"10.1093\/nar\/gkh429","article-title":"STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins","volume":"32","author":"Heinig","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023020108442954000_btab666-B12","doi-asserted-by":"crossref","first-page":"1340","DOI":"10.1038\/s41467-021-21511-x","article-title":"Improved protein structure refinement guided by deep learning based accuracy estimation","volume":"12","author":"Hiranuma","year":"2021","journal-title":"Nat. Commun"},{"key":"2023020108442954000_btab666-B13","doi-asserted-by":"crossref","first-page":"2332","DOI":"10.1093\/bioinformatics\/btab118","article-title":"VoroCNN: deep convolutional neural network built on 3D Voronoi tessellation of protein structures","volume":"37","author":"Igashov","year":"2021","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B14","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1186\/s12859-017-1691-z","article-title":"MQAPRank: improved global protein model quality assessment by learning-to-rank","volume":"18","author":"Jing","year":"2017","journal-title":"BMC Bioinformatics"},{"key":"2023020108442954000_btab666-B15","doi-asserted-by":"crossref","first-page":"2801","DOI":"10.1093\/bioinformatics\/bty1037","article-title":"Smooth orientation-dependent scoring function for coarse-grained protein quality assessment","volume":"35","author":"Karasikov","year":"2019","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B16","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1021\/acs.biochem.7b01073","article-title":"Where informatics lags chemistry leads","volume":"57","author":"Kaushik","year":"2018","journal-title":"Biochemistry"},{"key":"2023020108442954000_btab666-B17","doi-asserted-by":"crossref","first-page":"1271","DOI":"10.1002\/prot.25900","article-title":"A protein sequence fitness function for identifying natural and nonnatural proteins","volume":"88","author":"Kaushik","year":"2020","journal-title":"Proteins"},{"key":"2023020108442954000_btab666-B18","first-page":"1021","article-title":"Recent advances in sequence-based protein structure prediction","volume":"18","author":"Kc","year":"2017","journal-title":"Brief. Bioinf"},{"key":"2023020108442954000_btab666-B19","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1002\/prot.25823","article-title":"Critical assessment of methods of protein structure prediction (CASP)-Round XIII","volume":"87","author":"Kryshtafovych","year":"2019","journal-title":"Proteins"},{"key":"2023020108442954000_btab666-B20","doi-asserted-by":"crossref","first-page":"2496","DOI":"10.1093\/bioinformatics\/btx222","article-title":"SVMQA: support-vector-machine-based protein single-model quality assessment","volume":"33","author":"Manavalan","year":"2017","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B21","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1093\/bioinformatics\/btp629","article-title":"Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments","volume":"26","author":"McGuffin","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B22","doi-asserted-by":"crossref","first-page":"W425","DOI":"10.1093\/nar\/gkab321","article-title":"ModFOLD8: accurate global and local quality estimates for 3D protein models","volume":"49","author":"McGuffin","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2023020108442954000_btab666-B23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/prot.24452","article-title":"Critical assessment of methods of protein structure prediction (CASP)\u2013round x","volume":"82","author":"Moult","year":"2014","journal-title":"Proteins"},{"key":"2023020108442954000_btab666-B24","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1002\/prot.25415","article-title":"Critical assessment of methods of protein structure prediction (CASP)-Round XII","volume":"86","author":"Moult","year":"2018","journal-title":"Proteins"},{"key":"2023020108442954000_btab666-B25","doi-asserted-by":"crossref","first-page":"1131","DOI":"10.1002\/prot.25278","article-title":"VoroMQA: assessment of protein structure quality using interatomic contact areas","volume":"85","author":"Olechnovi\u010d","year":"2017","journal-title":"Proteins"},{"key":"2023020108442954000_btab666-B26","doi-asserted-by":"crossref","first-page":"3313","DOI":"10.1093\/bioinformatics\/btz122","article-title":"Protein model quality assessment using 3D oriented convolutional neural networks","volume":"35","author":"Pag\u00e8s","year":"2019","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B27","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2023020108442954000_btab666-B28","doi-asserted-by":"crossref","first-page":"i294","DOI":"10.1093\/bioinformatics\/btq192","article-title":"Low-homology protein threading","volume":"26","author":"Peng","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B29","first-page":"579","article-title":"Multilayer perceptron and neural networks","volume":"8","author":"Popescu","year":"2009","journal-title":"WSEAS Trans. Cir. Syst"},{"key":"2023020108442954000_btab666-B30","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1007\/978-1-4939-3145-3_23","article-title":"Toolbox for protein structure prediction","volume":"1369","author":"Roche","year":"2016","journal-title":"Methods Mol. Biol"},{"key":"2023020108442954000_btab666-B31","doi-asserted-by":"crossref","first-page":"1531","DOI":"10.1016\/j.str.2013.08.007","article-title":"Protein modeling: what happened to the \"protein structure gap\"?","volume":"21","author":"Schwede","year":"2013","journal-title":"Structure"},{"key":"2023020108442954000_btab666-B32","doi-asserted-by":"crossref","DOI":"10.1002\/prot.26232","article-title":"When homologous sequences meet structural decoys: accurate contact prediction by tFold in CASP14","author":"Shen","year":"2021","journal-title":"Proteins"},{"key":"2023020108442954000_btab666-B33","doi-asserted-by":"crossref","first-page":"1522","DOI":"10.1107\/S0907444912037961","article-title":"Error-estimation-guided rebuilding of de novo models increases the success rate of ab initio phasing","volume":"68","author":"Shrestha","year":"2012","journal-title":"Acta Crystallogr. Sect. D Biol. Crystallogr"},{"key":"2023020108442954000_btab666-B34","doi-asserted-by":"crossref","first-page":"e68954","DOI":"10.1371\/journal.pone.0068954","article-title":"Efficient sampling in fragment-based protein structure prediction using an estimation of distribution algorithm","volume":"8","author":"Simoncini","year":"2013","journal-title":"PLoS One"},{"key":"2023020108442954000_btab666-B35","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.bbapap.2015.10.004","article-title":"ProTSAV: a protein tertiary structure analysis and validation server","volume":"1864","author":"Singh","year":"2016","journal-title":"Biochim. Biophys. Acta"},{"key":"2023020108442954000_btab666-B36","doi-asserted-by":"crossref","first-page":"40","DOI":"10.3390\/bioengineering8030040","article-title":"P3CMQA: single-model quality assessment using 3DCNN with profile-based features","volume":"8","author":"Takei","year":"2021","journal-title":"Bioengineering (Basel)"},{"key":"2023020108442954000_btab666-B37","doi-asserted-by":"crossref","first-page":"1411","DOI":"10.1093\/bioinformatics\/btv767","article-title":"ProQ2: estimation of model accuracy implemented in Rosetta","volume":"32","author":"Uziela","year":"2016","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B38","doi-asserted-by":"crossref","first-page":"1578","DOI":"10.1093\/bioinformatics\/btw819","article-title":"ProQ3D: improved model quality assessments using deep learning","volume":"33","author":"Uziela","year":"2017","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B39","doi-asserted-by":"crossref","first-page":"e0218149","DOI":"10.1371\/journal.pone.0218149","article-title":"RFQAmodel: random forest quality assessment to identify a predicted protein structure in the correct fold","volume":"14","author":"West","year":"2019","journal-title":"PLoS One"},{"key":"2023020108442954000_btab666-B40","doi-asserted-by":"crossref","first-page":"1351","DOI":"10.1002\/prot.25804","article-title":"Assessment of protein model structure accuracy estimation in CASP13: challenges in the era of deep learning","volume":"87","author":"Won","year":"2019","journal-title":"Proteins"},{"key":"2023020108442954000_btab666-B41","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1002\/prot.24179","article-title":"Toward optimal fragment generations for ab initio protein structure assembly","volume":"81","author":"Xu","year":"2013","journal-title":"Proteins"},{"key":"2023020108442954000_btab666-B42","doi-asserted-by":"crossref","first-page":"889","DOI":"10.1093\/bioinformatics\/btq066","article-title":"How significant is a protein structure similarity with TM-score = 0.5?","volume":"26","author":"Xu","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020108442954000_btab666-B43","doi-asserted-by":"crossref","first-page":"1496","DOI":"10.1073\/pnas.1914677117","article-title":"Improved protein structure prediction using predicted interresidue orientations","volume":"117","author":"Yang","year":"2020","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023020108442954000_btab666-B44","doi-asserted-by":"crossref","first-page":"3370","DOI":"10.1093\/nar\/gkg571","article-title":"LGA: a method for finding 3D similarities in protein structures","volume":"31","author":"Zemla","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023020108442954000_btab666-B45","doi-asserted-by":"crossref","first-page":"702","DOI":"10.1002\/prot.20264","article-title":"Scoring function for automated assessment of protein structure template quality","volume":"57","author":"Zhang","year":"2004","journal-title":"Proteins"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab666\/40545008\/btab666.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/2\/369\/49007064\/btab666.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/2\/369\/49007064\/btab666.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,8]],"date-time":"2023-11-08T22:01:32Z","timestamp":1699480892000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/2\/369\/6372688"}},"subtitle":[],"editor":[{"given":"Jan","family":"Gorodkin","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,9,20]]},"references-count":45,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,1,3]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab666","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2022,1,15]]},"published":{"date-parts":[[2021,9,20]]}}}