{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,29]],"date-time":"2026-03-29T13:20:30Z","timestamp":1774790430795,"version":"3.50.1"},"reference-count":28,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2020,7,13]],"date-time":"2020-07-13T00:00:00Z","timestamp":1594598400000},"content-version":"vor","delay-in-days":12,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["IIS-1217886"],"award-info":[{"award-number":["IIS-1217886"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CCF-1617192"],"award-info":[{"award-number":["CCF-1617192"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Protein secondary structure prediction is a fundamental precursor to many bioinformatics tasks. Nearly all state-of-the-art tools when computing their secondary structure prediction do not explicitly leverage the vast number of proteins whose structure is known. Leveraging this additional information in a so-called template-based method has the potential to significantly boost prediction accuracy.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Method<\/jats:title>\n                  <jats:p>We present a new hybrid approach to secondary structure prediction that gains the advantages of both template- and non-template-based methods. Our core template-based method is an algorithmic approach that uses metric-space nearest neighbor search over a template database of fixed-length amino acid words to determine estimated class-membership probabilities for each residue in the protein. These probabilities are then input to a dynamic programming algorithm that finds a physically valid maximum-likelihood prediction for the entire protein. Our hybrid approach exploits a novel accuracy estimator for our core method, which estimates the unknown true accuracy of its prediction, to discern when to switch between template- and non-template-based methods.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>On challenging CASP benchmarks, the resulting hybrid approach boosts the state-of-the-art Q8 accuracy by more than 2\u201310%, and Q3 accuracy by more than 1\u20133%, yielding the most accurate method currently available for both 3- and 8-state secondary structure prediction.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>A preliminary implementation in a new tool we call Nnessy is available free for non-commercial use at http:\/\/nnessy.cs.arizona.edu.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa336","type":"journal-article","created":{"date-parts":[[2020,4,30]],"date-time":"2020-04-30T19:14:41Z","timestamp":1588274081000},"page":"i317-i325","source":"Crossref","is-referenced-by-count":11,"title":["Boosting the accuracy of protein secondary structure prediction through nearest neighbor search and method hybridization"],"prefix":"10.1093","volume":"36","author":[{"given":"Spencer","family":"Krieger","sequence":"first","affiliation":[{"name":"Department of Computer Science, The University of Arizona , Tucson, AZ 85721, USA"}]},{"given":"John","family":"Kececioglu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, The University of Arizona , Tucson, AZ 85721, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,7,13]]},"reference":[{"key":"2024021913341283600_btaa336-B1","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1002\/prot.20176","article-title":"Accurate prediction of solvent accessibility using neural networks-based regression","volume":"56","author":"Adamczak","year":"2004","journal-title":"Proteins"},{"key":"2024021913341283600_btaa336-B2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2024021913341283600_btaa336-B3","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The Protein Data Bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2024021913341283600_btaa336-B4","author":"Beygelzimer","year":"2006"},{"key":"2024021913341283600_btaa336-B5","author":"DeBlasio","year":"2017"},{"key":"2024021913341283600_btaa336-B6","doi-asserted-by":"crossref","first-page":"472","DOI":"10.1186\/1471-2105-12-472","article-title":"MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue\u2013residue contacts","volume":"12","author":"Deng","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2024021913341283600_btaa336-B7","doi-asserted-by":"crossref","first-page":"1042","DOI":"10.1126\/science.1219021","article-title":"The protein-folding problem, 50 years on","volume":"338","author":"Dill","year":"2012","journal-title":"Science"},{"key":"2024021913341283600_btaa336-B8","doi-asserted-by":"crossref","first-page":"838","DOI":"10.1002\/prot.21298","article-title":"Achieving 80% ten-fold cross-validated accuracy for secondary structure prediction by large-scale training","volume":"66","author":"Dor","year":"2006","journal-title":"Proteins Struct. Funct. Bioinf"},{"key":"2024021913341283600_btaa336-B9","doi-asserted-by":"crossref","first-page":"W389","DOI":"10.1093\/nar\/gkv332","article-title":"JPred4: a protein secondary structure prediction server","volume":"43","author":"Drozdetskiy","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2024021913341283600_btaa336-B10","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1002\/jcc.21968","article-title":"SPINE X: improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles","volume":"33","author":"Faraggi","year":"2012","journal-title":"J. Comput. Chem"},{"key":"2024021913341283600_btaa336-B11","doi-asserted-by":"crossref","first-page":"W29","DOI":"10.1093\/nar\/gkr367","article-title":"HMMER web server: interactive sequence similarity searching","volume":"39","author":"Finn","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2024021913341283600_btaa336-B12","doi-asserted-by":"crossref","first-page":"11476","DOI":"10.1038\/srep11476","article-title":"Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning","volume":"5","author":"Heffernan","year":"2015","journal-title":"Sci. Rep"},{"key":"2024021913341283600_btaa336-B13","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position-specific scoring matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2024021913341283600_btaa336-B14","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1002\/bip.360221211","article-title":"Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features","volume":"22","author":"Kabsch","year":"1983","journal-title":"Biopolymers"},{"key":"2024021913341283600_btaa336-B15","doi-asserted-by":"crossref","first-page":"561","DOI":"10.1089\/cmb.2009.0222","article-title":"Aligning protein sequences with predicted secondary structure","volume":"17","author":"Kececioglu","year":"2010","journal-title":"J. Comput. Biol"},{"key":"2024021913341283600_btaa336-B16","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1093\/bioinformatics\/btr611","article-title":"A novel structural position-specific scoring matrix for the prediction of protein secondary structures","volume":"28","author":"Li","year":"2012","journal-title":"Bioinformatics"},{"key":"2024021913341283600_btaa336-B17","doi-asserted-by":"crossref","first-page":"767","DOI":"10.1089\/cmb.2007.0132","article-title":"Multiple sequence alignment based on profile alignment of intermediate sequences","volume":"15","author":"Lu","year":"2008","journal-title":"J. Comput. Biol"},{"key":"2024021913341283600_btaa336-B18","doi-asserted-by":"crossref","first-page":"9856","DOI":"10.1038\/s41598-018-28084-8","article-title":"Protein secondary structure prediction based on data partition and semi-random subspace method","volume":"8","author":"Ma","year":"2018","journal-title":"Sci. Rep"},{"key":"2024021913341283600_btaa336-B19","doi-asserted-by":"crossref","first-page":"2056","DOI":"10.1093\/bioinformatics\/btt344","article-title":"Porter, PaleAle 4.0: high-accuracy prediction of protein secondary structure and relative solvent accessibility","volume":"29","author":"Mirabello","year":"2013","journal-title":"Bioinformatics"},{"key":"2024021913341283600_btaa336-B20","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1093\/oxfordjournals.molbev.a003985","article-title":"Estimating amino acid substitution models: a comparison of Dayhoff\u2019s estimator, the resolvent approach and a maximum likelihood method","volume":"19","author":"M\u00fcller","year":"2002","journal-title":"Mol. Biol. Evol"},{"key":"2024021913341283600_btaa336-B21","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1002\/prot.10082","article-title":"Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles","volume":"47","author":"Pollastri","year":"2002","journal-title":"Proteins Struct. Funct. Bioinf"},{"key":"2024021913341283600_btaa336-B22","doi-asserted-by":"crossref","first-page":"e32235","DOI":"10.1371\/journal.pone.0032235","article-title":"A unified multitask architecture for predicting local protein properties","volume":"7","author":"Qi","year":"2012","journal-title":"PLoS One"},{"key":"2024021913341283600_btaa336-B23","doi-asserted-by":"crossref","first-page":"4275","DOI":"10.1007\/s00894-012-1410-7","article-title":"Fast learning optimized prediction methodology (FLOPRED) for protein secondary structure prediction","volume":"18","author":"Saraswathi","year":"2012","journal-title":"J. Mol. Model"},{"key":"2024021913341283600_btaa336-B24","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1109\/TCBB.2014.2343960","article-title":"A deep learning network approach to ab initio protein secondary structure prediction","volume":"12","author":"Spencer","year":"2015","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinf"},{"key":"2024021913341283600_btaa336-B25","doi-asserted-by":"crossref","first-page":"18962","DOI":"10.1038\/srep18962","article-title":"Protein secondary structure prediction using deep convolutional neural fields","volume":"6","author":"Wang","year":"2016","journal-title":"Sci. Rep"},{"key":"2024021913341283600_btaa336-B26","author":"Woerner","year":"2016"},{"key":"2024021913341283600_btaa336-B27","article-title":"Sixty-five years of the long march in protein secondary structure prediction: the final stretch?","volume":"19, 482-494","author":"Yang","year":"2016","journal-title":"Brief. Bioinf"},{"key":"2024021913341283600_btaa336-B28","doi-asserted-by":"crossref","first-page":"992","DOI":"10.1021\/ci400647u","article-title":"Context-based features enhance protein secondary structure prediction accuracy","volume":"54","author":"Yaseen","year":"2014","journal-title":"J. Chem. Inf. Model"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/Supplement_1\/i317\/56702644\/bioinformatics_36_supplement1_i317.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/Supplement_1\/i317\/56702644\/bioinformatics_36_supplement1_i317.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,19]],"date-time":"2024-02-19T13:43:47Z","timestamp":1708350227000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/Supplement_1\/i317\/5870492"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,1]]},"references-count":28,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2020,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa336","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,7]]},"published":{"date-parts":[[2020,7,1]]}}}