{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,10,16]],"date-time":"2023-10-16T08:05:59Z","timestamp":1697443559119},"reference-count":22,"publisher":"Oxford University Press (OUP)","issue":"19","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Summary: One of the challenges in protein secondary structure prediction is to overcome the cross-validated 80% prediction accuracy barrier. Here, we propose a novel approach to surpass this barrier. Instead of using a single algorithm that relies on a limited data set for training, we combine two complementary methods having different strengths: Fragment Database Mining (FDM) and GOR V. FDM harnesses the availability of the known protein structures in the Protein Data Bank and provides highly accurate secondary structure predictions when sequentially similar structural fragments are identified. In contrast, the GOR V algorithm is based on information theory, Bayesian statistics, and PSI-BLAST multiple sequence alignments to predict the secondary structure of residues inside a sliding window along a protein chain. A combination of these two different methods benefits from the large number of structures in the PDB and significantly improves the secondary structure prediction accuracy, resulting in Q3 ranging from 67.5 to 93.2%, depending on the availability of highly similar fragments in the Protein Data Bank.<\/jats:p><jats:p>Availability: The CDM server is freely accessible by public users and private institutions at http:\/\/gor.bb.iastate.edu\/cdm<\/jats:p><jats:p>Contact: \u00a0taner@iastate.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm379","type":"journal-article","created":{"date-parts":[[2007,7,28]],"date-time":"2007-07-28T00:29:50Z","timestamp":1185582590000},"page":"2628-2630","source":"Crossref","is-referenced-by-count":23,"title":["Consensus Data Mining (CDM) Protein Secondary Structure Prediction Server: Combining GOR V and Fragment Database Mining (FDM)"],"prefix":"10.1093","volume":"23","author":[{"given":"Haitao","family":"Cheng","sequence":"first","affiliation":[{"name":"1 Department of Biochemistry, Biophysics and Molecular Biology, 2Bioinformatics and Computational Biology Program and 3L.H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA"},{"name":"1 Department of Biochemistry, Biophysics and Molecular Biology, 2Bioinformatics and Computational Biology Program and 3L.H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA"}]},{"given":"Taner Z.","family":"Sen","sequence":"additional","affiliation":[{"name":"1 Department of Biochemistry, Biophysics and Molecular Biology, 2Bioinformatics and Computational Biology Program and 3L.H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA"},{"name":"1 Department of Biochemistry, Biophysics and Molecular Biology, 2Bioinformatics and Computational Biology Program and 3L.H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA"}]},{"given":"Robert L.","family":"Jernigan","sequence":"additional","affiliation":[{"name":"1 Department of Biochemistry, Biophysics and Molecular Biology, 2Bioinformatics and Computational Biology Program and 3L.H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA"},{"name":"1 Department of Biochemistry, Biophysics and Molecular Biology, 2Bioinformatics and Computational Biology Program and 3L.H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA"}]},{"given":"Andrzej","family":"Kloczkowski","sequence":"additional","affiliation":[{"name":"1 Department of Biochemistry, Biophysics and Molecular Biology, 2Bioinformatics and Computational Biology Program and 3L.H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA"},{"name":"1 Department of Biochemistry, Biophysics and Molecular Biology, 2Bioinformatics and Computational Biology Program and 3L.H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA"}]}],"member":"286","published-online":{"date-parts":[[2007,7,27]]},"reference":[{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The Protein Data Bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"4314","DOI":"10.1016\/j.polymer.2005.02.040","article-title":"Prediction of protein secondary structure by mining structural fragment database","volume":"46","author":"Cheng","year":"2005","journal-title":"Polymer"},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"1242","DOI":"10.1093\/bioinformatics\/17.12.1242","article-title":"EVA: continuous automatic evaluation of protein structure prediction servers","volume":"17","author":"Eyrich","year":"2001","journal-title":"Bioinformatics"},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1007\/978-1-4613-1571-1_10","article-title":"The GOR method for predicting secondary structures in proteins","volume-title":"Prediction of Protein Structure and the Principles of Protein Conformation","author":"Garnier","year":"1989"},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1016\/0022-2836(78)90297-8","article-title":"Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins","volume":"120","author":"Garnier","year":"1978","journal-title":"J. Mol. Biol."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1016\/S0076-6879(96)66034-0","article-title":"GOR method for predicting protein secondary structure from amino acid sequence","volume":"266","author":"Garnier","year":"1996","journal-title":"Methods Enzymol."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1016\/0022-2836(87)90292-0","article-title":"Further developments of protein secondary structure prediction using information theory: new parameters and consideration of residue pairs","volume":"198","author":"Gibrat","year":"1987","journal-title":"J. Mol. Biol."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"6195","DOI":"10.1093\/nar\/gkl789","article-title":"Bhageerath: an energy based web enabled computer software suite for limiting the search space of tertiary structures of small globular proteins","volume":"34","author":"Jayaram","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position specific matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J. Mol. Biol."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"30455","DOI":"10.1074\/jbc.M604615200","article-title":"Distinct structural elements in the first membrane-spanning segment of the epithelial sodium channel","volume":"281","author":"Kashlan","year":"2006","journal-title":"J. Biol. Chem."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"1955","DOI":"10.1110\/ps.051479505","article-title":"The effect of long\u2013range interactions on the secondary structure formation of proteins","volume":"14","author":"Kihara","year":"2005","journal-title":"Protein Sci."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1002\/prot.10181","article-title":"Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence","volume":"49","author":"Kloczkowski","year":"2002","journal-title":"Proteins"},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"2256","DOI":"10.1107\/S0907444904026460","article-title":"Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions","volume":"60","author":"Krissinel","year":"2004","journal-title":"Acta Crystallogr. D Biol. Crystallogr."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"12105","DOI":"10.1073\/pnas.1831973100","article-title":"Coupled prediction of protein secondary and tertiary structure","volume":"100","author":"Meiler","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"453","DOI":"10.1098\/rstb.2005.1810","article-title":"Rigorous performance evaluation in protein structure modelling and implications for computational biology","volume":"361","author":"Moult","year":"2006","journal-title":"Philos. Trans. R. Soc. Lond., B., Biol. Sci."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1016\/S0076-6879(96)66033-9","article-title":"PHD: Predicting one-dimensional protein structure by profile-based neural networks","volume":"266","author":"Rost","year":"1996","journal-title":"Comput. Methods Macromol. Sequence Anal."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"204","DOI":"10.1006\/jsbi.2001.4336","article-title":"Review: protein secondary structure prediction continues to rise","volume":"134","author":"Rost","year":"2001","journal-title":"J. Struct. Biol."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"2787","DOI":"10.1093\/bioinformatics\/bti408","article-title":"GOR V server for protein secondary structure prediction","volume":"21","author":"Sen","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"2499","DOI":"10.1110\/ps.062125306","article-title":"A Consensus Data Mining secondary structure prediction by combining GOR V and Fragment Database Mining","volume":"15","author":"Sen","year":"2006","journal-title":"Protein Sci."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"209","DOI":"10.1006\/jmbi.1997.0959","article-title":"Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions","volume":"268","author":"Simons","year":"1997","journal-title":"J. Mol. Biol."},{"key":"2023041208443431600_","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1128\/JB.01238-06","article-title":"Functional analysis of the carboxy-terminal region of Bacillus subtilis TnrA, a MerR family protein","volume":"189","author":"Wray","year":"2007","journal-title":"J. Bacteriol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/19\/2628\/49858051\/bioinformatics_23_19_2628.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/19\/2628\/49858051\/bioinformatics_23_19_2628.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,13]],"date-time":"2023-05-13T16:02:37Z","timestamp":1683993757000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/19\/2628\/185995"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,7,27]]},"references-count":22,"journal-issue":{"issue":"19","published-print":{"date-parts":[[2007,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm379","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,10,1]]},"published":{"date-parts":[[2007,7,27]]}}}