{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,17]],"date-time":"2026-01-17T20:06:33Z","timestamp":1768680393367,"version":"3.49.0"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":3015,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: The 3D structure of a protein sequence can be assembled from the substructures corresponding to small segments of this sequence. For each small sequence segment, there are only a few more likely substructures. We call them the \u2018structural alphabet\u2019 for this segment. Classical approaches such as ROSETTA used sequence profile and secondary structure information, to predict structural fragments. In contrast, we utilize more structural information, such as solvent accessibility and contact capacity, for finding structural fragments.<\/jats:p><jats:p>Results: Integer linear programming technique is applied to derive the best combination of these sequence and structural information items. This approach generates significantly more accurate and succinct structural alphabets with more than 50% improvement over the previous accuracies. With these novel structural alphabets, we are able to construct more accurate protein structures than the state-of-art ab initio protein structure prediction programs such as ROSETTA. We are also able to reduce the Kolodny's library size by a factor of 8, at the same accuracy.<\/jats:p><jats:p>Availability: The online FRazor server is under construction<\/jats:p><jats:p>Contact: \u00a0scli@uwaterloo.ca,mli@uwaterloo.ca, j3xu@tti-c.org<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn165","type":"journal-article","created":{"date-parts":[[2008,6,27]],"date-time":"2008-06-27T07:43:13Z","timestamp":1214552593000},"page":"i182-i189","source":"Crossref","is-referenced-by-count":12,"title":["Designing succinct structural alphabets"],"prefix":"10.1093","volume":"24","author":[{"given":"Shuai Cheng","family":"Li","sequence":"first","affiliation":[{"name":"1 David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ont. N2L 3G1, Canada, 2Institute for Computing Technology, Chinese Academy of Sciences, China and 3Toyota Technological Institute at Chicago, IL 60637, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dongbo","family":"Bu","sequence":"additional","affiliation":[{"name":"1 David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ont. N2L 3G1, Canada, 2Institute for Computing Technology, Chinese Academy of Sciences, China and 3Toyota Technological Institute at Chicago, IL 60637, USA"},{"name":"1 David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ont. N2L 3G1, Canada, 2Institute for Computing Technology, Chinese Academy of Sciences, China and 3Toyota Technological Institute at Chicago, IL 60637, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xin","family":"Gao","sequence":"additional","affiliation":[{"name":"1 David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ont. N2L 3G1, Canada, 2Institute for Computing Technology, Chinese Academy of Sciences, China and 3Toyota Technological Institute at Chicago, IL 60637, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jinbo","family":"Xu","sequence":"additional","affiliation":[{"name":"1 David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ont. N2L 3G1, Canada, 2Institute for Computing Technology, Chinese Academy of Sciences, China and 3Toyota Technological Institute at Chicago, IL 60637, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ming","family":"Li","sequence":"additional","affiliation":[{"name":"1 David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ont. N2L 3G1, Canada, 2Institute for Computing Technology, Chinese Academy of Sciences, China and 3Toyota Technological Institute at Chicago, IL 60637, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2008,7,1]]},"reference":[{"key":"2023020210375129800_B1","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023020210375129800_B2","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023020210375129800_B3","volume-title":"Pattern Recognition and Machine Learning (Information Science and Statistics)","author":"Bishop","year":"2006"},{"key":"2023020210375129800_B4","doi-asserted-by":"crossref","first-page":"4436","DOI":"10.1073\/pnas.91.10.4436","article-title":"An evolutionary approach to folding small \u03b1-helical proteins that uses sequence information and an empirical guiding fitness function","volume":"91","author":"Bowie","year":"1994","journal-title":"Proc. Natl Acad. Sci."},{"key":"2023020210375129800_B5","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1002\/prot.10552","article-title":"Rosetta predictions in CASP5: successes, failures, and prospects for complete automation","volume":"53","author":"Bradley","year":"2003","journal-title":"Proteins Struct. Funct. Genet."},{"key":"2023020210375129800_B6","doi-asserted-by":"crossref","first-page":"381","DOI":"10.3233\/ISB-00141","article-title":"Local backbone structure prediction of proteins","volume":"4","author":"De Brevern","year":"2004","journal-title":"In Silico Biol."},{"key":"2023020210375129800_B7","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1002\/prot.20733","article-title":"Rosetta predictions in CASP5: successes, failures, and prospects for complete automation","volume":"61","author":"Chivian","year":"2005","journal-title":"Proteins Struct. Funct. Genet"},{"key":"2023020210375129800_B8","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1093\/protein\/2.5.335","article-title":"Modelling the polypeptide backbone with \u2018spare parts\u2019 from known protein structures","volume":"2","author":"Claessens","year":"1989","journal-title":"Protein Eng."},{"key":"2023020210375129800_B9","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1093\/biomet\/52.3-4.331","article-title":"The convex hull of a random set of points","volume":"52","author":"Efron","year":"1965","journal-title":"Biometrika"},{"key":"2023020210375129800_B10","doi-asserted-by":"crossref","first-page":"953","DOI":"10.1093\/protein\/7.8.953","article-title":"Comparison of systematic search and database methods for constructing segments of protein structure","volume":"7","author":"Fidelis","year":"1994","journal-title":"Protein Eng."},{"key":"2023020210375129800_B11","doi-asserted-by":"crossref","first-page":"e131","DOI":"10.1371\/journal.pcbi.0020131","article-title":"Sampling realistic protein conformations using local structural bias","volume":"2","author":"Hamelryck","year":"2006","journal-title":"PLoS Computat. Biol."},{"key":"2023020210375129800_B12","doi-asserted-by":"crossref","first-page":"1177","DOI":"10.1110\/ps.0232903","article-title":"Reducing the computational complexity of protein folding via fragment folding and assembly","volume":"12","author":"Haspel","year":"2003","journal-title":"Protein Sci."},{"key":"2023020210375129800_B13","doi-asserted-by":"crossref","first-page":"1636","DOI":"10.1110\/ps.03494504","article-title":"Some fundamental aspects of building protein structures from fragment libraries","volume":"13","author":"Holmesand","year":"2004","journal-title":"Protein Sci."},{"key":"2023020210375129800_B14","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1093\/bioinformatics\/btg1020","article-title":"Protein structure prediction via combinatorial assembly of sub-structural units","volume":"19","author":"Inbar","year":"2003","journal-title":"Bioinformatics"},{"key":"2023020210375129800_B15","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position-specific scoring matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J. Mol. Biol."},{"key":"2023020210375129800_B16","doi-asserted-by":"crossref","first-page":"819","DOI":"10.1002\/j.1460-2075.1986.tb04287.x","article-title":"Using known substructures in protein model building and crystallography","volume":"5","author":"Jones","year":"1986","journal-title":"EMBO J."},{"key":"2023020210375129800_B17","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1002\/bip.360221211","article-title":"Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features","volume":"22","author":"Kabsch","year":"1983","journal-title":"Biopolymers"},{"key":"2023020210375129800_B18","doi-asserted-by":"crossref","first-page":"641","DOI":"10.1093\/protein\/gzg081","article-title":"PROSPECT II: protein structure prediction program for genome-scale applications","volume":"16","author":"Kim","year":"2003","journal-title":"Protein Eng."},{"key":"2023020210375129800_B19","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1016\/S0022-2836(02)00942-7","article-title":"Small libraries of protein fragments model native protein structures accurately","volume":"323","author":"Kolodny","year":"2002","journal-title":"J. Mol. Biol."},{"key":"2023020210375129800_B20","doi-asserted-by":"crossref","first-page":"209","DOI":"10.1016\/j.bpc.2004.12.046","article-title":"Protein structure prediction based on fragment assembly and parameter optimization","volume":"115","author":"Lee","year":"2005","journal-title":"Biophys. Chem."},{"key":"2023020210375129800_B21","doi-asserted-by":"crossref","first-page":"507","DOI":"10.1016\/0022-2836(92)90964-L","article-title":"Accurate modeling of protein conformation by automatic segment matching","volume":"226","author":"Levitt","year":"1992","journal-title":"J. Mol. Biol."},{"key":"2023020210375129800_B22","article-title":"FALCON: zero in on the native protein structure","volume-title":"Technical Report","author":"Li","year":"2007"},{"key":"2023020210375129800_B23","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1002\/prot.10285","article-title":"Ab initio construction of polypeptide fragments: efficient generation of accurate, representative ensembles","volume":"51","author":"Lovell","year":"2003","journal-title":"Proteins"},{"key":"2023020210375129800_B24","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1002\/prot.20716","article-title":"Critical assessment of methods of protein structure prediction (casp):round 6","volume":"61","author":"Moult","year":"2005","journal-title":"Proteins Struct. Funct. Genet."},{"key":"2023020210375129800_B25","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1073\/pnas.37.5.251","article-title":"The pleated sheet, a new layer configuration of polypeptide chains","volume":"37","author":"Pauling","year":"1951","journal-title":"Proc. Natl Acad. Sci."},{"key":"2023020210375129800_B26","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/S0076-6879(04)83004-0","article-title":"Protein structure prediction using Rosetta","volume":"383","author":"Rohl","year":"2004","journal-title":"Methods Enzymol"},{"key":"2023020210375129800_B27","doi-asserted-by":"crossref","first-page":"3661","DOI":"10.1073\/pnas.88.9.3661","article-title":"Calculation of protein conformation as an assembly of stable overlapping segments: application to Bovine pancreatic trypsin inhibitor","volume":"88","author":"Simon","year":"1991","journal-title":"Proc. Natl Acad. Sci."},{"key":"2023020210375129800_B28","doi-asserted-by":"crossref","DOI":"10.1006\/jmbi.1997.0959","article-title":"Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions","volume":"268","author":"Simons","year":"1997","journal-title":"J. Mol. Biol."},{"key":"2023020210375129800_B29","doi-asserted-by":"crossref","first-page":"4428","DOI":"10.1073\/pnas.0511333103","article-title":"A method for evaluating the structural quality of protein models by using higher-order varphi-psi pairs scoring","volume":"103","author":"Sims","year":"2006","journal-title":"Proc. Natl Acad. Sci."},{"key":"2023020210375129800_B30","doi-asserted-by":"crossref","first-page":"355","DOI":"10.1002\/prot.340170404","article-title":"Recognition of errors in three-dimensional structures of proteins","volume":"17","author":"Sippl","year":"1993","journal-title":"Proteins"},{"key":"2023020210375129800_B31","doi-asserted-by":"crossref","first-page":"355","DOI":"10.1002\/prot.340050410","article-title":"A 3D building blocks approach to analyzing and predicting structure of proteins","volume":"5","author":"Unger","year":"1989","journal-title":"Proteins Struct. Funct. Genet."},{"key":"2023020210375129800_B32","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1016\/0263-7855(92)80066-M","article-title":"Probit: a statistical approach to modeling proteins from partial coordinate data using substructure libraries","volume":"10","author":"Wendoloski","year":"1992","journal-title":"J. Mol. Graph."},{"key":"2023020210375129800_B33","doi-asserted-by":"crossref","first-page":"1589","DOI":"10.1093\/bioinformatics\/btg224","article-title":"PISCES: a protein sequence culling server","volume":"19","author":"Wang","year":"2003","journal-title":"Bioinformatics"},{"key":"2023020210375129800_B34","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1109\/TCBB.2005.24","article-title":"Fold recognition by predicted alignment accuracy","volume":"2","author":"Xu","year":"2005","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"2023020210375129800_B35","article-title":"Template-based modeling and free modeling by I-TASSER in CASP7","volume-title":"Proteins","author":"Zhang","year":"2007"},{"key":"2023020210375129800_B36","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1002\/prot.20724","article-title":"TASSER: an automated method for the prediction of protein tertiary structures in CASP6","volume":"61","author":"Zhang","year":"2005","journal-title":"Proteins"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/13\/i182\/49053841\/bioinformatics_24_13_i182.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/13\/i182\/49053841\/bioinformatics_24_13_i182.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,30]],"date-time":"2025-01-30T21:40:24Z","timestamp":1738273224000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/13\/i182\/232453"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,7,1]]},"references-count":36,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2008,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn165","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,7,1]]},"published":{"date-parts":[[2008,7,1]]}}}