{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,26]],"date-time":"2025-10-26T14:53:22Z","timestamp":1761490402166,"version":"3.37.3"},"reference-count":59,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2017,11,9]],"date-time":"2017-11-09T00:00:00Z","timestamp":1510185600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold cotranslationally.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We have investigated whether a pseudo-greedy search approach, which begins sequentially from one of the termini, can improve the performance and accuracy of de novo protein structure prediction. We observed that our sequential approach converges when fewer than 20\u00a0000 decoys have been produced, fewer than commonly expected. Using our software, SAINT2, we also compared the run time and quality of models produced in a sequential fashion against a standard, non-sequential approach. Sequential prediction produces an individual decoy 1.5\u20132.5 times faster than non-sequential prediction. When considering the quality of the best model, sequential prediction led to a better model being produced for 31 out of 41 soluble protein validation cases and for 18 out of 24 transmembrane protein cases. Correct models (TM-Score &amp;gt; 0.5) were produced for 29 of these cases by the sequential mode and for only 22 by the non-sequential mode. Our comparison reveals that a sequential search strategy can be used to drastically reduce computational time of de novo protein structure prediction and improve accuracy.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>Data are available for download from: http:\/\/opig.stats.ox.ac.uk\/resources. SAINT2 is available for download from: https:\/\/github.com\/sauloho\/SAINT2.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx722","type":"journal-article","created":{"date-parts":[[2017,11,8]],"date-time":"2017-11-08T12:12:53Z","timestamp":1510143173000},"page":"1132-1140","source":"Crossref","is-referenced-by-count":10,"title":["Sequential search leads to faster, more efficient fragment-based<i>de novo<\/i>protein structure prediction"],"prefix":"10.1093","volume":"34","author":[{"given":"Saulo H P","family":"de Oliveira","sequence":"first","affiliation":[{"name":"Department of Statistics, University of Oxford, Oxford, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eleanor C","family":"Law","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of Oxford, Oxford, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiye","family":"Shi","sequence":"additional","affiliation":[{"name":"Department of Informatics, UCB Pharma, Slough, UK"},{"name":"Division of Physical Biology, Shanghai Institute of Applied Physics, Chinese Academy of Sciences, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Charlotte M","family":"Deane","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of Oxford, Oxford, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2017,11,9]]},"reference":[{"key":"2023012712572761800_btx722-B1","doi-asserted-by":"crossref","first-page":"136.","DOI":"10.1186\/s12859-015-0576-2","article-title":"Customised fragments libraries for protein structure prediction based on structural class annotations","volume":"16","author":"Abbass","year":"2015","journal-title":"BMC Bioinformatics"},{"key":"2023012712572761800_btx722-B2","doi-asserted-by":"crossref","first-page":"1380","DOI":"10.1023\/A:1002800822475","article-title":"Cotranslational folding of proteins","volume":"65","author":"Basharov","year":"2000","journal-title":"Biochemistry (Moscow)"},{"key":"2023012712572761800_btx722-B3","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Research"},{"key":"2023012712572761800_btx722-B4","doi-asserted-by":"crossref","first-page":"2791","DOI":"10.1093\/bioinformatics\/btw316","article-title":"UniCon3D: de novo protein structure prediction using united-residue conformational search via stepwise, probabilistic sampling","volume":"32","author":"Bhattacharya","year":"2016","journal-title":"Bioinformatics"},{"key":"2023012712572761800_btx722-B5","doi-asserted-by":"crossref","first-page":"W406","DOI":"10.1093\/nar\/gkt462","article-title":"CABS-fold: server for the de novo and consensus-based prediction of protein structure","volume":"41","author":"Blaszczyk","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023012712572761800_btx722-B6","doi-asserted-by":"crossref","first-page":"958","DOI":"10.1002\/prot.22123","article-title":"Guiding conformation space search with an all-atom energy potential","volume":"73","author":"Brunette","year":"2008","journal-title":"Proteins: Structure, Function, and Bioinformatics"},{"key":"2023012712572761800_btx722-B7","doi-asserted-by":"crossref","first-page":"881","DOI":"10.1002\/prot.24782","article-title":"Optimized distance-dependent atom-pair-based potential DOOP for protein structure prediction","volume":"83","author":"Chae","year":"2015","journal-title":"Proteins: Structure, Function, and Bioinformatics"},{"key":"2023012712572761800_btx722-B8","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1016\/j.asoc.2013.10.029","article-title":"A multiple minima genetic algorithm for protein structure prediction","volume":"15","author":"Custodio","year":"2014","journal-title":"Appl. Soft Comput"},{"key":"2023012712572761800_btx722-B9","doi-asserted-by":"crossref","first-page":"e0123998.","DOI":"10.1371\/journal.pone.0123998","article-title":"Building a better fragment library for de novo protein structure prediction","volume":"10","author":"de Oliveira","year":"2015","journal-title":"PLoS One"},{"key":"2023012712572761800_btx722-B10","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1093\/bioinformatics\/btw618","article-title":"Comparing co-evolution methods and their application to template-free protein structure prediction","volume":"33","author":"de Oliveira","year":"2016","journal-title":"Bioinformatics"},{"key":"2023012712572761800_btx722-B11","doi-asserted-by":"crossref","first-page":"1224.","DOI":"10.12688\/f1000research.11543.1","article-title":"Co-evolution techniques are reshaping the way we do structural bioinformatics","volume":"6","author":"de Oliveira","year":"2017","journal-title":"F1000Research"},{"key":"2023012712572761800_btx722-B12","doi-asserted-by":"crossref","first-page":"i142","DOI":"10.1093\/bioinformatics\/btm175","article-title":"Cotranslational protein folding\u2014fact or fiction?","volume":"23","author":"Deane","year":"2007","journal-title":"Bioinformatics"},{"issue":"1","key":"2023012712572761800_btx722-B13","doi-asserted-by":"crossref","first-page":"172.","DOI":"10.1186\/1471-2105-11-172","article-title":"Directionality in protein fold prediction","volume":"11","author":"Ellis","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012712572761800_btx722-B14","doi-asserted-by":"crossref","first-page":"1515","DOI":"10.1016\/j.str.2009.09.006","article-title":"Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction","volume":"17","author":"Faraggi","year":"2009","journal-title":"Structure"},{"key":"2023012712572761800_btx722-B15","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1002\/jcc.21968","article-title":"Spine x: improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles","volume":"33","author":"Faraggi","year":"2012","journal-title":"J. Comput Chem"},{"key":"2023012712572761800_btx722-B16","doi-asserted-by":"crossref","first-page":"32715","DOI":"10.1074\/jbc.272.52.32715","article-title":"Cotranslational protein folding","volume":"272","author":"Fedorov","year":"1997","journal-title":"J. Biol. Chem"},{"key":"2023012712572761800_btx722-B17","doi-asserted-by":"crossref","first-page":"577","DOI":"10.1162\/EVCO_a_00176","article-title":"Generating, maintaining, and exploiting diversity in a memetic algorithm for protein structure prediction","volume":"24","author":"Garza-Fabre","year":"2016","journal-title":"Evolutionary Comput"},{"key":"2023012712572761800_btx722-B18","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1016\/j.tibs.2009.04.003","article-title":"Cotranslational processing mechanisms: towards a dynamic 3d model","volume":"34","author":"Giglione","year":"2009","journal-title":"Trends Biochem. Sci"},{"key":"2023012712572761800_btx722-B19","doi-asserted-by":"crossref","first-page":"1104","DOI":"10.1126\/science.aad0344","article-title":"Cotranslational protein folding on the ribosome monitored in real time","volume":"350","author":"Holtkamp","year":"2015","journal-title":"Science"},{"key":"2023012712572761800_btx722-B20","doi-asserted-by":"crossref","first-page":"7684","DOI":"10.1073\/pnas.1305887110","article-title":"Stepwise protein folding at near amino acid resolution by hydrogen exchange and mass spectrometry","volume":"110","author":"Hu","year":"2013","journal-title":"Proc. Natl. Acad. Sci"},{"key":"2023012712572761800_btx722-B21","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position-specific scoring matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023012712572761800_btx722-B22","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1093\/bioinformatics\/btr638","article-title":"Psicov: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments","volume":"28","author":"Jones","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012712572761800_btx722-B23","doi-asserted-by":"crossref","first-page":"999","DOI":"10.1093\/bioinformatics\/btu791","article-title":"Metapsicov: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins","volume":"31","author":"Jones","year":"2014","journal-title":"Bioinformatics"},{"key":"2023012712572761800_btx722-B24","doi-asserted-by":"crossref","first-page":"15674","DOI":"10.1073\/pnas.1314045110","article-title":"Assessing the utility of coevolution-based residue\u2013residue contact predictions in a sequence-and structure-rich era","volume":"110","author":"Kamisetty","year":"2013","journal-title":"Proc. Natl. Acad. Sci"},{"key":"2023012712572761800_btx722-B25","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1002\/prot.24987","article-title":"Toward a detailed understanding of search trajectories in fragment assembly approaches to protein structure prediction","volume":"84","author":"Kandathil","year":"2016","journal-title":"Proteins: Struct. Funct., Bioinformatics"},{"key":"2023012712572761800_btx722-B26","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1002\/prot.24374","article-title":"One contact for every twelve residues allows robust and accurate topology-level protein structure modeling","volume":"82","author":"Kim","year":"2014","journal-title":"Proteins: Struct. Funct., Bioinformatics"},{"key":"2023012712572761800_btx722-B27","doi-asserted-by":"crossref","first-page":"584","DOI":"10.1023\/A:1010579111510","article-title":"Cotranslational protein folding","volume":"35","author":"Kolb","year":"2001","journal-title":"Mol. Biol"},{"key":"2023012712572761800_btx722-B28","doi-asserted-by":"crossref","first-page":"e92197.","DOI":"10.1371\/journal.pone.0092197","article-title":"De novo structure prediction of globular proteins aided by sequence variation-derived contacts","volume":"9","author":"Kosciolek","year":"2014","journal-title":"PLoS One"},{"key":"2023012712572761800_btx722-B29","first-page":"349","article-title":"Methods of model accuracy estimation can help selecting the best models from decoy sets: assessment of model accuracy estimations in CASP11","volume":"84(Suppl 1)","author":"Kryshtafovych","year":"2015","journal-title":"Proteins: Struct. Funct., Bioinformatics"},{"key":"2023012712572761800_btx722-B30","doi-asserted-by":"crossref","first-page":"2006","DOI":"10.1002\/jcc.24422","article-title":"A critical assessment of hidden markov model sub-optimal sampling strategies applied to the generation of peptide 3D models","volume":"37","author":"Lamiable","year":"2016","journal-title":"J. Comput. Chem"},{"key":"2023012712572761800_btx722-B31","doi-asserted-by":"crossref","first-page":"e0154786.","DOI":"10.1371\/journal.pone.0154786","article-title":"Estimation of uncertainties in the Global Distance Test (GDT_TS) for CASP models","volume":"11","author":"Li","year":"2016","journal-title":"PLoS One"},{"key":"2023012712572761800_btx722-B32","doi-asserted-by":"crossref","first-page":"623","DOI":"10.1093\/bioinformatics\/btk023","article-title":"Opm: orientations of proteins in membranes database","volume":"22","author":"Lomize","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012712572761800_btx722-B33","doi-asserted-by":"crossref","first-page":"W343","DOI":"10.1093\/nar\/gkv357","article-title":"RBO Aleph: leveraging novel information sources for protein structure prediction","volume":"43","author":"Mabrouk","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023012712572761800_btx722-B34","doi-asserted-by":"crossref","first-page":"4741","DOI":"10.1073\/pnas.0501043102","article-title":"Protein folding: the stepwise assembly of foldon units","volume":"102","author":"Maity","year":"2005","journal-title":"Proc. Natl. Acad. Sci.,U.S.A"},{"key":"2023012712572761800_btx722-B35","doi-asserted-by":"crossref","first-page":"e28766.","DOI":"10.1371\/journal.pone.0028766","article-title":"Protein 3d structure computed from evolutionary sequence variation","volume":"6","author":"Marks","year":"2011","journal-title":"PLoS One"},{"key":"2023012712572761800_btx722-B36","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/prot.24452","article-title":"Critical assessment of methods of protein structure prediction (casp) - round x","volume":"82","author":"Moult","year":"2014","journal-title":"Proteins: Struct, Funct, Bioinformatics"},{"key":"2023012712572761800_btx722-B37","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0022-2836(05)80134-2","article-title":"Scop: a structural classification of proteins database for the investigation of sequences and structures","volume":"247","author":"Murzin","year":"1995","journal-title":"J Mol Biol"},{"year":"2014","author":"Olson","key":"2023012712572761800_btx722-B38"},{"key":"2023012712572761800_btx722-B39","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1021\/ct500864r","article-title":"Combined covalent-electrostatic model of hydrogen bonding improves structure prediction with Rosetta","volume":"11","author":"O\u2019Meara","year":"2015","journal-title":"J. Chem. Theor. Comput"},{"key":"2023012712572761800_btx722-B40","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1002\/prot.24974","article-title":"Improved de novo structure prediction in CASP11 by incorporating Co-evolution information into rosetta","volume":"84","author":"Ovchinnikov","year":"2015","journal-title":"Proteins: Struct., Funct., Bioinformatics"},{"key":"2023012712572761800_btx722-B41","doi-asserted-by":"crossref","first-page":"e09248.","DOI":"10.7554\/eLife.09248","article-title":"Large-scale determination of previously unsolved protein structures using evolutionary information","volume":"4","author":"Ovchinnikov","year":"2015","journal-title":"Elife"},{"key":"2023012712572761800_btx722-B42","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1126\/science.aah4043","article-title":"Protein structure determination using metagenome sequence data","volume":"355","author":"Ovchinnikov","year":"2017","journal-title":"Science"},{"key":"2023012712572761800_btx722-B43","doi-asserted-by":"crossref","first-page":"e1601274.","DOI":"10.1126\/sciadv.1601274","article-title":"Blind protein structure prediction using accelerated free-energy simulations","volume":"2","author":"Perez","year":"2016","journal-title":"Sci. Adv"},{"key":"2023012712572761800_btx722-B44","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1126\/science.aab2157","article-title":"The delicate dance of translation and folding","volume":"348","author":"Puglisi","year":"2015","journal-title":"Science"},{"key":"2023012712572761800_btx722-B45","doi-asserted-by":"crossref","first-page":"D290","DOI":"10.1093\/nar\/gkr1065","article-title":"The pfam protein families database","volume":"40","author":"Punta","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023012712572761800_btx722-B46","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1002\/prot.22540","article-title":"Structure prediction for casp8 with all-atom refinement using rosetta","volume":"77","author":"Raman","year":"2009","journal-title":"Proteins: Struct., Funct., Bioinformatics"},{"key":"2023012712572761800_btx722-B47","doi-asserted-by":"crossref","first-page":"742","DOI":"10.1002\/biot.201000330","article-title":"Signatures of co-translational folding","volume":"6","author":"Saunders","year":"2011","journal-title":"Biotechnol. J"},{"key":"2023012712572761800_btx722-B48","doi-asserted-by":"crossref","first-page":"2240","DOI":"10.1002\/prot.24587","article-title":"Improving fragment quality for de novo structure prediction","volume":"82","author":"Shrestha","year":"2014","journal-title":"Proteins: Struct., Funct., Bioinformatics"},{"key":"2023012712572761800_btx722-B49","doi-asserted-by":"crossref","first-page":"e38799.","DOI":"10.1371\/journal.pone.0038799","article-title":"A probabilistic fragment-based protein structure prediction algorithm","volume":"7","author":"Simoncini","year":"2012","journal-title":"PLoS One"},{"key":"2023012712572761800_btx722-B50","doi-asserted-by":"crossref","first-page":"e68954.","DOI":"10.1371\/journal.pone.0068954","article-title":"Efficient sampling in fragment-based protein structure prediction using an estimation of distribution algorithm","volume":"8","author":"Simoncini","year":"2013","journal-title":"PLoS One"},{"key":"2023012712572761800_btx722-B51","doi-asserted-by":"crossref","first-page":"852","DOI":"10.1002\/prot.25244","article-title":"Balancing exploration and exploitation in population-based sampling improves fragment-based de novo protein structure prediction","volume":"85","author":"Simoncini","year":"2017","journal-title":"Proteins: Struct., Funct., Bioinformatics"},{"key":"2023012712572761800_btx722-B52","doi-asserted-by":"crossref","first-page":"951","DOI":"10.1093\/bioinformatics\/bti125","article-title":"Protein homology detection by HMM-HMM comparison","volume":"21","author":"S\u00f6ding","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012712572761800_btx722-B53","doi-asserted-by":"crossref","first-page":"1589","DOI":"10.1093\/bioinformatics\/btg224","article-title":"Pisces: a protein sequence culling server","volume":"19","author":"Wang","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012712572761800_btx722-B54","doi-asserted-by":"crossref","first-page":"1715","DOI":"10.1002\/prot.24065","article-title":"Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field","volume":"80","author":"Xu","year":"2012","journal-title":"Proteins: Struct., Funct., Bioinformatics"},{"key":"2023012712572761800_btx722-B55","doi-asserted-by":"crossref","first-page":"889","DOI":"10.1093\/bioinformatics\/btq066","article-title":"How significant is a protein structure similarity with tm-score= 0.5?","volume":"26","author":"Xu","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012712572761800_btx722-B56","doi-asserted-by":"crossref","first-page":"W174","DOI":"10.1093\/nar\/gkv342","article-title":"I-TASSER server: new development for protein structure and function predictions","volume":"43","author":"Yang","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023012712572761800_btx722-B57","doi-asserted-by":"crossref","first-page":"1010","DOI":"10.1002\/prot.20817","article-title":"Multipass membrane protein structure prediction using rosetta","volume":"62","author":"Yarov-Yarovoy","year":"2005","journal-title":"Proteins: Struct., Funct., Bioinformatics"},{"year":"2016","author":"Zhang","key":"2023012712572761800_btx722-B58"},{"key":"2023012712572761800_btx722-B59","doi-asserted-by":"crossref","first-page":"702","DOI":"10.1002\/prot.20264","article-title":"Scoring function for automated assessment of protein structure template quality","volume":"57","author":"Zhang","year":"2004","journal-title":"Proteins: Struct., Funct., Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/7\/1132\/48914519\/bioinformatics_34_7_1132.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/7\/1132\/48914519\/bioinformatics_34_7_1132.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,28]],"date-time":"2023-08-28T05:06:08Z","timestamp":1693199168000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/7\/1132\/4609351"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2017,11,9]]},"references-count":59,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2018,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx722","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2018,4,1]]},"published":{"date-parts":[[2017,11,9]]}}}