{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T04:30:05Z","timestamp":1770525005687,"version":"3.49.0"},"reference-count":45,"publisher":"Oxford University Press (OUP)","issue":"14","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,7,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Generation of structural models and recognition of homologous relationships for unannotated protein sequences are fundamental problems in bioinformatics. Improving the sensitivity and selectivity of methods designed for these two tasks therefore has downstream benefits for many other bioinformatics applications.<\/jats:p>\n               <jats:p>Results: We describe the latest implementation of the GenTHREADER method for structure prediction on a genomic scale. The method combines profile\u2013profile alignments with secondary-structure specific gap-penalties, classic pair- and solvation potentials using a linear combination optimized with a regression SVM model. We find this combination significantly improves both detection of useful templates and accuracy of sequence-structure alignments relative to other competitive approaches. We further present a second implementation of the protocol designed for the task of discriminating superfamilies from one another. This method, pDomTHREADER, is the first to incorporate both sequence and structural data directly in this task and improves sensitivity and selectivity over the standard version of pGenTHREADER and three other standard methods for remote homology detection.<\/jats:p>\n               <jats:p>Contact: \u00a0d.jones@cs.ucl.ac.uk<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp302","type":"journal-article","created":{"date-parts":[[2009,5,9]],"date-time":"2009-05-09T00:15:02Z","timestamp":1241828102000},"page":"1761-1767","source":"Crossref","is-referenced-by-count":254,"title":["pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination"],"prefix":"10.1093","volume":"25","author":[{"given":"Anna","family":"Lobley","sequence":"first","affiliation":[{"name":"1 Department of Computer Science, University College London, London WC1E 6BT and 2 Division of Mathematical Biology, National Institute of Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK"}]},{"given":"Michael I.","family":"Sadowski","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University College London, London WC1E 6BT and 2 Division of Mathematical Biology, National Institute of Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK"}]},{"given":"David T.","family":"Jones","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University College London, London WC1E 6BT and 2 Division of Mathematical Biology, National Institute of Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK"}]}],"member":"286","published-online":{"date-parts":[[2009,5,7]]},"reference":[{"key":"2023013112045688800_B1","doi-asserted-by":"crossref","first-page":"355","DOI":"10.1006\/jmbi.1997.1287","article-title":"Do aligned sequences share the same fold?","volume":"273","author":"Abagyan","year":"1997","journal-title":"J. Mol. Biol."},{"key":"2023013112045688800_B2","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol"},{"key":"2023013112045688800_B3","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023013112045688800_B4","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1016\/j.str.2006.11.009","article-title":"The generation of new protein functions by the combination of domains","volume":"15","author":"Bashton","year":"2003","journal-title":"Structure"},{"key":"2023013112045688800_B5","doi-asserted-by":"crossref","first-page":"1282","DOI":"10.1093\/bioinformatics\/btm098","article-title":"UniRef: comprehensive and non-redundant UniProt reference clusters","volume":"23","author":"Baris","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013112045688800_B6","doi-asserted-by":"crossref","first-page":"D189","DOI":"10.1093\/nar\/gkh034","article-title":"The ASTRAL compendium in 2004","volume":"32","author":"Chandonia","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023013112045688800_B7","doi-asserted-by":"crossref","first-page":"1265","DOI":"10.1016\/j.jmb.2007.12.076","article-title":"Discrimination between distant homologs and structural analogs: lessons from manually constructed, reliable data sets","volume":"377","author":"Cheng","year":"2008","journal-title":"J. Mol. Biol."},{"key":"2023013112045688800_B8","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1006\/jsbi.2001.4335","article-title":"Fold change in evolution of protein structures","volume":"134","author":"Grishin","year":"2001","journal-title":"J. Struct. Biol."},{"key":"2023013112045688800_B9","doi-asserted-by":"crossref","first-page":"909","DOI":"10.1016\/S0022-2836(02)00992-0","article-title":"Quantifying the similarities wtihin fold space","volume":"323","author":"Harrison","year":"2002","journal-title":"J. Mol. Biol."},{"key":"2023013112045688800_B10","doi-asserted-by":"crossref","first-page":"1632","DOI":"10.1101\/gr.183801","article-title":"Annotation transfer for genomics: measuring functional divergence in multi-domain proteins","volume":"11","author":"Heygi","year":"2001","journal-title":"Genome Res"},{"key":"2023013112045688800_B11","doi-asserted-by":"crossref","first-page":"1702","DOI":"10.1110\/ps.4820102","article-title":"In search for more accurate alignments in the twilight zone","volume":"11","author":"Jaroszewski","year":"2002","journal-title":"Protein Sci."},{"key":"2023013112045688800_B12","doi-asserted-by":"crossref","first-page":"797","DOI":"10.1006\/jmbi.1999.2583","article-title":"GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences","volume":"287","author":"Jones","year":"1999","journal-title":"J. Mol. Biol."},{"key":"2023013112045688800_B13","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position-specific scoring matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J. Mol. Biol."},{"key":"2023013112045688800_B14","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1002\/prot.1171","article-title":"Predicting novel protein folds by using FRAGFOLD","volume":"45","author":"Jones","year":"2001","journal-title":"Proteins Struct. Func. Bioinf"},{"key":"2023013112045688800_B15","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/S0968-0004(01)02039-4","article-title":"Getting the most from PSI-BLAST","volume":"3","author":"Jones","year":"2002","journal-title":"Trends Biochem. Sci"},{"key":"2023013112045688800_B16","doi-asserted-by":"crossref","first-page":"4321","DOI":"10.1093\/nar\/gkf544","article-title":"A comparison of profile hidden Markov model procedures for remote homology detection","volume":"30","author":"Madera","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023013112045688800_B17","article-title":"PRC \u2013 The Profile Compararer","volume-title":"PhD Thesis","author":"Madera","year":"2006"},{"key":"2023013112045688800_B18","doi-asserted-by":"crossref","first-page":"874","DOI":"10.1093\/bioinformatics\/btg097","article-title":"Improvement of the GenTHREADER method for genomic fold recognition","volume":"19","author":"McGuffin","year":"2003","journal-title":"Bioinformatics"},{"key":"2023013112045688800_B19","doi-asserted-by":"crossref","first-page":"288","DOI":"10.1186\/1471-2105-7-288","article-title":"High throughput profile-profile based fold recognition for the entire Human proteome","volume":"7","author":"McGuffin","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013112045688800_B20","doi-asserted-by":"crossref","first-page":"1531","DOI":"10.1093\/bioinformatics\/btg185","article-title":"Porbabilistic scoring measures for profile-profile comparison yield more accuracte short seed alignments","volume":"19","author":"Mittelman","year":"2003","journal-title":"Bioinformatics"},{"issue":"Suppl. 8","key":"2023013112045688800_B21","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1002\/prot.21767","article-title":"Critical assessment of methods of protein structure prediction-Round VII","volume":"69","author":"Moult","year":"2007","journal-title":"Proteins"},{"key":"2023013112045688800_B22","doi-asserted-by":"crossref","first-page":"1257","DOI":"10.1006\/jmbi.1999.3233","article-title":"Benchmarking PSI-BLAST in genome annotation","volume":"293","author":"Muller","year":"1999","journal-title":"J. Mol. Biol."},{"key":"2023013112045688800_B23","doi-asserted-by":"crossref","first-page":"867","DOI":"10.1146\/annurev.biochem.74.082803.133029","article-title":"Protein families and their evolution: a structural perspective","volume":"74","author":"Orengo","year":"2005","journal-title":"Ann. Rev. Biochem."},{"key":"2023013112045688800_B24","doi-asserted-by":"crossref","first-page":"683","DOI":"10.1093\/nar\/gkg154","article-title":"Finding weak similarities between proteins by sequence profile comparison","volume":"31","author":"Panchenko","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023013112045688800_B25","first-page":"61","article-title":"Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods","volume-title":"Advances in Large Margin Classifiers","author":"Platt","year":"1999"},{"issue":"Suppl. 8","key":"2023013112045688800_B26","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1002\/prot.21662","article-title":"Assessment of CASP7 predictions in the high accuracy template-based modeling category","volume":"69","author":"Read","year":"2007","journal-title":"Proteins"},{"key":"2023013112045688800_B27","doi-asserted-by":"crossref","first-page":"394","DOI":"10.1016\/j.sbi.2008.05.007","article-title":"Exploring the structure and function paradigm","volume":"18","author":"Redfern","year":"2008","journal-title":"Curr. Opin. Struct. Biol."},{"key":"2023013112045688800_B28","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1016\/j.jmb.2006.05.035","article-title":"Structural diversity of domain superfamilies in the CATH Database","volume":"360","author":"Reeves","year":"2006","journal-title":"J. Mol. Biol"},{"key":"2023013112045688800_B29","doi-asserted-by":"crossref","first-page":"2353","DOI":"10.1093\/bioinformatics\/btm355","article-title":"Methods of remote homology detection can be combined to increase coverage by 10% in the midnight zone","volume":"23","author":"Reid","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013112045688800_B30","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/S0076-6879(04)83004-0","article-title":"Protein structure prediction using Rosetta","volume":"383","author":"Rohl","year":"2004","journal-title":"Meth. Enzymol."},{"key":"2023013112045688800_B31","doi-asserted-by":"crossref","first-page":"232","DOI":"10.1110\/ps.9.2.232","article-title":"Comparison of sequence profiles. Strategies for structural predictions using sequence information","volume":"9","author":"Rychlewski","year":"2000","journal-title":"Protein Sci"},{"key":"2023013112045688800_B32","doi-asserted-by":"crossref","first-page":"240","DOI":"10.1110\/ps.04888805","article-title":"LiveBench-8: the large-scale, continuous assessment of automated protein structure prediction","volume":"14","author":"Rychlewski","year":"2005","journal-title":"Protein. Sci."},{"key":"2023013112045688800_B33","doi-asserted-by":"crossref","first-page":"476","DOI":"10.1002\/prot.21531","article-title":"Benchmarking template selection and model quality assessment for high-resolution comparative modeling","volume":"69","author":"Sadowski","year":"2007","journal-title":"Proteins"},{"key":"2023013112045688800_B34","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1006\/jmbi.1993.1626","article-title":"Comparative protein modeling by satisfaction of spatial restraints","volume":"234","author":"Sali","year":"1993","journal-title":"J. Mol. Biol."},{"key":"2023013112045688800_B35","doi-asserted-by":"crossref","first-page":"3381","DOI":"10.1093\/nar\/gkg520","article-title":"SWISS-MODEL: an automated protein homology-modeling server","volume":"31","author":"Schwede","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023013112045688800_B36","volume-title":"Handbook of Parametric and Nonparametric Statistics","author":"Sheskin","year":"1998","edition":"3rd"},{"key":"2023013112045688800_B37","doi-asserted-by":"crossref","first-page":"776","DOI":"10.1093\/bioinformatics\/16.9.776","article-title":"MaxSub: an automated measure for the assessment of protein structure prediction quality","volume":"16","author":"Siew","year":"2000","journal-title":"Bioinformatics"},{"key":"2023013112045688800_B38","doi-asserted-by":"crossref","first-page":"951","DOI":"10.1093\/bioinformatics\/bti125","article-title":"Protein homology detection by HMM-HMM comparison","volume":"21","author":"Soding","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013112045688800_B39","doi-asserted-by":"crossref","first-page":"1282","DOI":"10.1093\/bioinformatics\/btm098","article-title":"UniRef: comprehensive and non-redundant UniPort reference clusters","volume":"23","author":"Suzek","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013112045688800_B40","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1002\/prot.22186","article-title":"Information theory provides a comprehensive framework for the evaluation of protein structure predictions","volume":"74","author":"Swanson","year":"2009","journal-title":"Proteins"},{"key":"2023013112045688800_B41","doi-asserted-by":"crossref","first-page":"1257","DOI":"10.1006\/jmbi.2001.5293","article-title":"Within the twilight zone: a sensitive profile-profile comparison tool based on information theory","volume":"315","author":"Yona","year":"2002","journal-title":"J. Mol. Biol."},{"key":"2023013112045688800_B42","doi-asserted-by":"crossref","first-page":"e2325","DOI":"10.1371\/journal.pone.0002325","article-title":"SP5: improving protein fold recognition by using torsion angle profiles and profile-based gap penalty model","volume":"3","author":"Zhang","year":"2008","journal-title":"PLoS ONE"},{"key":"2023013112045688800_B43","doi-asserted-by":"crossref","first-page":"702","DOI":"10.1002\/prot.20264","article-title":"Scoring function for automated assessment of protein structure template quality","volume":"57","author":"Zhang","year":"2004","journal-title":"Proteins"},{"key":"2023013112045688800_B44","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1002\/prot.21702","article-title":"Template-based modeling and free modeling by I-TASSER in CASP7","volume":"S8","author":"Zhang","year":"2007","journal-title":"Proteins"},{"key":"2023013112045688800_B45","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1002\/prot.21649","article-title":"Analysis of TASSER-based CASP7 protein structure prediction results","volume":"S8","author":"Zhou","year":"2007","journal-title":"Proteins"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/14\/1761\/48992824\/bioinformatics_25_14_1761.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/14\/1761\/48992824\/bioinformatics_25_14_1761.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T21:19:11Z","timestamp":1675199951000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/14\/1761\/224443"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,5,7]]},"references-count":45,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2009,7,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp302","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,7,15]]},"published":{"date-parts":[[2009,5,7]]}}}