{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T07:14:29Z","timestamp":1761894869146},"reference-count":34,"publisher":"Oxford University Press (OUP)","issue":"10","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,5,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation:Correct prediction of residue\u2013residue contacts in proteins that lack good templates with known structure would take ab initio protein structure prediction a large step forward. The lack of correct contacts, and in particular long-range contacts, is considered the main reason why these methods often fail.<\/jats:p>\n               <jats:p>Results: We propose a novel hidden Markov model (HMM)-based method for predicting residue\u2013residue contacts from protein sequences using as training data homologous sequences, predicted secondary structure and a library of local neighborhoods (local descriptors of protein structure). The library consists of recurring structural entities incorporating short-, medium- and long-range interactions and is general enough to reassemble the cores of nearly all proteins in the PDB. The method is tested on an external test set of 606 domains with no significant sequence similarity to the training set as well as 151 domains with SCOP folds not present in the training set. Considering the top 0.2 \u00b7 L predictions (L=sequence length), our HMMs obtained an accuracy of 22.8% for long-range interactions in new fold targets, and an average accuracy of 28.6% for long-, medium- and short-range contacts. This is a significant performance increase over currently available methods when comparing against results published in the literature.<\/jats:p>\n               <jats:p>Availability: \u00a0http:\/\/predictioncenter.org\/Services\/FragHMMent\/<\/jats:p>\n               <jats:p>Contact: \u00a0torgeir.hvidsten@plantphys.umu.se<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp149","type":"journal-article","created":{"date-parts":[[2009,3,17]],"date-time":"2009-03-17T00:34:13Z","timestamp":1237250053000},"page":"1264-1270","source":"Crossref","is-referenced-by-count":36,"title":["Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue\u2013residue contacts"],"prefix":"10.1093","volume":"25","author":[{"given":"Patrik","family":"Bj\u00f6rkholm","sequence":"first","affiliation":[{"name":"1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, 2Stockholm Bioinformatics Center, Albanova, Stockholm University, 10691 Stockholm, Sweden, 3Department of Biophysics, Faculty of Physics, University of Warsaw, Warsaw, Poland, 4UC Davis Genome Centre, UC Davis, USA and 5Ume\u00e5 Plant Science Centre, Department of Plant Physiology, Ume\u00e5 University, Ume\u00e5, Sweden"},{"name":"1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, 2Stockholm Bioinformatics Center, Albanova, Stockholm University, 10691 Stockholm, Sweden, 3Department of Biophysics, Faculty of Physics, University of Warsaw, Warsaw, Poland, 4UC Davis Genome Centre, UC Davis, USA and 5Ume\u00e5 Plant Science Centre, Department of Plant Physiology, Ume\u00e5 University, Ume\u00e5, Sweden"}]},{"given":"Pawel","family":"Daniluk","sequence":"additional","affiliation":[{"name":"1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, 2Stockholm Bioinformatics Center, Albanova, Stockholm University, 10691 Stockholm, Sweden, 3Department of Biophysics, Faculty of Physics, University of Warsaw, Warsaw, Poland, 4UC Davis Genome Centre, UC Davis, USA and 5Ume\u00e5 Plant Science Centre, Department of Plant Physiology, Ume\u00e5 University, Ume\u00e5, Sweden"}]},{"given":"Andriy","family":"Kryshtafovych","sequence":"additional","affiliation":[{"name":"1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, 2Stockholm Bioinformatics Center, Albanova, Stockholm University, 10691 Stockholm, Sweden, 3Department of Biophysics, Faculty of Physics, University of Warsaw, Warsaw, Poland, 4UC Davis Genome Centre, UC Davis, USA and 5Ume\u00e5 Plant Science Centre, Department of Plant Physiology, Ume\u00e5 University, Ume\u00e5, Sweden"}]},{"given":"Krzysztof","family":"Fidelis","sequence":"additional","affiliation":[{"name":"1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, 2Stockholm Bioinformatics Center, Albanova, Stockholm University, 10691 Stockholm, Sweden, 3Department of Biophysics, Faculty of Physics, University of Warsaw, Warsaw, Poland, 4UC Davis Genome Centre, UC Davis, USA and 5Ume\u00e5 Plant Science Centre, Department of Plant Physiology, Ume\u00e5 University, Ume\u00e5, Sweden"}]},{"given":"Robin","family":"Andersson","sequence":"additional","affiliation":[{"name":"1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, 2Stockholm Bioinformatics Center, Albanova, Stockholm University, 10691 Stockholm, Sweden, 3Department of Biophysics, Faculty of Physics, University of Warsaw, Warsaw, Poland, 4UC Davis Genome Centre, UC Davis, USA and 5Ume\u00e5 Plant Science Centre, Department of Plant Physiology, Ume\u00e5 University, Ume\u00e5, Sweden"}]},{"given":"Torgeir R.","family":"Hvidsten","sequence":"additional","affiliation":[{"name":"1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, 2Stockholm Bioinformatics Center, Albanova, Stockholm University, 10691 Stockholm, Sweden, 3Department of Biophysics, Faculty of Physics, University of Warsaw, Warsaw, Poland, 4UC Davis Genome Centre, UC Davis, USA and 5Ume\u00e5 Plant Science Centre, Department of Plant Physiology, Ume\u00e5 University, Ume\u00e5, Sweden"},{"name":"1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, 2Stockholm Bioinformatics Center, Albanova, Stockholm University, 10691 Stockholm, Sweden, 3Department of Biophysics, Faculty of Physics, University of Warsaw, Warsaw, Poland, 4UC Davis Genome Centre, UC Davis, USA and 5Ume\u00e5 Plant Science Centre, Department of Plant Physiology, Ume\u00e5 University, Ume\u00e5, Sweden"}]}],"member":"286","published-online":{"date-parts":[[2009,3,16]]},"reference":[{"key":"2023013110285633600_B1","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1016\/S0968-0004(98)01298-5","article-title":"Iterated profile searches with PSI-BLAST\u2013a tool for discovery in protein databases","volume":"23","author":"Altschul","year":"1998","journal-title":"Trends Biochem. Sci."},{"key":"2023013110285633600_B2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023013110285633600_B3","doi-asserted-by":"crossref","first-page":"D226","DOI":"10.1093\/nar\/gkh039","article-title":"SCOP database in 2004: refinements integrate structure and sequence family data","volume":"32","author":"Andreeva","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023013110285633600_B4","doi-asserted-by":"crossref","first-page":"254","DOI":"10.1093\/nar\/28.1.254","article-title":"The ASTRAL compendium for protein structure and sequence analysis","volume":"28","author":"Brenner","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023013110285633600_B5","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1002\/cbic.200500235","article-title":"Protein-structure prediction by recombination of fragments","volume":"7","author":"Bujnicki","year":"2006","journal-title":"Chembiochem"},{"key":"2023013110285633600_B6","doi-asserted-by":"crossref","first-page":"2585","DOI":"10.1016\/S0031-3203(03)00136-5","article-title":"Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers","volume":"36","author":"Cawley","year":"2003","journal-title":"Pattern Recognit. Soc."},{"key":"2023013110285633600_B7","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1186\/1471-2105-8-113","article-title":"Improved residue contact prediction using support vector machines and a large feature set","volume":"8","author":"Cheng","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023013110285633600_B8","doi-asserted-by":"crossref","first-page":"755","DOI":"10.1093\/bioinformatics\/14.9.755","article-title":"Profile hidden Markov models","volume":"14","author":"Eddy","year":"1998","journal-title":"Bioinformatics"},{"key":"2023013110285633600_B9","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1002\/prot.21223","article-title":"A pair-to-pair amino acids substitution matrix and its applications for protein structure prediction","volume":"67","author":"Eyal","year":"2007","journal-title":"Proteins"},{"key":"2023013110285633600_B10","doi-asserted-by":"crossref","first-page":"966","DOI":"10.1016\/j.ces.2005.04.009","article-title":"Advances in protein structure prediction and de novo protein design: a review","volume":"61","author":"Floudas","year":"2006","journal-title":"Chem. Eng. Sci."},{"key":"2023013110285633600_B11","doi-asserted-by":"crossref","first-page":"832","DOI":"10.1002\/prot.20933","article-title":"Correlated mutations: advances and limitations. A study on fusion proteins and on the Cohesin-Dockerin families","volume":"63","author":"Halperin","year":"2006","journal-title":"Proteins"},{"key":"2023013110285633600_B12","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1002\/prot.20160","article-title":"Protein contact prediction using patterns of correlation","volume":"56","author":"Hamilton","year":"2004","journal-title":"Proteins"},{"key":"2023013110285633600_B13","first-page":"135","article-title":"Using substitution probabilities to improve position-specific scoring matrices","volume":"12","author":"Henikoff","year":"1996","journal-title":"Comput. Appl. Biosci."},{"key":"2023013110285633600_B14","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"Henikoff","year":"1992","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110285633600_B15","article-title":"Local descriptors of protein structure: a systematical analysis of the sequence-structure relationship in proteins using short- and long-range interactions","author":"Hvidsten","year":"2008","journal-title":"Proteins Struct. Funct. Bioinform."},{"issue":"Suppl. 8","key":"2023013110285633600_B16","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1002\/prot.21637","article-title":"Assessment of intramolecular contact predictions for CASP7","volume":"69","author":"Izarzugaza","year":"2007","journal-title":"Proteins"},{"issue":"Suppl. 8","key":"2023013110285633600_B17","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1002\/prot.21771","article-title":"Assessment of casp7 structure predictions for template free targets","volume":"69","author":"Jauch","year":"2007","journal-title":"Proteins Struct. Funct. Bioinform."},{"key":"2023013110285633600_B18","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1186\/1471-2105-7-503","article-title":"Predicting residue contacts using pragmatic correlated mutations method: reducing the false positives","volume":"7","author":"Kundrotas","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013110285633600_B19","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1093\/nar\/30.1.264","article-title":"SCOP database in 2002: refinements accommodate structural genomics","volume":"30","author":"Lo Conte","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"2023013110285633600_B20","doi-asserted-by":"crossref","first-page":"404","DOI":"10.1093\/bioinformatics\/16.4.404","article-title":"The PSIPRED protein structure prediction server","volume":"16","author":"McGuffin","year":"2000","journal-title":"Bioinformatics"},{"key":"2023013110285633600_B21","doi-asserted-by":"crossref","first-page":"5361","DOI":"10.1073\/pnas.0509355103","article-title":"Physically realistic homology models built with ROSETTA can be more accurate than their templates","volume":"103","author":"Misura","year":"2006","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110285633600_B22","doi-asserted-by":"crossref","first-page":"S25","DOI":"10.1016\/S1359-0278(97)00060-6","article-title":"Improving contact predictions by the combination of correlated mutations and other sources of sequence information","volume":"2","author":"Olmea","year":"1997","journal-title":"Fold. Des."},{"key":"2023013110285633600_B23","doi-asserted-by":"crossref","first-page":"D501","DOI":"10.1093\/nar\/gki025","article-title":"NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins","volume":"33","author":"Pruitt","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023013110285633600_B24","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1109\/5.18626","article-title":"A tutorial on hidden Markov models and selected applications in speech recognition","volume":"77","author":"Rabiner","year":"1989","journal-title":"Proc. IEEE"},{"key":"2023013110285633600_B25","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1093\/protein\/12.2.85","article-title":"Twilight zone of protein sequence alignments","volume":"12","author":"Rost","year":"1999","journal-title":"Protein Eng."},{"issue":"Suppl. 8","key":"2023013110285633600_B26","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1002\/prot.21791","article-title":"Contact prediction using mutual information and neural nets","volume":"69","author":"Shackelford","year":"2007","journal-title":"Proteins"},{"key":"2023013110285633600_B27","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1093\/protein\/7.3.349","article-title":"Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations?","volume":"7","author":"Shindyalov","year":"1994","journal-title":"Protein Eng."},{"key":"2023013110285633600_B28","doi-asserted-by":"crossref","first-page":"502","DOI":"10.1002\/prot.20106","article-title":"Development and large scale benchmark testing of the PROSPECTOR_3 threading algorithm","volume":"56","author":"Skolnick","year":"2004","journal-title":"Proteins"},{"key":"2023013110285633600_B29","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1038\/nsb0203-87","article-title":"Of men and machines","volume":"10","author":"Tramontano","year":"2003","journal-title":"Nat. Struct. Biol."},{"key":"2023013110285633600_B30","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1109\/TIT.1967.1054010","article-title":"Error bounds for convolutional codes and an asymptotically optimal decoding algorithm","volume":"13","author":"Viterbi","year":"1967","journal-title":"IEEE Trans. Inf. Theory IT"},{"key":"2023013110285633600_B31","doi-asserted-by":"crossref","first-page":"924","DOI":"10.1093\/bioinformatics\/btn069","article-title":"A comprehensive assessment of sequence-based and template-based methods for protein contact prediction","volume":"24","author":"Wu","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013110285633600_B32","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1186\/1471-2105-7-180","article-title":"A two-stage approach for improved prediction of residue contact maps","volume":"7","author":"Vullo","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013110285633600_B33","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1016\/j.sbi.2008.02.004","article-title":"Progress and challenges in protein structure prediction","volume":"18","author":"Zhang","year":"2008","journal-title":"Curr. Opin. Struct. Biol."},{"key":"2023013110285633600_B34","doi-asserted-by":"crossref","first-page":"1145","DOI":"10.1016\/S0006-3495(03)74551-2","article-title":"TOUCHSTONE II: a new approach to ab initio protein structure prediction","volume":"85","author":"Zhang","year":"2003","journal-title":"Biophys. J."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/10\/1264\/48989965\/bioinformatics_25_10_1264.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/10\/1264\/48989965\/bioinformatics_25_10_1264.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T20:37:25Z","timestamp":1675197445000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/10\/1264\/270218"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,3,16]]},"references-count":34,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2009,5,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp149","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,5,15]]},"published":{"date-parts":[[2009,3,16]]}}}