{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T08:08:19Z","timestamp":1778054899676,"version":"3.51.4"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2006,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Protein topology representations such as residue contact maps are an important intermediate step towards <jats:italic>ab initio<\/jats:italic> prediction of protein structure. Although improvements have occurred over the last years, the problem of accurately predicting residue contact maps from primary sequences is still largely unsolved. Among the reasons for this are the unbalanced nature of the problem (with far fewer examples of contacts than non-contacts), the formidable challenge of capturing long-range interactions in the maps, the intrinsic difficulty of mapping one-dimensional input sequences into two-dimensional output maps.<\/jats:p>\n            <jats:p>In order to alleviate these problems and achieve improved contact map predictions, in this paper we split the task into two stages: the prediction of a map's principal eigenvector (PE) from the primary sequence; the reconstruction of the contact map from the PE and primary sequence. Predicting the PE from the primary sequence consists in mapping a vector into a vector. This task is less complex than mapping vectors directly into two-dimensional matrices since the size of the problem is drastically reduced and so is the scale length of interactions that need to be learned.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We develop architectures composed of ensembles of two-layered bidirectional recurrent neural networks to classify the components of the PE in 2, 3 and 4 classes from protein primary sequence, predicted secondary structure, and hydrophobicity interaction scales. Our predictor, tested on a non redundant set of 2171 proteins, achieves classification performances of up to 72.6%, 16% above a base-line statistical predictor.<\/jats:p>\n            <jats:p>We design a system for the prediction of contact maps from the predicted PE. Our results show that predicting maps through the PE yields sizeable gains especially for long-range contacts which are particularly critical for accurate protein 3D reconstruction. The final predictor's accuracy on a non-redundant set of 327 targets is 35.4% and 19.8% for minimum contact separations of 12 and 24, respectively, when the top length\/5 contacts are selected. On the 11 CASP6 Novel Fold targets we achieve similar accuracies (36.5% and 19.7%). This favourably compares with the best automated predictors at CASP6.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>Our final system for contact map prediction achieves state-of-the-art performances, and may provide valuable constraints for improved <jats:italic>ab initio<\/jats:italic> prediction of protein structures. A suite of predictors of structural features, including the PE, and PE-based contact maps, is available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"http:\/\/distill.ucd.ie\" ext-link-type=\"uri\">http:\/\/distill.ucd.ie<\/jats:ext-link>.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-7-180","type":"journal-article","created":{"date-parts":[[2006,4,6]],"date-time":"2006-04-06T12:52:02Z","timestamp":1144327922000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":67,"title":["A two-stage approach for improved prediction of residue contact maps"],"prefix":"10.1186","volume":"7","author":[{"given":"Alessandro","family":"Vullo","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ian","family":"Walsh","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gianluca","family":"Pollastri","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2006,3,30]]},"reference":[{"key":"919_CR1","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1126\/science.1065659","volume":"294","author":"D Baker","year":"2001","unstructured":"Baker D, Sail A: Protein structure prediction and structural genomics. Science 2001, 294: 93\u201396. 10.1126\/science.1065659","journal-title":"Science"},{"issue":"1","key":"919_CR2","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1093\/protein\/12.1.15","volume":"12","author":"P Fariselli","year":"1999","unstructured":"Fariselli P, Casadio R: A neural network based predictor of residue contacts in proteins. Protein Engineering 1999, 12(1):15\u201321. 10.1093\/protein\/12.1.15","journal-title":"Protein Engineering"},{"issue":"11","key":"919_CR3","doi-asserted-by":"publisher","first-page":"835","DOI":"10.1093\/protein\/14.11.835","volume":"14","author":"P Fariselli","year":"2001","unstructured":"Fariselli P, Olmea O, Valencia A, Casadio R: Prediction of contact maps with neural networks and correlated mutations. Protein Engineering 2001, 14(11):835\u2013439. 10.1093\/protein\/14.11.835","journal-title":"Protein Engineering"},{"issue":"Suppl 1","key":"919_CR4","doi-asserted-by":"publisher","first-page":"S62","DOI":"10.1093\/bioinformatics\/18.suppl_1.S62","volume":"18","author":"G Pollastri","year":"2002","unstructured":"Pollastri G, Baldi P: Prediction of Contact Maps by Recurrent Neural Network Architectures and Hidden Context Propagation from All Four Cardinal Corners. Bioinformatics 2002, 18(Suppl 1):S62-S70.","journal-title":"Bioinformatics"},{"key":"919_CR5","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1016\/S1359-0278(97)00041-2","volume":"2","author":"M Vendruscolo","year":"1997","unstructured":"Vendruscolo M, Kussell E, Domany E: Recovery of protein structure from contact maps. Folding and Design 1997, 2: 295\u2013306. 10.1016\/S1359-0278(97)00041-2","journal-title":"Folding and Design"},{"key":"919_CR6","doi-asserted-by":"publisher","first-page":"3001","DOI":"10.1021\/jp983429+","volume":"103","author":"D Debe","year":"1999","unstructured":"Debe D, Carlson M, Sadanobu J, Chan S, Goddard W: Protein fold determination from sparse distance restraints: the restrained generic protein direct Monte Carlo method. J Phys Chem 1999, 103: 3001\u20133008.","journal-title":"J Phys Chem"},{"key":"919_CR7","doi-asserted-by":"publisher","first-page":"308","DOI":"10.1006\/jmbi.1995.0436","volume":"251","author":"A Aszodi","year":"1995","unstructured":"Aszodi A, Gradwell M, Taylor W: Global fold determination from a small number of distance restraints. J Mol Biol 1995, 251: 308\u2013326. 10.1006\/jmbi.1995.0436","journal-title":"J Mol Biol"},{"key":"919_CR8","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1006\/jmbi.1999.2861","volume":"290","author":"E Huang","year":"1999","unstructured":"Huang E, Samudrala R, Ponder J: Ab initio Fold Prediction of Small Helical Proteins Using Distance Geometry and Knowledge-Based Scoring Functions. J Mol Biol 1999, 290: 267\u2013281. 10.1006\/jmbi.1999.2861","journal-title":"J Mol Biol"},{"key":"919_CR9","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1006\/jmbi.1996.0720","volume":"265","author":"J Skolnick","year":"1997","unstructured":"Skolnick J, Kolinski A, Ortiz A: MONSSTER: a method for folding globular proteins with a small number of distance restraints. J Mol Biol 1997, 265: 217\u2013241. 10.1006\/jmbi.1996.0720","journal-title":"J Mol Biol"},{"key":"919_CR10","doi-asserted-by":"publisher","first-page":"311","DOI":"10.1023\/A:1026744431105","volume":"18","author":"P Bowers","year":"2000","unstructured":"Bowers P, Strauss C, Baker D: De novo protein structure determination using sparse NMR data. J Biomol NMR 2000, 18: 311\u2013318. 10.1023\/A:1026744431105","journal-title":"J Biomol NMR"},{"key":"919_CR11","doi-asserted-by":"publisher","first-page":"290","DOI":"10.1002\/prot.10499","volume":"53","author":"W Li","year":"2003","unstructured":"Li W, Zhang Y, Kihara D, Huang Y, Zheng D, Montelione G, Kolinski A, Skolnick J: TOUCHSTONEX: Protein structure prediction with sparse NMR data. Proteins: Structure, Function, and Genetics 2003, 53: 290\u2013306. 10.1002\/prot.10499","journal-title":"Proteins: Structure, Function, and Genetics"},{"issue":"Suppl 1","key":"919_CR12","doi-asserted-by":"publisher","first-page":"224","DOI":"10.1093\/bioinformatics\/bth913","volume":"20","author":"R McCallum","year":"2004","unstructured":"McCallum R: Striped sheets and protein contact prediction. Bioinformatics 2004, 20(Suppl 1):224\u2013231. 10.1093\/bioinformatics\/bth913","journal-title":"Bioinformatics"},{"issue":"Sep","key":"919_CR13","first-page":"575","volume":"4","author":"P Baldi","year":"2003","unstructured":"Baldi P, Pollastri G: The Principled Design of Large- Scale Recursive Neural Network Architectures \u2013 DAG-RNNs and the Protein Structure Prediction Problem. Journal of Machine Learning Research 2003, 4(Sep):575\u2013602.","journal-title":"Journal of Machine Learning Research"},{"key":"919_CR14","unstructured":"CASP6 Home[http:\/\/predictioncenter.org\/casp6\/Casp6.html]"},{"key":"919_CR15","doi-asserted-by":"publisher","first-page":"1242","DOI":"10.1093\/bioinformatics\/17.12.1242","volume":"17","author":"V Eyrich","year":"2001","unstructured":"Eyrich V, Marti-Renom M, Przybylski D, Madhusudan M, Fiser A, Pazos F, Valencia A, Sali A, Rost B: EVA: continuous automatic evaluation od protein structure prediction servers. Bioinformatics 2001, 17: 1242\u20131251. 10.1093\/bioinformatics\/17.12.1242","journal-title":"Bioinformatics"},{"issue":"10","key":"919_CR16","doi-asserted-by":"publisher","first-page":"2167","DOI":"10.1093\/bioinformatics\/bti330","volume":"21","author":"AR Kinjo","year":"2005","unstructured":"Kinjo AR, Nishikawa K: Recoverable one-dimensional encoding of three-dimensional protein structures. Bioinformatics 2005, 21(10):2167\u20132170. 10.1093\/bioinformatics\/bti330","journal-title":"Bioinformatics"},{"key":"919_CR17","volume-title":"Bioinformatics: The Machine Learning Approach","author":"P Baldi","year":"2001","unstructured":"Baldi P, Brunak S: Bioinformatics: The Machine Learning Approach. Second edition. 2001.","edition":"Second"},{"key":"919_CR18","doi-asserted-by":"publisher","first-page":"218101","DOI":"10.1103\/PhysRevLett.92.218101","volume":"92","author":"M Porto","year":"2004","unstructured":"Porto M, Bastolla U, Roman H, Vendruscolo M: Reconstruction of protein structures from a vectorial representation. Phys Rev Lett 2004, 92: 218101. 10.1103\/PhysRevLett.92.218101","journal-title":"Phys Rev Lett"},{"key":"919_CR19","doi-asserted-by":"publisher","first-page":"256","DOI":"10.1002\/prot.340190309","volume":"19","author":"L Holm","year":"1994","unstructured":"Holm L, Sander C: Parser for protein folding units. Proteins 1994, 19: 256\u2013268. 10.1002\/prot.340190309","journal-title":"Proteins"},{"key":"919_CR20","doi-asserted-by":"publisher","first-page":"22","DOI":"10.1002\/prot.20240","volume":"58","author":"U Bastolla","year":"2005","unstructured":"Bastolla U, Porto M, Roman H, Vendruscolo M: Principal eigenvector of contact matrices and hydrophobicity profiles in proteins. Proteins: Structure, Function, and Bioinformatics 2005, 58: 22\u201330. 10.1002\/prot.20240","journal-title":"Proteins: Structure, Function, and Bioinformatics"},{"issue":"8","key":"919_CR21","doi-asserted-by":"publisher","first-page":"1719","DOI":"10.1093\/bioinformatics\/bti203","volume":"21","author":"G Pollastri","year":"2005","unstructured":"Pollastri G, McLysaght A: Porter: a new, accurate server for protein secondary structure prediction. Bioinformatics 2005, 21(8):1719\u201320. 10.1093\/bioinformatics\/bti203","journal-title":"Bioinformatics"},{"key":"919_CR22","doi-asserted-by":"publisher","first-page":"937","DOI":"10.1093\/bioinformatics\/15.11.937","volume":"15","author":"P Baldi","year":"1999","unstructured":"Baldi P, Brunak S, Frasconi P, Soda G, Pollastri G: Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 1999, 15: 937\u2013946. 10.1093\/bioinformatics\/15.11.937","journal-title":"Bioinformatics"},{"key":"919_CR23","doi-asserted-by":"publisher","first-page":"220","DOI":"10.1002\/(SICI)1097-0134(19990201)34:2<220::AID-PROT7>3.0.CO;2-K","volume":"34","author":"A Zemla","year":"1999","unstructured":"Zemla A, Venclovas C, Fidelis K, Rost B: A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 1999, 34: 220\u2013223. 10.1002\/(SICI)1097-0134(19990201)34:2<220::AID-PROT7>3.0.CO;2-K","journal-title":"Proteins"},{"key":"919_CR24","doi-asserted-by":"publisher","first-page":"1051","DOI":"10.1093\/protein\/12.12.1051","volume":"12","author":"C Richardson","year":"1999","unstructured":"Richardson C, Barlow D: The bottom line for prediction of residue solvent accessibility. Protein Engineering 1999, 12: 1051\u20131054. 10.1093\/protein\/12.12.1051","journal-title":"Protein Engineering"},{"key":"919_CR25","first-page":"146","volume-title":"Proceedings of the 2000 Conference on Intelligent Systems for Molecular Biology (ISMBOO), La Jolla, CA","author":"P Fariselli","year":"2000","unstructured":"Fariselli P, Casadio R: Prediction of the number of residue contacts in proteins. Proceedings of the 2000 Conference on Intelligent Systems for Molecular Biology (ISMBOO), La Jolla, CA 2000, 146\u2013151."},{"key":"919_CR26","doi-asserted-by":"publisher","first-page":"142","DOI":"10.1002\/prot.10069","volume":"47","author":"G Pollastri","year":"2002","unstructured":"Pollastri G, Fariselli P, Casadio R, Baldi P: Prediction of Coordination Number and Relative Solvent Accessibility in Proteins. Proteins 2002, 47: 142\u2013235. 10.1002\/prot.10069","journal-title":"Proteins"},{"issue":"Suppl 6","key":"919_CR27","doi-asserted-by":"publisher","first-page":"334","DOI":"10.1002\/prot.10556","volume":"53","author":"J Moult","year":"2003","unstructured":"Moult J, Fidelis K, Zemla A, Hubbard T: Critical assessment of methods of protein structure prediction (CASP)-round V. Proteins 2003, 53(Suppl 6):334\u2013339. 10.1002\/prot.10556","journal-title":"Proteins"},{"key":"919_CR28","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1016\/S0022-2836(02)00698-8","volume":"322","author":"R Bonneau","year":"2002","unstructured":"Bonneau R, Strauss C, Rohl C, Chivian D, Bradley P, Malmstr\u00f6m L, Robertson T, Baker D, Sali A: De Novo Prediction of Three-dimensional Structures for Major Protein Families. J Mol Biol 2002, 322: 65\u201378. 10.1016\/S0022-2836(02)00698-8","journal-title":"J Mol Biol"},{"key":"919_CR29","volume-title":"Algebraic graph theory","author":"N Biggs","year":"1994","unstructured":"Biggs N: Algebraic graph theory. Second edition. 1994.","edition":"Second"},{"issue":"3","key":"919_CR30","doi-asserted-by":"publisher","first-page":"216","DOI":"10.1002\/prot.340200303","volume":"20","author":"B Rost","year":"1994","unstructured":"Rost B, Sander C: Conservation and prediction of solvent accessibility in protein families. Proteins 1994, 20(3):216\u2013226. 10.1002\/prot.340200303","journal-title":"Proteins"},{"issue":"2","key":"919_CR31","doi-asserted-by":"publisher","first-page":"176","DOI":"10.1093\/bioinformatics\/15.2.176","volume":"15","author":"M Mucchielli-Giorgi","year":"1999","unstructured":"Mucchielli-Giorgi M, Hazout S, Tuffery P: PredAcc: prediction of solvent accessibility. Bioinformatics 1999, 15(2):176\u2013177. 10.1093\/bioinformatics\/15.2.176","journal-title":"Bioinformatics"},{"issue":"1","key":"919_CR32","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1002\/1097-0134(20001001)41:1<17::AID-PROT40>3.0.CO;2-F","volume":"41","author":"T Petersen","year":"2000","unstructured":"Petersen T, Lundegaard C, Nielsen M, Bohr H, Bohr J, Brunak S, Gippert G, Lund O: Prediction of protein secondary structure at 80% accuracy. Proteins 2000, 41(1):17\u201320. 10.1002\/1097-0134(20001001)41:1<17::AID-PROT40>3.0.CO;2-F","journal-title":"Proteins"},{"key":"919_CR33","doi-asserted-by":"publisher","first-page":"228","DOI":"10.1002\/prot.10082","volume":"47","author":"G Pollastri","year":"2002","unstructured":"Pollastri G, Przybylski D, Rost B, Baldi P: Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins 2002, 47: 228\u2013235. 10.1002\/prot.10082","journal-title":"Proteins"},{"key":"919_CR34","unstructured":"[http:\/\/bioinfo.tg.fh-giessen.de\/pdbselect\/]"},{"key":"919_CR35","doi-asserted-by":"publisher","first-page":"2577","DOI":"10.1002\/bip.360221211","volume":"22","author":"W Kabsch","year":"1983","unstructured":"Kabsch W, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22: 2577\u20132637. 10.1002\/bip.360221211","journal-title":"Biopolymers"},{"key":"919_CR36","doi-asserted-by":"publisher","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","volume":"25","author":"S Altschul","year":"1997","unstructured":"Altschul S, Madden T, Schaffer A: Gapped blast and psi-blast: a new generation of protein database search programs. Nucl Acids Res 1997, 25: 3389\u20133402. 10.1093\/nar\/25.17.3389","journal-title":"Nucl Acids Res"},{"key":"919_CR37","doi-asserted-by":"publisher","first-page":"768","DOI":"10.1109\/72.712151","volume":"9","author":"P Frasconi","year":"1998","unstructured":"Frasconi P, Gori M, Sperduti A: A general framework for adaptive processing of data structures. IEEE Transactions on Neural Networks 1998, 9: 768\u2013786. 10.1109\/72.712151","journal-title":"IEEE Transactions on Neural Networks"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-7-180.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T03:18:49Z","timestamp":1630466329000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-7-180"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,3,30]]},"references-count":37,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2006,12]]}},"alternative-id":["919"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-7-180","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2006,3,30]]},"assertion":[{"value":"22 September 2005","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 March 2006","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 March 2006","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"180"}}