{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,26]],"date-time":"2025-10-26T14:12:56Z","timestamp":1761487976063},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2008,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Prediction of disulfide bridges from protein sequences is useful for characterizing structural and functional properties of proteins. Several methods based on different machine learning algorithms have been applied to solve this problem and public domain prediction services exist. These methods are however still potentially subject to significant improvements both in terms of prediction accuracy and overall architectural complexity.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We introduce new methods for predicting disulfide bridges from protein sequences. The methods take advantage of two new decomposition kernels for measuring the similarity between protein sequences according to the amino acid environments around cysteines. Disulfide connectivity is predicted in two passes. First, a binary classifier is trained to predict whether a given protein chain has at least one intra-chain disulfide bridge. Second, a multiclass classifier (plemented by 1-nearest neighbor) is trained to predict connectivity patterns. The two passes can be easily cascaded to obtain connectivity prediction from sequence alone. We report an extensive experimental comparison on several data sets that have been previously employed in the literature to assess the accuracy of cysteine bonding state and disulfide connectivity predictors.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>We reach state-of-the-art results on bonding state prediction with a simple method that classifies chains rather than individual residues. The prediction accuracy reached by our connectivity prediction method compares favorably with respect to all but the most complex other approaches. On the other hand, our method does not need any model selection or hyperparameter tuning, a property that makes it less prone to overfitting and prediction accuracy overestimation.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-9-20","type":"journal-article","created":{"date-parts":[[2008,1,15]],"date-time":"2008-01-15T07:14:29Z","timestamp":1200381269000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["A simplified approach to disulfide connectivity prediction from protein sequences"],"prefix":"10.1186","volume":"9","author":[{"given":"Marc","family":"Vincent","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrea","family":"Passerini","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Matthieu","family":"Labb\u00e9","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Paolo","family":"Frasconi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2008,1,14]]},"reference":[{"issue":"10","key":"2005_CR1","doi-asserted-by":"publisher","first-page":"957","DOI":"10.1093\/bioinformatics\/17.10.957","volume":"17","author":"P Fariselli","year":"2001","unstructured":"Fariselli P, Casadio R: Prediction of disulfide connectivity in proteins. Bioinformatics 2001, 17(10):957\u2013964. 10.1093\/bioinformatics\/17.10.957","journal-title":"Bioinformatics"},{"key":"2005_CR2","doi-asserted-by":"publisher","first-page":"W182","DOI":"10.1093\/nar\/gkl189","volume":"34","author":"F Ferr\u00e8","year":"2006","unstructured":"Ferr\u00e8 F, Clote P: DiANNA 1.1: an extension of the DiANNA web server for ternary cysteine classification. Nucleic Acids Research 2006, 34: W182-W185. 10.1093\/nar\/gkl189","journal-title":"Nucleic Acids Research"},{"issue":"24","key":"2005_CR3","doi-asserted-by":"publisher","first-page":"4416","DOI":"10.1093\/bioinformatics\/bti715","volume":"21","author":"CH Tsai","year":"2005","unstructured":"Tsai CH, Chen BJ, Chan CH, Liu HL, Kao CY: Improving disulfide connectivity prediction with sequential distance between oxidized cysteines. Bioinformatics 2005, 21(24):4416\u20134419. 10.1093\/bioinformatics\/bti715","journal-title":"Bioinformatics"},{"key":"2005_CR4","first-page":"W72","volume-title":"Nucleic Acids Res","author":"J Cheng","year":"2005","unstructured":"Cheng J, Randall AZ, Sweredoski MJ, Baldi P: SCRATCH: a protein structure and structural feature prediction server. Nucleic Acids Res 2005, (33 Web Server):W72-W76. 10.1093\/nar\/gki396"},{"issue":"Web Server","key":"2005_CR5","doi-asserted-by":"publisher","first-page":"W177","DOI":"10.1093\/nar\/gkl266","volume":"34","author":"A Ceroni","year":"2006","unstructured":"Ceroni A, Passerini A, Vullo A, Frasconi P: DISULFIND: a Disulfide Bonding State and Cysteine Connectivity Prediction Server. Nucleic Acids Research 2006, 34(Web Server):W177-W181. 10.1093\/nar\/gkl266","journal-title":"Nucleic Acids Research"},{"issue":"3","key":"2005_CR6","doi-asserted-by":"publisher","first-page":"617","DOI":"10.1002\/prot.20787","volume":"62","author":"J Cheng","year":"2006","unstructured":"Cheng J, Saigo H, Baldi P: Large-scale prediction of disulphide bridges using kernel methods, two-dimensional recursive neural networks, and weighted graph matching. Proteins 2006, 62(3):617\u2013629. 10.1002\/prot.20787","journal-title":"Proteins"},{"issue":"2","key":"2005_CR7","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1016\/0014-5793(92)80419-H","volume":"302","author":"A Fiser","year":"1992","unstructured":"Fiser A, Cserzo M, Tudos E, Simon I: Different sequence environments of cysteines and half cystines in proteins. Application to predict disulfide forming residues. FEBS Lett 1992, 302(2):117\u201320. 10.1016\/0014-5793(92)80419-H","journal-title":"FEBS Lett"},{"issue":"3","key":"2005_CR8","doi-asserted-by":"publisher","first-page":"340","DOI":"10.1002\/(SICI)1097-0134(19990815)36:3<340::AID-PROT8>3.0.CO;2-D","volume":"36","author":"P Fariselli","year":"1999","unstructured":"Fariselli P, Riccobelli P, Casadio R: Role of evolutionary information in predicting the disulfide-bonding state of cysteine in proteins. Proteins 1999, 36(3):340\u2013346. 10.1002\/(SICI)1097-0134(19990815)36:3<340::AID-PROT8>3.0.CO;2-D","journal-title":"Proteins"},{"issue":"3","key":"2005_CR9","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1093\/bioinformatics\/16.3.251","volume":"16","author":"A Fiser","year":"2000","unstructured":"Fiser A, Simon I: Predicting the oxidation state of cysteines by multiple sequence alignment. Bioinformatics 2000, 16(3):251\u2013256. 10.1093\/bioinformatics\/16.3.251","journal-title":"Bioinformatics"},{"key":"2005_CR10","doi-asserted-by":"publisher","first-page":"243","DOI":"10.1002\/prot.10047","volume":"46","author":"M Mucchielli-Giorgi","year":"2002","unstructured":"Mucchielli-Giorgi M, Hazout S, Tuffery P: Predicting the Disulfide Bonding State of Cysteines Using Protein Descriptors. Proteins 2002, 46: 243\u2013249. 10.1002\/prot.10047","journal-title":"Proteins"},{"issue":"3","key":"2005_CR11","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1023\/B:VLSI.0000003026.58068.ce","volume":"35","author":"A Ceroni","year":"2003","unstructured":"Ceroni A, Frasconi P, Passerini A, Vullo A: Predicting the Disulfide Bonding State of Cysteines with Combinations of Kernel Machines. Journal of VLSI Signal Processing 2003, 35(3):287\u2013295. [ps\/jvlsi-03-cys.pdf] 10.1023\/B:VLSI.0000003026.58068.ce","journal-title":"Journal of VLSI Signal Processing"},{"key":"2005_CR12","doi-asserted-by":"publisher","first-page":"142","DOI":"10.1016\/j.bbrc.2004.03.189","volume":"318","author":"JN Song","year":"2004","unstructured":"Song JN, Wang ML, Li WJ, Xu WB: Prediction of the disulfide-bonding state of cysteines in proteins based on dipeptide composition. Biochem Biophys Res Commun 2004, 318: 142\u2013147. 10.1016\/j.bbrc.2004.03.189","journal-title":"Biochem Biophys Res Commun"},{"issue":"6","key":"2005_CR13","doi-asserted-by":"publisher","first-page":"1665","DOI":"10.1002\/pmic.200300745","volume":"4","author":"PL Martelli","year":"2004","unstructured":"Martelli PL, Fariselli P, Casadio R: Prediction of disulfide-bonded cysteines in proteomes with a hidden neural network. Proteomics 2004, 4(6):1665\u20131671. 10.1002\/pmic.200300745","journal-title":"Proteomics"},{"issue":"4","key":"2005_CR14","doi-asserted-by":"publisher","first-page":"1036","DOI":"10.1002\/prot.20079","volume":"55","author":"YC Chen","year":"2004","unstructured":"Chen YC, Lin YS, Lin CJ, Hwang JK: Prediction of the bonding states of cysteines using the support vector machines based on multiple feature vectors and cysteine state sequences. Proteins 2004, 55(4):1036\u20131042. 10.1002\/prot.20079","journal-title":"Proteins"},{"issue":"5","key":"2005_CR15","doi-asserted-by":"publisher","first-page":"653","DOI":"10.1093\/bioinformatics\/btg463","volume":"20","author":"A Vullo","year":"2004","unstructured":"Vullo A, Frasconi P: Disulfide connectivity prediction using recursive neural networks and evolutionary information. Bioinformatics 2004, 20(5):653\u2013659. 10.1093\/bioinformatics\/btg463","journal-title":"Bioinformatics"},{"key":"2005_CR16","volume-title":"Proceedings of the Twenty Second International Conference on Machine Learning (ICML05)","author":"B Taskar","year":"2005","unstructured":"Taskar B, Chatalbashev V, Koller D, Guestrin C: Learning Structured Prediction Models: A Large Margin Approach. Proceedings of the Twenty Second International Conference on Machine Learning (ICML05) 2005."},{"issue":"10","key":"2005_CR17","doi-asserted-by":"publisher","first-page":"2336","DOI":"10.1093\/bioinformatics\/bti328","volume":"21","author":"F Ferr\u00e8","year":"2005","unstructured":"Ferr\u00e8 F, Clote P: Disulfide connectivity prediction using secondary structure information and diresidue frequencies. Bioinformatics 2005, 21(10):2336\u20132346. 10.1093\/bioinformatics\/bti328","journal-title":"Bioinformatics"},{"issue":"8","key":"2005_CR18","doi-asserted-by":"publisher","first-page":"1415","DOI":"10.1093\/bioinformatics\/bti179","volume":"21","author":"E Zhao","year":"2005","unstructured":"Zhao E, Liu HL, Tsai CH, Tsai HK, hsiung Chan C, Kao CY: Cysteine separations profiles on protein sequences infer disulfide connectivity. Bioinformatics 2005, 21(8):1415\u20131420. 10.1093\/bioinformatics\/bti179","journal-title":"Bioinformatics"},{"issue":"3","key":"2005_CR19","doi-asserted-by":"publisher","first-page":"507","DOI":"10.1002\/prot.20627","volume":"61","author":"YC Chen","year":"2005","unstructured":"Chen YC, Hwang JK: Prediction of disulfide connectivity from protein sequences. Proteins 2005, 61(3):507\u2013512. 10.1002\/prot.20627","journal-title":"Proteins"},{"key":"2005_CR20","doi-asserted-by":"publisher","first-page":"246","DOI":"10.1002\/prot.20972","volume":"64","author":"BJ Chen","year":"2006","unstructured":"Chen BJ, Tsai CH, Chan CH, Kao CY: Disulfide connectivity prediction with 70% accuracy using two-level models. Proteins 2006, 64: 246\u2013252. 10.1002\/prot.20972","journal-title":"Proteins"},{"issue":"2","key":"2005_CR21","doi-asserted-by":"publisher","first-page":"262","DOI":"10.1002\/prot.21309","volume":"67","author":"CH Lu","year":"2007","unstructured":"Lu CH, Chen YC, Yu CS, Hwang JK: Predicting disulfide connectivity patterns. Proteins 2007, 67(2):262\u2013270. 10.1002\/prot.21309","journal-title":"Proteins"},{"key":"2005_CR22","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1016\/S0925-2312(03)00375-8","volume":"55","author":"C Gold","year":"2003","unstructured":"Gold C, Sollich P: Model Selection for Support Vector Machine Classification. Neurocomputing 2003, 55: 221. [doi:10.1016\/S0925\u20132312(03)00375\u20138] 10.1016\/S0925-2312(03)00375-8","journal-title":"Neurocomputing"},{"key":"2005_CR23","volume-title":"Advances in Kernel Methods \u2013 Support Vector Learning","author":"T Joachims","year":"1999","unstructured":"Joachims T: Making large-Scale SVM Learning Practical. In Advances in Kernel Methods \u2013 Support Vector Learning Edited by: Sch\u00f6lkopf B, Burges C, Smola A. MIT Press; 1999. [http:\/\/svmlight.joachims.org\/]"},{"key":"2005_CR24","doi-asserted-by":"publisher","first-page":"409","DOI":"10.1002\/pro.5560010313","volume":"1","author":"U Hobohm","year":"1992","unstructured":"Hobohm U, Scharf M, Schneider R, Sander C: Selection of a representative set of structures from the Brookhaven Protein Data Bank. Protein Science 1992, 1: 409\u2013417.","journal-title":"Protein Science"},{"key":"2005_CR25","doi-asserted-by":"publisher","first-page":"522","DOI":"10.1002\/pro.5560030317","volume":"3","author":"U Hobohm","year":"1994","unstructured":"Hobohm U, Sander C: Enlarged representative set of protein structures. Protein Science 1994, 3: 522.","journal-title":"Protein Science"},{"key":"2005_CR26","unstructured":"PDBselect[http:\/\/bioinfo.tg.fh-giessen.de\/pdbselect\/]"},{"key":"2005_CR27","doi-asserted-by":"publisher","first-page":"2577","DOI":"10.1002\/bip.360221211","volume":"22","author":"U Hobohm","year":"1983","unstructured":"Hobohm U, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22: 2577\u20132637. 10.1002\/bip.360221211","journal-title":"Biopolymers"},{"key":"2005_CR28","unstructured":"DIpro[http:\/\/contact.ics.uci.edu\/intro.html]"},{"key":"2005_CR29","unstructured":"CysPred[http:\/\/www.biocomp.unibo.it\/piero\/cyspred\/cysdataset.tgz]"},{"issue":"17","key":"2005_CR30","doi-asserted-by":"publisher","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","volume":"25","author":"S Altschul","year":"1997","unstructured":"Altschul S, Madden T, Schaffer A, Zhang J, Zhang Z, Miller W, Lipman D: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 1997, 25(17):3389\u20133402. 10.1093\/nar\/25.17.3389","journal-title":"Nucleic Acids Research"},{"key":"2005_CR31","volume-title":"Tech Rep UCSC-CRL-99-10","author":"D Haussler","year":"1999","unstructured":"Haussler D: Convolution Kernels on Discrete Structures. In Tech Rep UCSC-CRL-99\u201310. University of California, Santa Cruz; 1999."},{"key":"2005_CR32","volume-title":"Advances in Large Margin Classiers","author":"J Platt","year":"1999","unstructured":"Platt J: Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In Advances in Large Margin Classiers. Edited by: Smola A, Bartlett P, Scholkopf B, Schurmans D. MIT Press; 1999."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-9-20.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T03:22:27Z","timestamp":1630466547000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-9-20"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,1,14]]},"references-count":32,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,12]]}},"alternative-id":["2005"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-9-20","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2008,1,14]]},"assertion":[{"value":"4 September 2007","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 January 2008","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 January 2008","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"20"}}