{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T12:18:54Z","timestamp":1767961134975,"version":"3.49.0"},"reference-count":52,"publisher":"Oxford University Press (OUP)","issue":"23","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Disulfide bonds are primary covalent crosslinks between two cysteine residues in proteins that play critical roles in stabilizing the protein structures and are commonly found in extracy-toplasmatic or secreted proteins. In protein folding prediction, the localization of disulfide bonds can greatly reduce the search in conformational space. Therefore, there is a great need to develop computational methods capable of accurately predicting disulfide connectivity patterns in proteins that could have potentially important applications.<\/jats:p><jats:p>Results: We have developed a novel method to predict disulfide connectivity patterns from protein primary sequence, using a support vector regression (SVR) approach based on multiple sequence feature vectors and predicted secondary structure by the PSIPRED program. The results indicate that our method could achieve a prediction accuracy of 74.4% and 77.9%, respectively, when averaged on proteins with two to five disulfide bridges using 4-fold cross-validation, measured on the protein and cysteine pair on a well-defined non-homologous dataset. We assessed the effects of different sequence encoding schemes on the prediction performance of disulfide connectivity. It has been shown that the sequence encoding scheme based on multiple sequence feature vectors coupled with predicted secondary structure can significantly improve the prediction accuracy, thus enabling our method to outperform most of other currently available predictors. Our work provides a complementary approach to the current algorithms that should be useful in computationally assigning disulfide connectivity patterns and helps in the annotation of protein sequences generated by large-scale whole-genome projects.<\/jats:p><jats:p>Availability: The prediction web server and Supplementary Material are accessible at http:\/\/foo.maths.uq.edu.au\/~huber\/disulfide<\/jats:p><jats:p>Contact: \u00a0kb@maths.uq.edu.au<\/jats:p><jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm505","type":"journal-article","created":{"date-parts":[[2007,10,18]],"date-time":"2007-10-18T00:44:48Z","timestamp":1192668288000},"page":"3147-3154","source":"Crossref","is-referenced-by-count":54,"title":["Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure"],"prefix":"10.1093","volume":"23","author":[{"given":"Jiangning","family":"Song","sequence":"first","affiliation":[{"name":"1 Advanced Computational Modelling Centre, 2ARC Centre in Bioinformatics and Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia, 3Caulfield School of Information Technology, Monash University, Caulfield, East VIC 3145 and 4School of Molecular & Microbial Sciences and Australian Institute for Bioengineering & Nanotechnology, The University of Queensland, Brisbane, QLD 4072, Australia"}]},{"given":"Zheng","family":"Yuan","sequence":"additional","affiliation":[{"name":"1 Advanced Computational Modelling Centre, 2ARC Centre in Bioinformatics and Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia, 3Caulfield School of Information Technology, Monash University, Caulfield, East VIC 3145 and 4School of Molecular & Microbial Sciences and Australian Institute for Bioengineering & Nanotechnology, The University of Queensland, Brisbane, QLD 4072, Australia"}]},{"given":"Hao","family":"Tan","sequence":"additional","affiliation":[{"name":"1 Advanced Computational Modelling Centre, 2ARC Centre in Bioinformatics and Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia, 3Caulfield School of Information Technology, Monash University, Caulfield, East VIC 3145 and 4School of Molecular & Microbial Sciences and Australian Institute for Bioengineering & Nanotechnology, The University of Queensland, Brisbane, QLD 4072, Australia"}]},{"given":"Thomas","family":"Huber","sequence":"additional","affiliation":[{"name":"1 Advanced Computational Modelling Centre, 2ARC Centre in Bioinformatics and Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia, 3Caulfield School of Information Technology, Monash University, Caulfield, East VIC 3145 and 4School of Molecular & Microbial Sciences and Australian Institute for Bioengineering & Nanotechnology, The University of Queensland, Brisbane, QLD 4072, Australia"}]},{"given":"Kevin","family":"Burrage","sequence":"additional","affiliation":[{"name":"1 Advanced Computational Modelling Centre, 2ARC Centre in Bioinformatics and Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia, 3Caulfield School of Information Technology, Monash University, Caulfield, East VIC 3145 and 4School of Molecular & Microbial Sciences and Australian Institute for Bioengineering & Nanotechnology, The University of Queensland, Brisbane, QLD 4072, Australia"},{"name":"1 Advanced Computational Modelling Centre, 2ARC Centre in Bioinformatics and Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia, 3Caulfield School of Information Technology, Monash University, Caulfield, East VIC 3145 and 4School of Molecular & Microbial Sciences and Australian Institute for Bioengineering & Nanotechnology, The University of Queensland, Brisbane, QLD 4072, Australia"}]}],"member":"286","published-online":{"date-parts":[[2007,10,17]]},"reference":[{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"975","DOI":"10.1006\/jmbi.2000.3893","article-title":"What can disulfide bonds tell us about protein energetics, function and folding: simulations and bioinformatics analysis","volume":"300","author":"Abkevich","year":"2000","journal-title":"J. Mol. Biol"},{"key":"2023041107510822000_","first-page":"97","article-title":"Large-scale prediction of disulphide bond connectivity","volume-title":"Advances in Neural Information Processing Systems","author":"Baldi","year":"2005"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1093\/nar\/28.1.45","article-title":"The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 2000","volume":"28","author":"Bairoch","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The Protein Data Bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"1487","DOI":"10.1093\/bioinformatics\/bti242","article-title":"Improved prediction of protein-protein binding sites using a support vector machines approach","volume":"21","author":"Bradford","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1073\/pnas.97.1.262","article-title":"Knowledge-based analysis of microarray gene expression data by using support vector machines","volume":"97","author":"Brown","year":"2000","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"ii54","DOI":"10.1093\/bioinformatics\/bti1109","article-title":"Predicting protein stability changes from sequences using support vector machines","volume":"21","author":"Capriotti","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"2729","DOI":"10.1093\/bioinformatics\/btl423","article-title":"Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information","volume":"22","author":"Capriotti","year":"2006","journal-title":"Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"W177","DOI":"10.1093\/nar\/gkl266","article-title":"DISULFIND: a disulfide bonding state and cysteine connectivity prediction server","volume":"34","author":"Ceroni","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1016\/j.jmb.2006.03.017","article-title":"Structural classification of small, disulfide-rich protein domains","volume":"359","author":"Cheek","year":"2006","journal-title":"J. Mol. Biol"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1002\/prot.20972","article-title":"Disulfide connectivity prediction with 70% accuracy using two-level models","volume":"64","author":"Chen","year":"2006","journal-title":"Proteins"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"507","DOI":"10.1002\/prot.20627","article-title":"Prediction of disulfide connectivity from protein sequences","volume":"61","author":"Chen","year":"2005","journal-title":"Proteins"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"1456","DOI":"10.1093\/bioinformatics\/btl102","article-title":"A machine learning information retrieval approach to protein fold recognition","volume":"22","author":"Cheng","year":"2006","journal-title":"Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"617","DOI":"10.1002\/prot.20787","article-title":"Large-scale prediction of disulphide bridges using kernel methods, two-dimensional recursive neural networks, and weighted graph matching","volume":"62","author":"Cheng","year":"2006","journal-title":"Proteins"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/prot.10492","article-title":"Relationship between protein structures and disulfide-bonding patterns","volume":"53","author":"Chuang","year":"2003","journal-title":"Proteins"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"449","DOI":"10.4153\/CJM-1965-045-4","article-title":"Paths, trees, and flowers","volume":"17","author":"Edmonds","year":"1965","journal-title":"Can. J. Math"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"957","DOI":"10.1093\/bioinformatics\/17.10.957","article-title":"Prediction of disulfide connectivity in proteins","volume":"17","author":"Fariselli","year":"2001","journal-title":"Bioinformatics"},{"key":"2023041107510822000_","first-page":"464","article-title":"A neural network based method for predicting the disulfide connectivity in proteins","volume-title":"Knowledge Based Intelligent Information Engineering Systems and Allied Technologies (KES 2002)","author":"Fariselli","year":"2002"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"W230","DOI":"10.1093\/nar\/gki412","article-title":"DiANNA: a web server for disulfide connectivity prediction","volume":"33","author":"Ferre","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"2336","DOI":"10.1093\/bioinformatics\/bti328","article-title":"Disulfide connectivity prediction using secondary structure information and diresidue frequencies","volume":"21","author":"Ferre","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"2045","DOI":"10.1110\/ps.04613004","article-title":"A classification of disulfide patterns and its relationship to protein structure and function","volume":"13","author":"Gupta","year":"2004","journal-title":"Protein Sci"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"448","DOI":"10.1006\/jmbi.1994.1742","article-title":"Analysis and classification of disulphide connectivity in proteins. The entropic effect of cross-linkage","volume":"244","author":"Harrison","year":"1994","journal-title":"J. Mol. Biol"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"474","DOI":"10.1110\/ps.04923305","article-title":"Intramolecular disulphide bond arrangements in nonhomologous proteins","volume":"14","author":"Hartig","year":"2005","journal-title":"Protein Sci"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"721","DOI":"10.1093\/bioinformatics\/17.8.721","article-title":"Support vector machine approach for protein subcellular localization prediction","volume":"17","author":"Hua","year":"2001","journal-title":"Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"789","DOI":"10.1016\/j.cell.2006.10.034","article-title":"Crystal structure of the DsbB-DsbA complex reveals a mechanism of disulfide bond generation","volume":"127","author":"Inaba","year":"2006","journal-title":"Cell"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"940","DOI":"10.1002\/prot.21047","article-title":"Potential for assessing quality of protein structure based on contact number prediction","volume":"64","author":"Ishida","year":"2006","journal-title":"Proteins"},{"key":"2023041107510822000_","article-title":"Making large-scale SVM learning practical","volume-title":"Advances in Kernel Methods - Support Vector Learning","author":"Joachims","year":"1999"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position-specific scoring matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1146\/annurev.biochem.72.121801.161459","article-title":"Protein disulfide bond formation in prokaryotes","volume":"72","author":"Kadokura","year":"2003","journal-title":"Annu. Rev. Biochem"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"534","DOI":"10.1126\/science.1091724","article-title":"Snapshots of DsbA in action: detection of proteins in the process of oxidative folding","volume":"303","author":"Kadokura","year":"2004","journal-title":"Science"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1186\/1471-2105-7-182","article-title":"Quantitative prediction of mouse class I MHC peptide binding affinity using support vector machine regression (SVR) models","volume":"7","author":"Liu","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1002\/prot.21309","article-title":"Predicting disulfide connectivity patterns","volume":"67","author":"Lu","year":"2007","journal-title":"Proteins"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"1605","DOI":"10.1002\/jcc.20084","article-title":"UCSF Chimera \u2013 a visualization system for exploratory research and analysis","volume":"25","author":"Pettersen","year":"2004","journal-title":"J. Comput. Chem"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"584","DOI":"10.1006\/jmbi.1993.1413","article-title":"Prediction of protein secondary structure at better than 70% accuracy","volume":"232","author":"Rost","year":"1993","journal-title":"J. Mol. Biol"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1186\/1471-2105-6-152","article-title":"pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties","volume":"6","author":"Sarda","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1016\/j.cell.2007.02.039","article-title":"Modulation of cellular disulfide-bond formation and the ER redox environment by feedback regulation of Ero1","volume":"129","author":"Sevier","year":"2007","journal-title":"Cell"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"4337","DOI":"10.1073\/pnas.0607879104","article-title":"Predicting protein-protein interactions based only on sequences information","volume":"104","author":"Shen","year":"2007","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1186\/1471-2105-7-425","article-title":"Predicting residue-wise contact orders in proteins by support vector regression","volume":"7","author":"Song","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1186\/1471-2105-7-124","article-title":"Prediction of cis\/trans isomerization in proteins using PSI-BLAST profiles and secondary structure information","volume":"7","author":"Song","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"866","DOI":"10.1002\/prot.20369","article-title":"Native and modeled disulfide bonds in proteins: knowledge-based approaches toward structure prediction of disulfide-rich polypeptides","volume":"58","author":"Thangudu","year":"2005","journal-title":"Proteins"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1016\/0022-2836(81)90515-5","article-title":"Disulphide bridges in globular proteins","volume":"151","author":"Thornton","year":"1981","journal-title":"J. Mol. Biol"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"2095","DOI":"10.1126\/science.292.5524.2095","article-title":"From genome to function","volume":"292","author":"Thornton","year":"2001","journal-title":"Science"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"4416","DOI":"10.1093\/bioinformatics\/bti715","article-title":"Improving disulfide connectivity prediction with sequential distance between oxidized cysteines","volume":"21","author":"Tsai","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"1083","DOI":"10.1016\/j.jmb.2003.10.077","article-title":"A novel database of disulfide patterns","volume":"335","author":"van Vlijmen","year":"2004","journal-title":"J. Mol. Biol"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-3264-1","volume-title":"The nature of statistical learning theory","author":"Vapnik","year":"2000"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"653","DOI":"10.1093\/bioinformatics\/btg463","article-title":"Disulfide connectivity prediction using recursive neural networks and evolutionary information","volume":"20","author":"Vullo","year":"2004","journal-title":"Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1186\/1471-2105-7-463","article-title":"SVRMHC prediction server for MHC-binding peptides","volume":"7","author":"Wan","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1186\/1471-2105-7-32","article-title":"Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme","volume":"7","author":"Wang","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1186\/1471-2105-6-248","article-title":"Better prediction of protein contact number using a support vector regression analysis of amino acid sequence","volume":"6","author":"Yuan","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"905","DOI":"10.1002\/prot.20375","article-title":"Prediction of protein B-factor profiles","volume":"58","author":"Yuan","year":"2005","journal-title":"Proteins"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"1063","DOI":"10.1021\/pr050397b","article-title":"Predicting the solvent accessibility of transmembrane residues from protein sequence","volume":"5","author":"Yuan","year":"2006","journal-title":"J. Proteome Res"},{"key":"2023041107510822000_","doi-asserted-by":"crossref","first-page":"1415","DOI":"10.1093\/bioinformatics\/bti179","article-title":"Cysteine separations profiles on protein sequences infer disulfide connectivity","volume":"21","author":"Zhao","year":"2005","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/23\/3147\/49822377\/bioinformatics_23_23_3147.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/23\/3147\/49822377\/bioinformatics_23_23_3147.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,18]],"date-time":"2024-02-18T14:35:15Z","timestamp":1708266915000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/23\/3147\/290860"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,10,17]]},"references-count":52,"journal-issue":{"issue":"23","published-print":{"date-parts":[[2007,12,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm505","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,12,1]]},"published":{"date-parts":[[2007,10,17]]}}}