{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T12:21:54Z","timestamp":1767961314010,"version":"3.49.0"},"reference-count":43,"publisher":"Springer Science and Business Media LLC","issue":"S1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2009,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Amyloid fibrillar aggregates of proteins or polypeptides are known to be associated with many human diseases. Recent studies suggest that short protein regions trigger this aggregation. Thus, identifying these short peptides is critical for understanding diseases and finding potential therapeutic targets.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We propose a method, named Pafig (<jats:underline>P<\/jats:underline> rediction of <jats:underline>a<\/jats:underline> myloid <jats:underline>fi<\/jats:underline> bril-forming se<jats:underline>g<\/jats:underline> ments) based on support vector machines, to identify the hexpeptides associated with amyloid fibrillar aggregates. The features of Pafig were obtained by a two-round selection from AAindex. Using a 10-fold cross validation test on Hexpepset dataset, Pafig performed well with regards to overall accuracy of 81% and Matthews correlation coefficient of 0.63. Pafig was used to predict the potential fibril-forming hexpeptides in all of the 64,000,000 hexpeptides. As a result, approximately 5.08% of hexpeptides showed a high aggregation propensity. In the predicted fibril-forming hexpeptides, the amino acids \u2013 alanine, phenylalanine, isoleucine, leucine and valine occurred at the higher frequencies and the amino acids \u2013 aspartic acid, glutamic acid, histidine, lysine, arginine and praline, appeared with lower frequencies.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>The performance of Pafig indicates that it is a powerful tool for identifying the hexpeptides associated with fibrillar aggregates and will be useful for large-scale analysis of proteomic data.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-10-s1-s45","type":"journal-article","created":{"date-parts":[[2009,1,30]],"date-time":"2009-01-30T20:05:15Z","timestamp":1233345915000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":82,"title":["Prediction of amyloid fibril-forming segments based on a support vector machine"],"prefix":"10.1186","volume":"10","author":[{"given":"Jian","family":"Tian","sequence":"first","affiliation":[]},{"given":"Ningfeng","family":"Wu","sequence":"additional","affiliation":[]},{"given":"Jun","family":"Guo","sequence":"additional","affiliation":[]},{"given":"Yunliu","family":"Fan","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2009,1,30]]},"reference":[{"issue":"17","key":"3228_CR1","doi-asserted-by":"publisher","first-page":"2218","DOI":"10.1093\/bioinformatics\/btm325","volume":"23","author":"Z Zhang","year":"2007","unstructured":"Zhang Z, Chen H, Lai L: Identification of amyloid fibril-forming segments based on structure and residue-based statistical potential. Bioinformatics 2007, 23(17):2218\u20132225. 10.1093\/bioinformatics\/btm325","journal-title":"Bioinformatics"},{"issue":"Suppl","key":"3228_CR2","doi-asserted-by":"publisher","first-page":"S10","DOI":"10.1038\/nm1066","volume":"10","author":"CA Ross","year":"2004","unstructured":"Ross CA, Poirier MA: Protein aggregation and neurodegenerative disease. Nat Med 2004, 10(Suppl):S10\u201317. 10.1038\/nm1066","journal-title":"Nat Med"},{"issue":"1406","key":"3228_CR3","doi-asserted-by":"publisher","first-page":"133","DOI":"10.1098\/rstb.2000.0758","volume":"356","author":"CM Dobson","year":"2001","unstructured":"Dobson CM: The structural basis of protein folding and its links with human disease. Philos Trans R Soc Lond B Biol Sci 2001, 356(1406):133\u2013145. 10.1098\/rstb.2000.0758","journal-title":"Philos Trans R Soc Lond B Biol Sci"},{"issue":"1","key":"3228_CR4","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1016\/S0959-440X(99)00049-4","volume":"10","author":"JC Rochet","year":"2000","unstructured":"Rochet JC, Lansbury PT Jr: Amyloid fibrillogenesis: themes and variations. Curr Opin Struct Biol 2000, 10(1):60\u201368. 10.1016\/S0959-440X(99)00049-4","journal-title":"Curr Opin Struct Biol"},{"issue":"12","key":"3228_CR5","doi-asserted-by":"publisher","first-page":"e177","DOI":"10.1371\/journal.pcbi.0020177","volume":"2","author":"OV Galzitskaya","year":"2006","unstructured":"Galzitskaya OV, Garbuzynskiy SO, Lobanov MY: Prediction of amyloidogenic and disordered regions in protein chains. PLoS Comput Biol 2006, 2(12):e177. 10.1371\/journal.pcbi.0020177","journal-title":"PLoS Comput Biol"},{"key":"3228_CR6","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1186\/1472-6807-5-18","volume":"5","author":"N Sanchez de Groot","year":"2005","unstructured":"Sanchez de Groot N, Pallares I, Aviles FX, Vendrell J, Ventura S: Prediction of \"hot spots\" of aggregation in disease-linked polypeptides. BMC Struct Biol 2005, 5: 18. 10.1186\/1472-6807-5-18","journal-title":"BMC Struct Biol"},{"issue":"1","key":"3228_CR7","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1016\/S0959-440X(98)80016-X","volume":"8","author":"JW Kelly","year":"1998","unstructured":"Kelly JW: The alternative conformations of amyloidogenic proteins and their multi-step assembly pathways. Curr Opin Struct Biol 1998, 8(1):101\u2013106. 10.1016\/S0959-440X(98)80016-X","journal-title":"Curr Opin Struct Biol"},{"issue":"7043","key":"3228_CR8","doi-asserted-by":"publisher","first-page":"773","DOI":"10.1038\/nature03680","volume":"435","author":"R Nelson","year":"2005","unstructured":"Nelson R, Sawaya MR, Balbirnie M, Madsen AO, Riekel C, Grothe R, Eisenberg D: Structure of the cross-beta spine of amyloid-like fibrils. Nature 2005, 435(7043):773\u2013778. 10.1038\/nature03680","journal-title":"Nature"},{"issue":"2","key":"3228_CR9","doi-asserted-by":"publisher","first-page":"315","DOI":"10.1073\/pnas.0406847102","volume":"102","author":"OS Makin","year":"2005","unstructured":"Makin OS, Atkins E, Sikorski P, Johansson J, Serpell LC: Molecular basis for amyloid fibril formation and stability. Proc Natl Acad Sci USA 2005, 102(2):315\u2013320. 10.1073\/pnas.0406847102","journal-title":"Proc Natl Acad Sci USA"},{"issue":"19","key":"3228_CR10","doi-asserted-by":"publisher","first-page":"7258","DOI":"10.1073\/pnas.0308249101","volume":"101","author":"S Ventura","year":"2004","unstructured":"Ventura S, Zurdo J, Narayanan S, Parreno M, Mangues R, Reif B, Chiti F, Giannoni E, Dobson CM, Aviles FX, et al.: Short amino acid stretches can mediate amyloid formation in globular proteins: the Src homology 3 (SH3) case. Proc Natl Acad Sci USA 2004, 101(19):7258\u20137263. 10.1073\/pnas.0308249101","journal-title":"Proc Natl Acad Sci USA"},{"issue":"1","key":"3228_CR11","doi-asserted-by":"publisher","first-page":"87","DOI":"10.1073\/pnas.2634884100","volume":"101","author":"M Lopez de la Paz","year":"2004","unstructured":"Lopez de la Paz M, Serrano L: Sequence determinants of amyloid fibril formation. Proc Natl Acad Sci USA 2004, 101(1):87\u201392. 10.1073\/pnas.2634884100","journal-title":"Proc Natl Acad Sci USA"},{"issue":"29","key":"3228_CR12","doi-asserted-by":"publisher","first-page":"10584","DOI":"10.1073\/pnas.0403756101","volume":"101","author":"MI Ivanova","year":"2004","unstructured":"Ivanova MI, Sawaya MR, Gingery M, Attinger A, Eisenberg D: An amyloid-forming segment of beta2-microglobulin suggests a molecular model for the fibril. Proc Natl Acad Sci USA 2004, 101(29):10584\u201310589. 10.1073\/pnas.0403756101","journal-title":"Proc Natl Acad Sci USA"},{"issue":"5","key":"3228_CR13","doi-asserted-by":"publisher","first-page":"437","DOI":"10.1016\/j.cbpa.2006.07.009","volume":"10","author":"A Caflisch","year":"2006","unstructured":"Caflisch A: Computational models for the prediction of polypeptide aggregation propensity. Curr Opin Chem Biol 2006, 10(5):437\u2013444. 10.1016\/j.cbpa.2006.07.009","journal-title":"Curr Opin Chem Biol"},{"issue":"10","key":"3228_CR14","doi-asserted-by":"publisher","first-page":"2723","DOI":"10.1110\/ps.051471205","volume":"14","author":"GG Tartaglia","year":"2005","unstructured":"Tartaglia GG, Cavalli A, Pellarin R, Caflisch A: Prediction of aggregation rate and aggregation-prone segments in polypeptide sequences. Protein Sci 2005, 14(10):2723\u20132734. 10.1110\/ps.051471205","journal-title":"Protein Sci"},{"issue":"7","key":"3228_CR15","doi-asserted-by":"publisher","first-page":"1939","DOI":"10.1110\/ps.04663504","volume":"13","author":"GG Tartaglia","year":"2004","unstructured":"Tartaglia GG, Cavalli A, Pellarin R, Caflisch A: The role of aromaticity, exposed surface, and dipole moment in determining protein aggregation rates. Protein Sci 2004, 13(7):1939\u20131941. 10.1110\/ps.04663504","journal-title":"Protein Sci"},{"issue":"5","key":"3228_CR16","doi-asserted-by":"publisher","first-page":"1317","DOI":"10.1016\/j.jmb.2004.06.043","volume":"341","author":"KF DuBay","year":"2004","unstructured":"DuBay KF, Pawar AP, Chiti F, Zurdo J, Dobson CM, Vendruscolo M: Prediction of the absolute aggregation rates of amyloidogenic polypeptide chains. J Mol Biol 2004, 341(5):1317\u20131326. 10.1016\/j.jmb.2004.06.043","journal-title":"J Mol Biol"},{"issue":"6950","key":"3228_CR17","doi-asserted-by":"publisher","first-page":"805","DOI":"10.1038\/nature01891","volume":"424","author":"F Chiti","year":"2003","unstructured":"Chiti F, Stefani M, Taddei N, Ramponi G, Dobson CM: Rationalization of the effects of mutations on peptide and protein aggregation rates. Nature 2003, 424(6950):805\u2013808. 10.1038\/nature01891","journal-title":"Nature"},{"issue":"10","key":"3228_CR18","doi-asserted-by":"publisher","first-page":"1302","DOI":"10.1038\/nbt1012","volume":"22","author":"AM Fernandez-Escamilla","year":"2004","unstructured":"Fernandez-Escamilla AM, Rousseau F, Schymkowitz J, Serrano L: Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins. Nat Biotechnol 2004, 22(10):1302\u20131306. 10.1038\/nbt1012","journal-title":"Nat Biotechnol"},{"issue":"4","key":"3228_CR19","doi-asserted-by":"publisher","first-page":"175","DOI":"10.1093\/protein\/gzi022","volume":"18","author":"S Idicula-Thomas","year":"2005","unstructured":"Idicula-Thomas S, Balaji PV: Understanding the relationship between the primary structure of proteins and their amyloidogenic propensity: clues from inclusion body formation. Protein Eng Des Sel 2005, 18(4):175\u2013180. 10.1093\/protein\/gzi022","journal-title":"Protein Eng Des Sel"},{"issue":"11","key":"3228_CR20","doi-asserted-by":"publisher","first-page":"4074","DOI":"10.1073\/pnas.0511295103","volume":"103","author":"MJ Thompson","year":"2006","unstructured":"Thompson MJ, Sievers SA, Karanicolas J, Ivanova MI, Baker D, Eisenberg D: The 3D profile method for identifying fibril-forming segments of proteins. Proc Natl Acad Sci USA 2006, 103(11):4074\u20134078. 10.1073\/pnas.0511295103","journal-title":"Proc Natl Acad Sci USA"},{"issue":"8","key":"3228_CR21","doi-asserted-by":"publisher","first-page":"2149","DOI":"10.1110\/ps.04790604","volume":"13","author":"S Yoon","year":"2004","unstructured":"Yoon S, Welsh WJ: Detecting hidden sequence propensity for amyloid fibril formation. Protein Sci 2004, 13(8):2149\u20132160. 10.1110\/ps.04790604","journal-title":"Protein Sci"},{"issue":"25","key":"3228_CR22","doi-asserted-by":"publisher","first-page":"16052","DOI":"10.1073\/pnas.252340199","volume":"99","author":"M Lopez De La Paz","year":"2002","unstructured":"Lopez De La Paz M, Goldie K, Zurdo J, Lacroix E, Dobson CM, Hoenger A, Serrano L: De novo designed peptide-based amyloid fibrils. Proc Natl Acad Sci USA 2002, 99(25):16052\u201316057. 10.1073\/pnas.252340199","journal-title":"Proc Natl Acad Sci USA"},{"key":"3228_CR23","volume-title":"Statistical Learning Theory","author":"VN Vapnik","year":"1998","unstructured":"Vapnik VN: Statistical Learning Theory. New York: Wiley; 1998."},{"key":"3228_CR24","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-2440-0","volume-title":"The Nature of Statistical Learning Theory","author":"VN Vapnik","year":"1995","unstructured":"Vapnik VN: The Nature of Statistical Learning Theory. 1st edition. New York: Springer; 1995.","edition":"1"},{"issue":"1","key":"3228_CR25","doi-asserted-by":"publisher","first-page":"374","DOI":"10.1093\/nar\/28.1.374","volume":"28","author":"S Kawashima","year":"2000","unstructured":"Kawashima S, Kanehisa M: AAindex: amino acid index database. Nucleic Acids Res 2000, 28(1):374. 10.1093\/nar\/28.1.374","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"3228_CR26","doi-asserted-by":"publisher","first-page":"368","DOI":"10.1093\/nar\/27.1.368","volume":"27","author":"S Kawashima","year":"1999","unstructured":"Kawashima S, Ogata H, Kanehisa M: AAindex: Amino Acid Index Database. Nucleic Acids Res 1999, 27(1):368\u2013369. 10.1093\/nar\/27.1.368","journal-title":"Nucleic Acids Res"},{"issue":"2","key":"3228_CR27","doi-asserted-by":"publisher","first-page":"379","DOI":"10.1016\/j.jmb.2005.04.016","volume":"350","author":"AP Pawar","year":"2005","unstructured":"Pawar AP, Dubay KF, Zurdo J, Chiti F, Vendruscolo M, Dobson CM: Prediction of \"aggregation-prone\" and \"aggregation-susceptible\" regions in proteins associated with neurodegenerative diseases. J Mol Biol 2005, 350(2):379\u2013392. 10.1016\/j.jmb.2005.04.016","journal-title":"J Mol Biol"},{"issue":"3","key":"3228_CR28","doi-asserted-by":"publisher","first-page":"278","DOI":"10.1093\/bioinformatics\/bti810","volume":"22","author":"S Idicula-Thomas","year":"2006","unstructured":"Idicula-Thomas S, Kulkarni AJ, Kulkarni BD, Jayaraman VK, Balaji PV: A support vector machine-based method for predicting the propensity of a protein to be soluble or to form inclusion body on overexpression in Escherichia coli. Bioinformatics 2006, 22(3):278\u2013284. 10.1093\/bioinformatics\/bti810","journal-title":"Bioinformatics"},{"issue":"1\u20133","key":"3228_CR29","doi-asserted-by":"publisher","first-page":"191","DOI":"10.1023\/A:1012406528296","volume":"46","author":"Y Lin","year":"2002","unstructured":"Lin Y, Lee Y, Wahba G: Support Vector Machines for Classification in Nonstandard Situations. Machine Learning 2002, 46(1\u20133):191\u2013202. 10.1023\/A:1012406528296","journal-title":"Machine Learning"},{"key":"3228_CR30","unstructured":"Pafig[http:\/\/www.mobioinfor.cn\/pafig]"},{"key":"3228_CR31","volume-title":"Genetic Algorithms in Search, Optimization and Machine Learning","author":"DE Goldberg","year":"1989","unstructured":"Goldberg DE: Genetic Algorithms in Search, Optimization and Machine Learning. Boston: Addison-Wesley; 1989."},{"key":"3228_CR32","unstructured":"LIBSVM[http:\/\/www.csie.ntu.edu.tw\/~cjlin\/]"},{"issue":"2","key":"3228_CR33","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1002\/prot.340190207","volume":"19","author":"M Vihinen","year":"1994","unstructured":"Vihinen M, Torkkila E, Riikonen P: Accuracy of protein flexibility predictions. Proteins 1994, 19(2):141\u2013149. 10.1002\/prot.340190207","journal-title":"Proteins"},{"key":"3228_CR34","doi-asserted-by":"crossref","unstructured":"Xia H, Hu B: Feature selection using fuzzy support vector machines. Fuzzy Optim Decis Making 2006, (5):187\u2013192. 10.1007\/s10700-006-7336-8","DOI":"10.1007\/s10700-006-7336-8"},{"key":"3228_CR35","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1186\/1471-2105-8-245","volume":"8","author":"E Jung","year":"2007","unstructured":"Jung E, Kim J, Kim M, Jung DH, Rhee H, Shin JM, Choi K, Kang SK, Kim MK, Yun CH, et al.: Artificial neural network models for prediction of intestinal permeability of oligopeptides. BMC Bioinformatics 2007, 8: 245. 10.1186\/1471-2105-8-245","journal-title":"BMC Bioinformatics"},{"issue":"5","key":"3228_CR36","doi-asserted-by":"publisher","first-page":"412","DOI":"10.1093\/bioinformatics\/16.5.412","volume":"16","author":"P Baldi","year":"2000","unstructured":"Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 2000, 16(5):412\u2013424. 10.1093\/bioinformatics\/16.5.412","journal-title":"Bioinformatics"},{"issue":"22","key":"3228_CR37","doi-asserted-by":"publisher","first-page":"2729","DOI":"10.1093\/bioinformatics\/btl423","volume":"22","author":"E Capriotti","year":"2006","unstructured":"Capriotti E, Calabrese R, Casadio R: Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics 2006, 22(22):2729\u20132734. 10.1093\/bioinformatics\/btl423","journal-title":"Bioinformatics"},{"issue":"8","key":"3228_CR38","doi-asserted-by":"publisher","first-page":"721","DOI":"10.1093\/bioinformatics\/17.8.721","volume":"17","author":"S Hua","year":"2001","unstructured":"Hua S, Sun Z: Support vector machine approach for protein subcellular localization prediction. Bioinformatics 2001, 17(8):721\u2013728. 10.1093\/bioinformatics\/17.8.721","journal-title":"Bioinformatics"},{"key":"3228_CR39","doi-asserted-by":"publisher","first-page":"450","DOI":"10.1186\/1471-2105-8-450","volume":"8","author":"J Tian","year":"2007","unstructured":"Tian J, Wu N, Guo X, Guo J, Zhang J, Fan Y: Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines. BMC Bioinformatics 2007, 8: 450. 10.1186\/1471-2105-8-450","journal-title":"BMC Bioinformatics"},{"issue":"1","key":"3228_CR40","doi-asserted-by":"publisher","first-page":"R9","DOI":"10.1016\/S1359-0278(98)00002-9","volume":"3","author":"AL Fink","year":"1998","unstructured":"Fink AL: Protein aggregation: folding aggregates, inclusion bodies and amyloid. Fold Des 1998, 3(1):R9\u201323. 10.1016\/S1359-0278(98)00002-9","journal-title":"Fold Des"},{"issue":"9","key":"3228_CR41","doi-asserted-by":"publisher","first-page":"620","DOI":"10.1021\/ar050067x","volume":"39","author":"F Bemporad","year":"2006","unstructured":"Bemporad F, Calloni G, Campioni S, Plakoutsi G, Taddei N, Chiti F: Sequence and structural determinants of amyloid fibril formation. Acc Chem Res 2006, 39(9):620\u2013627. 10.1021\/ar050067x","journal-title":"Acc Chem Res"},{"issue":"5","key":"3228_CR42","doi-asserted-by":"publisher","first-page":"1037","DOI":"10.1016\/j.jmb.2005.11.035","volume":"355","author":"F Rousseau","year":"2006","unstructured":"Rousseau F, Serrano L, Schymkowitz JW: How evolutionary pressure against protein aggregation shaped chaperone specificity. J Mol Biol 2006, 355(5):1037\u20131047. 10.1016\/j.jmb.2005.11.035","journal-title":"J Mol Biol"},{"issue":"1","key":"3228_CR43","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1038\/75556","volume":"25","author":"M Ashburner","year":"2000","unstructured":"Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature genetics 2000, 25(1):25\u201329. 10.1038\/75556","journal-title":"Nature genetics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-10-S1-S45.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T10:47:00Z","timestamp":1630493220000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-10-S1-S45"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,1]]},"references-count":43,"journal-issue":{"issue":"S1","published-print":{"date-parts":[[2009,1]]}},"alternative-id":["3228"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-10-s1-s45","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2009,1]]},"assertion":[{"value":"30 January 2009","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S45"}}