{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T23:04:53Z","timestamp":1773270293678,"version":"3.50.1"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2009,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Polypeptides are composed of amino acids covalently bonded via a peptide bond. The majority of peptide bonds in proteins is found to occur in the <jats:italic>trans<\/jats:italic> conformation. In spite of their infrequent occurrence, <jats:italic>cis<\/jats:italic> peptide bonds play a key role in the protein structure and function, as well as in many significant biological processes.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We perform a systematic analysis of regions in protein sequences that contain a proline <jats:italic>cis<\/jats:italic> peptide bond in order to discover non-random associations between the primary sequence and the nature of proline <jats:italic>cis\/trans<\/jats:italic> isomerization. For this purpose an efficient pattern discovery algorithm is employed which discovers regular expression-type patterns that are overrepresented (i.e. appear frequently repeated) in a set of sequences. Four types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, iii) pattern discovery using a structural equivalency set and iv) pattern discovery using certain amino acids' physicochemical properties. The extracted patterns are carefully validated using a specially implemented scoring function and a significance measure (i.e. log-probability estimate) indicative of their specificity. The score threshold for the first three types of pattern discovery is 0.90 while for the last type of pattern discovery 0.80. Regarding the significance measure, all patterns yielded values in the range [-9, -31] which ensure that the derived patterns are highly unlikely to have emerged by chance. Among the highest scoring patterns, most of them are consistent with previous investigations concerning the neighborhood of <jats:italic>cis<\/jats:italic> proline peptide bonds, and many new ones are identified. Finally, the extracted patterns are systematically compared against the PROSITE database, in order to gain insight into the functional implications of <jats:italic>cis<\/jats:italic> prolyl bonds.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>\n              <jats:italic>Cis<\/jats:italic> patterns with matches in the PROSITE database fell mostly into two main functional clusters: family signatures and protein signatures. However considerable propensity was also observed for targeting signals, active and phosphorylation sites as well as domain signatures.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-10-113","type":"journal-article","created":{"date-parts":[[2009,4,21]],"date-time":"2009-04-21T06:17:33Z","timestamp":1240294653000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation"],"prefix":"10.1186","volume":"10","author":[{"given":"Konstantinos P","family":"Exarchos","sequence":"first","affiliation":[]},{"given":"Themis P","family":"Exarchos","sequence":"additional","affiliation":[]},{"given":"Costas","family":"Papaloukas","sequence":"additional","affiliation":[]},{"given":"Anastassios N","family":"Troganis","sequence":"additional","affiliation":[]},{"given":"Dimitrios I","family":"Fotiadis","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2009,4,20]]},"reference":[{"issue":"1","key":"2843_CR1","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1016\/0022-2836(90)90159-J","volume":"214","author":"DE Stewart","year":"1990","unstructured":"Stewart DE, Sarkar A, Wampler JE: Occurrence and role of cis peptide bonds in protein structures. Journal of molecular biology 1990, 214(1):253\u2013260. 10.1016\/0022-2836(90)90159-J","journal-title":"Journal of molecular biology"},{"issue":"8","key":"2843_CR2","doi-asserted-by":"publisher","first-page":"676","DOI":"10.1038\/1368","volume":"5","author":"MS Weiss","year":"1998","unstructured":"Weiss MS, Jabs A, Hilgenfeld R: Peptide bonds revisited. Nature structural biology 1998, 5(8):676. 10.1038\/1368","journal-title":"Nature structural biology"},{"issue":"10","key":"2843_CR3","doi-asserted-by":"publisher","first-page":"619","DOI":"10.1038\/nchembio.2007.35","volume":"3","author":"KP Lu","year":"2007","unstructured":"Lu KP, Finn G, Lee TH, Nicholson LK: Prolyl cis-trans isomerization as a molecular timer. Nature chemical biology 2007, 3(10):619\u2013629. 10.1038\/nchembio.2007.35","journal-title":"Nature chemical biology"},{"issue":"3","key":"2843_CR4","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1002\/prot.20342","volume":"58","author":"S Lorenzen","year":"2005","unstructured":"Lorenzen S, Peters B, Goede A, Preissner R, Frommel C: Conservation of cis prolyl bonds in proteins during evolution. Proteins 2005, 58(3):589\u2013595. 10.1002\/prot.20342","journal-title":"Proteins"},{"issue":"1","key":"2843_CR5","doi-asserted-by":"publisher","first-page":"271","DOI":"10.1006\/jmbi.1999.3217","volume":"294","author":"D Pal","year":"1999","unstructured":"Pal D, Chakrabarti P: Cis peptide bonds in proteins: residues involved, their conformations, interactions and locations. Journal of molecular biology 1999, 294(1):271\u2013288. 10.1006\/jmbi.1999.3217","journal-title":"Journal of molecular biology"},{"issue":"7","key":"2843_CR6","doi-asserted-by":"publisher","first-page":"2475","DOI":"10.1021\/cr0104375","volume":"103","author":"C Dugave","year":"2003","unstructured":"Dugave C, Demange L: Cis-trans isomerization of organic molecules and biomolecules: implications and applications. Chemical reviews 2003, 103(7):2475\u20132532. 10.1021\/cr0104375","journal-title":"Chemical reviews"},{"issue":"2","key":"2843_CR7","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1096\/fasebj.14.2.231","volume":"14","author":"BK Kay","year":"2000","unstructured":"Kay BK, Williamson MP, Sudol M: The importance of being proline: the interaction of proline-rich motifs in signaling proteins with their cognate domains. Faseb J 2000, 14(2):231\u2013241.","journal-title":"Faseb J"},{"issue":"2","key":"2843_CR8","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1016\/j.bpc.2004.05.006","volume":"111","author":"YK Kang","year":"2004","unstructured":"Kang YK, Choi HY: Cis-trans isomerization and puckering of proline residue. Biophysical chemistry 2004, 111(2):135\u2013142. 10.1016\/j.bpc.2004.05.006","journal-title":"Biophysical chemistry"},{"issue":"3","key":"2843_CR9","doi-asserted-by":"publisher","first-page":"725","DOI":"10.1016\/0022-2836(92)90859-I","volume":"228","author":"EJ Milner-White","year":"1992","unstructured":"Milner-White EJ, Bell LH, Maccallum PH: Pyrrolidine ring puckering in cis and trans-proline residues in proteins and polypeptides. Different puckers are favoured in certain situations. Journal of molecular biology 1992, 228(3):725\u2013734. 10.1016\/0022-2836(92)90859-I","journal-title":"Journal of molecular biology"},{"issue":"12","key":"2843_CR10","doi-asserted-by":"publisher","first-page":"2627","DOI":"10.1110\/ps.ps.26601a","volume":"10","author":"L Vitagliano","year":"2001","unstructured":"Vitagliano L, Berisio R, Mastrangelo A, Mazzarella L, Zagari A: Preferred proline puckerings in cis and trans peptide groups: implications for collagen stability. Protein Sci 2001, 10(12):2627\u20132632.","journal-title":"Protein Sci"},{"issue":"12","key":"2843_CR11","doi-asserted-by":"publisher","first-page":"2623","DOI":"10.1002\/bip.1981.360201209","volume":"20","author":"C Grathwohl","year":"1981","unstructured":"Grathwohl C, Wuethrich K: NMR studies of the rates of proline cis-trans isomerization in oligopeptides. Biopolymers 1981, 20(12):2623\u20132633. 10.1002\/bip.1981.360201209","journal-title":"Biopolymers"},{"issue":"1\u20132","key":"2843_CR12","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1016\/0014-5793(90)80833-5","volume":"277","author":"C Frommel","year":"1990","unstructured":"Frommel C, Preissner R: Prediction of prolyl residues in cis-conformation in protein structures on the basis of the amino acid sequence. FEBS letters 1990, 277(1\u20132):159\u2013163. 10.1016\/0014-5793(90)80833-5","journal-title":"FEBS letters"},{"issue":"1","key":"2843_CR13","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1046\/j.1399-3011.2004.00100.x","volume":"63","author":"ML Wang","year":"2004","unstructured":"Wang ML, Li WJ, Wang ML, Xu WB: Support vector machines for prediction of peptidyl prolyl cis\/trans isomerization. J Pept Res 2004, 63(1):23\u201328. 10.1046\/j.1399-3011.2004.00100.x","journal-title":"J Pept Res"},{"key":"2843_CR14","doi-asserted-by":"publisher","first-page":"124","DOI":"10.1186\/1471-2105-7-124","volume":"7","author":"J Song","year":"2006","unstructured":"Song J, Burrage K, Yuan Z, Huber T: Prediction of cis\/trans isomerization in proteins using PSI-BLAST profiles and secondary structure information. BMC bioinformatics 2006, 7: 124. 10.1186\/1471-2105-7-124","journal-title":"BMC bioinformatics"},{"issue":"5","key":"2843_CR15","doi-asserted-by":"publisher","first-page":"685","DOI":"10.1093\/bioinformatics\/bti089","volume":"21","author":"D Pahlke","year":"2005","unstructured":"Pahlke D, Leitner D, Wiedemann U, Labudde D: COPS \u2013 cis\/trans peptide bond conformation prediction of amino acids on the basis of secondary structure information. Bioinformatics (Oxford, England) 2005, 21(5):685\u2013686. 10.1093\/bioinformatics\/bti089","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2843_CR16","doi-asserted-by":"publisher","first-page":"5009","DOI":"10.1109\/IEMBS.2007.4353465","volume-title":"Conf Proc IEEE Eng Med Biol Soc: 2007; Lyon, France","author":"KP Exarchos","year":"2007","unstructured":"Exarchos KP, Exarchos TP, Papaloukas C, Troganis AN, Fotiadis DI: Predicting peptide bond conformation using feature selection and the Naive Bayes approach. Conf Proc IEEE Eng Med Biol Soc: 2007; Lyon, France 2007, 5009\u20135012."},{"issue":"1","key":"2843_CR17","doi-asserted-by":"publisher","first-page":"140","DOI":"10.1016\/j.jbi.2008.05.006","volume":"42","author":"KP Exarchos","year":"2009","unstructured":"Exarchos KP, Papaloukas C, Exarchos TP, Troganis AN, Fotiadis DI: Prediction of cis\/trans isomerization using feature selection and support vector machines. J Biomed Inform 2009, 42(1):140\u2013149. 10.1016\/j.jbi.2008.05.006","journal-title":"J Biomed Inform"},{"key":"2843_CR18","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1186\/1472-6807-5-8","volume":"5","author":"D Pahlke","year":"2005","unstructured":"Pahlke D, Freund C, Leitner D, Labudde D: Statistically significant dependence of the Xaa-Pro peptide bond conformation on secondary structure and amino acid sequence. BMC structural biology 2005, 5: 8. 10.1186\/1472-6807-5-8","journal-title":"BMC structural biology"},{"issue":"7065","key":"2843_CR19","doi-asserted-by":"publisher","first-page":"248","DOI":"10.1038\/nature04130","volume":"438","author":"SC Lummis","year":"2005","unstructured":"Lummis SC, Beene DL, Lee LW, Lester HA, Broadhurst RW, Dougherty DA: Cis-trans isomerization at a proline opens the pore of a neurotransmitter-gated ion channel. Nature 2005, 438(7065):248\u2013252. 10.1038\/nature04130","journal-title":"Nature"},{"issue":"2","key":"2843_CR20","doi-asserted-by":"publisher","first-page":"337","DOI":"10.1016\/S0022-2836(05)80195-0","volume":"213","author":"MJ Rooman","year":"1990","unstructured":"Rooman MJ, Rodriguez J, Wodak SJ: Relations between protein sequence and structure and their significance. Journal of molecular biology 1990, 213(2):337\u2013350. 10.1016\/S0022-2836(05)80195-0","journal-title":"Journal of molecular biology"},{"key":"2843_CR21","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1002\/prot.340090108","volume":"9","author":"MJ Rooman","year":"1991","unstructured":"Rooman MJ, Wodak SJ: Weak Correlation Between Predictive Power Of Individual Sequence Patterns and Overall Prediction Accuracy in Proteins. Proteins: Structure, Function, and Genetics 1991, 9: 69\u201378. 10.1002\/prot.340090108","journal-title":"Proteins: Structure, Function, and Genetics"},{"issue":"1","key":"2843_CR22","doi-asserted-by":"publisher","first-page":"144","DOI":"10.1002\/prot.20279","volume":"58","author":"S Lise","year":"2005","unstructured":"Lise S, Jones DT: Sequence patterns associated with disordered regions in proteins. PROTEINS: Structure, Function, and Bioinformatics 2005, 58(1):144\u2013150. 10.1002\/prot.20279","journal-title":"PROTEINS: Structure, Function, and Bioinformatics"},{"key":"2843_CR23","volume-title":"Genomics and proteomics engineering in medicine and biology","author":"M Akay","year":"2007","unstructured":"Akay M: Genomics and proteomics engineering in medicine and biology. Edited by: Piscataway NJ, Hoboken NJ. IEEE Press; John Wiley & Sons, Inc; 2007."},{"issue":"6","key":"2843_CR24","doi-asserted-by":"publisher","first-page":"451","DOI":"10.1016\/S1359-0278(96)00061-2","volume":"1","author":"A Elofsson","year":"1996","unstructured":"Elofsson A, Fischer D, Rice DW, Le Grand SM, Eisenberg D: A study of combined structure\/sequence profiles. Folding and Design 1996, 1(6):451\u2013461. 10.1016\/S1359-0278(96)00061-2","journal-title":"Folding and Design"},{"key":"2843_CR25","doi-asserted-by":"crossref","unstructured":"Hulo N, Bairoch A, Bulliard V, Cerutti L, Cuche BA, de Castro E, Lachaize C, Langendijk-Genevaux PS, Sigrist CJ: The 20 years of PROSITE. Nucleic acids research 2008, (36 Database):D245\u2013249.","DOI":"10.1093\/nar\/gkm977"},{"issue":"1","key":"2843_CR26","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1093\/nar\/28.1.235","volume":"28","author":"HM Berman","year":"2000","unstructured":"Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic acids research 2000, 28(1):235\u2013242. 10.1093\/nar\/28.1.235","journal-title":"Nucleic acids research"},{"key":"2843_CR27","doi-asserted-by":"crossref","unstructured":"Wang G, Dunbrack RL Jr: PISCES: recent improvements to a PDB sequence culling server. Nucleic acids research 2005, (33 Web Server):W94\u201398. 10.1093\/nar\/gki402","DOI":"10.1093\/nar\/gki402"},{"issue":"13","key":"2843_CR28","doi-asserted-by":"publisher","first-page":"3316","DOI":"10.1093\/nar\/gkg565","volume":"31","author":"L Willard","year":"2003","unstructured":"Willard L, Ranjan A, Zhang H, Monzavi H, Boyko RF, Sykes BD, Wishart DS: VADAR: a web server for quantitative evaluation of protein structure quality. Nucleic acids research 2003, 31(13):3316\u20133319. 10.1093\/nar\/gkg565","journal-title":"Nucleic acids research"},{"issue":"1","key":"2843_CR29","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1093\/bioinformatics\/14.1.55","volume":"14","author":"I Rigoutsos","year":"1998","unstructured":"Rigoutsos I, Floratos A: Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm. Bioinformatics (Oxford, England) 1998, 14(1):55\u201367. 10.1093\/bioinformatics\/14.1.55","journal-title":"Bioinformatics (Oxford, England)"},{"issue":"2","key":"2843_CR30","doi-asserted-by":"publisher","first-page":"264","DOI":"10.1002\/(SICI)1097-0134(19991101)37:2<264::AID-PROT11>3.0.CO;2-C","volume":"37","author":"I Rigoutsos","year":"1999","unstructured":"Rigoutsos I, Floratos A, Ouzounis C, Gao Y, Parida L: Dictionary building via unsupervised hierarchical motif discovery in the sequence space of natural proteins. Proteins 1999, 37(2):264\u2013277. 10.1002\/(SICI)1097-0134(19991101)37:2<264::AID-PROT11>3.0.CO;2-C","journal-title":"Proteins"},{"key":"2843_CR31","first-page":"D26","volume-title":"Nucleic acids research","author":"DA Benson","year":"2009","unstructured":"Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW: GenBank. Nucleic acids research 2009, (37 Database):D26\u201331. 10.1093\/nar\/gkn723"},{"key":"2843_CR32","doi-asserted-by":"publisher","first-page":"1188","DOI":"10.1101\/gr.849004","volume":"14","author":"GE Crooks","year":"2004","unstructured":"Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: A Sequence Logo Generator. Genome Research 2004, 14: 1188\u20131190. 10.1101\/gr.849004","journal-title":"Genome Research"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-10-113.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,8,31]],"date-time":"2021-08-31T21:38:04Z","timestamp":1630445884000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-10-113"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,4,20]]},"references-count":32,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2009,12]]}},"alternative-id":["2843"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-10-113","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2009,4,20]]},"assertion":[{"value":"26 December 2008","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 April 2009","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 April 2009","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"113"}}