{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,5]],"date-time":"2026-04-05T19:05:37Z","timestamp":1775415937611,"version":"3.50.1"},"reference-count":34,"publisher":"Oxford University Press (OUP)","issue":"24","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2904,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,12,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Phosphorylation is a crucial post-translational protein modification mechanism with important regulatory functions in biological systems. It is catalyzed by a group of enzymes called kinases, each of which recognizes certain target sites in its substrate proteins. Several authors have built computational models trained from sets of experimentally validated phosphorylation sites to predict these target sites for each given kinase. All of these models suffer from certain limitations, such as the fact that they do not take into account the dependencies between amino acid motifs within protein sequences in a global fashion.<\/jats:p>\n               <jats:p>Results: We propose a novel approach to predict phosphorylation sites from the protein sequence. The method uses a positive dataset to train a conditional random field (CRF) model. The negative training dataset is used to specify the decision threshold corresponding to a desired false positive rate. Application of the method on experimentally verified benchmark phosphorylation data (Phospho.ELM) shows that it performs well compared to existing methods for most kinases. This is to our knowledge that the first report of the use of CRFs to predict post-translational modification sites in protein sequences.<\/jats:p>\n               <jats:p>Availability: The source code of the implementation, called CRPhos, is available from http:\/\/www.ptools.ua.ac.be\/CRPhos\/<\/jats:p>\n               <jats:p>Contact: \u00a0kris.laukens@ua.ac.be<\/jats:p>\n               <jats:p>Suplementary Information: Supplementary data are available at http:\/\/www.ptools.ua.ac.be\/CRPhos\/<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn546","type":"journal-article","created":{"date-parts":[[2008,10,22]],"date-time":"2008-10-22T00:23:54Z","timestamp":1224635034000},"page":"2857-2864","source":"Crossref","is-referenced-by-count":56,"title":["Prediction of kinase-specific phosphorylation sites using conditional random fields"],"prefix":"10.1093","volume":"24","author":[{"given":"Thanh Hai","family":"Dang","sequence":"first","affiliation":[{"name":"1 Intelligent Systems Laboratory and 2Advanced Database Research and Modelling, Department of Mathematics and Computer Science, Middelheimlaan 1, B-2020 Antwerpen, Belgium"}]},{"given":"Koenraad","family":"Van Leemput","sequence":"additional","affiliation":[{"name":"1 Intelligent Systems Laboratory and 2Advanced Database Research and Modelling, Department of Mathematics and Computer Science, Middelheimlaan 1, B-2020 Antwerpen, Belgium"}]},{"given":"Alain","family":"Verschoren","sequence":"additional","affiliation":[{"name":"1 Intelligent Systems Laboratory and 2Advanced Database Research and Modelling, Department of Mathematics and Computer Science, Middelheimlaan 1, B-2020 Antwerpen, Belgium"}]},{"given":"Kris","family":"Laukens","sequence":"additional","affiliation":[{"name":"1 Intelligent Systems Laboratory and 2Advanced Database Research and Modelling, Department of Mathematics and Computer Science, Middelheimlaan 1, B-2020 Antwerpen, Belgium"}]}],"member":"286","published-online":{"date-parts":[[2008,10,20]]},"reference":[{"key":"2023020212304831200_B1","doi-asserted-by":"crossref","first-page":"1351","DOI":"10.1006\/jmbi.1999.3310","article-title":"Sequence and structure-based prediction of eukaryotic protein phosphorylation sites","volume":"294","author":"Blom","year":"1999","journal-title":"J. Mol. Biol."},{"key":"2023020212304831200_B2","doi-asserted-by":"crossref","first-page":"1633","DOI":"10.1002\/pmic.200300771","article-title":"Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence","volume":"4","author":"Blom","year":"2004","journal-title":"Proteomics"},{"key":"2023020212304831200_B3","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1093\/nar\/gkg095","article-title":"The Swiss-Prot protein knowledgebase and its supplement TrEMBL in 2003","volume":"31","author":"Boeckmann","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023020212304831200_B4","doi-asserted-by":"crossref","first-page":"i125","DOI":"10.1093\/bioinformatics\/btm187","article-title":"Kernel-based data fusion for gene prioritization","volume":"23","author":"De Bie","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020212304831200_B5","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1186\/1471-2105-5-79","article-title":"Phospho.ELM: a database of experimentally verified phosphorylation sites in eukaryotic proteins","volume":"5","author":"Diella","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023020212304831200_B6","doi-asserted-by":"crossref","first-page":"D240","DOI":"10.1093\/nar\/gkm772","volume":"36","author":"Diella","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023020212304831200_B7","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-3247-4","volume-title":"Statistical Methods in Bioinformatics: An Introduction.","author":"Ewens","year":"2001"},{"key":"2023020212304831200_B8","first-page":"584","article-title":"Information extraction with HMM structures learned by stochastic optimization","volume-title":"Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence.","author":"Freitag","year":"2000"},{"key":"2023020212304831200_B9","doi-asserted-by":"crossref","first-page":"R250","DOI":"10.1186\/gb-2007-8-11-r250","article-title":"PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites","volume":"8","author":"Gnad","year":"2007","journal-title":"Genome Biol"},{"key":"2023020212304831200_B10","doi-asserted-by":"crossref","first-page":"1015","DOI":"10.1093\/nar\/gkm812","article-title":"PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant specific phosphorylation site predictor","volume":"36","author":"Heazlewood","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023020212304831200_B11","doi-asserted-by":"crossref","first-page":"1551","DOI":"10.1002\/pmic.200300772","article-title":"PhosphoSite: a bioinformatics resource dedicated to physiological protein phosphorylation","volume":"4","author":"Hornbeck","year":"2004","journal-title":"Proteomics"},{"key":"2023020212304831200_B12","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1093\/nar\/gki471","article-title":"KinasePhos: a web tool for identifying protein kinase-specific phosphorylation sites","volume":"33","author":"Huang","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023020212304831200_B13","doi-asserted-by":"crossref","first-page":"1032","DOI":"10.1002\/jcc.20235","article-title":"Incorporating hidden Markov model for identifying protein kinase-specific phosphorylation sites","volume":"26","author":"Huang","year":"2005","journal-title":"J. Comput. Chem"},{"key":"2023020212304831200_B14","doi-asserted-by":"crossref","first-page":"1037","DOI":"10.1093\/nar\/gkh253","article-title":"The importance of intrinsic disorder for protein phosphorylation","volume":"32","author":"Iakoucheva","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023020212304831200_B15","doi-asserted-by":"crossref","first-page":"895","DOI":"10.1093\/bioinformatics\/btm020","article-title":"NetPhosYeast: prediction of protein phosphorylation sites in yeast","volume":"7","author":"Ingrell","year":"2007","journal-title":"Bioinformatic"},{"key":"2023020212304831200_B16","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1016\/j.cbpa.2003.12.009","article-title":"Modification-specific proteomics: characterization of post-translational modifications by mass spectrometry","volume":"8","author":"Jensen","year":"2004","journal-title":"Curr. Opin. Chem. Biol"},{"key":"2023020212304831200_B17","doi-asserted-by":"crossref","first-page":"3179","DOI":"10.1093\/bioinformatics\/bth382","article-title":"Prediction of phosphorylation sites using SVMs","volume":"20","author":"Kim","year":"2004","journal-title":"Bioinformatics"},{"key":"2023020212304831200_B18","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1016\/j.bbapap.2005.07.036","article-title":"Substrate specificity of protein kinases and computational prediction of substrates","volume":"1754","author":"Kobe","year":"2005","journal-title":"Biochim. Biophys. Acta"},{"key":"2023020212304831200_B19","first-page":"282","article-title":"Conditional random fields: probabilistic models for segmenting and labeling sequence data","volume-title":"Proceedings of the Eighteenth International Conference on Machine Learning.","author":"Lafferty","year":"2001"},{"key":"2023020212304831200_B20","doi-asserted-by":"crossref","first-page":"1912","DOI":"10.1126\/science.1075762","article-title":"The protein kinase complement of the human genome","volume":"298","author":"Manning","year":"2002","journal-title":"Science"},{"key":"2023020212304831200_B21","first-page":"403","article-title":"Efficiently inducing features of conditional random fields","volume-title":"Proceedings of the 19th Conference in Uncertainty in Articifical Intelligence.","author":"McCallum","year":"2003"},{"key":"2023020212304831200_B22","first-page":"591","article-title":"Maximum entropy Markov models for information extraction and segmentation","volume-title":"Proceedings of ICML 2000.","author":"McCallum","year":"2000"},{"key":"2023020212304831200_B23","first-page":"R23","article-title":"Spatial clustering of phosphorylation site recognition motifs can be exploited to predict the targets of cyclin-dependent kinase","volume":"8","author":"Moses","year":"2007"},{"key":"2023020212304831200_B24","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1745-6150-2-1","article-title":"pkaPS: prediction of protein kinase A phosphorylation sites with the simplified kinase-substrate binding model","volume":"2","author":"Neuberger","year":"2007","journal-title":"Biol. Direct"},{"key":"2023020212304831200_B25","doi-asserted-by":"crossref","first-page":"3635","DOI":"10.1093\/nar\/gkg584","article-title":"Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs","volume":"31","author":"Obenauer","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023020212304831200_B26","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1109\/34.588021","article-title":"Inducing features of random fields","volume":"19","author":"Pietra","year":"1997","journal-title":"IEEE Trans. Pattern Anal. Match. Intell"},{"key":"2023020212304831200_B27","first-page":"73","article-title":"A support vector machine approach to the identification of phosphorylation sites","volume":"10","author":"Plewczynski","year":"2005","journal-title":"Cell. Mol. Biol. Lett"},{"key":"2023020212304831200_B28","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1007\/s00894-007-0250-3","article-title":"Automotif server for prediction of phosphorylation sites in proteins using vector machine","volume":"14","author":"Plewczynski","year":"2008","journal-title":"J. Mol. Model"},{"key":"2023020212304831200_B29","doi-asserted-by":"crossref","DOI":"10.3115\/1073445.1073473","article-title":"Shallow parsing with conditional random fields","volume-title":"Proceedings of the 2003 Human Language Technology Conference and North American Chapter of the Association for Computational Linguistics.","author":"Sha","year":"2003"},{"key":"2023020212304831200_B30","doi-asserted-by":"crossref","first-page":"e22","DOI":"10.1093\/nar\/gkm848","article-title":"Meta-prediction of phosphorylation sites with weighted voting and restricted grid search parameter selection","volume":"36","author":"Wan","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023020212304831200_B31","doi-asserted-by":"crossref","first-page":"W588","DOI":"10.1093\/nar\/gkm322","article-title":"KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns","volume":"35","author":"Wong","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023020212304831200_B32","doi-asserted-by":"crossref","first-page":"W184","DOI":"10.1093\/nar\/gki393","article-title":"GPS: a comprehensive www server for phosphorylation sites prediction","volume":"33","author":"Xue","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023020212304831200_B33","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1186\/1471-2105-7-163","article-title":"PPSP: prediction of PK-specific phosphorylation site with Bayesian decision theory","volume":"7","author":"Xue","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020212304831200_B34","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/j.bbrc.2004.11.001","article-title":"GPS: a novel group-based phosphorylation predicting and scoring method","volume":"325","author":"Zhou","year":"2004","journal-title":"Biochem. Biophys. Res. Commun."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/24\/2857\/49056349\/bioinformatics_24_24_2857.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/24\/2857\/49056349\/bioinformatics_24_24_2857.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T15:17:01Z","timestamp":1675351021000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/24\/2857\/196927"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,10,20]]},"references-count":34,"journal-issue":{"issue":"24","published-print":{"date-parts":[[2008,12,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn546","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,12,15]]},"published":{"date-parts":[[2008,10,20]]}}}