{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,2,19]],"date-time":"2023-02-19T18:09:09Z","timestamp":1676830149473},"reference-count":38,"publisher":"Oxford University Press (OUP)","issue":"23","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Regulatory proteases modulate proteomic dynamics with a spectrum of specificities against substrate proteins. Predictions of the substrate sites in a proteome for the proteases would facilitate understanding the biological functions of the proteases. High-throughput experiments could generate suitable datasets for machine learning to grasp complex relationships between the substrate sequences and the enzymatic specificities. But the capability in predicting protease substrate sites by integrating the machine learning algorithms with the experimental methodology has yet to be demonstrated.<\/jats:p>\n               <jats:p>Results: Factor Xa, a key regulatory protease in the blood coagulation system, was used as model system, for which effective substrate site predictors were developed and benchmarked. The predictors were derived from bootstrap aggregation (machine learning) algorithms trained with data obtained from multilevel substrate phage display experiments. The experimental sampling and computational learning on substrate specificities can be generalized to proteases for which the active forms are available for the in vitro experiments.<\/jats:p>\n               <jats:p>Availability: \u00a0http:\/\/asqa.iis.sinica.edu.tw\/fXaWeb\/<\/jats:p>\n               <jats:p>Contact: \u00a0hsu@iis.sinica.edu.tw; yangas@gate.sinica.edu.tw<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn538","type":"journal-article","created":{"date-parts":[[2008,10,30]],"date-time":"2008-10-30T00:25:35Z","timestamp":1225326335000},"page":"2691-2697","source":"Crossref","is-referenced-by-count":18,"title":["Protease substrate site predictors derived from machine learning on multilevel substrate phage display data"],"prefix":"10.1093","volume":"24","author":[{"given":"Ching-Tai","family":"Chen","sequence":"first","affiliation":[{"name":"1 Institute of Information Science, Academia Sinica, Taipei 115, 2Institute of bioinformatics, National Chiao Tung University, Hsin Chu 300, 3Genomics Research Center, Academia Sinica, Taipei 115 and 4Graduate Institute of Life Sciences, National Defense Medical University, Taipei 114, Taiwan"},{"name":"1 Institute of Information Science, Academia Sinica, Taipei 115, 2Institute of bioinformatics, National Chiao Tung University, Hsin Chu 300, 3Genomics Research Center, Academia Sinica, Taipei 115 and 4Graduate Institute of Life Sciences, National Defense Medical University, Taipei 114, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ei-Wen","family":"Yang","sequence":"additional","affiliation":[{"name":"1 Institute of Information Science, Academia Sinica, Taipei 115, 2Institute of bioinformatics, National Chiao Tung University, Hsin Chu 300, 3Genomics Research Center, Academia Sinica, Taipei 115 and 4Graduate Institute of Life Sciences, National Defense Medical University, Taipei 114, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hung-Ju","family":"Hsu","sequence":"additional","affiliation":[{"name":"1 Institute of Information Science, Academia Sinica, Taipei 115, 2Institute of bioinformatics, National Chiao Tung University, Hsin Chu 300, 3Genomics Research Center, Academia Sinica, Taipei 115 and 4Graduate Institute of Life Sciences, National Defense Medical University, Taipei 114, Taiwan"},{"name":"1 Institute of Information Science, Academia Sinica, Taipei 115, 2Institute of bioinformatics, National Chiao Tung University, Hsin Chu 300, 3Genomics Research Center, Academia Sinica, Taipei 115 and 4Graduate Institute of Life Sciences, National Defense Medical University, Taipei 114, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yi-Kun","family":"Sun","sequence":"additional","affiliation":[{"name":"1 Institute of Information Science, Academia Sinica, Taipei 115, 2Institute of bioinformatics, National Chiao Tung University, Hsin Chu 300, 3Genomics Research Center, Academia Sinica, Taipei 115 and 4Graduate Institute of Life Sciences, National Defense Medical University, Taipei 114, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wen-Lian","family":"Hsu","sequence":"additional","affiliation":[{"name":"1 Institute of Information Science, Academia Sinica, Taipei 115, 2Institute of bioinformatics, National Chiao Tung University, Hsin Chu 300, 3Genomics Research Center, Academia Sinica, Taipei 115 and 4Graduate Institute of Life Sciences, National Defense Medical University, Taipei 114, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"An-Suei","family":"Yang","sequence":"additional","affiliation":[{"name":"1 Institute of Information Science, Academia Sinica, Taipei 115, 2Institute of bioinformatics, National Chiao Tung University, Hsin Chu 300, 3Genomics Research Center, Academia Sinica, Taipei 115 and 4Graduate Institute of Life Sciences, National Defense Medical University, Taipei 114, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2008,10,29]]},"reference":[{"key":"2023020212243515800_B1","doi-asserted-by":"crossref","first-page":"W208","DOI":"10.1093\/nar\/gki433","article-title":"GraBCas: a bioinformatics tool for score-based prediction of Caspase-and Granzyme B-cleavage sites in protein sequences","volume":"33","author":"Backes","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023020212243515800_B2","first-page":"372","article-title":"PoPS: a computational tool for modeling and predicting protease specificity","volume-title":"Proceedings of the IEEE Computational Systems Bioinformatics Conference","author":"Boyd","year":"2004"},{"key":"2023020212243515800_B3","doi-asserted-by":"crossref","first-page":"29988","DOI":"10.1074\/jbc.271.47.29988","article-title":"X-ray structure of active site-inhibited clotting factor Xa. Implications for drug design and substrate recognition","volume":"271","author":"Brandstetter","year":"1996","journal-title":"J. Biol. Chem"},{"key":"2023020212243515800_B4","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/BF00058655","article-title":"Bagging predictors","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach. Learn."},{"key":"2023020212243515800_B5","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1023\/A:1009715923555","article-title":"A tutorial on support vector machines for pattern recognition","volume":"2","author":"Burges","year":"1998","journal-title":"Data Min. Knowl. Discov."},{"key":"2023020212243515800_B6","volume-title":"LIBSVM: A library for support vector machines","author":"Chang","year":"2001"},{"key":"2023020212243515800_B7","doi-asserted-by":"crossref","first-page":"24074","DOI":"10.1074\/jbc.274.34.24074","article-title":"Revisiting catalysis by chymotrypsin family serine proteases using peptide substrates and inhibitors with unnatural main chains","volume":"274","author":"Coombs","year":"1999","journal-title":"J. Biol. Chem."},{"key":"2023020212243515800_B8","doi-asserted-by":"crossref","first-page":"1107","DOI":"10.1515\/BC.2002.119","article-title":"Phage display substrate: a blind method for determining protease specificity","volume":"383","author":"Deperthes","year":"2002","journal-title":"Biol. Chem"},{"key":"2023020212243515800_B9","doi-asserted-by":"crossref","first-page":"410","DOI":"10.1016\/j.bioorg.2006.10.002","article-title":"Direct crystallographic observation of an acyl-enzyme intermediate in the elastase-catalyzed hydrolysis of a peptidyl ester substrate: exploiting the \u201cglass transition\u201d in protein dynamics","volume":"34","author":"Ding","year":"2006","journal-title":"Bioorg. Chem."},{"issue":"Suppl. 1","key":"2023020212243515800_B10","doi-asserted-by":"crossref","first-page":"i169","DOI":"10.1093\/bioinformatics\/bti1034","article-title":"CaSPredictor: a new computer-based tool for caspase substrate prediction","volume":"21","author":"Garay-Malpartida","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020212243515800_B11","doi-asserted-by":"crossref","first-page":"1292","DOI":"10.1002\/pmic.200401011","article-title":"Profiling serine protease substrate specificity with solution phase fluorogenic peptide microarrays","volume":"5","author":"Gosalia","year":"2005","journal-title":"Proteomics"},{"key":"2023020212243515800_B12","doi-asserted-by":"crossref","first-page":"2471","DOI":"10.2174\/092986707782023659","article-title":"The discovery of the Factor Xa inhibitor otamixaban: from lead identification to clinical development","volume":"14","author":"Guertin","year":"2007","journal-title":"Curr. Med. Chem."},{"key":"2023020212243515800_B13","doi-asserted-by":"crossref","first-page":"7754","DOI":"10.1073\/pnas.140132697","article-title":"Rapid and general profiling of protease specificity by using combinatorial fluorogenic substrate libraries","volume":"97","author":"Harris","year":"2000","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020212243515800_B14","doi-asserted-by":"crossref","first-page":"4501","DOI":"10.1021\/cr000033x","article-title":"Serine protease mechanism and specificity","volume":"102","author":"Hedstrom","year":"2002","journal-title":"Chem. Rev"},{"key":"2023020212243515800_B15","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1016\/0268-960X(94)90007-8","article-title":"Biochemistry of factor X","volume":"8","author":"Hertzberg","year":"1994","journal-title":"Blood Rev."},{"key":"2023020212243515800_B16","doi-asserted-by":"crossref","first-page":"12343","DOI":"10.1074\/jbc.M708843200","article-title":"Factor Xa active site substrate specificity with substrate phage~display and computational molecular modeling","volume":"283","author":"Hsu","year":"2008","journal-title":"J. Biol. Chem."},{"key":"2023020212243515800_B17","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/S1046-5928(03)00168-2","article-title":"A critical review of the methods for cleavage of fusion proteins with thrombin and factor Xa","volume":"31","author":"Jenny","year":"2003","journal-title":"Protein Expr. Purif."},{"key":"2023020212243515800_B18","doi-asserted-by":"crossref","first-page":"139","DOI":"10.4161\/cbt.4.2.1508","article-title":"Overview of cell death signaling pathways","volume":"4","author":"Jin","year":"2005","journal-title":"Cancer Biol. Ther."},{"key":"2023020212243515800_B19","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1093\/nar\/28.1.374","article-title":"AAindex: amino acid index database","volume":"28","author":"Kawashima","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023020212243515800_B20","doi-asserted-by":"crossref","first-page":"324","DOI":"10.1016\/S0167-4838(99)00284-8","article-title":"What can the structures of enzyme-inhibitor complexes tell us about the structures of enzyme substrate complexes?","volume":"1477","author":"Laskowski","year":"2000","journal-title":"Biochim. Biophys. Acta"},{"key":"2023020212243515800_B21","doi-asserted-by":"crossref","first-page":"3227","DOI":"10.1093\/bioinformatics\/bti524","article-title":"HYPROSP II \u2013 a knowledge-based hybrid method for protein secondary structure prediction based on local prediction confidence","volume":"21","author":"Lin","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020212243515800_B22","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1186\/1471-2105-7-182","article-title":"Quantitative prediction of mouse class I MHC peptide binding affinity using support vector machine regression (SVR) models","volume":"7","author":"Liu","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020212243515800_B23","volume-title":"An Introduction to Information Retrieval","author":"Manning","year":"2007"},{"key":"2023020212243515800_B24","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.tibtech.2004.12.010","article-title":"Papa's got a brand new tag: advances in identification of proteases and their substrates","volume":"23","author":"Marnett","year":"2005","journal-title":"Trends Biotechnol"},{"key":"2023020212243515800_B25","doi-asserted-by":"crossref","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","article-title":"Comparison of predicted and observed secondary structure of T4 phage lysozyme","volume":"405","author":"Matthews","year":"1975","journal-title":"Biochim. Biophys. Acta"},{"key":"2023020212243515800_B26","doi-asserted-by":"crossref","first-page":"1113","DOI":"10.1126\/science.8493554","article-title":"Substrate phage: selection of protease substrates by monovalent phage display","volume":"260","author":"Matthews","year":"1993","journal-title":"Science"},{"issue":"Suppl. 1","key":"2023020212243515800_B27","doi-asserted-by":"crossref","first-page":"S5","DOI":"10.1093\/bioinformatics\/18.suppl_1.S5","article-title":"Mining viral protease data to extract cleavage knowledge","volume":"18","author":"Narayanan","year":"2002","journal-title":"Bioinformatics"},{"key":"2023020212243515800_B28","doi-asserted-by":"crossref","first-page":"573","DOI":"10.2174\/1386207013330788","article-title":"Substrate phage as a tool to identify novel substrate sequences of proteases","volume":"4","author":"Ohkubo","year":"2001","journal-title":"Comb. Chem. High Throughput Screen"},{"key":"2023020212243515800_B29","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1038\/cr.2008.17","article-title":"Intracellular protease activation in apoptosis and cell-mediated cytotoxicity characterized by cell-permeable fluorogenic protease substrates","volume":"18","author":"Packard","year":"2008","journal-title":"Cell Res"},{"key":"2023020212243515800_B30","first-page":"392","article-title":"Advances in gamma-secretase modulation","volume":"10","author":"Pissarnitski","year":"2007","journal-title":"Curr. Opin. Drug Discov. Devel."},{"key":"2023020212243515800_B31","doi-asserted-by":"crossref","first-page":"D320","DOI":"10.1093\/nar\/gkm954","article-title":"MEROPS: the peptidase database","volume":"36","author":"Rawlings","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023020212243515800_B32","volume-title":"Learning Internal Representations by Error Propagation","author":"Rumelhart","year":"1986"},{"key":"2023020212243515800_B33","doi-asserted-by":"crossref","first-page":"14868","DOI":"10.1021\/ja027477q","article-title":"Peptide microarrays for the determination of protease substrate specificity","volume":"124","author":"Salisbury","year":"2002","journal-title":"J. Am. Chem. Soc."},{"key":"2023020212243515800_B34","doi-asserted-by":"crossref","first-page":"10788","DOI":"10.1074\/jbc.M011772200","article-title":"Reaction kinetics of protease with substrate phage. Kinetic model developed using stromelysin","volume":"276","author":"Sharkov","year":"2001","journal-title":"J. Biol. Chem."},{"key":"2023020212243515800_B35","doi-asserted-by":"crossref","first-page":"6440","DOI":"10.1074\/jbc.270.12.6440","article-title":"Rapid identification of highly active and selective substrates for stromelysin and matrilysin using bacteriophage peptide display libraries","volume":"270","author":"Smith","year":"1995","journal-title":"J. Biol. Chem."},{"key":"2023020212243515800_B36","doi-asserted-by":"crossref","first-page":"973","DOI":"10.1021\/cr040669e","article-title":"Proteases universally recognize beta strands in their active sites","volume":"105","author":"Tyndall","year":"2005","journal-title":"Chem. Rev."},{"key":"2023020212243515800_B37","first-page":"975","article-title":"Probability estimates for multi-class classification by pairwise coupling","volume":"5","author":"Wu","year":"2004","journal-title":"J. Mach. Learn. Res."},{"key":"2023020212243515800_B38","doi-asserted-by":"crossref","first-page":"2644","DOI":"10.1093\/bioinformatics\/bti404","article-title":"Mining SARS-CoV protease cleavage data using non-orthogonal decision trees: a novel method for decisive template selection","volume":"21","author":"Yang","year":"2005","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/23\/2691\/49056215\/bioinformatics_24_23_2691.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/23\/2691\/49056215\/bioinformatics_24_23_2691.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T15:09:28Z","timestamp":1675350568000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/23\/2691\/181331"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,10,29]]},"references-count":38,"journal-issue":{"issue":"23","published-print":{"date-parts":[[2008,12,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn538","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,12,1]]},"published":{"date-parts":[[2008,10,29]]}}}