{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,14]],"date-time":"2026-02-14T14:20:49Z","timestamp":1771078849698,"version":"3.50.1"},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: One of the major problems in shotgun proteomics is the low peptide coverage when analyzing complex protein samples. Identifying more peptides, e.g. non-tryptic peptides, may increase the peptide coverage and improve protein identification and\/or quantification that are based on the peptide identification results. Searching for all potential non-tryptic peptides is, however, time consuming for shotgun proteomics data from complex samples, and poses a challenge for a routine data analysis.<\/jats:p>\n               <jats:p>Results: We hypothesize that non-tryptic peptides are mainly created from the truncation of regular tryptic peptides before separation. We introduce the notion of truncatability of a tryptic peptide, i.e. the probability of the peptide to be identified in its truncated form, and build a predictor to estimate a peptide's truncatability from its sequence. We show that our predictions achieve useful accuracy, with the area under the ROC curve from 76% to 87%, and can be used to filter the sequence database for identifying truncated peptides. After filtering, only a limited number of tryptic peptides with the highest truncatability are retained for non-tryptic peptide searching. By applying this method to identification of semi-tryptic peptides, we show that a significant number of such peptides can be identified within a searching time comparable to that of tryptic peptide identification.<\/jats:p>\n               <jats:p>Contact: \u00a0predrag@indiana.edu; rarnold@indiana.edu; hatang@indiana.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm545","type":"journal-article","created":{"date-parts":[[2007,11,23]],"date-time":"2007-11-23T01:33:56Z","timestamp":1195781636000},"page":"102-109","source":"Crossref","is-referenced-by-count":48,"title":["Fast and accurate identification of semi-tryptic peptides in shotgun proteomics"],"prefix":"10.1093","volume":"24","author":[{"given":"Pedro","family":"Alves","sequence":"first","affiliation":[{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Randy J.","family":"Arnold","sequence":"additional","affiliation":[{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"},{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David E.","family":"Clemmer","sequence":"additional","affiliation":[{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"},{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yixue","family":"Li","sequence":"additional","affiliation":[{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"James P.","family":"Reilly","sequence":"additional","affiliation":[{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"},{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Quanhu","family":"Sheng","sequence":"additional","affiliation":[{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"},{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"},{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haixu","family":"Tang","sequence":"additional","affiliation":[{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"},{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"},{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhiyin","family":"Xun","sequence":"additional","affiliation":[{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rong","family":"Zeng","sequence":"additional","affiliation":[{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Predrag","family":"Radivojac","sequence":"additional","affiliation":[{"name":"1 School of Informatics, 2Department of Chemistry, 3Department of Biology, Center for Genomics and Bioinformatics, 4National Center for Glycomics & Glycoproteomics, Indiana University, Bloomington, IN, USA and 5Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2007,11,22]]},"reference":[{"key":"2023020209444896300_B1","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1038\/nature01511","article-title":"Mass spectrometry-based proteomics","volume":"422","author":"Aebersold","year":"2003","journal-title":"Nature"},{"key":"2023020209444896300_B2","first-page":"409","article-title":"Advancements in protein identification from shotgun proteomics using predicted peptide detectability","volume":"12","author":"Alves","year":"2007","journal-title":"Pac. Symp. Biocomput"},{"key":"2023020209444896300_B3","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1146\/annurev.bi.34.070165.000405","article-title":"Mechanism of action of proteolytic enzymes","volume":"34","author":"Bender","year":"1965","journal-title":"Annu. Rev. Biochem"},{"key":"2023020209444896300_B4","doi-asserted-by":"crossref","first-page":"140","DOI":"10.1073\/pnas.81.1.140","article-title":"The hydrophobic moment detects periodicity in protein hydrophobicity","volume":"81","author":"Eisenberg","year":"1984","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020209444896300_B5","doi-asserted-by":"crossref","first-page":"964","DOI":"10.1021\/ac048788h","article-title":"PepNovo: de novo peptide sequencing via probabilistic network modeling","volume":"77","author":"Frank","year":"2005","journal-title":"Anal. Chem"},{"key":"2023020209444896300_B6","doi-asserted-by":"crossref","first-page":"1287","DOI":"10.1021\/pr050011x","article-title":"Peptide sequence tags for fast database search in mass-spectrometry","volume":"4","author":"Frank","year":"2005","journal-title":"J. Proteome Res"},{"key":"2023020209444896300_B7","doi-asserted-by":"crossref","first-page":"2839","DOI":"10.1021\/pr060328c","article-title":"Proteomic studies of the intrinsically unstructured mammalian proteome","volume":"5","author":"Galea","year":"2006","journal-title":"J. Proteome Res"},{"key":"2023020209444896300_B8","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1038\/nbt1270","article-title":"Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation","volume":"25","author":"Lu","year":"2007","journal-title":"Nat. Biotechnol"},{"key":"2023020209444896300_B9","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1038\/nbt1275","article-title":"Computational prediction of proteotypic peptides for quantitative proteomics","volume":"25","author":"Mallick","year":"2007","journal-title":"Nat. Biotechnol"},{"key":"2023020209444896300_B10","doi-asserted-by":"crossref","first-page":"566","DOI":"10.1002\/prot.10532","article-title":"Predicting intrinsic disorder from amino acid sequence","volume":"53","author":"Obradovic","year":"2003","journal-title":"Proteins"},{"key":"2023020209444896300_B11","doi-asserted-by":"crossref","first-page":"608","DOI":"10.1074\/mcp.T400003-MCP200","article-title":"Trypsin cleaves exclusively C-terminal to arginine and lysine residues","volume":"3","author":"Olsen","year":"2004","journal-title":"Mol. Cell Proteomics"},{"key":"2023020209444896300_B12","doi-asserted-by":"crossref","first-page":"3551","DOI":"10.1002\/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2","article-title":"Probability-based protein identification by searching sequence databases using mass spectrometry data","volume":"20","author":"Perkins","year":"1999","journal-title":"Electrophoresis"},{"key":"2023020209444896300_B13","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1110\/ps.03128904","article-title":"Protein flexibility and intrinsic disorder","volume":"13","author":"Radivojac","year":"2004","journal-title":"Protein Sci"},{"key":"2023020209444896300_B14","doi-asserted-by":"crossref","first-page":"885","DOI":"10.1016\/j.febslet.2004.12.001","article-title":"Proteomics strategies for protein identification","volume":"579","author":"Resing","year":"2005","journal-title":"FEBS Lett"},{"key":"2023020209444896300_B15","doi-asserted-by":"crossref","first-page":"586","DOI":"10.1109\/ICNN.1993.298623","article-title":"A direct adaptive method for faster backpropagation learning: the RPROP algorithm","volume":"1","author":"Riedmiller","year":"1993","journal-title":"Proc. IEEE Int. Conf. Neural Netw"},{"key":"2023020209444896300_B16","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1002\/1097-0134(20010101)42:1<38::AID-PROT50>3.0.CO;2-3","article-title":"Sequence complexity of disordered protein","volume":"42","author":"Romero","year":"2001","journal-title":"Proteins"},{"key":"2023020209444896300_B17","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1016\/S0074-7742(04)61006-3","article-title":"Proteomic informatics","volume":"61","author":"Russell","year":"2004","journal-title":"Int. Rev. Neurobiol"},{"key":"2023020209444896300_B18","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1038\/nbt1183","article-title":"Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study","volume":"24","author":"States","year":"2006","journal-title":"Nat. Biotechnol"},{"key":"2023020209444896300_B19","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1021\/ac051348l","article-title":"Efficient and specific trypsin digestion of microgram to nanogram quantities of proteins in organic-aqueous solvent systems","volume":"78","author":"Strader","year":"2006","journal-title":"Anal. Chem"},{"key":"2023020209444896300_B20","doi-asserted-by":"crossref","first-page":"e481","DOI":"10.1093\/bioinformatics\/btl237","article-title":"A computational approach toward label-free protein quantification using predicted peptide detectability","volume":"22","author":"Tang","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020209444896300_B21","doi-asserted-by":"crossref","first-page":"1562","DOI":"10.1038\/nbt1168","article-title":"Identification of post-translational modifications by blind search of mass spectra","volume":"23","author":"Tsur","year":"2005","journal-title":"Nat. Biotechnol"},{"key":"2023020209444896300_B22","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1002\/prot.340190207","article-title":"Accuracy of protein flexibility predictions","volume":"19","author":"Vihinen","year":"1994","journal-title":"Proteins"},{"key":"2023020209444896300_B23","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1002\/prot.10437","article-title":"Flavors of protein disorder","volume":"52","author":"Vucetic","year":"2003","journal-title":"Proteins"},{"key":"2023020209444896300_B24","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1146\/annurev.biophys.33.111502.082538","article-title":"Mass spectral analysis in proteomics","volume":"33","author":"Yates","year":"2004","journal-title":"Annu. Rev. Biophys. Biomol. Struct"},{"key":"2023020209444896300_B25","doi-asserted-by":"crossref","first-page":"1426","DOI":"10.1021\/ac00104a020","article-title":"Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database","volume":"67","author":"Yates","year":"1995","journal-title":"Anal. Chem"},{"key":"2023020209444896300_B26","doi-asserted-by":"crossref","first-page":"3549","DOI":"10.1021\/pr070230d","article-title":"Proteomic parsimony through bipartite graph analysis improves accuracy and transparency","volume":"6","author":"Zhang","year":"2007","journal-title":"J. Proteome Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/1\/102\/49044820\/bioinformatics_24_1_102.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/1\/102\/49044820\/bioinformatics_24_1_102.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T10:08:19Z","timestamp":1675332499000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/1\/102\/206010"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,11,22]]},"references-count":26,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm545","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,1,1]]},"published":{"date-parts":[[2007,11,22]]}}}