{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T15:52:36Z","timestamp":1775058756232,"version":"3.50.1"},"reference-count":55,"publisher":"Oxford University Press (OUP)","issue":"20","license":[{"start":{"date-parts":[[2019,3,23]],"date-time":"2019-03-23T00:00:00Z","timestamp":1553299200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100000923","name":"Australia Research Council","doi-asserted-by":"crossref","award":["DP180102060"],"award-info":[{"award-number":["DP180102060"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100000925","name":"National Health and Medical Research Council","doi-asserted-by":"publisher","award":["1121629"],"award-info":[{"award-number":["1121629"]}],"id":[{"id":"10.13039\/501100000925","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Institute for Glycomics"},{"name":"Australian Government Research Training Program Scholarship"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,10,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Protein glycosylation is one of the most abundant post-translational modifications that plays an important role in immune responses, intercellular signaling, inflammation and host-pathogen interactions. However, due to the poor ionization efficiency and microheterogeneity of glycopeptides identifying glycosylation sites is a challenging task, and there is a demand for computational methods. Here, we constructed the largest dataset of human and mouse glycosylation sites to train deep learning neural networks and support vector machine classifiers to predict N-\/O-linked glycosylation sites, respectively.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>The method, called SPRINT-Gly, achieved consistent results between ten-fold cross validation and independent test for predicting human and mouse glycosylation sites. For N-glycosylation, a mouse-trained model performs equally well in human glycoproteins and vice versa, however, due to significant differences in O-linked sites separate models were generated. Overall, SPRINT-Gly is 18% and 50% higher in Matthews correlation coefficient than the next best method compared in N-linked and O-linked sites, respectively. This improved performance is due to the inclusion of novel structure and sequence-based features.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>http:\/\/sparks-lab.org\/server\/SPRINT-Gly\/<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz215","type":"journal-article","created":{"date-parts":[[2019,3,21]],"date-time":"2019-03-21T13:23:27Z","timestamp":1553174607000},"page":"4140-4146","source":"Crossref","is-referenced-by-count":67,"title":["SPRINT-Gly: predicting<i>N-<\/i>and<i>O-<\/i>linked glycosylation sites of human and mouse proteins by using sequence and predicted structural properties"],"prefix":"10.1093","volume":"35","author":[{"given":"Ghazaleh","family":"Taherzadeh","sequence":"first","affiliation":[{"name":"School of Information and Communication Technology, Griffith University , Gold Coast, QLD, Australia"}]},{"given":"Abdollah","family":"Dehzangi","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Morgan State University , Baltimore, MD, USA"}]},{"given":"Maryam","family":"Golchin","sequence":"additional","affiliation":[{"name":"School of Information and Communication Technology, Griffith University , Gold Coast, QLD, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9958-5699","authenticated-orcid":false,"given":"Yaoqi","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Information and Communication Technology, Griffith University , Gold Coast, QLD, Australia"},{"name":"Institute for Glycomics, Griffith University, Parklands Drive , Gold Coast, QLD, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9525-792X","authenticated-orcid":false,"given":"Matthew P","family":"Campbell","sequence":"additional","affiliation":[{"name":"Institute for Glycomics, Griffith University, Parklands Drive , Gold Coast, QLD, Australia"}]}],"member":"286","published-online":{"date-parts":[[2019,3,23]]},"reference":[{"key":"2023013108282645500_btz215-B1","first-page":"265","volume-title":"Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201916)","author":"Abadi","year":"2016"},{"key":"2023013108282645500_btz215-B2","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1016\/j.tibs.2009.10.001","article-title":"N-glycan structures: recognition and processing in the ER","volume":"35","author":"Aebi","year":"2010","journal-title":"Trends Biochem. Sci"},{"key":"2023013108282645500_btz215-B3","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023013108282645500_btz215-B4","doi-asserted-by":"crossref","first-page":"D115","DOI":"10.1093\/nar\/gkh131","article-title":"UniProt: the universal protein knowledgebase","volume":"32","author":"Apweiler","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023013108282645500_btz215-B5","doi-asserted-by":"crossref","first-page":"714","DOI":"10.1002\/msb.201304521","article-title":"Evolution and functional cross-talk of protein post-translational modifications","volume":"9","author":"Beltrao","year":"2013","journal-title":"Mol. Syst. Biol"},{"key":"2023013108282645500_btz215-B6","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1093\/glycob\/cwh004","article-title":"Biases and complex patterns in the residues flanking protein N-glycosylation sites","volume":"14","author":"Ben-Dor","year":"2004","journal-title":"Glycobiology"},{"key":"2023013108282645500_btz215-B7","doi-asserted-by":"crossref","first-page":"1633","DOI":"10.1002\/pmic.200300771","article-title":"Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence","volume":"4","author":"Blom","year":"2004","journal-title":"Proteomics"},{"key":"2023013108282645500_btz215-B8","doi-asserted-by":"crossref","first-page":"D215","DOI":"10.1093\/nar\/gkt1128","article-title":"UniCarbKB: building a knowledge platform for glycoproteomics","volume":"42","author":"Campbell","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023013108282645500_btz215-B9","doi-asserted-by":"crossref","first-page":"438.","DOI":"10.1186\/1471-2105-8-438","article-title":"Glycosylation site prediction using ensembles of support vector machine classifiers","volume":"8","author":"Caragea","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023013108282645500_btz215-B10","doi-asserted-by":"crossref","first-page":"27.","DOI":"10.1145\/1961189.1961199","article-title":"LIBSVM: a library for support vector machines","volume":"2","author":"Chang","year":"2011","journal-title":"ACM Trans. Intell. Syst. Technol"},{"key":"2023013108282645500_btz215-B11","doi-asserted-by":"crossref","first-page":"e40155.","DOI":"10.1371\/journal.pone.0040155","article-title":"GlycoPP: a webserver for prediction of N-and O-glycosites in prokaryotic protein sequences","volume":"7","author":"Chauhan","year":"2012","journal-title":"PLoS One"},{"key":"2023013108282645500_btz215-B12","doi-asserted-by":"crossref","first-page":"e67008.","DOI":"10.1371\/journal.pone.0067008","article-title":"In silico platform for prediction of N-, O-and C-glycosites in eukaryotic protein sequences","volume":"8","author":"Chauhan","year":"2013","journal-title":"PLoS One"},{"key":"2023013108282645500_btz215-B13","doi-asserted-by":"crossref","first-page":"1188","DOI":"10.1101\/gr.849004","article-title":"WebLogo: a sequence logo generator","volume":"14","author":"Crooks","year":"2004","journal-title":"Genome Res"},{"key":"2023013108282645500_btz215-B14","author":"Gupta","year":"2004"},{"key":"2023013108282645500_btz215-B15","doi-asserted-by":"crossref","first-page":"500.","DOI":"10.1186\/1471-2105-9-500","article-title":"Prediction of glycosylation sites using random forests","volume":"9","author":"Hamby","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023013108282645500_btz215-B16","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1002\/prot.20379","article-title":"An amino acid has two sides: a new 2D measure provides a different view of solvent exposure","volume":"59","author":"Hamelryck","year":"2005","journal-title":"Proteins"},{"key":"2023013108282645500_btz215-B17","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1023\/A:1006960004440","article-title":"NetOglyc: prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility","volume":"15","author":"Hansen","year":"1998","journal-title":"Glycoconj. J"},{"key":"2023013108282645500_btz215-B18","doi-asserted-by":"crossref","first-page":"685","DOI":"10.1093\/bioinformatics\/btw678","article-title":"Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks","volume":"33","author":"Hanson","year":"2016","journal-title":"Bioinformatics"},{"key":"2023013108282645500_btz215-B19","doi-asserted-by":"crossref","first-page":"11476","DOI":"10.1038\/srep11476","article-title":"Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning","volume":"5","author":"Heffernan","year":"2015","journal-title":"Sci. Rep"},{"key":"2023013108282645500_btz215-B20","doi-asserted-by":"crossref","first-page":"2842","DOI":"10.1093\/bioinformatics\/btx218","article-title":"Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility","volume":"33","author":"Heffernan","year":"2017","journal-title":"Bioinformatics"},{"key":"2023013108282645500_btz215-B21","doi-asserted-by":"crossref","first-page":"D435","DOI":"10.1093\/nar\/gkv1240","article-title":"dbPTM 2016: 10-year anniversary of a resource for post-translational modification of proteins","volume":"44","author":"Huang","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023013108282645500_btz215-B22","doi-asserted-by":"crossref","first-page":"632","DOI":"10.1016\/j.cell.2018.01.016","article-title":"SnapShot: o -glycosylation pathways across kingdoms","volume":"172","author":"Joshi","year":"2018","journal-title":"Cell"},{"key":"2023013108282645500_btz215-B23","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1093\/glycob\/cwh151","article-title":"Prediction, conservation analysis, and structural characterization of mammalian mucin-type O-glycosylation sites","volume":"15","author":"Julenius","year":"2005","journal-title":"Glycobiology"},{"key":"2023013108282645500_btz215-B24","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1007\/978-4-431-56454-6_11","volume-title":"A Practical Guide to Using Glycomics Databases","author":"Kaji","year":"2017"},{"key":"2023013108282645500_btz215-B25","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1007\/s00216-016-9970-5","article-title":"Use of an informed search space maximizes confidence of site-specific assignment of glycoprotein glycosylation","volume":"409","author":"Khatri","year":"2017","journal-title":"Anal. Bioanal. Chem"},{"key":"2023013108282645500_btz215-B26","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/S0031-3203(99)00041-2","article-title":"Comparison of algorithms that select features for pattern classifiers","volume":"33","author":"Kudo","year":"2000","journal-title":"Pattern Recognit"},{"key":"2023013108282645500_btz215-B27","doi-asserted-by":"crossref","first-page":"515","DOI":"10.1016\/j.sbi.2009.06.004","article-title":"Glycoprotein folding, quality control and ER-associated degradation","volume":"19","author":"Lederkremer","year":"2009","journal-title":"Curr. Opin. Struct. Biol"},{"key":"2023013108282645500_btz215-B28","doi-asserted-by":"crossref","first-page":"1411","DOI":"10.1093\/bioinformatics\/btu852","article-title":"GlycoMine: a machine learning-based approach for predicting N-, C-and O-linked glycosylation in the human proteome","volume":"31","author":"Li","year":"2015","journal-title":"Bioinformatics"},{"key":"2023013108282645500_btz215-B29","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.neucom.2016.12.038","article-title":"A survey of deep neural network architectures and their applications","volume":"234","author":"Liu","year":"2017","journal-title":"Neurocomputing"},{"key":"2023013108282645500_btz215-B30","doi-asserted-by":"crossref","first-page":"713.","DOI":"10.1038\/nchembio.437","article-title":"A systematic approach to protein glycosylation analysis: a path through the maze","volume":"6","author":"Mari\u00f1o","year":"2010","journal-title":"Nat. Chem. Biol"},{"key":"2023013108282645500_btz215-B31","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1007\/s008940100038","article-title":"Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks","volume":"7","author":"Meiler","year":"2001","journal-title":"Mol. Model. Annu"},{"key":"2023013108282645500_btz215-B32","doi-asserted-by":"crossref","first-page":"448.","DOI":"10.1038\/nrm3383","article-title":"Vertebrate protein glycosylation: diversity, synthesis and function","volume":"13","author":"Moremen","year":"2012","journal-title":"Nat. Rev. Mol. Cell Biol"},{"key":"2023013108282645500_btz215-B33","doi-asserted-by":"crossref","first-page":"1052","DOI":"10.1016\/j.chembiol.2015.06.017","article-title":"Enhanced aromatic sequons increase oligosaccharyltransferase glycosylation efficiency and glycan homogeneity","volume":"22","author":"Murray","year":"2015","journal-title":"Chem. Biol"},{"key":"2023013108282645500_btz215-B34","doi-asserted-by":"crossref","first-page":"629","DOI":"10.1002\/prot.25489","article-title":"SPIN2: predicting sequence profiles from protein structures using deep neural networks","volume":"86","author":"O'Connell","year":"2018","journal-title":"Proteins"},{"key":"2023013108282645500_btz215-B35","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1093\/glycob\/cwh008","article-title":"Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding","volume":"14","author":"Petrescu","year":"2004","journal-title":"Glycobiology"},{"key":"2023013108282645500_btz215-B36","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.artmed.2017.02.007","article-title":"Identify and analysis crotonylation sites in histone by using support vector machines","volume":"83","author":"Qiu","year":"2017","journal-title":"Artif. Intell. Med"},{"key":"2023013108282645500_btz215-B37","doi-asserted-by":"crossref","first-page":"761","DOI":"10.1093\/bioinformatics\/btu703","article-title":"DANN: a deep learning approach for annotating the pathogenicity of genetic variants","volume":"31","author":"Quang","year":"2015","journal-title":"Bioinformatics"},{"key":"2023013108282645500_btz215-B38","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1007\/s00726-016-2362-5","article-title":"Novel \u201cextended sequons\u201d of human N-glycosylation sites improve the precision of qualitative predictions: an alignment-free study of pattern recognition using ProtDCal protein features","volume":"49","author":"Ruiz-Blanco","year":"2017","journal-title":"Amino Acids"},{"key":"2023013108282645500_btz215-B39","doi-asserted-by":"crossref","first-page":"791","DOI":"10.1002\/pro.5560040419","article-title":"Site-specific detection and structural characterization of the glycosylation of human plasma proteins lecithin: cholesterol acyltransferase and apolipoprotein D using HPLC\/electrospray mass spectrometry and sequential glycosidase digestion","volume":"4","author":"Schindler","year":"1995","journal-title":"Protein Sci"},{"key":"2023013108282645500_btz215-B40","doi-asserted-by":"crossref","first-page":"2079","DOI":"10.1016\/j.bbagen.2012.09.014","article-title":"Site-specific protein O-glycosylation modulates proprotein processing-deciphering specific functions of the large polypeptide GalNAc-transferase gene family","volume":"1820","author":"Schjoldager","year":"2012","journal-title":"Biochim. Biophys. Acta"},{"key":"2023013108282645500_btz215-B41","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1080\/10409239891204198","article-title":"Concepts and principles of O-linked glycosylation","volume":"33","author":"Steen","year":"1998","journal-title":"Crit. Rev. Biochem. Mol. Biol"},{"key":"2023013108282645500_btz215-B42","doi-asserted-by":"crossref","first-page":"774","DOI":"10.1093\/glycob\/cwy059","article-title":"Analysis of protein landscapes around N-glycosylation sites from the PDB repository for understanding the structural basis of N-glycoprotein processing and maturation","volume":"8","author":"Suga","year":"2018","journal-title":"Glycobiology"},{"key":"2023013108282645500_btz215-B43","doi-asserted-by":"crossref","first-page":"2115","DOI":"10.1021\/acs.jcim.6b00320","article-title":"Sequence-based prediction of protein-carbohydrate binding sites using support vector machines","volume":"56","author":"Taherzadeh","year":"2016","journal-title":"J. Chem. Inf. Model"},{"key":"2023013108282645500_btz215-B44","doi-asserted-by":"crossref","first-page":"477","DOI":"10.1093\/bioinformatics\/btx614","article-title":"Structure-based prediction of protein-peptide binding regions using Random Forest","volume":"34","author":"Taherzadeh","year":"2017","journal-title":"Bioinformatics"},{"key":"2023013108282645500_btz215-B45","doi-asserted-by":"crossref","first-page":"1757","DOI":"10.1002\/jcc.25353","article-title":"Predicting lysine-malonylation sites of proteins using sequence and predicted structural features","volume":"39","author":"Taherzadeh","year":"2018","journal-title":"J. Comput. Chem"},{"key":"2023013108282645500_btz215-B46","doi-asserted-by":"crossref","first-page":"1440","DOI":"10.1093\/glycob\/cws110","article-title":"Site-specific glycoproteomics confirms that protein structure dictates formation of N-glycan type, core fucosylation and branching","volume":"22","author":"Thaysen-Andersen","year":"2012","journal-title":"Glycobiology"},{"key":"2023013108282645500_btz215-B47","doi-asserted-by":"crossref","first-page":"1536","DOI":"10.1093\/bioinformatics\/btl151","article-title":"Two Sample Logo: a graphical representation of the differences between two sets of sequence alignments","volume":"22","author":"Vacic","year":"2006","journal-title":"Bioinformatics"},{"key":"2023013108282645500_btz215-B48","volume-title":"The Nature of Statistical Learning Theory","author":"Vapnik","year":"2013"},{"key":"2023013108282645500_btz215-B49","volume-title":"Essentials of Glycobiology","author":"Varki","year":"2009","edition":"2"},{"key":"2023013108282645500_btz215-B50","doi-asserted-by":"crossref","first-page":"18962.","DOI":"10.1038\/srep18962","article-title":"Protein secondary structure prediction using deep convolutional neural fields","volume":"6","author":"Wang","year":"2016","journal-title":"Sci. Rep"},{"key":"2023013108282645500_btz215-B51","doi-asserted-by":"crossref","first-page":"91R","DOI":"10.1093\/glycob\/cwj099","article-title":"Asparagine-linked protein glycosylation: from eukaryotic to prokaryotic systems","volume":"16","author":"Weerapana","year":"2006","journal-title":"Glycobiology"},{"key":"2023013108282645500_btz215-B52","doi-asserted-by":"crossref","first-page":"735","DOI":"10.1093\/bioinformatics\/btg477","article-title":"Bio-support vector machines for computational proteomics","volume":"20","author":"Yang","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013108282645500_btz215-B53","doi-asserted-by":"crossref","first-page":"2412","DOI":"10.1096\/fj.14-267096","article-title":"The atypical N-glycosylation motif, Asn-Cys-Cys, in human GPR109A is required for normal cell surface expression and intracellular signaling","volume":"29","author":"Yasuda","year":"2015","journal-title":"FASEB J"},{"key":"2023013108282645500_btz215-B54","doi-asserted-by":"crossref","first-page":"731","DOI":"10.1007\/978-3-540-37256-1_89","volume-title":"Intelligent Control and Automation","author":"Yen","year":"2006"},{"key":"2023013108282645500_btz215-B55","doi-asserted-by":"crossref","first-page":"R73.","DOI":"10.1186\/gb-2006-7-8-r73","article-title":"UniPep-a database for human N-linked glycosites: a resource for biomarker discovery","volume":"7","author":"Zhang","year":"2006","journal-title":"Genome Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz215\/28492207\/btz215.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/20\/4140\/48976706\/bioinformatics_35_20_4140.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/20\/4140\/48976706\/bioinformatics_35_20_4140.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,14]],"date-time":"2023-09-14T19:02:44Z","timestamp":1694718164000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/20\/4140\/5418954"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,3,23]]},"references-count":55,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2019,10,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz215","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,10,15]]},"published":{"date-parts":[[2019,3,23]]}}}