{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T05:11:36Z","timestamp":1775193096466,"version":"3.50.1"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2024,10,10]],"date-time":"2024-10-10T00:00:00Z","timestamp":1728518400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01 CA073808"],"award-info":[{"award-number":["R01 CA073808"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R35 GM148220"],"award-info":[{"award-number":["R35 GM148220"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Post-translational modifications (PTMs) increase the diversity of the proteome and are vital to organismal life and therapeutic strategies. Deep learning has been used to predict PTM locations. Still, limitations in datasets and their analyses compromise success.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We evaluated the use of known PTM sites in prediction via sequence-based deep learning algorithms. For each PTM, known locations of that PTM were encoded as a separate amino acid before sequences were encoded via word embedding and passed into a convolutional neural network that predicts the probability of that PTM at a given site. Without labeling known PTMs, our models are on par with others. With labeling, however, we improved significantly upon extant models. Moreover, knowing PTM locations can increase the predictability of a different PTM. Our findings highlight the importance of PTMs for the installation of additional PTMs. We anticipate that including known PTM locations will enhance the performance of other proteomic machine learning algorithms.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Sitetack is available as a web tool at https:\/\/sitetack.net; the source code, representative datasets, instructions for local use, and select models are available at https:\/\/github.com\/clair-gutierrez\/sitetack.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae602","type":"journal-article","created":{"date-parts":[[2024,10,8]],"date-time":"2024-10-08T23:23:32Z","timestamp":1728429812000},"source":"Crossref","is-referenced-by-count":6,"title":["Sitetack: a deep learning model that improves PTM prediction by using known PTMs"],"prefix":"10.1093","volume":"40","author":[{"given":"Clair S","family":"Gutierrez","sequence":"first","affiliation":[{"name":"Department of Chemistry, Massachusetts Institute of Technology , Cambridge, MA 02139,","place":["United States"]},{"name":"Broad Institute of MIT and Harvard , Cambridge, MA 02143,","place":["United States"]}]},{"given":"Alia A","family":"Kassim","sequence":"additional","affiliation":[{"name":"Department of Chemistry, Massachusetts Institute of Technology , Cambridge, MA 02139,","place":["United States"]}]},{"given":"Benjamin D","family":"Gutierrez","sequence":"additional","affiliation":[{"name":"Independent Researcher"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7164-1719","authenticated-orcid":false,"given":"Ronald T","family":"Raines","sequence":"additional","affiliation":[{"name":"Department of Chemistry, Massachusetts Institute of Technology , Cambridge, MA 02139,","place":["United States"]},{"name":"Broad Institute of MIT and Harvard , Cambridge, MA 02143,","place":["United States"]},{"name":"Koch Institute for Integrated Cancer Research at MIT , Cambridge, MA 02139,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2024,10,10]]},"reference":[{"key":"2024111117073706400_btae602-B1","doi-asserted-by":"crossref","first-page":"106276","DOI":"10.1016\/j.isci.2023.106276","article-title":"An inventory of crosstalk between ubiquitination and other post-translational modifications in orchestrating cellular processes","volume":"26","author":"Barbour","year":"2023","journal-title":"iScience"},{"key":"2024111117073706400_btae602-B2","doi-asserted-by":"crossref","first-page":"D523","DOI":"10.1093\/nar\/gkac1052","article-title":"UniProt: the universal protein knowledgebase in 2023","volume":"51","author":"Bateman","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2024111117073706400_btae602-B063339046","doi-asserted-by":"crossref","first-page":"1561","DOI":"10.1016\/j.cmet.2022.07.003","article-title":"Phosphoproteomics of three exercise modalities identifies canonical signaling and C18ORF25 AS AN AMPK substrate regulating skeletal muscle function","volume":"34","author":"Blazev","year":"2022","journal-title":"Cell Metab"},{"key":"2024111117073706400_btae602-B3","doi-asserted-by":"crossref","first-page":"15512","DOI":"10.1038\/s41598-018-33951-5","article-title":"SUMOgo: prediction of sumoylation sites on lysines by motif screening models and the effects of various post-translational modifications","volume":"8","author":"Chang","year":"2018","journal-title":"Sci Rep"},{"key":"2024111117073706400_btae602-B4","doi-asserted-by":"crossref","first-page":"1188","DOI":"10.1101\/gr.849004","article-title":"WebLogo: a sequence logo generator","volume":"14","author":"Crooks","year":"2004","journal-title":"Genome Res"},{"key":"2024111117073706400_btae602-B5","doi-asserted-by":"crossref","first-page":"101099","DOI":"10.1016\/j.mam.2022.101099","article-title":"Ageing\u2014oxidative stress, PTMs and disease","volume":"86","author":"Ebert","year":"2022","journal-title":"Mol Aspects Med"},{"key":"2024111117073706400_btae602-B6","doi-asserted-by":"crossref","first-page":"3150","DOI":"10.1093\/bioinformatics\/bts565","article-title":"CD-HIT: accelerated for clustering the next-generation sequencing data","volume":"28","author":"Fu","year":"2012","journal-title":"Bioinformatics"},{"key":"2024111117073706400_btae602-B7","doi-asserted-by":"crossref","first-page":"825","DOI":"10.1146\/annurev-biochem-060608-102511","article-title":"Cross talk between O-GlcNAcylation and phosphorylation: roles in signaling, transcription, and chronic disease","volume":"80","author":"Hart","year":"2011","journal-title":"Annu Rev Biochem"},{"key":"2024111117073706400_btae602-B8","doi-asserted-by":"crossref","first-page":"101066","DOI":"10.1016\/j.mam.2022.101066","article-title":"Identification and characterization of post-translational modifications: clinical implications","volume":"86","author":"Hermann","year":"2022","journal-title":"Mol Aspects Med"},{"key":"2024111117073706400_btae602-B9","doi-asserted-by":"crossref","first-page":"27470","DOI":"10.1021\/acsomega.0c03972","article-title":"Computational prediction of protein arginine methylation based on composition\u2212transition\u2212distribution features","volume":"5","author":"Hou","year":"2020","journal-title":"ACS Omega"},{"key":"2024111117073706400_btae602-B10","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1021\/acs.jproteome.3c00458","article-title":"O-GlcNAcPRED-DL: prediction of protein O-GlcNAcylation sites based on an ensemble model of deep learning","volume":"23","author":"Hu","year":"2024","journal-title":"J Proteome Res"},{"key":"2024111117073706400_btae602-B11","doi-asserted-by":"crossref","first-page":"D542","DOI":"10.1093\/nar\/gkx1104","article-title":"iPTMnet: an integrated resource for protein post-translational modification network discovery","volume":"46","author":"Huang","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2024111117073706400_btae602-B12","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1016\/j.gpb.2020.05.003","article-title":"OGP: a repository of experimentally characterized O-glycoproteins to facilitate studies on O-glycosylation","volume":"19","author":"Huang","year":"2021","journal-title":"Genomics Proteomics Bioinf"},{"key":"2024111117073706400_btae602-B13","doi-asserted-by":"crossref","first-page":"759","DOI":"10.1038\/s41586-022-05575-3","article-title":"An atlas of substrate specificities for the human serine\/threonine kinome","volume":"613","author":"Johnson","year":"2023","journal-title":"Nature"},{"key":"2024111117073706400_btae602-B14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.4061\/2011\/207691","article-title":"Small changes huge impact: the role of protein posttranslational modifications in cellular homeostasis and disease","volume":"2011","author":"Karve","year":"2011","journal-title":"J Amino Acids"},{"key":"2024111117073706400_btae602-B15","author":"Kingma"},{"key":"2024111117073706400_btae602-B16","doi-asserted-by":"crossref","first-page":"1658","DOI":"10.1093\/bioinformatics\/btl158","article-title":"Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences","volume":"22","author":"Li","year":"2006","journal-title":"Bioinformatics"},{"key":"2024111117073706400_btae602-B17","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1109\/TIT.1982.1056489","article-title":"Least squares quantization in PCM","volume":"28","author":"Lloyd","year":"1982","journal-title":"IEEE Trans Inform Theory"},{"key":"2024111117073706400_btae602-B18","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1016\/S0962-8924(02)02253-5","article-title":"Pinning down proline-directed phosphorylation signaling","volume":"12","author":"Lu","year":"2002","journal-title":"Trends Cell Biol"},{"key":"2024111117073706400_btae602-B19","doi-asserted-by":"crossref","first-page":"2766","DOI":"10.1093\/bioinformatics\/bty1051","article-title":"DeepPhos: prediction of protein phosphorylation sites with deep learning","volume":"35","author":"Luo","year":"2019","journal-title":"Bioinformatics"},{"key":"2024111117073706400_btae602-B20","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1093\/glycob\/cwab003","article-title":"O-GlcNAcAtlas: a database of experimentally identified O-GlcNAc sites and proteins","volume":"31","author":"Ma","year":"2021","journal-title":"Glycobiology"},{"key":"2024111117073706400_btae602-B21","doi-asserted-by":"crossref","first-page":"7314","DOI":"10.3390\/molecules26237314","article-title":"DeepNGlyPred: a deep neural network-based approach for human N-linked glycosylation site prediction","volume":"26","author":"Pakhrin","year":"2021","journal-title":"Molecules"},{"key":"2024111117073706400_btae602-B22","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1093\/glycob\/cwad033","article-title":"LMNglyPred: prediction of human N-linked glycosylation sites using embeddings from a pre-trained protein language model","volume":"33","author":"Pakhrin","year":"2023","journal-title":"Glycobiology"},{"key":"2024111117073706400_btae602-B23","doi-asserted-by":"crossref","first-page":"101097","DOI":"10.1016\/j.mam.2022.101097","article-title":"Pathological implication of protein post-translational modifications in cancer","volume":"86","author":"Pan","year":"2022","journal-title":"Mol Aspects Med"},{"key":"2024111117073706400_btae602-B24","doi-asserted-by":"crossref","first-page":"15975","DOI":"10.1038\/s41598-019-52341-z","article-title":"N-GlyDE: a two-stage N-linked glycosylation site prediction incorporating gapped dipeptides and pattern-based encoding","volume":"9","author":"Pitti","year":"2019","journal-title":"Sci Rep"},{"key":"2024111117073706400_btae602-B25","doi-asserted-by":"crossref","first-page":"baab012","DOI":"10.1093\/database\/baab012","article-title":"Post-translational modifications in proteins: resources, tools and prediction methods","volume":"2021","author":"Ramazi","year":"2021","journal-title":"Database"},{"key":"2024111117073706400_btae602-B26","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1016\/j.jprot.2013.03.025","article-title":"Effect of posttranslational modifications on enzyme function and assembly","volume":"92","author":"Ry\u0161lav\u00e1","year":"2013","journal-title":"J Proteomics"},{"key":"2024111117073706400_btae602-B27","doi-asserted-by":"crossref","first-page":"232","DOI":"10.3390\/brainsci10040232","article-title":"Do post-translational modifications influence protein aggregation in neurodegenerative diseases: a systematic review","volume":"10","author":"Schaffert","year":"2020","journal-title":"Brain Sci"},{"key":"2024111117073706400_btae602-B28","doi-asserted-by":"crossref","first-page":"929","DOI":"10.1146\/annurev.biochem.77.032207.120833","article-title":"Collagen structure and stability","volume":"78","author":"Shoulders","year":"2009","journal-title":"Annu Rev Biochem"},{"key":"2024111117073706400_btae602-B29","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1038\/s41598-019-46385-4","article-title":"Large-scale discovery of substrates of the human kinome","volume":"9","author":"Sugiyama","year":"2019","journal-title":"Sci Rep"},{"key":"2024111117073706400_btae602-B30","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12014-019-9254-0","article-title":"N-GlycositeAtlas: a database resource for mass spectrometry-based human N-linked glycoprotein and glycosylation site mapping","volume":"16","author":"Sun","year":"2019","journal-title":"Clin Proteom"},{"key":"2024111117073706400_btae602-B31","author":"Sundarajan","year":"2017"},{"key":"2024111117073706400_btae602-B32","doi-asserted-by":"crossref","first-page":"3152","DOI":"10.1111\/febs.14491","article-title":"Crosstalk between phosphorylation and O-GlcNAcylation: friend or foe","volume":"285","author":"van der Laarse","year":"2018","journal-title":"FEBS J"},{"key":"2024111117073706400_btae602-B33","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"van der Maaten","year":"2008","journal-title":"J Mach Learn Res"},{"key":"2024111117073706400_btae602-B34","doi-asserted-by":"crossref","first-page":"7342","DOI":"10.1002\/anie.200501023","article-title":"Protein posttranslational modifications: the chemistry of proteome diversifications","volume":"44","author":"Walsh","year":"2005","journal-title":"Angew Chem Int Ed Engl"},{"key":"2024111117073706400_btae602-B35","doi-asserted-by":"crossref","first-page":"2386","DOI":"10.1093\/bioinformatics\/bty977","article-title":"Capsule network for protein post-translational modification site prediction","volume":"35","author":"Wang","year":"2019","journal-title":"Bioinformatics"},{"key":"2024111117073706400_btae602-B36","doi-asserted-by":"crossref","first-page":"W140","DOI":"10.1093\/nar\/gkaa275","article-title":"MusiteDeep: a deep-learning based webserver for protein post-translational modification site prediction and visualization","volume":"48","author":"Wang","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2024111117073706400_btae602-B37","doi-asserted-by":"crossref","first-page":"3909","DOI":"10.1093\/bioinformatics\/btx496","article-title":"MusiteDeep: a deep-learning framework for general and kinase-specific phosphorylation site prediction","volume":"33","author":"Wang","year":"2017","journal-title":"Bioinformatics"},{"key":"2024111117073706400_btae602-B38","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-019-2632-9","article-title":"A deep learning method to more accurately recall known lysine acetylation sites","volume":"20","author":"Wu","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2024111117073706400_btae602-B39","doi-asserted-by":"crossref","first-page":"112796","DOI":"10.1016\/j.celrep.2023.112796","article-title":"Systematic analysis of the impact of phosphorylation and O-GlcNAcylation on protein subcellular localization","volume":"42","author":"Xu","year":"2023","journal-title":"Cell Rep"},{"key":"2024111117073706400_btae602-B40","doi-asserted-by":"crossref","first-page":"100430","DOI":"10.1016\/j.crmeth.2023.100430","article-title":"MIND-S is a deep-learning prediction model for elucidating protein post-translational modifications in human diseases","volume":"3","author":"Yan","year":"2023","journal-title":"Cell Rep Methods"},{"key":"2024111117073706400_btae602-B41","doi-asserted-by":"crossref","first-page":"4668","DOI":"10.1093\/bioinformatics\/btab551","article-title":"PhosIDN: an integrated deep neural network for improving protein phosphorylation site prediction by combining sequence and protein\u2013protein interaction information","volume":"37","author":"Yang","year":"2021","journal-title":"Bioinformatics"},{"key":"2024111117073706400_btae602-B42","doi-asserted-by":"crossref","first-page":"34943","DOI":"10.7554\/eLife.64943","article-title":"A subcellular map of the human kinome","volume":"10","author":"Zhang","year":"2021","journal-title":"Elife"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae602\/59713762\/btae602.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/11\/btae602\/60592587\/btae602.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/11\/btae602\/60592587\/btae602.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,11]],"date-time":"2024-11-11T12:08:24Z","timestamp":1731326904000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae602\/7817805"}},"subtitle":[],"editor":[{"given":"Macha","family":"Nikolski","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,10,10]]},"references-count":43,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2024,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae602","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.06.03.596298","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,11]]},"published":{"date-parts":[[2024,10,10]]},"article-number":"btae602"}}