{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,2]],"date-time":"2026-06-02T00:02:34Z","timestamp":1780358554438,"version":"3.54.1"},"reference-count":62,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T00:00:00Z","timestamp":1687219200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["T32-GM135131 R35-GM141881 R01-AI162381"],"award-info":[{"award-number":["T32-GM135131 R35-GM141881 R01-AI162381"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:p>Carbohydrates dynamically and transiently interact with proteins for cell\u2013cell recognition, cellular differentiation, immune response, and many other cellular processes. Despite the molecular importance of these interactions, there are currently few reliable computational tools to predict potential carbohydrate-binding sites on any given protein. Here, we present two deep learning (DL) models named CArbohydrate\u2013Protein interaction Site IdentiFier (CAPSIF) that predicts non-covalent carbohydrate-binding sites on proteins: (1) a 3D-UNet voxel-based neural network model (CAPSIF:V) and (2) an equivariant graph neural network model (CAPSIF:G). While both models outperform previous surrogate methods used for carbohydrate-binding site prediction, CAPSIF:V performs better than CAPSIF:G, achieving test Dice scores of 0.597 and 0.543 and test set Matthews correlation coefficients (MCCs) of 0.599 and 0.538, respectively. We further tested CAPSIF:V on AlphaFold2-predicted protein structures. CAPSIF:V performed equivalently on both experimentally determined structures and AlphaFold2-predicted structures. Finally, we demonstrate how CAPSIF models can be used in conjunction with local glycan-docking protocols, such as GlycanDock, to predict bound protein\u2013carbohydrate structures.<\/jats:p>","DOI":"10.3389\/fbinf.2023.1186531","type":"journal-article","created":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T10:04:15Z","timestamp":1687255455000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":23,"title":["Structure-based neural network protein\u2013carbohydrate interaction predictions at the residue level"],"prefix":"10.3389","volume":"3","author":[{"given":"Samuel W.","family":"Canner","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sudhanshu","family":"Shanker","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jeffrey J.","family":"Gray","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1965","published-online":{"date-parts":[[2023,6,20]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"6659","DOI":"10.1128\/mcb.00205-07","article-title":"Polysialic acid-directed migration and differentiation of neural precursors are essential for mouse brain development","volume":"27","author":"Angata","year":"2007","journal-title":"Mol. Cell Biol."},{"key":"B55","volume-title":"Essentials of glycobiology","author":"Varki","year":"2017"},{"key":"B2","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"B3","doi-asserted-by":"publisher","first-page":"D1236","DOI":"10.1093\/nar\/gky832","article-title":"UniLectin3D, a database of carbohydrate binding proteins with curated information on 3D structures and interacting ligands","volume":"47","author":"Bonnardel","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"B4","doi-asserted-by":"publisher","first-page":"6669","DOI":"10.1039\/d1sc05681f","article-title":"GlyNet: A multi-task neural network for predicting protein\u2013glycan interactions","volume":"13","author":"Carpenter","year":"2022","journal-title":"Chem. Sci."},{"key":"B5","doi-asserted-by":"publisher","first-page":"689","DOI":"10.1093\/bioinformatics\/btq007","article-title":"PyRosetta: A script-based interface for implementing molecular modeling algorithms using Rosetta","volume":"26","author":"Chaudhury","year":"2010","journal-title":"Bioinformatics"},{"key":"B6","doi-asserted-by":"publisher","first-page":"e1006705","DOI":"10.1371\/journal.pcbi.1006705","article-title":"Inherent versus induced protein flexibility: Comparisons within and between apo and holo structures","volume":"15","author":"Clark","year":"2019","journal-title":"PLoS Comput. Biol."},{"key":"B7","volume-title":"Diffusion steps, twists, and turns for molecular docking","author":"Corso","year":"2023"},{"key":"B8","doi-asserted-by":"publisher","first-page":"e2016198118","DOI":"10.1073\/pnas.2016198118","article-title":"A glycan FRET assay for detection and characterization of catalytic antibodies to the Cryptococcus neoformans capsule","volume":"118","author":"Crawford","year":"2021","journal-title":"Proc. Natl. Acad. Sci."},{"key":"B9","doi-asserted-by":"publisher","first-page":"15202","DOI":"10.3390\/molecules200815202","article-title":"Protein-carbohydrate interactions, and beyond","volume":"20","author":"de Schutter","year":"2015","journal-title":"Molecules"},{"key":"B10","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1016\/j.molimm.2015.02.028","article-title":"Structural biology of antibody recognition of carbohydrate epitopes and potential uses for targeted cancer immunotherapies","volume":"67","author":"Dingjan","year":"2015","journal-title":"Mol. Immunol."},{"key":"B11","doi-asserted-by":"publisher","first-page":"5634","DOI":"10.1038\/s41596-021-00628-9","article-title":"The trRosetta server for fast and accurate protein structure prediction","volume":"16","author":"Du","year":"2021","journal-title":"Nat. Protoc."},{"key":"B12","doi-asserted-by":"publisher","first-page":"269","DOI":"10.1016\/b978-0-12-374546-0.00015-8","article-title":"Viral surface glycoproteins in carbohydrate recognition","author":"Dyason","year":"2010","journal-title":"Microb. Glycobiol."},{"key":"B13","doi-asserted-by":"publisher","first-page":"661","DOI":"10.1038\/nrd2852","article-title":"From carbohydrate leads to glycomimetic drugs","volume":"8","author":"Ernst","year":"2009","journal-title":"Nat. Rev. Drug Discov."},{"key":"B14","doi-asserted-by":"publisher","first-page":"2897","DOI":"10.1021\/acs.jcim.1c00204","article-title":"Finding druggable sites in proteins using TACTICS","volume":"61","author":"Evans","year":"2021","journal-title":"J. Chem. Inf. Model"},{"key":"B16","doi-asserted-by":"publisher","first-page":"951","DOI":"10.1093\/glycob\/10.10.951","article-title":"MINI REVIEW keratan sulfate: Structure, biosynthesis, and function","volume":"10","author":"Funderburgh","year":"2000","journal-title":"Glycobiology"},{"key":"B17","doi-asserted-by":"publisher","first-page":"2223","DOI":"10.1016\/j.jmb.2019.04.016","article-title":"Protein and glycan mimicry in HIV vaccine design","volume":"431","author":"Ge","year":"2019","journal-title":"J. Mol. Biol."},{"key":"B18","doi-asserted-by":"publisher","first-page":"3168,","DOI":"10.1038\/s41467-021-23303-9","article-title":"Structure-based protein function prediction using graph convolutional networks","volume":"12","author":"Gligorijevi\u0107","year":"2021","journal-title":"Nat. Commun."},{"key":"B19","doi-asserted-by":"publisher","first-page":"920","DOI":"10.1093\/glycob\/cwv037","article-title":"Antibody recognition of carbohydrate epitopes","volume":"25","author":"Haji-Ghassemi","year":"2015","journal-title":"Glycobiology"},{"key":"B20","doi-asserted-by":"publisher","first-page":"5246","DOI":"10.1021\/acs.jcim.1c00233","article-title":"Mechanism of glycans modulating cholesteryl ester transfer protein: Unveiled by molecular dynamics simulation","volume":"62","author":"Hao","year":"2022","journal-title":"J. Chem. Inf. Model"},{"key":"B21","doi-asserted-by":"publisher","first-page":"2486","DOI":"10.1021\/acsanm.0c03047","article-title":"Aromaphilicity index of amino acids: Molecular dynamics simulations of the protein binding affinity for carbon nanomaterials","volume":"4","author":"Hirano","year":"2021","journal-title":"ACS Appl. Nano Mater"},{"key":"B22","doi-asserted-by":"publisher","first-page":"1","DOI":"10.5555\/3454287.3455704","article-title":"Generative models for graph-based protein design","volume":"32","author":"Ingraham","year":"2019","journal-title":"Adv. Neural Inf. Process Syst."},{"key":"B23","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2009.01411","article-title":"Learning from protein structure with geometric vector perceptrons","author":"Jing","year":"2021"},{"key":"B24","doi-asserted-by":"publisher","first-page":"1583","DOI":"10.1021\/acs.jcim.0c01306","article-title":"Improved protein\u2013ligand binding affinity prediction with structure-based deep fusion inference","volume":"61","author":"Jones","year":"2021","journal-title":"J. Chem. Inf. Model"},{"key":"B25","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"B26","unstructured":"Pathogen-host protein-carbohydrate interactions as the basis of important infections43143\n            KarlssonK.A.\n          10.1007\/978-1-4615-1267-7_28Adv. Exp. Med. Biol.4912001"},{"key":"B27","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1186\/s13321-021-00547-7","article-title":"PUResNet: Prediction of protein-ligand binding sites using deep residual neural network","volume":"13","author":"Kandel","year":"2021","journal-title":"J. Cheminform"},{"key":"B28","doi-asserted-by":"publisher","first-page":"224","DOI":"10.1038\/s41435-020-0105-9","article-title":"Emergence and significance of carbohydrate-specific antibodies","volume":"21","author":"Kappler","year":"2020","journal-title":"Genes Immun."},{"key":"B29","doi-asserted-by":"publisher","first-page":"41","DOI":"10.2149\/tmh.2014-25","article-title":"The role of carbohydrates in infection strategies of enteric pathogens","volume":"43","author":"Kato","year":"2015","journal-title":"Trop. Med. Health"},{"key":"B30","doi-asserted-by":"publisher","first-page":"308","DOI":"10.3389\/fimmu.2014.00308","article-title":"Carbohydrate-mimetic peptides for pan anti-tumor responses","volume":"5","author":"Kieber-Emmons","year":"2014","journal-title":"Front. Immunol."},{"key":"B31","volume-title":"Proceedings of the 3rd international conference on learning representations","author":"Kingma","year":"2015"},{"key":"B32","doi-asserted-by":"publisher","first-page":"733","DOI":"10.1038\/nprot.2015.043","article-title":"The FTMap family of web servers for determining and characterizing ligand-binding hot spots of proteins","volume":"10","author":"Kozakov","year":"2015","journal-title":"Nat. Protoc."},{"key":"B33","doi-asserted-by":"publisher","first-page":"2175","DOI":"10.1038\/s41467-023-37701-8","article-title":"PeSTo: Parameter-free geometric deep learning for accurate prediction of protein binding interfaces","volume":"14","author":"Krapp","year":"2023","journal-title":"Nat. Commun."},{"key":"B34","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1016\/j.jmgm.2009.08.009","article-title":"InCa-SiteFinder: A method for structure-based prediction of inositol and carbohydrate binding sites on proteins","volume":"28","author":"Kulharia","year":"2009","journal-title":"J. Mol. Graph Model"},{"key":"B35","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1016\/0022-2836(82)90515-0","article-title":"A simple method for displaying the hydropathic character of a protein","volume":"157","author":"Kyte","year":"1982","journal-title":"J. Mol. Biol."},{"key":"B36","doi-asserted-by":"publisher","first-page":"168","DOI":"10.1186\/1471-2105-10-168","article-title":"Fpocket: An open source platform for ligand pocket detection","volume":"10","author":"le Guilloux","year":"2009","journal-title":"BMC Bioinforma."},{"key":"B37","doi-asserted-by":"publisher","first-page":"e2107440118","DOI":"10.1073\/pnas.2107440118","article-title":"Shotgun scanning glycomutagenesis: A simple and efficient strategy for constructing and characterizing neoglycoproteins","volume":"118","author":"Li","year":"2021","journal-title":"Proc. Natl. Acad. Sci."},{"key":"B38","doi-asserted-by":"publisher","first-page":"387","DOI":"10.1080\/17460441.2019.1573813","article-title":"Carbohydrate\u2013protein interactions and multivalency: Implications for the inhibition of influenza A virus infections","volume":"14","author":"Lu","year":"2019","journal-title":"Expert Opin. Drug Discov."},{"key":"B39","doi-asserted-by":"publisher","first-page":"2103807","DOI":"10.1002\/advs.202103807","article-title":"LectinOracle: A generalizable deep learning model for lectin\u2013glycan binding prediction","volume":"9","author":"Lundstr\u00f8m","year":"2022","journal-title":"Adv. Sci."},{"key":"B40","doi-asserted-by":"publisher","first-page":"816","DOI":"10.2174\/138920312804871175","article-title":"Protein-carbohydrate interactions studied by NMR: From molecular recognition to drug design","volume":"13","author":"M","year":"2012","journal-title":"Curr. Protein Pept. Sci."},{"key":"B41","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1472-6807-7-1","article-title":"Sequence and structural features of carbohydrate binding in proteins and assessment of predictability using a neural network","volume":"7","author":"Malik","year":"2007","journal-title":"BMC Struct. Biol."},{"key":"B42","doi-asserted-by":"publisher","first-page":"W13","DOI":"10.1093\/nar\/gkac250","article-title":"3DLigandSite: Structure-based prediction of protein\u2013ligand binding sites","volume":"50","author":"McGreig","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B43","doi-asserted-by":"publisher","first-page":"679","DOI":"10.1038\/s41592-022-01488-1","article-title":"ColabFold: Making protein folding accessible to all","volume":"19","author":"Mirdita","year":"2022","journal-title":"Nat. Methods"},{"key":"B44","doi-asserted-by":"publisher","first-page":"1681","DOI":"10.1093\/bioinformatics\/btab009","article-title":"DeepSurf: A surface-based deep learning approach for the prediction of ligand binding sites on proteins","volume":"37","author":"Mylonas","year":"2021","journal-title":"Bioinformatics"},{"key":"B45","doi-asserted-by":"publisher","first-page":"6807","DOI":"10.1021\/acs.jpcb.1c00910","article-title":"Development and evaluation of GlycanDock: A protein-glycoligand docking refinement algorithm in Rosetta","volume":"125","author":"Nance","year":"2021","journal-title":"J. Phys. Chem. B"},{"key":"B46","volume-title":"Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies","author":"Ruffolo","year":""},{"key":"B47","doi-asserted-by":"publisher","first-page":"100406","DOI":"10.1016\/j.patter.2021.100406","article-title":"Antibody structure prediction using interpretable deep learning","volume":"3","author":"Ruffolo","year":"","journal-title":"Patterns"},{"key":"B48","first-page":"9323","article-title":"Equivariant graph neural networks","volume":"139","author":"Satorras","year":"2021","journal-title":"Proc. 38th Int. Conf. Mach. Learn. (PMLR)"},{"key":"B49","doi-asserted-by":"publisher","first-page":"3615","DOI":"10.1093\/bioinformatics\/btaa141","article-title":"ProCaff: Protein\u2013carbohydrate complex binding affinity database","volume":"36","author":"Siva Shanmugam","year":"2020","journal-title":"Bioinformatics"},{"key":"B50","doi-asserted-by":"publisher","first-page":"5035,","DOI":"10.1038\/s41598-020-61860-z","article-title":"Improving detection of protein-ligand binding sites with 3D segmentation","volume":"10","author":"Stepniewska-Dziubinska","year":"2020","journal-title":"Sci. Rep."},{"key":"B15","first-page":"15267","article-title":"Fast end-to-end learning on protein surfaces","author":"Sverrisson","year":"2021"},{"key":"B51","doi-asserted-by":"publisher","first-page":"2115","DOI":"10.1021\/acs.jcim.6b00320","article-title":"Sequence-based prediction of protein\u2013carbohydrate binding sites using support vector machines","volume":"56","author":"Taherzadeh","year":"2016","journal-title":"J. Chem. Inf. Model"},{"key":"B52","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1093\/protein\/13.2.89","article-title":"Analysis and prediction of carbohydrate binding sites","volume":"13","author":"Taroni","year":"2000","journal-title":"Protein Eng. Des. Sel."},{"key":"B53","doi-asserted-by":"publisher","first-page":"e40846","DOI":"10.1371\/journal.pone.0040846","article-title":"Prediction of carbohydrate binding sites on protein surfaces with 3-dimensional probability density distributions of interacting atoms","volume":"7","author":"Tsai","year":"2012","journal-title":"PLoS One"},{"key":"B54","doi-asserted-by":"publisher","first-page":"607","DOI":"10.1016\/j.jmb.2010.11.008","article-title":"Alternate states of proteins revealed by detailed energy landscape mapping","volume":"405","author":"Tyka","year":"2011","journal-title":"J. Mol. Biol."},{"key":"B56","article-title":"Scalars are universal: Equivariant machine learning, structured like classical physics","volume-title":"Advances in neural information processing systems","author":"Villar","year":"2021"},{"key":"B57","doi-asserted-by":"publisher","first-page":"383","DOI":"10.1007\/978-1-4939-1465-4_17","article-title":"Methods for predicting protein\u2013ligand binding sites","volume":"1215","author":"Xie","year":"2015","journal-title":"Methods Mol. Biol."},{"key":"B58","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1038\/s41392-020-00435-w","article-title":"G protein-coupled receptors: Structure- and function-based drug discovery","volume":"6","author":"Yang","year":"2021","journal-title":"Signal Transduct. Target Ther."},{"key":"B59","doi-asserted-by":"publisher","first-page":"1496","DOI":"10.1073\/pnas.1914677117","article-title":"Improved protein structure prediction using predicted interresidue orientations","volume":"117","author":"Yang","year":"2020","journal-title":"Proc. Natl. Acad. Sci."},{"key":"B60","doi-asserted-by":"publisher","first-page":"2139","DOI":"10.1158\/1535-7163.mct-06-0082","article-title":"Therapeutic value of glycosaminoglycans in cancer","volume":"5","author":"Yip","year":"2006","journal-title":"Mol. Cancer Ther."},{"key":"B61","doi-asserted-by":"publisher","first-page":"2177","DOI":"10.1002\/jcc.23730","article-title":"Carbohydrate-binding protein identification by coupling structural similarity searching with binding affinity prediction","volume":"35","author":"Zhao","year":"2014","journal-title":"J. Comput. Chem."},{"key":"B62","article-title":"UNet++: A nested U-net architecture for medical image segmentation","volume-title":"Lecture notes in computer science","author":"Zhou","year":"2018"}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2023.1186531\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T10:04:27Z","timestamp":1687255467000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2023.1186531\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,20]]},"references-count":62,"alternative-id":["10.3389\/fbinf.2023.1186531"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2023.1186531","relation":{},"ISSN":["2673-7647"],"issn-type":[{"value":"2673-7647","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,20]]},"article-number":"1186531"}}