{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T11:37:04Z","timestamp":1774352224551,"version":"3.50.1"},"reference-count":66,"publisher":"Public Library of Science (PLoS)","issue":"1","license":[{"start":{"date-parts":[[2022,1,28]],"date-time":"2022-01-28T00:00:00Z","timestamp":1643328000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000057","name":"National Institute of General Medical Sciences","doi-asserted-by":"publisher","award":["R35GM142502"],"award-info":[{"award-number":["R35GM142502"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000092","name":"U.S. National Library of Medicine","doi-asserted-by":"publisher","award":["T15LM007359"],"award-info":[{"award-number":["T15LM007359"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Machine learning with multi-layered artificial neural networks, also known as \u201cdeep learning,\u201d is effective for making biological predictions. However, model interpretation is challenging, especially for sequential input data used with recurrent neural network architectures. Here, we introduce a framework called \u201cPositional SHAP\u201d (PoSHAP) to interpret models trained from biological sequences by utilizing SHapely Additive exPlanations (SHAP) to generate positional model interpretations. We demonstrate this using three long short-term memory (LSTM) regression models that predict peptide properties, including binding affinity to major histocompatibility complexes (MHC), and collisional cross section (CCS) measured by ion mobility spectrometry. Interpretation of these models with PoSHAP reproduced MHC class I (rhesus macaque Mamu-A1*001 and human A*11:01) peptide binding motifs, reflected known properties of peptide CCS, and provided new insights into interpositional dependencies of amino acid interactions. PoSHAP should have widespread utility for interpreting a variety of models trained from biological sequences.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1009736","type":"journal-article","created":{"date-parts":[[2022,1,28]],"date-time":"2022-01-28T13:34:41Z","timestamp":1643376881000},"page":"e1009736","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":67,"title":["Positional SHAP (PoSHAP) for Interpretation of machine learning models trained from biological sequences"],"prefix":"10.1371","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7744-3083","authenticated-orcid":true,"given":"Quinn","family":"Dickinson","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2753-3926","authenticated-orcid":true,"given":"Jesse G.","family":"Meyer","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2022,1,28]]},"reference":[{"issue":"36","key":"pcbi.1009736.ref001","doi-asserted-by":"crossref","first-page":"22059","DOI":"10.1016\/S0021-9258(18)45665-7","article-title":"Protein structure determination in solution by NMR spectroscopy","volume":"265","author":"K. W\u00fcthrich","year":"1990","journal-title":"J Biol Chem"},{"key":"pcbi.1009736.ref002","unstructured":"Developments, applications, and prospects of cryo\u2010electron microscopy\u2014Benjin\u20142020\u2014Protein Science\u2014Wiley Online Library [Internet]. [cited 2021 Mar 2]. Available from: https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/pro.3805."},{"key":"pcbi.1009736.ref003","unstructured":"Protein crystallography from the perspective of technology developments: Crystallography Reviews: Vol 21, No 1\u20132 [Internet]. [cited 2021 Mar 2]. Available from: https:\/\/www.tandfonline.com\/doi\/abs\/10.1080\/0889311X.2014.973868."},{"issue":"2","key":"pcbi.1009736.ref004","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1016\/j.cell.2015.06.043","article-title":"The BioPlex Network: A Systematic Exploration of the Human Interactome","volume":"162","author":"EL Huttlin","year":"2015","journal-title":"Cell"},{"key":"pcbi.1009736.ref005","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/978-94-024-1069-3_1","volume-title":"From Protein Structure to Function with Bioinformatics","author":"J Lee","year":"2017"},{"issue":"11","key":"pcbi.1009736.ref006","doi-asserted-by":"crossref","first-page":"681","DOI":"10.1038\/s41580-019-0163-x","article-title":"Advances in protein structure prediction and design","volume":"20","author":"B Kuhlman","year":"2019","journal-title":"Nat Rev Mol Cell Biol"},{"issue":"7792","key":"pcbi.1009736.ref007","doi-asserted-by":"crossref","first-page":"706","DOI":"10.1038\/s41586-019-1923-7","article-title":"Improved protein structure prediction using potentials from deep learning","volume":"577","author":"AW Senior","year":"2020","journal-title":"Nature"},{"issue":"7553","key":"pcbi.1009736.ref008","first-page":"436","volume":"521","author":"Y LeCun","year":"2015","journal-title":"Deep learning. Nature"},{"issue":"5","key":"pcbi.1009736.ref009","doi-asserted-by":"crossref","first-page":"1039","DOI":"10.1021\/ac0205154","article-title":"Use of Artificial Neural Networks for the Accurate Prediction of Peptide Liquid Chromatography Elution Times in Proteome Analyses","volume":"75","author":"K Petritis","year":"2003","journal-title":"Anal Chem"},{"issue":"4","key":"pcbi.1009736.ref010","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF00344251","article-title":"Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position","volume":"36","author":"K. Fukushima","year":"1980","journal-title":"Biol CybernApr"},{"issue":"6088","key":"pcbi.1009736.ref011","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"DE Rumelhart","year":"1986","journal-title":"Nature"},{"issue":"3","key":"pcbi.1009736.ref012","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1109\/MCI.2018.2840738","article-title":"Recent Trends in Deep Learning Based Natural Language Processing [Review Article].","volume":"13","author":"T Young","year":"2018","journal-title":"IEEE ComputIntell Mag"},{"issue":"22","key":"pcbi.1009736.ref013","doi-asserted-by":"crossref","first-page":"3685","DOI":"10.1093\/bioinformatics\/btx531","article-title":"An introduction to deep learning on biological sequence data: examples and solutions","volume":"33","author":"VI Jurtz","year":"2017","journal-title":"Bioinformatics"},{"issue":"8","key":"pcbi.1009736.ref014","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory.","volume":"9","author":"S Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"pcbi.1009736.ref015","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/978-3-030-28954-6_11","volume-title":"Explainable AI: Interpreting, Explaining and Visualizing Deep Learning [Internet].:","author":"L Arras","year":"2019"},{"key":"pcbi.1009736.ref016","article-title":"Neural Machine Translation by Jointly Learning to Align and Translate.","author":"D Bahdanau","year":"2016","journal-title":"ArXiv14090473 Cs Stat"},{"issue":"1","key":"pcbi.1009736.ref017","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random Forests","volume":"45","author":"L. Breiman","year":"2001","journal-title":"Mach LearnOct 1"},{"key":"pcbi.1009736.ref018","first-page":"4768","volume-title":"Proceedings of the 31st International Conference on Neural Information Processing Systems","author":"SM Lundberg","year":"2017"},{"issue":"10","key":"pcbi.1009736.ref019","doi-asserted-by":"crossref","first-page":"749","DOI":"10.1038\/s41551-018-0304-0","article-title":"Explainable machine-learning predictions for the prevention of hypoxaemia during surgery","volume":"2","author":"SM Lundberg","year":"2018","journal-title":"Nat Biomed Eng"},{"issue":"1","key":"pcbi.1009736.ref020","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1111\/j.1600-065X.1999.tb01353.x","article-title":"The nature of the MHC class I peptide loading complex","volume":"172","author":"P Cresswell","year":"1999","journal-title":"Immunol Rev"},{"issue":"12","key":"pcbi.1009736.ref021","doi-asserted-by":"crossref","first-page":"823","DOI":"10.1038\/nri3084","article-title":"Towards a systems understanding of MHC class I and MHC class II antigen presentation","volume":"11","author":"J Neefjes","year":"2011","journal-title":"Nat Rev Immunol"},{"issue":"1","key":"pcbi.1009736.ref022","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1186\/s13059-017-1207-1","article-title":"The MHC locus and genetic susceptibility to autoimmune and infectious diseases","volume":"18","author":"V Matzaraki","year":"2017","journal-title":"Genome Biol"},{"issue":"9","key":"pcbi.1009736.ref023","doi-asserted-by":"crossref","first-page":"1491","DOI":"10.1007\/s00018-011-0657-y","article-title":"The role of the proteasome in the generation of MHC class I ligands and immune responses","volume":"68","author":"Kloetzel P-M Sijts EJAM","year":"2011","journal-title":"Cell Mol Life Sci"},{"issue":"1","key":"pcbi.1009736.ref024","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.it.2005.11.001","article-title":"Have we cut ourselves too short in mapping CTL epitopes?","volume":"27","author":"SR Burrows","year":"2006","journal-title":"Trends Immunol"},{"issue":"8","key":"pcbi.1009736.ref025","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1038\/s41571-020-0350-x","article-title":"Roles and mechanisms of alternative splicing in cancer\u2014implications for care.","volume":"17","author":"SC Bonnal","year":"2020","journal-title":"Nat Rev Clin OncolAug"},{"issue":"1","key":"pcbi.1009736.ref026","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1146\/annurev-biodatasci-021920-100259","article-title":"Immunoinformatics: Predicting Peptide\u2013MHC Binding.","volume":"3","author":"M Nielsen","year":"2020","journal-title":"Annu Rev Biomed Data Sci"},{"issue":"26","key":"pcbi.1009736.ref027","doi-asserted-by":"crossref","first-page":"2239","DOI":"10.2174\/1568026619666181224101744","article-title":"Structure-based Methods for Binding Mode and Binding Affinity Prediction for Peptide-MHC Complexes","volume":"18","author":"DA Antunes","year":"2018","journal-title":"Curr Top Med Chem"},{"issue":"1","key":"pcbi.1009736.ref028","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1016\/j.cels.2018.05.014","article-title":"MHCflurry: Open-Source Class I MHC Binding Affinity Prediction.","volume":"7","author":"TJ O\u2019Donnell","year":"2018","journal-title":"Cell Syst"},{"issue":"14","key":"pcbi.1009736.ref029","doi-asserted-by":"crossref","first-page":"i278","DOI":"10.1093\/bioinformatics\/btz330","article-title":"DeepLigand: accurate prediction of MHC class I ligands using peptide embedding","volume":"35","author":"H Zeng","year":"2019","journal-title":"Bioinformatics"},{"issue":"1","key":"pcbi.1009736.ref030","doi-asserted-by":"crossref","first-page":"794","DOI":"10.1038\/s41598-018-37214-1","article-title":"DeepSeqPan, a novel deep convolutional neural network model for pan-specific class I HLA-peptide binding affinity prediction.","volume":"9","author":"Z Liu","year":"2019","journal-title":"Sci Rep."},{"key":"pcbi.1009736.ref031","article-title":"Deep learning pan-specific model for interpretable MHC-I peptide binding prediction with improved attention mechanism","author":"J Jin","year":"2021","journal-title":"Proteins Struct FunctBioinforma"},{"issue":"23","key":"pcbi.1009736.ref032","doi-asserted-by":"crossref","first-page":"4946","DOI":"10.1093\/bioinformatics\/btz427","article-title":"ACME: pan-specific peptide\u2013MHC class I binding prediction through attention-based deep neural networks","volume":"35","author":"Y Hu","year":"2019","journal-title":"Bioinformatics"},{"key":"pcbi.1009736.ref033","doi-asserted-by":"crossref","first-page":"100003","DOI":"10.1016\/j.crmeth.2021.100003","article-title":"Deep learning neural network tools for proteomics","author":"JG Meyer","year":"2021","journal-title":"Cell Rep Methods"},{"issue":"9","key":"pcbi.1009736.ref034","doi-asserted-by":"crossref","first-page":"906","DOI":"10.1186\/s12864-019-6297-6","article-title":"MS2CNN: predicting MS\/MS spectrum based on protein sequence using deep convolutional neural networks","volume":"20","author":"Y-M Lin","year":"2019","journal-title":"BMC Genomics"},{"issue":"10","key":"pcbi.1009736.ref035","doi-asserted-by":"crossref","first-page":"2099","DOI":"10.1074\/mcp.TIR119.001412","article-title":"Prediction of LC-MS\/MS Properties of Peptides from Sequence by Deep Learning*[S]","volume":"18","author":"S Guan","year":"2019","journal-title":"Mol Cell Proteomics"},{"issue":"6","key":"pcbi.1009736.ref036","doi-asserted-by":"crossref","first-page":"4275","DOI":"10.1021\/acs.analchem.9b04867","article-title":"Full-Spectrum Prediction of Peptides Tandem Mass Spectra using Deep Neural Network","volume":"92","author":"K Liu","year":"2020","journal-title":"Anal Chem"},{"issue":"1","key":"pcbi.1009736.ref037","doi-asserted-by":"crossref","first-page":"146","DOI":"10.1038\/s41467-019-13866-z","article-title":"In silico spectral libraries by deep learning facilitate data-independent acquisition proteomics","volume":"11","author":"Y Yang","year":"2020","journal-title":"Nat Commun"},{"issue":"1","key":"pcbi.1009736.ref038","doi-asserted-by":"crossref","first-page":"1759","DOI":"10.1038\/s41467-020-15456-w","article-title":"Cancer neoantigen prioritization through sensitive and reliable proteogenomics analysis","volume":"11","author":"B Wen","year":"2020","journal-title":"Nat Commun"},{"key":"pcbi.1009736.ref039","unstructured":"Deep learning the collisional cross sections of the peptide universe from a million experimental values | Nature Communications [Internet]. [cited 2021 Feb 26]. Available from: https:\/\/www.nature.com\/articles\/s41467-021-21352-8."},{"issue":"1","key":"pcbi.1009736.ref040","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1016\/j.cels.2020.06.010","article-title":"MHCflurry 2.0: Improved Pan-Allele Prediction of MHC Class I-Presented Peptides by Incorporating Antigen Processing.","volume":"11","author":"TJ O\u2019Donnell","year":"2020","journal-title":"Cell Syst"},{"issue":"3","key":"pcbi.1009736.ref041","doi-asserted-by":"crossref","first-page":"396","DOI":"10.1158\/2326-6066.CIR-19-0464","article-title":"High-Throughput Prediction of MHC Class I and II Neoantigens with MHCnuggets.","volume":"8","author":"XM Shao","year":"2020","journal-title":"Cancer Immunol Res"},{"issue":"1","key":"pcbi.1009736.ref042","doi-asserted-by":"crossref","first-page":"99.11","DOI":"10.4049\/jimmunol.200.Supp.99.11","article-title":"A Random Forest based approach to MHC class I epitope prediction and analysis","volume":"200","author":"EA Wilson","year":"2018","journal-title":"J Immunol"},{"issue":"1","key":"pcbi.1009736.ref043","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1186\/s12859-018-2561-z","article-title":"Predicting peptide presentation by major histocompatibility complex class I: an improved machine learning approach to the immunopeptidome.","volume":"20","author":"KM Boehm","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"pcbi.1009736.ref044","article-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.","author":"J Devlin","year":"2019","journal-title":"ArXiv181004805 Cs"},{"key":"pcbi.1009736.ref045","article-title":"Language Models are Few-Shot Learners.","author":"TB Brown","year":"2020","journal-title":"ArXiv200514165 CsInternet]"},{"key":"pcbi.1009736.ref046","article-title":"BERTology Meets Biology: Interpreting Attention in Protein Language Models.","author":"J Vig","year":"2021","journal-title":"ArXiv200615222 Cs Q-Bio"},{"issue":"6","key":"pcbi.1009736.ref047","first-page":"1689","article-title":"High-Throughput Identification of MHC Class I Binding Peptides Using an Ultradense Peptide Array","volume":"204","author":"AK Haj","year":"1950","journal-title":"J Immunol Baltim Md"},{"key":"pcbi.1009736.ref048","doi-asserted-by":"crossref","first-page":"2565","DOI":"10.1145\/3447548.3467166","article-title":"TimeSHAP: Explaining Recurrent Models through Sequence Perturbations.","author":"J Bento","year":"2021","journal-title":"Proc 27th ACM SIGKDD Conf KnowlDiscov Data Min."},{"key":"pcbi.1009736.ref049","first-page":"16","volume-title":"Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation","author":"E Kokalj"},{"key":"pcbi.1009736.ref050","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1186\/1471-2105-15-241","article-title":"Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions","volume":"15","author":"Y Kim","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"pcbi.1009736.ref051","doi-asserted-by":"crossref","first-page":"D405","DOI":"10.1093\/nar\/gku938","article-title":"The immune epitope database (IEDB) 3.0.","volume":"43","author":"R Vita","year":"2015","journal-title":"Nucleic Acids Res"},{"issue":"12","key":"pcbi.1009736.ref052","doi-asserted-by":"crossref","first-page":"4690","DOI":"10.1172\/JCI88590","article-title":"MHC class I\u2013associated peptides derive from selective regions of the human genome","volume":"126","author":"H Pearson","year":"2016","journal-title":"J Clin Invest"},{"issue":"5","key":"pcbi.1009736.ref053","doi-asserted-by":"crossref","first-page":"1007","DOI":"10.1110\/ps.0239403","article-title":"Reliable prediction of T-cell epitopes using neural networks with novel sequence representations","volume":"12","author":"M Nielsen","year":"2003","journal-title":"Protein Sci Publ Protein Soc"},{"key":"pcbi.1009736.ref054","unstructured":"Chollet, Fran\\c{c}ois. Keras [Internet]. [cited 2021 Jan 12]. Available from: https:\/\/keras.io\/."},{"key":"pcbi.1009736.ref055","article-title":"TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems.:","volume":"19","author":"M Abadi"},{"key":"pcbi.1009736.ref056","article-title":"Adam: A Method for Stochastic Optimization.","author":"DP Kingma","year":"2017","journal-title":"ArXiv14126980 CsInternet]"},{"key":"pcbi.1009736.ref057","first-page":"I-115-I","volume-title":"Proceedings of the 30th International Conference on International Conference on Machine Learning\u2014Volume 28","author":"J Bergstra","year":"2013"},{"key":"pcbi.1009736.ref058","volume":"9","author":"JS Bergstra","journal-title":"Algorithms for Hyper-Parameter Optimization."},{"issue":"3","key":"pcbi.1009736.ref059","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1109\/MCSE.2007.55","article-title":"Matplotlib: A 2D Graphics Environment.","volume":"9","author":"JD Hunter","year":"2007","journal-title":"Comput Sci Eng"},{"key":"pcbi.1009736.ref060","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python.","volume":"12","author":"F Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"key":"pcbi.1009736.ref061","doi-asserted-by":"crossref","first-page":"785","DOI":"10.1145\/2939672.2939785","volume-title":"Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"T Chen","year":"2016"},{"issue":"11","key":"pcbi.1009736.ref062","doi-asserted-by":"crossref","first-page":"6387","DOI":"10.4049\/jimmunol.165.11.6387","article-title":"Definition of the Mamu A*01 Peptide Binding Specificity: Application to the Identification of Wild-Type and Optimized Ligands from Simian Immunodeficiency Virus Regulatory Proteins","volume":"165","author":"J Sidney","year":"2000","journal-title":"J Immunol"},{"key":"pcbi.1009736.ref063","doi-asserted-by":"crossref","DOI":"10.2172\/1248095","volume-title":"HIV Molecular Immunology 2015","author":"K Yusim","year":"2016"},{"issue":"7","key":"pcbi.1009736.ref064","doi-asserted-by":"crossref","first-page":"1167","DOI":"10.1007\/s13361-011-0118-8","article-title":"What Happens to Hydrophobic Interactions during Transfer from the Solution to the Gas Phase? The Case of Electrospray-Based Soft Ionization Methods","volume":"22","author":"K Barylyuk","year":"2011","journal-title":"J Am Soc Mass Spectrom"},{"issue":"3","key":"pcbi.1009736.ref065","doi-asserted-by":"crossref","first-page":"394","DOI":"10.1111\/imm.12889","article-title":"Improved methods for predicting peptide binding affinity to MHC class II molecules","volume":"154","author":"KK Jensen","year":"2018","journal-title":"Immunology"},{"key":"pcbi.1009736.ref066","article-title":"What made you do this? Understanding black-box decisions with sufficient input subsets.","author":"B Carter","year":"2019","journal-title":"ArXiv181003805 Cs Stat"}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009736","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T22:12:15Z","timestamp":1674598335000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009736"}},"subtitle":[],"editor":[{"given":"Ilya","family":"Ioshikhes","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,1,28]]},"references-count":66,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,1,28]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1009736","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.03.04.433939","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,28]]}}}