{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T06:20:41Z","timestamp":1773382841895,"version":"3.50.1"},"reference-count":59,"publisher":"Oxford University Press (OUP)","issue":"24","license":[{"start":{"date-parts":[[2021,7,21]],"date-time":"2021-07-21T00:00:00Z","timestamp":1626825600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002322","name":"Coordination for the Improvement of Higher Education Personnel","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002322","name":"CAPES","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,12,11]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>In silico identification of linear B-cell epitopes represents an important step in the development of diagnostic tests and vaccine candidates, by providing potential high-probability targets for experimental investigation. Current predictive tools were developed under a generalist approach, training models with heterogeneous datasets to develop predictors that can be deployed for a wide variety of pathogens. However, continuous advances in processing power and the increasing amount of epitope data for a broad range of pathogens indicate that training organism or taxon-specific models may become a feasible alternative, with unexplored potential gains in predictive performance.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>This article shows how organism-specific training of epitope prediction models can yield substantial performance gains across several quality metrics when compared to models trained with heterogeneous and hybrid data, and with a variety of widely used predictors from the literature. These results suggest a promising alternative for the development of custom-tailored predictive models with high predictive power, which can be easily implemented and deployed for the investigation of specific pathogens.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The data underlying this article, as well as the full reproducibility scripts, are available at https:\/\/github.com\/fcampelo\/OrgSpec-paper. The R package that implements the organism-specific pipeline functions is available at https:\/\/github.com\/fcampelo\/epitopes.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary materials are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab536","type":"journal-article","created":{"date-parts":[[2021,7,19]],"date-time":"2021-07-19T19:12:31Z","timestamp":1626721951000},"page":"4826-4834","source":"Crossref","is-referenced-by-count":16,"title":["Organism-specific training improves performance of linear B-cell epitope prediction"],"prefix":"10.1093","volume":"37","author":[{"given":"Jodie","family":"Ashford","sequence":"first","affiliation":[{"name":"Department of Computer Science, College of Engineering and Physical Sciences, Aston University , Birmingham B4 7ET, UK"}]},{"given":"Jo\u00e3o","family":"Reis-Cunha","sequence":"additional","affiliation":[{"name":"Department of Preventive Veterinary Medicine, Universidade Federal de Minas Gerais , Belo Horizonte 31270-901, Brazil"}]},{"given":"Igor","family":"Lobo","sequence":"additional","affiliation":[{"name":"Graduate Program in Genetics, Universidade Federal de Minas Gerais , Belo Horizonte 31270-901, Brazil"}]},{"given":"Francisco","family":"Lobo","sequence":"additional","affiliation":[{"name":"Department of General Biology, Universidade Federal de Minas Gerais , Belo Horizonte 31270-901, Brazil"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8432-4325","authenticated-orcid":false,"given":"Felipe","family":"Campelo","sequence":"additional","affiliation":[{"name":"Department of Computer Science, College of Engineering and Physical Sciences, Aston University , Birmingham B4 7ET, UK"}]}],"member":"286","published-online":{"date-parts":[[2021,7,21]]},"reference":[{"key":"2023051607135575700_btab536-B1","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1016\/S0264-410X(99)00329-1","article-title":"Predictive estimation of protein linear epitopes by using the program people","volume":"18","author":"Alix","year":"1999","journal-title":"Vaccine"},{"key":"2023051607135575700_btab536-B2","doi-asserted-by":"crossref","first-page":"1188","DOI":"10.3201\/eid2407.171928","article-title":"Integrated serologic surveillance of population immunity and disease transmission","volume":"24","author":"Arnold","year":"2018","journal-title":"Emerging Infect. Dis"},{"key":"2023051607135575700_btab536-B3","doi-asserted-by":"crossref","first-page":"e371","DOI":"10.1371\/journal.pmed.0030371","article-title":"River blindness: a success story under threat?","volume":"3","author":"Bas\u00e1\u00f1ez","year":"2006","journal-title":"PLoS Med"},{"key":"2023051607135575700_btab536-B4","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1110\/ps.041059505","article-title":"Benchmarking b cell epitope prediction: underperformance of existing methods","volume":"14","author":"Blythe","year":"2005","journal-title":"Protein Sci"},{"key":"2023051607135575700_btab536-B5","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1186\/s12864-019-6413-7","article-title":"The advantages of the Matthews correlation coefficient (MCC) over f1 score and accuracy in binary classification evaluation","volume":"21","author":"Chicco","year":"2020","journal-title":"BMC Genomics"},{"key":"2023051607135575700_btab536-B6","doi-asserted-by":"crossref","first-page":"448","DOI":"10.1093\/bioinformatics\/btaa773","article-title":"EpiDope: a deep neural network for linear B-cell epitope prediction","volume":"37","author":"Collatz","year":"2021","journal-title":"Bioinformatics"},{"key":"2023051607135575700_btab536-B7","volume-title":"Bootstrap Methods and Their Application","author":"Davison","year":"2013"},{"key":"2023051607135575700_btab536-B8","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1002\/jmr.893","article-title":"Predicting linear B-cell epitopes using string kernels","volume":"21","author":"EL-Manzalawy","year":"2008","journal-title":"J. Mol. Recognit. Interdiscipl. J"},{"key":"2023051607135575700_btab536-B9","doi-asserted-by":"crossref","first-page":"327","DOI":"10.4254\/wjh.v7.i3.327","article-title":"HCV syndrome: a constellation of organ- and non-organ specific autoimmune disorders, B-cell non-Hodgkin\u2019s lymphoma, and cancer","volume":"7","author":"Ferri","year":"2015","journal-title":"World J. Hepatol"},{"key":"2023051607135575700_btab536-B10","doi-asserted-by":"crossref","first-page":"e0121673","DOI":"10.1371\/journal.pone.0121673","article-title":"Dissecting antibodies with regards to linear and conformational epitopes","volume":"10","author":"Forsstr\u00f6m","year":"2015","journal-title":"PLoS One"},{"key":"2023051607135575700_btab536-B11","doi-asserted-by":"crossref","first-page":"703","DOI":"10.1089\/cmb.2008.0173","article-title":"Interpretable numerical descriptors of amino acid space","volume":"16","author":"Georgiev","year":"2009","journal-title":"J. Comput. Biol"},{"key":"2023051607135575700_btab536-B12","first-page":"1","author":"Getzoff","year":"1988"},{"key":"2023051607135575700_btab536-B13","first-page":"11","article-title":"B-pred, a structure based B-cell epitopes prediction server","volume":"5","author":"Giac\u00f2","year":"2012","journal-title":"Adv. Appl. Bioinf. Chem"},{"key":"2023051607135575700_btab536-B14","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1002\/jmr.815","article-title":"Towards a consensus on datasets and evaluation metrics for developing B-cell epitope prediction tools","volume":"20","author":"Greenbaum","year":"2007","journal-title":"J. Mol. Recognit. Interdiscipl. J"},{"key":"2023051607135575700_btab536-B15","doi-asserted-by":"crossref","first-page":"2558","DOI":"10.1110\/ps.062405906","article-title":"Prediction of residues in discontinuous B-cell epitopes using protein 3D structures","volume":"15","author":"Haste Andersen","year":"2006","journal-title":"Protein Sci"},{"key":"2023051607135575700_btab536-B16","first-page":"65","article-title":"A simple sequentially rejective multiple test procedure","volume":"6","author":"Holm","year":"1979","journal-title":"Scand. J. Stat"},{"key":"2023051607135575700_btab536-B17","doi-asserted-by":"crossref","first-page":"3824","DOI":"10.1073\/pnas.78.6.3824","article-title":"Prediction of protein antigenic determinants from amino acid sequences","volume":"78","author":"Hopp","year":"1981","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051607135575700_btab536-B18","doi-asserted-by":"crossref","first-page":"W24","DOI":"10.1093\/nar\/gkx346","article-title":"Bepipred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes","volume":"45","author":"Jespersen","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023051607135575700_btab536-B19","doi-asserted-by":"crossref","first-page":"298","DOI":"10.3389\/fimmu.2019.00298","article-title":"Antibody specific B-cell epitope predictions: leveraging information from antibody-antigen protein complexes","volume":"10","author":"Jespersen","year":"2019","journal-title":"Front. Immunol"},{"key":"2023051607135575700_btab536-B20","author":"Kaufman","year":"2011"},{"key":"2023051607135575700_btab536-B21","author":"Kindt","year":"2007"},{"key":"2023051607135575700_btab536-B22","doi-asserted-by":"crossref","first-page":"172","DOI":"10.1016\/0014-5793(90)80535-Q","article-title":"A semi-empirical method for prediction of antigenic determinants on protein antigens","volume":"276","author":"Kolaskar","year":"1990","journal-title":"FEBS Lett"},{"key":"2023051607135575700_btab536-B23","doi-asserted-by":"crossref","first-page":"W168","DOI":"10.1093\/nar\/gki460","article-title":"CEP: a conformational epitope prediction server","volume":"33","author":"Kulkarni-Kale","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023051607135575700_btab536-B24","doi-asserted-by":"crossref","first-page":"3149","DOI":"10.2174\/138161210793292447","article-title":"Epitope discovery and their use in peptide based vaccines","volume":"16","author":"Dudek","year":"2010","journal-title":"Curr. Pharm. Des"},{"key":"2023051607135575700_btab536-B25","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1186\/1745-7580-2-2","article-title":"Improved method for predicting linear B-cell epitopes","volume":"2","author":"Larsen","year":"2006","journal-title":"Immunome Res"},{"key":"2023051607135575700_btab536-B26","first-page":"149","author":"Leinikki","year":"1993"},{"key":"2023051607135575700_btab536-B27","doi-asserted-by":"crossref","first-page":"S3","DOI":"10.1186\/1471-2105-14-S4-S3","article-title":"Prediction of conformational epitopes with the use of a knowledge-based energy function and geometrically related neighboring residue characteristics","volume":"14","author":"Lo","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023051607135575700_btab536-B28","author":"Lodish","year":"2000","edition":"4th edn"},{"key":"2023051607135575700_btab536-B29","doi-asserted-by":"crossref","first-page":"1695","DOI":"10.3389\/fimmu.2018.01695","article-title":"iBCE-EL: a new ensemble learning framework for improved linear B-cell epitope prediction","volume":"9","author":"Manavalan","year":"2018","journal-title":"Front. Immunol"},{"key":"2023051607135575700_btab536-B30","first-page":"D7","article-title":"Database resources of the national center for biotechnology information","volume":"44","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023051607135575700_btab536-B31","doi-asserted-by":"crossref","first-page":"2021","DOI":"10.1016\/S0140-6736(07)60942-8","article-title":"Prevalence and intensity of Onchocerca volvulus infection and efficacy of ivermectin in endemic communities in Ghana: a two-phase epidemiological study","volume":"369","author":"Osei-Atweneboana","year":"2007","journal-title":"Lancet"},{"key":"2023051607135575700_btab536-B32","doi-asserted-by":"crossref","first-page":"4","DOI":"10.32614\/RJ-2015-001","article-title":"Peptides: a package for data mining of antimicrobial peptides","volume":"7","author":"Osorio","year":"2015","journal-title":"R. J"},{"key":"2023051607135575700_btab536-B33","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1002\/pro.3774","article-title":"Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and MCSM, using machine learning","volume":"29","author":"Pandurangan","year":"2020","journal-title":"Protein Sci"},{"key":"2023051607135575700_btab536-B34","doi-asserted-by":"crossref","first-page":"5425","DOI":"10.1021\/bi00367a013","article-title":"New hydrophilicity scale derived from high-performance liquid chromatography peptide retention data: correlation of predicted surface residues with antigenicity and X-ray-derived accessible sites","volume":"25","author":"Parker","year":"1986","journal-title":"Biochemistry"},{"key":"2023051607135575700_btab536-B35","volume-title":"Fundamental Immunology","author":"Paul","year":"2012","edition":"7th edn"},{"key":"2023051607135575700_btab536-B36","doi-asserted-by":"crossref","first-page":"204","DOI":"10.1016\/0263-7855(93)80074-2","article-title":"Preditop: a program for antigenicity prediction","volume":"11","author":"Pellequer","year":"1993","journal-title":"J. Mol. Graph"},{"key":"2023051607135575700_btab536-B37","first-page":"176","author":"Pellequer","year":"1991"},{"key":"2023051607135575700_btab536-B38","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1016\/0165-2478(93)90072-A","article-title":"Correlation between the location of antigenic sites and the prediction of turns in proteins","volume":"36","author":"Pellequer","year":"1993","journal-title":"Immunol. Lett"},{"key":"2023051607135575700_btab536-B39","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1186\/1472-6807-7-64","article-title":"Antibody-protein interactions: benchmark datasets and prediction tools evaluation","volume":"7","author":"Ponomarenko","year":"2007","journal-title":"BMC Struct. Biol"},{"key":"2023051607135575700_btab536-B40","doi-asserted-by":"crossref","first-page":"6760830","DOI":"10.1155\/2016\/6760830","article-title":"An introduction to B-cell epitope mapping and in silico epitope prediction","volume":"2016","author":"Potocnakova","year":"2016","journal-title":"J. Immunol. Res"},{"key":"2023051607135575700_btab536-B41","year":"2020"},{"key":"2023051607135575700_btab536-B42","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1016\/j.humpath.2018.05.020","article-title":"Epstein\u2013Barr virus (EBV)-associated lymphoid proliferations, a 2018 update","volume":"79","author":"Rezk","year":"2018","journal-title":"Hum. Pathol"},{"key":"2023051607135575700_btab536-B43","first-page":"197","author":"Saha","year":"2004"},{"key":"2023051607135575700_btab536-B44","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1002\/prot.21078","article-title":"Prediction of continuous B-cell epitopes in an antigen using recurrent neural network","volume":"65","author":"Saha","year":"2006","journal-title":"Proteins Struct. Funct. Bioinf"},{"key":"2023051607135575700_btab536-B45","doi-asserted-by":"crossref","first-page":"2680160","DOI":"10.1155\/2017\/2680160","article-title":"Fundamentals and methods for T- and B-cell epitope prediction","volume":"2017","author":"Sanchez-Trincado","year":"2017","journal-title":"J. Immunol. Res"},{"key":"2023051607135575700_btab536-B46","doi-asserted-by":"crossref","first-page":"648","DOI":"10.1089\/omi.2015.0095","article-title":"Harnessing computational biology for exact linear B-cell epitope prediction: a novel amino acid composition-based feature descriptor","volume":"19","author":"Saravanan","year":"2015","journal-title":"Omics J. Integr. Biol"},{"key":"2023051607135575700_btab536-B47","doi-asserted-by":"crossref","first-page":"4337","DOI":"10.1073\/pnas.0607879104","article-title":"Predicting protein\u2013protein interactions based only on sequences information","volume":"104","author":"Shen","year":"2007","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051607135575700_btab536-B48","doi-asserted-by":"crossref","first-page":"e62216","DOI":"10.1371\/journal.pone.0062216","article-title":"Improved method for linear B-cell epitope prediction using antigen\u2019s primary sequence","volume":"8","author":"Singh","year":"2013","journal-title":"PLoS One"},{"key":"2023051607135575700_btab536-B50","author":"Tan","year":"2005"},{"key":"2023051607135575700_btab536-B51","first-page":"D480","article-title":"UniProt: the universal protein knowledgebase in 2021","volume":"49","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2023051607135575700_btab536-B52","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Van der Maaten","year":"2008","journal-title":"J. Mach. Learn. Res"},{"key":"2023051607135575700_btab536-B53","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1006\/meth.1996.0054","article-title":"Mapping epitope structure and activity: from one-dimensional prediction to four-dimensional description of antigenic specificity","volume":"9","author":"Van Regenmortel","year":"1996","journal-title":"Methods"},{"key":"2023051607135575700_btab536-B54","doi-asserted-by":"crossref","first-page":"D339","DOI":"10.1093\/nar\/gky1006","article-title":"The immune epitope database (IEDB): 2018 update","volume":"47","author":"Vita","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023051607135575700_btab536-B55","doi-asserted-by":"crossref","first-page":"2373","DOI":"10.3390\/ijms18112373","article-title":"Protein\u2013protein interactions prediction using a novel local conjoint triad descriptor of amino acid sequences","volume":"18","author":"Wang","year":"2017","journal-title":"Int. J. Mol. Sci"},{"key":"2023051607135575700_btab536-B56","year":"2019"},{"key":"2023051607135575700_btab536-B57","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v077.i01","article-title":"ranger: a fast implementation of random forests for high dimensional data in C++ and R","volume":"77","author":"Wright","year":"2017","journal-title":"J. Stat. Softw"},{"key":"2023051607135575700_btab536-B58","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1002\/rmv.602","article-title":"An introduction to epitope prediction methods and software","volume":"19","author":"Yang","year":"2009","journal-title":"Rev. Med. Virol"},{"key":"2023051607135575700_btab536-B59","doi-asserted-by":"crossref","first-page":"e45152","DOI":"10.1371\/journal.pone.0045152","article-title":"SVMTriP: a method to predict antigenic epitopes using support vector machine to integrate tri-peptide similarity and propensity","volume":"7","author":"Yao","year":"2012","journal-title":"PLoS One"},{"key":"2023051607135575700_btab536-B60","doi-asserted-by":"crossref","first-page":"e62249","DOI":"10.1371\/journal.pone.0062249","article-title":"Conformational B-cell epitope prediction on antigen protein structures: a review of current algorithms and comparison with common binding site prediction methods","volume":"8","author":"Yao","year":"2013","journal-title":"PLoS One"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab536\/39711202\/btab536.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/24\/4826\/50334814\/btab536.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/24\/4826\/50334814\/btab536.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T07:44:45Z","timestamp":1684223085000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/24\/4826\/6325084"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,7,21]]},"references-count":59,"journal-issue":{"issue":"24","published-print":{"date-parts":[[2021,12,11]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab536","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,12,15]]},"published":{"date-parts":[[2021,7,21]]}}}